[Dev] Metadata modelling question

Windhouwer, Menzo Menzo.Windhouwer at mpi.nl
Mon Feb 15 11:49:08 CET 2016


Hi, Sander, All,

This is just legacy I’m afraid :-(

I’m about to cleanup the CMDI 1.1 toolkit, as it contains a lot of obsolete scripts/examples/etc. I assumed olac2cmd.xsl was also one of those.

My target CMDI 1.1 toolkit looks as follows:

general-component-schema.xsd
xsd/minimal-cmdi.xsd
xslt/comp2schema-v2
xslt/comp2schema-v2/cleanup-xsd.xsl
xslt/comp2schema-v2/comp2schema-header.xsl
xslt/comp2schema-v2/comp2schema.xsl

Which are the files that play an active role in CMDI 1.1.

@All: please let me know if you miss something that you currently use from

https://infra.clarin.eu/cmd/xslt/

@Sander: I’ll have a look at the concept of git submodules. It would be great if the OLAC 2 CMDI can be just in its own repository …

Best,

Menzo
--
The Language Archive – tla.mpi.nl

From: Sander Maijers <sander at clarin.eu<mailto:sander at clarin.eu>>
Date: Friday 12 February 2016 at 19:00
To: Sander Maijers <sander at clarin.eu<mailto:sander at clarin.eu>>
Cc: Menzo Windhouwer <menzo.windhouwer at mpi.nl<mailto:menzo.windhouwer at mpi.nl>>, Dieter van Uytvanck <dieter at clarin.eu<mailto:dieter at clarin.eu>>, developers list CLARIN <dev at lists.clarin.eu<mailto:dev at lists.clarin.eu>>
Subject: Re: [Dev] Metadata modelling question

Hi,

A clarification of the links I gave:
https://github.com/clarin-eric/cmdi-toolkit/tree/cmdi-1.1/toolkit<https://github.com/clarin-eric/cmdi-toolkit/tree/cmdi-1.1/toolkit/xslt> == https://infra.clarin.eu/cmd/xslt/

Best,
Sander
--
Sent as system administrator and engineer for CLARIN
Please send inquiries about specific services to the corresponding e-mail address
See https://www.clarin.eu/content/support

Max Planck Institute for Psycholinguistics<https://tla.mpi.nl/>, software developer
personal Skype: sander.maijers | work address: Wundtlaan 1, 6525 XD, Nijmegen (NL)



On Fri, Feb 12, 2016 at 6:58 PM, Sander Maijers <sander at clarin.eu<mailto:sander at clarin.eu>> wrote:
Hi Menzo,

As you requested, since recently the CMDI toolkit is being pulled from a Git repo:
https://github.com/clarin-eric/cmdi-toolkit/tree/cmdi-1.1/toolkit/xslt

So, https://infra.clarin.eu/cmd/scripts/ is in fact up to date with with the CMDI toolkit 1.1 branch on the Git repo.

What is the relation with the Git repo you just referred to? It would be best and least confusing if all changes are made on https://github.com/clarin-eric/cmdi-toolkit and the Harvest Manager depends on that via a git submodule.

Best,
Sander
--
Sent as system administrator and engineer for CLARIN
Please send inquiries about specific services to the corresponding e-mail address
See https://www.clarin.eu/content/support

Max Planck Institute for Psycholinguistics<https://tla.mpi.nl/>, software developer
personal Skype: sander.maijers | work address: Wundtlaan 1, 6525 XD, Nijmegen (NL)



On Fri, Feb 12, 2016 at 5:35 PM, Windhouwer, Menzo <Menzo.Windhouwer at mpi.nl<mailto:Menzo.Windhouwer at mpi.nl>> wrote:
Hi Jörg,

If so, you might try

https://github.com/TheLanguageArchive/oai-harvest-manager/blob/master/src/main/resources/olac2cmdi.xsl

which is newer and actively maintained. Fixes (or bug reports ;-) are welcome!

Best,

Menzo
--
The Language Archive – tla.mpi.nl<http://tla.mpi.nl>

From: <dev-bounces at lists.clarin.eu<mailto:dev-bounces at lists.clarin.eu>> on behalf of Dieter van Uytvanck <dieter at clarin.eu<mailto:dieter at clarin.eu>>
Date: Friday 12 February 2016 at 17:28
To: developers list CLARIN <dev at lists.clarin.eu<mailto:dev at lists.clarin.eu>>
Subject: Re: [Dev] Metadata modelling question

Hi Jörg,

are you using https://infra.clarin.eu/cmd/xslt/olac2cmdi.xsl ? Since it should recognize the role (if I look at the code).

If you're using that version there must be a bug in the XSLT, hope you can track it down!

best,
Dieter

On Fri, Feb 12, 2016 at 4:57 PM, "Jörg Knappen" <jknappen at web.de<mailto:jknappen at web.de>> wrote:
In our current curation project, I got some really nice and fine grained DC metadata with several orles of the
contributors differentiated like this:

<dc:contributor xsi:type="olac:role" olac:code="compiler">Con, Tributor</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="depositor">Con, Tributor</dc:contributor>

The olac2cmdi XSLT script just removes the subdivisions of the contributors, leaving only levelled out information
like this:

     <OLAC-DcmiTerms>

         <contributor>Con, Tributor</contributor>
         <contributor>Con, Tributor</contributor>

     </OLAC-DcmiTerms>

Thus, some information from the DC metadata is lost in the resulting CMDI.

So, my questions are:

1. Should I care about the information loss? Is the information relevant for further processing in the VLO?
2. If I should care, what should I do?
3. If I don't care, should I remove duplicate lines from the CMDI metadata?

Greetings from Saarbrücken,

Jörg Knappen





_______________________________________________
Dev mailing list
Dev at lists.clarin.eu<mailto:Dev at lists.clarin.eu>
https://lists.clarin.eu/cgi-bin/mailman/listinfo/dev




--
Dieter Van Uytvanck
Technical Director CLARIN ERIC
www.clarin.eu<http://www.clarin.eu/> | tel. +31-(0)850091363 | skype: dietervu.mpi

_______________________________________________
Dev mailing list
Dev at lists.clarin.eu<mailto:Dev at lists.clarin.eu>
https://lists.clarin.eu/cgi-bin/mailman/listinfo/dev



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clarin.eu/pipermail/dev/attachments/20160215/16352ee5/attachment-0001.html>


More information about the Dev mailing list