[Tf-curation] clavas vocabulary for licenses

Sugimoto, Go Go.Sugimoto at oeaw.ac.at
Tue Jun 12 15:23:53 CEST 2018


Dear all,

Just a general thought on the topic.
Correct me if I am wrong, as I have not worked on CMDI so long. :)

I think what metadata within the CMDI-VLO does not have is a standard way to have both a custom field (which ideally should be also controlled) and a CLARIN normalised field (to make near 100% facet coverage) for the same value. Instead, CMDI centrally normalises whatever source fields a provider defines (and/or mapping them to other fields CLARIN needs)…this could cause confusions.

This is different from the approach of other metadata. For instance, in Europeana, there are dc:type (which you can put anything about a type of a resource (eg photo, newspaper)) and edm:type (which is mandatory and has to be one of TEXT, IMAGE, VIDEO, SOUND, and 3D). Similar situation can be seen in other metadata such as EAD (as global attribute of NORMAL). This is the way normally how to make a facet consistent. On the other hand, CLARIN has too few restrictions on the controlled vocabularies, elements, and cardinalities etc, which leads to many metadata curation tasks, so, at least, I think this kind of point can be more clarified in the CMDI guidelines people are preparing. For example, we could say “keep original values and recommend to create a new field to map them to CLARIN vocabularies (CLAVAS etc).” In this way, we can finally have the controlled vocabularies for VLO we have dreamed of. This would be an alternative solution for CLARIN to pass the curation tasks to the data providers (although I cannot say for sure if it is the easiest solution, when VLO/CMDI is already mature and hard to change).

Best,
Go Sugimoto
Go.Sugimoto at oeaw.ac.at<mailto:Go.Sugimoto at oeaw.ac.at>
Austrian Centre for Digital Humanities (ACDH)<acdh.oeaw.ac.at>
Austrian Academy of Sciences<oeaw.ac.at>

Skype: heygo4it
LinkedIn: https://www.linkedin.com/in/gosavethequeen
ResearchGate: http://www.researchgate.net/profile/Go_Sugimoto2

From: tf-curation-bounces at lists.clarin.eu <tf-curation-bounces at lists.clarin.eu> On Behalf Of Ondrej Košarko
Sent: Tuesday, June 12, 2018 10:07 AM
To: Odijk, J.E.J.M. (Jan) <j.odijk at uu.nl>
Cc: tf-curation at lists.clarin.eu
Subject: Re: [Tf-curation] clavas vocabulary for licenses

Dear Jan, all,

I'm wondering if CLAVAS (or any "fixed" vocabulary) is a good fit for fields where new values will appear over time. Even though we try to avoid it as much as possible, we do have custom licenses applicable to just one particular item. How'd a dictionary handle that? A catch all "other" value doesn't feel right. Would that mean updating the dictionary periodically, and having "invalid" values in between the updates?

Currently we are using url with this concept http://hdl.handle.net/11459/CCR_C-6586_2c79d86a-5a75-0890-d407-7d9cb86b9beb to identify/point to the license. Then the license name in our profile is optional and just a string.

Best,
Ondrej



2018-06-11 22:02 GMT+02:00 Odijk, J.E.J.M. (Jan) <j.odijk at uu.nl<mailto:j.odijk at uu.nl>>:

Dear Curation task force,

The list of values for license in the (META_SHARE originating) profile resourceInfo (clarin.eu:cr1:p_1360931019836) is quite extensive, and I would like to reuse it for a profile I manage. A disadvantage is that no semantics for the vocabulary items are defined, but one can imagine working on that together with the original profile creator (Penny Labropoulou)

I cannot reuse the vocabulary because it is a vocabulary of  a CMDI element (and not, on its own, of a component).

An option would be to make this vocabulary a vocabulary in CLAVAS.
I understood from Menzo that there are plans for CLAVAS vocabulary for license, and I wonder whether this one is being taken into consideration, and what the plans for such a vocabulary are.

I look forward to your response
Jan
-----------------------------------------------------
Prof.dr. Jan Odijk
Professor of Language and Speech Technology
Director CLARIAH
UiL-OTS
Trans 10 k. 2.33
3512 JK Utrecht
T +31 30 253 5745
F +31 30 253 6000
Skype janodijk10
-----------------------------------------------------


_______________________________________________
Tf-curation mailing list
Tf-curation at lists.clarin.eu<mailto:Tf-curation at lists.clarin.eu>
https://lists.clarin.eu/cgi-bin/mailman/listinfo/tf-curation

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clarin.eu/pipermail/tf-curation/attachments/20180612/d602ff09/attachment.html>


More information about the Tf-curation mailing list