[Userinvolvement] Supporting citation

Thorsten Trippel thorsten.trippel at uni-tuebingen.de
Thu Oct 1 21:44:57 CEST 2015


Dear Tomaž and all,

you are right, of course. Let's stick to the lindat example. Lindat uses 
handles and the metadata states the correct handle. But the URL  to cite 
should not be the lindat site but the handle. And of course a resolver 
can then easily find out that the handle belongs to the lindat 
repository. But if you want to find every citation of the resource you 
will have to look at every handle and find out if the handle is part of 
lindat (or some other CLARIN centre). Now it would be easy(-ier) if each 
CLARIN centre used its own handle prefix. In this case we could just 
search for the hdl:PREFIX (for every CLARIN centre). However this would 
still not be too obvious, non-CLARIN-ingroup persons would not be able 
to see that it is a CLARIN resource.

I think there are two purposes for citations of resources:

1. automatic counting: a crawler (or google) looks at references and 
counts the number of references to CLARIN resources. If the handle is 
cited only, this crawler would have to know all CLARIN-centres prefixes 
and look at all handles or resolve all handles to see which ones of them 
point to CLARIN repositories. Though this would be possible technically, 
it may involve some work and performance might be tricky.

2. making CLARIN resources visible to humans: handles are persistent, 
but for a human it is impossible to see a relation to CLARIN. These 
readers will not use a resolver to find out that it is a CLARIN 
resource. So we need to find a different way of attributing it to CLARIN 
somehow. If I say CLARIN, I would of course also include national 
consortia.


One more thing about DOIs: DOIs are handles (!), see 
https://www.doi.org/factsheets/DOIHandle.html

So if you have DOIs and DOIs fit to your data, it is my understanding 
that this is not contrary to CLARIN policy. In fact as long as a PID is 
actionable, it should not be a problem to use any type of persistent 
identifiers (I know for webservices this might be a challenge). DOIs are 
just slightly problematic because the policy of assigning DOIs does not 
fit to all kinds of resources we work with. Handles (as a superset) are 
more flexible. And of course there is no reason in principle not to 
assign multiple persistent identifiers to a resource. It may only result 
in a maintenance nightmare...

Does this clarify my argument?

Cheers
Thorsten

Am 01.10.15 um 18:25 schrieb Tomaž Erjavec:
> Dear Thorsten, all,
>
> Thorsten Trippel je 01/10/2015 ob 17:39 napisal:
>> Dear all,
>>
>> the Lindat example may contain a handle but this is not the classic
>> way to cite a handle which whould be hdl:11372/LRT-386 or as an
>> actionable PID it would be http://hdl.handle.net/11372/LRT-386. Though
>> the path in the URL says "handle" it does not become a handle just by
>> saying so in the path, but by being resolvable via the handle-resolver.
> But surely this is just because Martin has cited it using the direct
> LINDAT URL?
> But if you go to this URL it does clearly say that the way to cite this
> item is via http://hdl.handle.net/11372/LRT-386, just as you suggest above.
>
>> Neither of them shows an obvious link to lindat.
> Well, if you use the handle that is where you wind up, and there it does
> say LINDAT (more than once:).
>
>> So I guess we should prefer something like Footnote "The data is
>> available at http://hdl.handle.net/11372/LRT-386, provided via the
>> CLARIN Research infrastructure" or in the reference section as Branco,
>> António (2014). Nexing Corpus. http://hdl.handle.net/11372/LRT-386
>> provided via the CLARIN Research infrastructure"
>>
>> Zotero would not be more helpful as I do not see that Zotero provides
>> PIDs that are sustainable (i.e. independent of the hosting
>> institution). Maybe some day Zotero will be certified and provide real
>> PIDs ;-)
>>
> Maybe https://zenodo.org/ is a better exemplar, as it is also used for
> datasets and does provide DOIs.
> Which brings me to a point which I already made with LINDAT folks, but
> was not met with much enthusiasm:
> in spite of the short time our repository has been working, I've already
> had several experiences where potential depositors were put off by the
> fact that we don't use DOIs - it's a system they know and trust, whereas
> handles are to them some strange beast. I know that it is A Bad Thing to
> multiply identifiers, but maybe for the DOIs we could make an exception
> (as an option, not obligatory of course), as it does seems to be what
>  >90% of people use.
> This might be getting off-topic though and something to better discuss
> in Poland.
> Best,
> Tomaž
>
>> Greetings
>>
>> Thorsten
>>
>> Am 01.10.15 um 16:56 schrieb Martin Wynne:
>>> Dear all,
>>>
>>> One of the suggestions at the User Involvement meeting last month was
>>> that repositories should make it easy for users to cite CLARIN resources
>>> that they have accessed and used. It was thought that this would be
>>> encourage use of CLARIN resources, and and help measure usage of
>>> resources.
>>>
>>> I promised to make a recommendation to the Centres Committee, which I
>>> have done in an email, but Dieter would like some more details of the
>>> requirements, and some examples of good practice in this area.
>>>
>>> I think that the most important requirements are:
>>> - to provide text on the landing page for a resource giving guidance to
>>> users on how to cite resources (e.g. fragments of text for them to copy
>>> and paste);
>>> - make sure that there are persistent identifiers for online resources,
>>> that can be cited in publications and which will remain valid;
>>> - to inform users that citing the resources that they use is important
>>> for the ongoing sustainability of the resources.
>>>
>>> For examples of good practice I have found the following, which provide
>>> a fragment of text that can be pasted into a document (although the
>>> British History one is only a URL not a handle or DOI):
>>>
>>> - https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-386
>>>
>>> -
>>> http://www.british-history.ac.uk/cal-state-papers/domestic/edw-eliz/addenda/1547-65/p554
>>>
>>>
>>>
>>> Are there other requirements, and more examples? For example, would
>>> links to index resources on Zotero be useful?
>>>
>>> Please send any comments of suggestions to the list, or to me and I will
>>> summarize.
>>>
>>> Best wishes,
>>> Martin
>>>
>>
>>
>
> _______________________________________________
> Userinvolvement mailing list
> Userinvolvement at lists.clarin.eu
> https://lists.clarin.eu/cgi-bin/mailman/listinfo/userinvolvement


-- 
----------------------------------------------------------------------------
///////// Dr. Thorsten Trippel   thorsten.trippel at uni-tuebingen.de
    //     Seminar für Sprachwissenschaft
   //  //  Eberhard-Karls-Universität Tübingen
  //  //   Office:  Wilhelmstr. 19 #2.17
     //    Phone:   +49 (0)7071-29-77352
///////// Federal Republic of Germany
-----------------------------------------------------------------------------


More information about the Userinvolvement mailing list