[Userinvolvement] Supporting citation

Pavel Stranak stranak at ufal.mff.cuni.cz
Fri Oct 2 10:33:43 CEST 2015


Dear Thorsten and all,

thanks for using our example. Let me add a few clarifications.

> On 01 Oct 2015, at 21:44, Thorsten Trippel <thorsten.trippel at uni-tuebingen.de> wrote:
> 
> Dear Tomaž and all,
> 
> you are right, of course. Let's stick to the lindat example. Lindat uses handles and the metadata states the correct handle. But the URL  to cite should not be the lindat site but the handle.

Which it clearly is. If you copy&paste the citation you use the actionable (URLufied) handle, which is incidentally the currently recommended one. Nowhere do we recommend to use other URLs for citations.

> And of course a resolver can then easily find out that the handle belongs to the lindat repository. But if you want to find every citation of the resource you will have to look at every handle and find out if the handle is part of lindat (or some other CLARIN centre).

The repository is also mentioned explicitly in the citation text, following the Force 11 Joint Declaration of Data Citation Principles examples (see below).

> Now it would be easy(-ier) if each CLARIN centre used its own handle prefix.

Which we do. Feel free to resolve the handle in the example with the parametre "Don't Redirect to URLs" at http://hdl.handle.net. In fact we use two prefixes: one for the Clarin LRT Inventory records, one for our national resources. The reason is sustainability: being able to transfer the LRT Inventory to another centre, including the PID management (the whole prefix).

> In this case we could just search for the hdl:PREFIX (for every CLARIN centre). However this would still not be too obvious, non-CLARIN-ingroup persons would not be able to see that it is a CLARIN resource.

The "hdl:" format is not really recommended for citations AFAIK. I remember Larry Lenom explaining that the reason is that the schema never got supported by browsers. Sounds sensible to me. Even though the URLified form is longer, we follow that recommendation.

> 
> I think there are two purposes for citations of resources:

I actually think there are a few more. 

1) Giving credit and citations to resource creators. 

2) Ensuring replicability of results. That requires citing resources directly via their PIDs (and proper versioning of the resources with new PIDs).

Those would be the most prominent reasons usually mentioned. For more see the Force 11 declaration: https://www.force11.org/group/joint-declaration-data-citation-principles-final

> 
> 1. automatic counting: a crawler (or google) looks at references and counts the number of references to CLARIN resources. If the handle is cited only, this crawler would have to know all CLARIN-centres prefixes and look at all handles or resolve all handles to see which ones of them point to CLARIN repositories. Though this would be possible technically, it may involve some work and performance might be tricky.
> 
> 2. making CLARIN resources visible to humans: handles are persistent, but for a human it is impossible to see a relation to CLARIN. These readers will not use a resolver to find out that it is a CLARIN resource. So we need to find a different way of attributing it to CLARIN somehow. If I say CLARIN, I would of course also include national consortia.
> 
> 
> One more thing about DOIs: DOIs are handles (!), see https://www.doi.org/factsheets/DOIHandle.html

Of course they are, it has been recognised by Clarin since begining of preparation of the PID policy. Any Clarin centre is free to use DOIs as their PIDs.

To sum up, we have tried to create the format of our citation to conform with the Force 11 declaration mentioned above (and endorsed by Clarin ERIC: https://www.force11.org/datacitation/endorsements) and RDA work group for Data Citation. We followed the examples published by Force 11: https://www.force11.org/node/4771, but we are of really open to rational criticism and further improvements. 

Best,
Pavel



More information about the Userinvolvement mailing list