[Userinvolvement] Supporting citation

Thorsten Trippel thorsten.trippel at uni-tuebingen.de
Fri Oct 2 11:48:27 CEST 2015


Dear all and Pavel (who is actually also included in all...),

First: Sorry Pavel, I did not intend to sound offensive, I second 
everything you said here.

Following up on that discussion: if someone wants to measure the impact 
of cited data, it is not easy without resolving the handles. I don't 
think that resolving the PID will be the solution to the citation issue, 
though all information may be available in the CMDI metadata you can get 
via the PID. Hence to support citation, we probably should recommend 
something to give us and human readers of a publication some additional 
clue that the resource is somewhere in the CLARIN community.

Lindat is actually a pretty good example: it is a certified repository, 
using PIDs, the system is transparent, etc.

So let me ask you (that is everyone):

1. What would you answer to a user coming to you saying: "I used António 
Branco's Nexing Corpus, which I found at 
https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-386. It has 
the PID http://hdl.handle.net/11372/LRT-386 and seems to have been 
published 2014. I don't know how to cite this. Bibtex doesn't have a 
@languageresource type. How do I cite it? How do I do this in Microsoft 
Word?"

2. How do you make sure that somebody (a user, google-scholar...) 
reading that paper notices that the resource was provided via CLARIN? 
And how do you make sure that also the national consortium that provides 
it (in this case Lindat (!)) is credited? And António and the University 
of Lisbon as well? In Linked Data and hypertext it might work to resolve 
everything on a click, but in a paper, even as PDF?

3. How do you measure the use of CLARIN resources based on such citations?

I gave my suggestion in a previous e-mail, but I know that it is not 
ideal and may not solve all issues.

Cheers

Thorsten





Am 02.10.15 um 10:33 schrieb Pavel Stranak:
> Dear Thorsten and all,
>
> thanks for using our example. Let me add a few clarifications.
>
>> On 01 Oct 2015, at 21:44, Thorsten Trippel <thorsten.trippel at uni-tuebingen.de> wrote:
>>
>> Dear Tomaž and all,
>>
>> you are right, of course. Let's stick to the lindat example. Lindat uses handles and the metadata states the correct handle. But the URL  to cite should not be the lindat site but the handle.
>
> Which it clearly is. If you copy&paste the citation you use the actionable (URLufied) handle, which is incidentally the currently recommended one. Nowhere do we recommend to use other URLs for citations.
>
>> And of course a resolver can then easily find out that the handle belongs to the lindat repository. But if you want to find every citation of the resource you will have to look at every handle and find out if the handle is part of lindat (or some other CLARIN centre).
>
> The repository is also mentioned explicitly in the citation text, following the Force 11 Joint Declaration of Data Citation Principles examples (see below).
>
>> Now it would be easy(-ier) if each CLARIN centre used its own handle prefix.
>
> Which we do. Feel free to resolve the handle in the example with the parametre "Don't Redirect to URLs" at http://hdl.handle.net. In fact we use two prefixes: one for the Clarin LRT Inventory records, one for our national resources. The reason is sustainability: being able to transfer the LRT Inventory to another centre, including the PID management (the whole prefix).
>
>> In this case we could just search for the hdl:PREFIX (for every CLARIN centre). However this would still not be too obvious, non-CLARIN-ingroup persons would not be able to see that it is a CLARIN resource.
>
> The "hdl:" format is not really recommended for citations AFAIK. I remember Larry Lenom explaining that the reason is that the schema never got supported by browsers. Sounds sensible to me. Even though the URLified form is longer, we follow that recommendation.
>
>>
>> I think there are two purposes for citations of resources:
>
> I actually think there are a few more.
>
> 1) Giving credit and citations to resource creators.
>
> 2) Ensuring replicability of results. That requires citing resources directly via their PIDs (and proper versioning of the resources with new PIDs).
>
> Those would be the most prominent reasons usually mentioned. For more see the Force 11 declaration: https://www.force11.org/group/joint-declaration-data-citation-principles-final
>
>>
>> 1. automatic counting: a crawler (or google) looks at references and counts the number of references to CLARIN resources. If the handle is cited only, this crawler would have to know all CLARIN-centres prefixes and look at all handles or resolve all handles to see which ones of them point to CLARIN repositories. Though this would be possible technically, it may involve some work and performance might be tricky.
>>
>> 2. making CLARIN resources visible to humans: handles are persistent, but for a human it is impossible to see a relation to CLARIN. These readers will not use a resolver to find out that it is a CLARIN resource. So we need to find a different way of attributing it to CLARIN somehow. If I say CLARIN, I would of course also include national consortia.
>>
>>
>> One more thing about DOIs: DOIs are handles (!), see https://www.doi.org/factsheets/DOIHandle.html
>
> Of course they are, it has been recognised by Clarin since begining of preparation of the PID policy. Any Clarin centre is free to use DOIs as their PIDs.
>
> To sum up, we have tried to create the format of our citation to conform with the Force 11 declaration mentioned above (and endorsed by Clarin ERIC: https://www.force11.org/datacitation/endorsements) and RDA work group for Data Citation. We followed the examples published by Force 11: https://www.force11.org/node/4771, but we are of really open to rational criticism and further improvements.
>
> Best,
> Pavel
>


-- 
----------------------------------------------------------------------------
///////// Dr. Thorsten Trippel   thorsten.trippel at uni-tuebingen.de
    //     Seminar für Sprachwissenschaft
   //  //  Eberhard-Karls-Universität Tübingen
  //  //   Office:  Wilhelmstr. 19 #2.17
     //    Phone:   +49 (0)7071-29-77352
///////// Federal Republic of Germany
-----------------------------------------------------------------------------


More information about the Userinvolvement mailing list