[Userinvolvement] Supporting citation

Thorsten Trippel thorsten.trippel at uni-tuebingen.de
Fri Oct 2 13:34:04 CEST 2015

Hi all,

thanks for the constructive discussion. And thanks for pointing out "the 
obvious": that's what I mean, not all repositories have a box like that. 
Lindat is the positive example on how to help the user cite their data.

Looking at

"Branco, António, 2014, Nexing Corpus, LINDAT/CLARIN digital library at 
Institute of Formal and Applied Linguistics, Charles University in 
Prague, http://hdl.handle.net/11372/LRT-386."
shows a couple of implications:

1. including the name of the repository: if the repository contains 
LINDAT/CLARIN (here: LINDAT/CLARIN digital library at Institute of 
Formal and Applied Linguistics) then it is obvious to spot. Hence it 
seems to be advisable to select a repository name that contains it.

2. the handle is an actionable PID, even those users unfamiliar with the 
handle system will know that this is "some kind of website" and we can 
cross our fingers that they will use it.

3. the part "LINDAT/CLARIN digital library at Institute of Formal and 
Applied Linguistics" is not in the CMDI file, but is obviously hard 
coded in the citation box generation routine. I wonder if this is 
physically the same instance as the LINDAT centre or if the CLARIN 
centre and the library can be independent. The name is not the same as 
in the centre registry, as far as I can tell.

4, the generated BibTeX entry
  title = {Nexing Corpus},
  url = {http://hdl.handle.net/11372/LRT-386},
  note = {{LINDAT}/{CLARIN} digital library at Institute of Formal and 
Applied Linguistics, Charles University in Prague},
  year  {2014} }

does not contain the reference to Antonio, who is coded as 
dc.contributor.other on the website and olac:contributor in the LINDAT 
CMDI. BibTeX only allows authors or editors, so he could only become a 
note (sorry Antonio if you read this...)

5. If I understand you @Pavel and @Martin correctly your suggestion for 
citation is

[AUTHOR|Editor|contributor], Publication year, [Resource Title|Resource 
Name|DC Title], Name of repository, institution of repository, PID.

6. My suggestion would be slightly different:

The name of the repository would be replaced by: "distributed by the 
CLARIN centre [name of the CLARIN centre as in the centre registry]"

advantage: standard and consistent naming with the CLARIN centre name, 
so even those centres that do not have CLARIN in their name (Huygens, 
Meertens, IDS, MPI, CMU; National Library of Norway, Oxford Text Archive 
to name some prominent examples) would be CLARINified in the reference; 
also if one centre operates more than one repository, it would still 
reference the centre, not the repository.

disadvantage: Douplication for those who have it "distributed by the 
CLARIN centre CLARIN Norway|CLARIN-PL Language Technology Centre...."

However, I think that we are getting to some ideas on how to cite data 
in CLARIN and these will help the Copenhagen-group to count such 
publications in their impact counts. The more consistent we are, the 
easier it will be to users utilizing resources from different 
institutions and/or those crawling for citations.

@Martin, how should we proceed from here? Of course waiting for "the 
others" to contribute, but should we forward this as a suggestion to the 
standards committee? And the centres committee to suggest that the 
"Citation box" could be integrated by all repositories?



Am 02.10.15 um 12:41 schrieb Martin Wynne:
> On 02/10/15 10:48, Thorsten Trippel wrote:
>> 1. What would you answer to a user coming to you saying: "I used
>> António Branco's Nexing Corpus, which I found at
>> https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-386. It
>> has the PID http://hdl.handle.net/11372/LRT-386 and seems to have been
>> published 2014. I don't know how to cite this. Bibtex doesn't have a
>> @languageresource type. How do I cite it? How do I do this in
>> Microsoft Word?"
> As it prominently states on the above web page:
> "Please use the following text to cite this item or export to a
> predefined format:
> Branco, António, 2014, Nexing Corpus, LINDAT/CLARIN digital library at
> Institute of Formal and Applied Linguistics, Charles University in
> Prague, http://hdl.handle.net/11372/LRT-386."
> I don't think that could be any clearer or easier to paste into a
> document, and it contains the word "CLARIN", which a crawler could find,
> and it contains the handle, so it seems to tick all of the boxes,
> doesn't it? Or am I missing something?
> Martin

///////// Dr. Thorsten Trippel   thorsten.trippel at uni-tuebingen.de
    //     Seminar für Sprachwissenschaft
   //  //  Eberhard-Karls-Universität Tübingen
  //  //   Office:  Wilhelmstr. 19 #2.17
     //    Phone:   +49 (0)7071-29-77352
///////// Federal Republic of Germany

More information about the Userinvolvement mailing list