[Userinvolvement] Supporting citation

Pavel Stranak stranak at ufal.mff.cuni.cz
Mon Oct 5 14:40:07 CEST 2015


Hi,

nobody else reacted, so allow me to add a little. I promise to be brief :-)

> On 2. 10. 2015, at 13:34, Thorsten Trippel <thorsten.trippel at uni-tuebingen.de> wrote:
> 
> Hi all,
> 
> thanks for the constructive discussion. And thanks for pointing out "the obvious": that's what I mean, not all repositories have a box like that. Lindat is the positive example on how to help the user cite their data.

I agree with this main point of Martin's first message. I also have no better idea than providing this type of citation/sharing mechanism: a formatted citation text, a choice of export formats, etc. We can improve on this, however.

> Looking at
> 
> "Branco, António, 2014, Nexing Corpus, LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague, http://hdl.handle.net/11372/LRT-386."
> shows a couple of implications:
> 
> 1. including the name of the repository: if the repository contains LINDAT/CLARIN (here: LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics) then it is obvious to spot. Hence it seems to be advisable to select a repository name that contains it.

> 2. the handle is an actionable PID, even those users unfamiliar with the handle system will know that this is "some kind of website" and we can cross our fingers that they will use it.
> 
> 3. the part "LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics" is not in the CMDI file, but is obviously hard coded in the citation box generation routine. I wonder if this is physically the same instance as the LINDAT centre or if the CLARIN centre and the library can be independent. The name is not the same as in the centre registry, as far as I can tell.

The general idea follows that Force11 example: https://www.force11.org/node/4771. We can certainly discuss unified way of naming our repositories. You are correct, "repository name" is configured (not hardcoded) in some repo config file, it is not taken from the metadata. 

> 4, the generated BibTeX entry
> @misc{11372/LRT-386,
> title = {Nexing Corpus},
> url = {http://hdl.handle.net/11372/LRT-386},
> note = {{LINDAT}/{CLARIN} digital library at Institute of Formal and Applied Linguistics, Charles University in Prague},
> year  {2014} }
> 
> does not contain the reference to Antonio, who is coded as dc.contributor.other on the website and olac:contributor in the LINDAT CMDI. BibTeX only allows authors or editors, so he could only become a note (sorry Antonio if you read this...)

Nice catch, this is our bug. We map dc.contributor.other to "Authors" in other places, but not in the bibtex export: https://github.com/ufal/lindat-dspace/issues/393

(Why is Antonio listed as "dc.contributor.other" and not "dc.contributor.author" is another issue, probably something to do with the origin of these records in the old LRT Inventory on the original Clarin web.)

> 5. If I understand you @Pavel and @Martin correctly your suggestion for citation is
> 
> [AUTHOR|Editor|contributor], Publication year, [Resource Title|Resource Name|DC Title], Name of repository, institution of repository, PID.

Close. The Force11 recommendation (see above) is: 
Author(s), Year, Dataset Title, Data Repository or Archive, Version, Global Persistent Identifier. 

The apparent "Institution" is just part of how we have configured our "Repository name", nothing else.
 
> 6. My suggestion would be slightly different:
> 
> The name of the repository would be replaced by: "distributed by the CLARIN centre [name of the CLARIN centre as in the centre registry]"

Certainly possible, I agree we should agree on how we do this "archive attribution".

> advantage: standard and consistent naming with the CLARIN centre name, so even those centres that do not have CLARIN in their name (Huygens, Meertens, IDS, MPI, CMU; National Library of Norway, Oxford Text Archive to name some prominent examples) would be CLARINified in the reference; also if one centre operates more than one repository, it would still reference the centre, not the repository.

"Clarinification" of Oxford Text Archive, National Library of Norway, etc. is an advantage for CLARIN. But this looks like something national coordinators should decide. 

I see 3 reasonable naming options:
- <centre name>
- the CLARIN centre <centre name> 
- the <national> CLARIN centre <centre name>

> disadvantage: Douplication for those who have it "distributed by the CLARIN centre CLARIN Norway|CLARIN-PL Language Technology Centre...."

Those are consortia, not centres, I think. So maybe it would be actually OK.

Pavel


> However, I think that we are getting to some ideas on how to cite data in CLARIN and these will help the Copenhagen-group to count such publications in their impact counts. The more consistent we are, the easier it will be to users utilizing resources from different institutions and/or those crawling for citations.
> 
> @Martin, how should we proceed from here? Of course waiting for "the others" to contribute, but should we forward this as a suggestion to the standards committee? And the centres committee to suggest that the "Citation box" could be integrated by all repositories?
> 
> Cheers
> 
> Thorsten



More information about the Userinvolvement mailing list