[Standards] Fwd: Invitation to join the CLARIN Resource Families Taskforce

Piotr Banski banski at ids-mannheim.de
Wed Mar 3 13:27:04 CET 2021


Dear all,

We have looked at this invitation briefly at the last meeting (with 
Dieter presenting its context and aims) and we decided that I would 
forward Darja's message to the mailing list with a question of whether 
any member would be willing to take on the responsibility of serving at

Note that the deadline for registering interest is March 10th. If we 
don't come up with a representative until then, I will participate in 
the 1st meeting but will hope to cede this position ASAP, because, in 
the coming two years or so, I simply won't be able to allocate any 
productive time to this taskforce on a regular basis.

Please note that a major keyword below is "metadata" followed by 
(roughly speaking) "interoperability". This promises to be an extremely 
important initiative with high impact across CLARIN.

Please kindly register your interest by e-mailing this list, and let's 
see where that leads us.

Thanks in advance and best regards,

    Piotr



-------- Forwarded Message --------

Dear all,

you are probably all familiar wiht the successful CLARIN Resource 
Families initiative. In the initial stage, we have prepared some 
overviews of resource and tool families as well as analyzed some 
metadata issues. We have also carried out some dissemination and 
awareness raising activities. In the second stage of the initiative, we 
would like to go beyond surveys and would like to also develop and 
implement best practices as well as tackle a more comprehensive 
evaluation of the resources and tool families, which cannot be done by a 
single person. This is why we have proposed to the BoD to form a special 
taskforce with members from all the relevant CLARIN bodies:

- Technical Director (Dieter van Uytvanck)
- Director of User Involvement (Francesca Frontini)
- a SCCTC representative (they select a candidate from their members)
- a representative of the Standards Committee (they select a candidate 
from their members)
- a KSIC representative (they select a candidate from their members)
- a representative of the UI Committee (they select a candidate from 
their members)
- CRF Officer (Jakob)

The taskforce would work on the following tasks:

- Discuss models and incentives to maximally curate the identified 
metadata issues with CLARIN resources and tools whivch have been 
collected and published published on github
- Develop and implement preventive measures which will minimize the 
number of metadata issues for any newly deposited resources and tools in 
the future in order to ensure high quality future deposits and minimize 
issues we have encountered in the past as well as avoid post-hoc 
curation requirements
- Develop a user-friendly best practice guide for depositing resources 
and tools that take into account the above-mentioned preventive measures
- Perform a gap analysis for the CRF initiative -> to inform future CRF work
- Evaluate the resource and tool families from a more qualitative 
perspective, taking not only into account the availability or absence of 
metadata, but also in which way metadata are reported for the observed 
categories. For instance, it is often the case that resource size is 
reported in broadly different ways even for resources in the same 
resource family. For example, where some corpora in the same family 
report size in terms of sentences, others report it in terms of tokens, 
words, hours or file size, which extent hinders the cross-comparability 
of the resources. In addition to different measurement units for size, 
certain resources specify their annotation layers very imprecisely, 
using vague descriptors such as “multitagged” or “tagged with lexical 
relations” without further qualifying them, which is far from 
informative and user-friendly. In the case of licence information, some 
tools and resource list unhelpful values such as an “other” licence.
- Promote a greater inclusion of key publications describing the tools 
and resources. Listing key publications is not only important from the 
perspective of ensuring and facilitating author attribution, but the 
publications themselves generally provide the most detailed descriptions 
of the resources and tools, thereby crucially complementing the metadata 
presented in CLARIN repositories with documentation that enable 
appropriate reuse of resources and tools as well as interpretation of 
research results.

Work dynamics:

The taskforce is to meet virtually for the time being and after travel 
is possible again also F2F if required/desired. Participation in the 
taskforce is not financially remunerated but travel expenses will be 
reimbursed. The duration of the taskforce is planned until the end of 
2021 for the moment and after the final report the BoD will decide if 
they wish to prolong the initiative.

I attach the Clarin Resource Families workplan for 2021 which also 
includes this taskforce. It would be great if the committees could 
inform us of the candidates they propose to represent their committee by 
10 March so that we could officially launch the committee and send an 
invitation for the first meeting by 15 March.

If you have any questions or comments, do not hesitate to get in touch.

Best,

Darja Fišer


-------------- next part --------------
A non-text attachment was scrubbed...
Name: CE-2020-1724-Resource-Families-Workplan-2021-4.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 50334 bytes
Desc: not available
URL: <http://lists.clarin.eu/pipermail/standards/attachments/20210303/64dca811/attachment-0001.docx>


More information about the Standards mailing list