[Userinvolvement] Fwd: Invitation to join the CLARIN Resource Families Taskforce
Fišer, Darja
Darja.Fiser at ff.uni-lj.si
Thu Feb 25 11:36:17 CET 2021
Dear all,
you are probably all familiar wiht the successful CLARIN Resource Families initiative. In the initial stage, we have prepared some overviews of resource and tool families as well as analyzed some metadata issues. We have also carried out some dissemination and awareness raising activities. In the second stage of the initiative, we would like to go beyond surveys and would like to also develop and implement best practices as well as tackle a more comprehensive evaluation of the resources and tool families, which cannot be done well by one or two people alone. This is why the BoD has agreed to form a special taskforce with members from all the relevant CLARIN bodies:
- Technical Director (Dieter van Uytvanck)
- Director of User Involvement (Francesca Frontini)
- a SCCTC representative (they select a candidate from their members)
- a representative of the Standards Committee (they select a candidate from their members)
- a KSIC representative (they select a candidate from their members)
- a representative of the UI Committee (they select a candidate from their members)
- CRF Officer (Jakob)
The taskforce would work on the following tasks:
- Discuss models and incentives to maximally curate the identified metadata issues with CLARIN resources and tools whivch have been collected and published published on github
- Develop and implement preventive measures which will minimize the number of metadata issues for any newly deposited resources and tools in the future in order to ensure high quality future deposits and minimize issues we have encountered in the past as well as avoid post-hoc curation requirements
- Develop a user-friendly best practice guide for depositing resources and tools that take into account the above-mentioned preventive measures
- Perform a gap analysis for the CRF initiative -> to inform future CRF work
- Evaluate the resource and tool families from a more qualitative perspective, taking not only into account the availability or absence of metadata, but also in which way metadata are reported for the observed categories. For instance, it is often the case that resource size is reported in broadly different ways even for resources in the same resource family. For example, where some corpora in the same family report size in terms of sentences, others report it in terms of tokens, words, hours or file size, which extent hinders the cross-comparability of the resources. In addition to different measurement units for size, certain resources specify their annotation layers very imprecisely, using vague descriptors such as “multitagged” or “tagged with lexical relations” without further qualifying them, which is far from informative and user-friendly. In the case of licence information, some tools and resource list unhelpful values such as an “other” licence.
- Promote a greater inclusion of key publications describing the tools and resources. Listing key publications is not only important from the perspective of ensuring and facilitating author attribution, but the publications themselves generally provide the most detailed descriptions of the resources and tools, thereby crucially complementing the metadata presented in CLARIN repositories with documentation that enable appropriate reuse of resources and tools as well as interpretation of research results.
Work dynamics:
The taskforce is to meet virtually for the time being and after travel is possible again also F2F if required/desired. Participation in the taskforce is not financially remunerated but travel expenses will be reimbursed. The duration of the taskforce is planned until the end of 2021 for the moment and after the final report the BoD will decide if they wish to prolong the initiative.
I attach the Clarin Resource Families workplan for 2021 which also includes this taskforce.
We would like to invite all members of the User Involvement Committee who would wish to serve on the taskforce to express their interest by 4 March. If we receive multiple (self-)nominations, we will organize a selection procedure by 10 March.
If you have any questions or comments, do not hesitate to get in touch.
Best,
Darja Fišer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clarin.eu/pipermail/userinvolvement/attachments/20210225/51b4f6d0/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CE-2020-1724-Resource-Families-Workplan-2021-4.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 50334 bytes
Desc: CE-2020-1724-Resource-Families-Workplan-2021-4.docx
URL: <http://lists.clarin.eu/pipermail/userinvolvement/attachments/20210225/51b4f6d0/attachment-0001.docx>
More information about the Userinvolvement
mailing list