[Standards] Fwd: Invitation to join the CLARIN Resource Families Taskforce
Piotr Banski
banski at ids-mannheim.de
Wed Mar 3 13:27:04 CET 2021
Dear all,
We have looked at this invitation briefly at the last meeting (with
Dieter presenting its context and aims) and we decided that I would
forward Darja's message to the mailing list with a question of whether
any member would be willing to take on the responsibility of serving at
Note that the deadline for registering interest is March 10th. If we
don't come up with a representative until then, I will participate in
the 1st meeting but will hope to cede this position ASAP, because, in
the coming two years or so, I simply won't be able to allocate any
productive time to this taskforce on a regular basis.
Please note that a major keyword below is "metadata" followed by
(roughly speaking) "interoperability". This promises to be an extremely
important initiative with high impact across CLARIN.
Please kindly register your interest by e-mailing this list, and let's
see where that leads us.
Thanks in advance and best regards,
Piotr
-------- Forwarded Message --------
Dear all,
you are probably all familiar wiht the successful CLARIN Resource
Families initiative. In the initial stage, we have prepared some
overviews of resource and tool families as well as analyzed some
metadata issues. We have also carried out some dissemination and
awareness raising activities. In the second stage of the initiative, we
would like to go beyond surveys and would like to also develop and
implement best practices as well as tackle a more comprehensive
evaluation of the resources and tool families, which cannot be done by a
single person. This is why we have proposed to the BoD to form a special
taskforce with members from all the relevant CLARIN bodies:
- Technical Director (Dieter van Uytvanck)
- Director of User Involvement (Francesca Frontini)
- a SCCTC representative (they select a candidate from their members)
- a representative of the Standards Committee (they select a candidate
from their members)
- a KSIC representative (they select a candidate from their members)
- a representative of the UI Committee (they select a candidate from
their members)
- CRF Officer (Jakob)
The taskforce would work on the following tasks:
- Discuss models and incentives to maximally curate the identified
metadata issues with CLARIN resources and tools whivch have been
collected and published published on github
- Develop and implement preventive measures which will minimize the
number of metadata issues for any newly deposited resources and tools in
the future in order to ensure high quality future deposits and minimize
issues we have encountered in the past as well as avoid post-hoc
curation requirements
- Develop a user-friendly best practice guide for depositing resources
and tools that take into account the above-mentioned preventive measures
- Perform a gap analysis for the CRF initiative -> to inform future CRF work
- Evaluate the resource and tool families from a more qualitative
perspective, taking not only into account the availability or absence of
metadata, but also in which way metadata are reported for the observed
categories. For instance, it is often the case that resource size is
reported in broadly different ways even for resources in the same
resource family. For example, where some corpora in the same family
report size in terms of sentences, others report it in terms of tokens,
words, hours or file size, which extent hinders the cross-comparability
of the resources. In addition to different measurement units for size,
certain resources specify their annotation layers very imprecisely,
using vague descriptors such as “multitagged” or “tagged with lexical
relations” without further qualifying them, which is far from
informative and user-friendly. In the case of licence information, some
tools and resource list unhelpful values such as an “other” licence.
- Promote a greater inclusion of key publications describing the tools
and resources. Listing key publications is not only important from the
perspective of ensuring and facilitating author attribution, but the
publications themselves generally provide the most detailed descriptions
of the resources and tools, thereby crucially complementing the metadata
presented in CLARIN repositories with documentation that enable
appropriate reuse of resources and tools as well as interpretation of
research results.
Work dynamics:
The taskforce is to meet virtually for the time being and after travel
is possible again also F2F if required/desired. Participation in the
taskforce is not financially remunerated but travel expenses will be
reimbursed. The duration of the taskforce is planned until the end of
2021 for the moment and after the final report the BoD will decide if
they wish to prolong the initiative.
I attach the Clarin Resource Families workplan for 2021 which also
includes this taskforce. It would be great if the committees could
inform us of the candidates they propose to represent their committee by
10 March so that we could officially launch the committee and send an
invitation for the first meeting by 15 March.
If you have any questions or comments, do not hesitate to get in touch.
Best,
Darja Fišer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CE-2020-1724-Resource-Families-Workplan-2021-4.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 50334 bytes
Desc: not available
URL: <http://lists.clarin.eu/pipermail/standards/attachments/20210303/64dca811/attachment-0001.docx>
More information about the Standards
mailing list