[Userinvolvement] CLARIN Tool Families: Overview of Named Entity Recognizers in the CLARIN infrastructure

Lenardič, Jakob Jakob.Lenardic at ff.uni-lj.si
Tue Nov 19 15:16:15 CET 2019


Dear Pavel,


sorry for the late reply.


Darja and I suggest that the lines aren't duplicated, but that a note is added to the funcitonalities category specifying which entities are recognized by each model.


You say that there are no training data for English, but could you nevertheless tell me which categories are recognized in English? I've added a note to the relevant cell in the spreadsheet if you want to input this directly.


Best,
Jakob


Univerza v Ljubljani
Filozofska fakulteta    asist. Jakob Lenardič


Oddelek za prevajalstvo / Department of translation

Filozofska fakulteta / Faculty of arts

Aškerčeva cesta 2, SI-1000 Ljubljana, Slovenija / Slovenia
T.: 241-1143
Jakob.Lenardic at ff.uni-lj.si<mailto:Jakob.Lenardic at ff.uni-lj.si>, www.ff.uni-lj.si<http://www.ff.uni-lj.si/>
[Univerza v Ljubljani]<http://www.uni-lj.si/>


________________________________
From: Pavel Stranak <stranak at ufal.mff.cuni.cz>
Sent: Friday, November 8, 2019 1:18 PM
To: Fišer, Darja
Cc: userinvolvement at lists.clarin.eu; Lenardič, Jakob
Subject: Re: [Userinvolvement] CLARIN Tool Families: Overview of Named Entity Recognizers in the CLARIN infrastructure

Hi Darja and all,

I have made a quick check of our NameTag information and I see 2 issues:
- We provide models for Czech and English, both are available online. So I added English into your list.
- the "NER categories" column. The issue is that at least for our system the recognised entities (types and sub-types) are a feature of the model (and in the end the training corpus), not the tool. It is specified in detail in the NameTag documentation that you link to, but shortly, for instance this is the schema of entities recognised by models 'czech-cnec2.0-<version>': http://ufal.mff.cuni.cz/~strakova/cnec2.0/ne-type-hierarchy.pdf. For the English models the schema is very different, because there is no training dataset with this detailed classification of entities.

So whould we duplicate lines for the tool for each model? I don't see another way to fill it in. The categories must be per model, at least for NameTag.

Best,
Pavel




On 7 Nov 2019, at 17:31, Fišer, Darja <Darja.Fiser at ff.uni-lj.si<mailto:Darja.Fiser at ff.uni-lj.si>> wrote:

Dear all,

Jakob and I have started the second tool survey with which we will expand the CLARIN Resource Families initiative. It is for tools for Named Entity Recognition that are provided by national CLARIN consortia. Please find the spreadsheet where we’re collecting the data below:

https://docs.google.com/spreadsheets/d/1W7Yv-HMUt0LGsK19btJ_wMK-fmtXWTt0jmSq7OKnrcc/edit?usp=sharing

We have already provided input for three tools (NameTag by LINDAT, janes-ner by CLARIN.SI, and PolDeepNer by CLARIN-PL) to show what kind of information we are looking for with this survey, so please use these entries as a template for providing information on NER tools for your consortium. In contrast to previous surveys of resources, we did not search the tools for all CLARIN countries by ourselves since tools are at the moment generally less likely discoverable through CLARIN repositories. We are kindly requesting your input by 28 November.

Best,
Darja and Jakob

Univerza v Ljubljani
Filozofska fakulteta    Assoc. Prof. dr. Darja Fišer


Oddelek za prevajalstvo / Department of translation

Filozofska fakulteta / Faculty of arts

Aškerčeva cesta 2, SI-1000 Ljubljana, Slovenija / Slovenia

darja.fiser at ff.uni-lj.si<mailto:darja.fiser at ff.uni-lj.si>, www.ff.uni-lj.si<http://www.ff.uni-lj.si/>

<http://www.uni-lj.si/>
<logo_100.png>




_______________________________________________
Userinvolvement mailing list
Userinvolvement at lists.clarin.eu<mailto:Userinvolvement at lists.clarin.eu>
https://lists.clarin.eu/cgi-bin/mailman/listinfo/userinvolvement

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clarin.eu/pipermail/userinvolvement/attachments/20191119/c6110adf/attachment.html>


More information about the Userinvolvement mailing list