[Userinvolvement] Overview of manually annotated text corpora

Pavel Stranak stranak at ufal.mff.cuni.cz
Thu Jan 31 13:57:26 CET 2019

Dear Darja,

please see attached a list of missing treebanks and other missing information. For now we have added corrections only to section 2. In the second round we would do also corrections to section 3 (when it changes based on changes to Sec. 2).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corrections.odt
Type: application/vnd.oasis.opendocument.text
Size: 29474 bytes
Desc: not available
URL: <https://lists.clarin.eu/pipermail/userinvolvement/attachments/20190131/987bda77/attachment-0001.odt>
-------------- next part --------------

Some treebanks that are not ?ours?, but we have them in out treebank search (https://lindat.mff.cuni.cz/services/pmltq/) are listed below, but not in the report. We think they have CLARIN origin and maybe should be included in the report, but we are not sure. 
	? Index Thomisticus Treebank: It is manually annotated: https://itreebank.marginalia.it/view/projet.php)
	? Latin Dependency Treebank; The same as for ITTB (https://perseusdl.github.io/treebank_data/)
	? Bulgarian Treebank (HPSG-based Syntactic Treebank of Bulgarian): - Only Morphologically Annotated Part of BulTreeBank is mentioned in the report, but it's not clear if this hpsg treebank is considered manual or not;
	? LVTB - Latvian dependency-constituency treebank v 2.3: Couldn't find information except it's the same data as in UD (maybe it's the original annotation) + no Latvian in the report 
	? Swedish Treebank from Talbanken: It looks like the same data as in UD_Swedish-Talbanken, probably originally manual, but again, no explicit info.


> On 23 Jan 2019, at 19:00, Fi?er, Darja <Darja.Fiser at ff.uni-lj.si> wrote:
> Dear all,
> I?m happy to share the draft report of the manually annotated text corpora in the CLARIN infrastructure. If you see anything that needs to be improved or added, please let us know:
> CE-2019-1384-Manually-annotated-corpora-report.pdf
> CE-2019-1384-Manually-annotated-corpora-report.docx 
> We will soon be adding the overview to our webpage as well.
> Best,
> Darja Fi?er
> ? 
> Univerza v Ljubljani
> Filozofska fakulteta	doc. dr. Darja Fi?er, Assistant Professor 
> http://lojze.lugos.si/darja/
> Oddelek za prevajalstvo / Department of translation
> Filozofska fakulteta / Faculty of arts
> A?ker?eva cesta 2, SI-1000 Ljubljana, Slovenija / Slovenia
> darja.fiser at ff.uni-lj.si, www.ff.uni-lj.si
> <logo_UL.gif>
> _______________________________________________
> Userinvolvement mailing list
> Userinvolvement at lists.clarin.eu
> https://lists.clarin.eu/cgi-bin/mailman/listinfo/userinvolvement

More information about the Userinvolvement mailing list