[Dev] retrieving the CQL endpoints from the center registry

Dieter Van Uytvanck dieter.vanuytvanck at mpi.nl
Tue Aug 14 17:47:43 CEST 2012


On 14/8/12 17:14 , Thomas Zastrow wrote:
> a) At the moment, it seems so that every endpoint represents one
> corpus.

No, that is not completely correct. See e.g.
http://trac.clarin.eu/wiki/RepositoryRegistry#Listofcorporaperendpoint.

> b) How can I get the language of the resource from the center
> registry?

That's not possible at this point. But we need this information, so
there are some approaches possible:

- hard-code it for now (based on the trac page)

- have it in the scan response, see
http://trac.clarin.eu/wiki/FCS-specification#Scan

- extract it from the VLO (but that means we need to get a close
connection between the aggregator and the VLO, might be good on the long
term but probably takes a while before that is done)

- have a collection record per endpoint (a CMDI giving a language list,
modality, etc. for each corpus) to which we can refer in the center
registry or from the scan response

I think I would like the last option the most, as it is relatively
light-weight, not too hard to make and it would also be in the hands of
the centers providing the end points (instead of being hardcoded). What
is your opinion?

> c) Are you sure that MPI is giving back KWIC dataview?

We should. Herman can state this with a higher degree of certainty.

best,
-- 
Dieter Van Uytvanck
Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
tel. +31-(0)24-3521-191 | <http://www.mpi.nl/>


More information about the Dev mailing list