[Dev] retrieving the CQL endpoints from the center registry

Thomas Zastrow thomas.zastrow at uni-tuebingen.de
Thu Aug 16 14:57:58 CEST 2012


Am 16.08.12 14:44, schrieb Herman Stehouwer:
> Shouldn't corpora already have CMDI files?
> Shouldn't those CMDI files already contain all the information you are going to get about a specific corpus?
>
> Otherwise we can keep adding stuff ...
Yes - but as I already said:

a)
The center registry is my starting point to get the information, which 
endpoints are available.

b)
I need information about the corpora at these endpoints, for example the 
language. If I don't find these information directly via the endpoints, 
it would mean that

For every corpus ...

1.)
Find the CMDI file: Resolve the PID and parse that document (most, but 
not all people are using the handle system which means that at this 
point I have to probably mind more than one PID-resolver format ...)

2.)
Harvest the CMDI file

3.)
Parse the CMDI file

So, that would be *much* more effort at the user interface part of the 
FCS. At the moment, with 10 corpora or so, I can harvest the necessary 
information from the center registry and the endpoints in realtime. 
Doing it the way that I have to go the circuit via the CMDI files would 
slow down everything a lot.

So, be pragmatic, we have less then 10 months to finish the whole thing 
and not many people are *really* writing code at the moment ...

Best,

Tom


-- 
Dr. Thomas Zastrow
Seminar fuer Sprachwissenschaft
Universitaet Tuebingen

Wilhelmstr. 19
D-72074 Tuebingen

http://www.thomas-zastrow.de

Tel.: 07071/29-73968
Fax: 07071/29-5214



More information about the Dev mailing list