[Dev] Validation of OAI-PMH output for CLARIN centers
Thomas Zastrow
thomas.zastrow at uni-tuebingen.de
Wed Feb 20 10:05:09 CET 2013
Dear CLARIN colleagues,
As CLARIN centers, we are publishing our metadata via the OAI-PMH
protocol. Therefor, the metadata from a bunch of CMDI files is
concatenated and offered via an OAI-PMH provider.
To make sure that your CLARIN center is compatible with the OAI-PMH
standard, please test your software at
http://re.cs.uct.ac.za/
We want to point you to the use of XML IDs in the CMDI files: these XML
IDs have to be unique in the current XML instance. On the other hand,
when concatenating CMDI files for OAI-PMH harvesting, it could happen
that IDs from several CMDI files can have the same value. In that case,
the OAI-PMH output is no longer valid. Please make sure that this does
not happen in your repository!
Amongst others, you have the following options to archive uniqueness of
XML Ids within OAI responses:
* Use unique XML IDs in your CMDI files within your repository. Please
note that perpending or concatenating Handle PIDs is not a good
solution, because the syntax will not be compatible with XML ID syntax.
* Limit the response set size of your OAI provider to one record per
request. In this case, your OAI provider must have complete support for
resumption tokens. Please note that this approach will increase
harvesting time and network bandwidth (and, depending on your OAI
provider, system load on your repository)
* Reassign XML IDs when generating a OAI response. Please note that this
approach usually requires modifications to your OAI provider
implementation.
Best regards,
Dieter, Oli and Tom
--
Dr. Thomas Zastrow
Seminar fuer Sprachwissenschaft
Universitaet Tuebingen
Wilhelmstr. 19
D-72074 Tuebingen
http://www.thomas-zastrow.de
Tel.: 07071/29-73968
Fax: 07071/29-5214
More information about the Dev
mailing list