[Dev] CLARIN-FCS: clarification about FCS schema
Matej Durco
xnrn at gmx.net
Mon Oct 1 16:28:48 CEST 2012
Hi,
ad 1)
I too vote for c)
(the @type attribute was introduced
because the the initial values (kwic, fulltext)
did not seem to fit in the mime-type domain.
But as we can define our own mime-types,
it sounds the most proper way.)
2)
the recursiveness was introduced to allow
referencing parent collections to the (matching) Resource
(including (potentially resolved) references to the CMD-record):
<sru:recordData>
<fcs:Resource pid="{ancestor-collection}">
<fcs:Resource pid="{parent-collection}">
<fcs:DataView mime-type="application/x-clarin-cmd+xml"
ref="{cmd-url}" />
<fcs:Resource pid="{matching-resource-handle}" >
<fcs:ResourceFragment pid="{fragment-identifier}" >
<fcs:DataView
mime-type="application/x-clarin-fcs-kwic+xml" >... </fcs:DataView>
</fcs:ResourceFragment>
</fcs:Resource>
</fcs:Resource>
</fcs:Resource>
</sru:recordData>
it is problematic insofar, as if there are multiple matching Resources
within one collection
they still should be put in a separate hit (<sru:record>).
So admittedly this is practically not applicable:
<sru:recordData>
<fcs:Resource pid="{collection-handle}">
<fcs:Resource pid="{res1-pid}" > ... </fcs:Resource>
<fcs:Resource pid="{res2-pid}" > ... </fcs:Resource>
<fcs:Resource>
</sru:recordData>
I would still vote for (corrected) recursiveness, except there is an
alternative proposal for referencing the parent-collections.
best,
matej
Am 01.10.2012 14:12, schrieb Oliver Schonefeld:
> [X-Posted to CLARIN-D developers]
>
> Hi,
>
> while building a SRU client for FCS, I revisited the current CLARIN-FCS
> record schema [1].
> I've got two issues with the current schema, I'd like to get discuss
> with interested developers:
>
> 1) [minor] The dataview type currently allows only three values
> ("kwic", "fulltext", "image"). Some endpoints, e.g. Meertens, also
> have a DataView for KML. However, the "kml" is currently not within
> the set of allowed values, thus resulting in invalid XML.
> We have several options to deal with this:
> a) add "kml" to the list of allows values (and do this, every time a
> new dataview pops up; including bumping the version number of the
> schema)
> b) get rid of the predefined values and define attribute value to be
> of type xs:NMTOKEN (or something similar)
> c) drop the @type attribute in favor of a proper @mime-type
> attribute. For our own types (e.g. kwic) we could define
> a non-standard mime types (cf. RFC 2045, RFC 4288), e.g. like
> "application/x-clarin-fcs-kwic+xml"
>
> (SN: KML has a officially registered mime-type:
> "application/vnd.google-earth.kml+xml")
>
> BTW, I'd vote for solution c ...
>
>
> 2) [major] "Resource" is currently defined semi-recursive:
> <xs:complexType name="ResourceType">
> <xs:sequence>
> <xs:element maxOccurs="unbounded" minOccurs="0"
> name="Resource" type="fcs:ResourceType"/>
> <xs:element maxOccurs="unbounded" minOccurs="0"
> name="DataView" type="fcs:DataViewType"/>
> <xs:element maxOccurs="unbounded" minOccurs="0"
> name="ResourceFragment" type="fcs:ResourceFragmentType"/>
> </xs:sequence>
> <xs:attribute name="pid" type="fcs:pidType" use="optional"/>
> <xs:attribute name="ref" type="fcs:refType" use="optional"/>
> </xs:complexType>
> Since maxOccures defaults to 1 (not "unlimited"), the definition of
> the type in the XSD allows for structures where a Resource may have
> zero-or-one Resource as child, thus forming structure like
> (namespaces and other elements omitted for brevity):
> <Resource ...>
> <Resource ...>
> <Resource ...>
> <Resource ...>
> <!-- ad infinitum -->
> </Resource>
> </Resource>
> </Resource>
> </Resource>
> However no Resource elements, with more than one Resource elements
> as child, like:
> <Resource ...>
> <Resource ...>
> </Resource>
> <Resource ...>
> </Resource>
> </Resource>
>
> The first structure does not really make sense to me, while the one
> could argue, that the second could be used to produce a structures
> result in form of a (sub-)corpus.
> My suggestion is either to drop the recursiveness or define it
> properly (including some real world use cases, why this is needed).
>
> BTW, I'd vote for dropping the recursiveness ...
>
> Comments, Ideas, Thoughts?
>
> Best,
> Oliver
>
> [1] http://trac.clarin.eu/browser/FederatedSearch/Resource.xsd
More information about the Dev
mailing list