[Standards] [Teiweblicht] [Dev] proposal: using a common mime type for TEI files

Piotr Bański banski at ids-mannheim.de
Mon Jul 11 19:10:08 CEST 2016


Dear Dieter,

Thank you so much for this catch. Indeed, on the one hand it's RFC 6129, 
and on the other, it's the "Architecture..." spec (in the previously 
quoted fragment [1]), that call for application/ in all cases. I'll 
modify this now in the wiki.

[1]: https://www.w3.org/TR/webarch/#xml-media-types

Best regards,

   P.

On 11/07/16 17:58, Dieter Van Uytvanck wrote:
> On 08/07/16 17:42, Piotr Bański wrote:
>> I have summarized Thomas's proposal at
>> https://trac.clarin.eu/wiki/MIME%20format%20variants
> Thank you Piotr!
>
> Great to see this discussion moving forward. Before I start editing the
> wiki, let me first check and mention a few points:
>
> - right now it states "text/tei+xml" as mimetype for TEI; shouldn't that
> be "application/tei+xml" ?
>
> - we have a CMDI component where we have gathered (at least a subset of)
> CLARIN-relevant mime types:
>
> https://catalog.clarin.eu/ds/ComponentRegistry#/?itemId=clarin.eu%3Acr1%3Ac_1271859438106&registrySpace=public
>
> It is not complete, but I will add it as a starting point to the trac
> (might take me a few days with the DH conference coming up)
>
> - I agree that the "format-variant=" is a necessary and elegant solution
> in case of e.g. the mimetype "application/tei+xml".
>
> - I am not sure about using this approach for general XML-based formats
> where no disambiguation on top of the mimetype is strictly necessary,
> since some form of mimeytype is already in use, (eg "text/tcf+xml", see
> https://vlo.clarin.eu/?q=text/tcf or "text/exb+xml", see
> https://vlo.clarin.eu/search?q=text/exb%2Bxml). We risk here to "invent"
> a new standard where some practice is already used in the wild.
>
> - Optional parameter(s) like "tokenized=0/1" were seen as problematic
> when discussing these with some of our developers - they can lead to
> arbitrary and unpredictable combinations. Maybe we can use something
> like "application/tei+xml;format-variant=dta-tokenized" instead? A major
> advantage of a finite list of format variants is that we can document
> every variant, eg with a link to an example file.
>
> best regards,


-- 
Piotr Bański, Ph.D.
Senior Researcher,
Institut für Deutsche Sprache,
R5 6-13
68-161 Mannheim, Germany



More information about the Standards mailing list