[Standards] [Dev] proposal: using a common mime type for TEI files

Sander Maijers sander at clarin.eu
Fri Jun 17 12:59:09 CEST 2016


Hi Dieter,

Using custom media types can be done in the a number of ways,
described in https://en.wikipedia.org/wiki/Media_type#Registration_trees
.
You stated the benefits of your solution well. Your solution has the
following costs:
- You'll have to either go through IANA registration procedure for new
media types in the ‘Personal or Vanity’ tree, go through IETF
Standards Action to get a CLARIN-specific tree, or break the standards
and use custom media types outside of this process.
- Whatever you opt in this context, no third-party (i.e., general,
standards compliant tools) will recognize the media type of centre's
content retrieved via PID URLs anymore.

I find Menzo's approach not the proper as well as most useful one
compared to media type based approaches. After all, you would want a
resource's metadata to be completely descriptive of such elementary
aspects as internal structure and content of the TEI files, and not
dependent on system configuration (served as custom media type x or y,
as long as the server remains so configured).

Best,
Sander


On Fri, Jun 17, 2016 at 11:39 AM, Dieter Van Uytvanck <dieter at clarin.eu> wrote:
> On 16/06/16 20:35, Thomas Schmidt wrote:
>> Therefore, we would need to distinguish this at whathever the place is
>> where WebLicht distinguishes file formats. If it is via the mime type,
>> we would need a mime type extension like "text/x-tei-isospoken+xml"
>> vs. "text/x-tei-dta+xml". If it is on some other level, we would have
>> to know which and agree on a suitable set of TEI variant identifiers.
>> I'm copying relevant parts of the mailing list exchange below for your
>> information.
>
> Dear Thomas,
>
> Thank you for this very insightful summary of the discussions on this
> topic. Looking at all the suggestions made, I think having detailed
> mimetype extensions would be the most convenient for most parties involved:
>
> - It puts the responsibility of providing an exact data type for a file
> at the side of the metadata creator/resource provider. This is always
> better than relying on interpretation by a third-party tool.
>
> - It does not require changes to (CMDI) metadata profiles.
>
> - It makes it feasible for tool/data matching applications (WebLicht,
> Switchboard, ...) to provide a meaningful processing application.
>
> There are of course approaches on other levels too (like suggested by
> Bart and Menzo), and these could be used in addition to the extended TEI
> mimetypes:
>
> - Matching applications could still try to parse a TEI file (in absence
> of a detailed mime type) and make a guess about the sub-type, and using
> @type where available. This is of course not trivial.
>
> - The ParameterGroup in the CMDI description can be added. But in many
> cases that requires metadata providers to change their profiles, which
> means quite a bit of additional work.
>
> I will join the TEI weblicht list, and try to gather a bit more concrete
> information in the upcoming time at
>
> https://trac.clarin.eu/wiki/TEI%20variants
>
> (feel free to edit along)
>
> When we have that additional information, we can try to come up with
> concrete recommendations.
>
> best regards,
> --
> Dieter Van Uytvanck
> Technical Director CLARIN ERIC
> www.clarin.eu | tel. +31-(0)850091363 | skype: dietervu.mpi
> _______________________________________________
> Dev mailing list
> Dev at lists.clarin.eu
> https://lists.clarin.eu/cgi-bin/mailman/listinfo/dev


More information about the Standards mailing list