value of the attribute format for formats without URI of Internet Media Type

18 views
Skip to first unread message

Nicolas Le Novere

unread,
Sep 23, 2014, 7:56:39 AM9/23/14
to combine...@googlegroups.com
Hello,

The current specification mentions that the attribute "format" of the element content in the manifest file is mandatory. Its values have to be an identifiers.org URI or an Internet Media Type. However, we also want to use the archive to store any type of relevant content, for instance open formats that are not part of COMBINE such as PharmML, but also widely used proprietary format such as MatLab or NMTRAN. So we are a bit in a pickle.

Possible solutions forward:

1a) We ban files that do not have an identifiers.org or Media Type URI
1b) We files that do not have an identifiers.org or Media Type URI are not listed in the manifest

1a and 1b are very close actually. I think 1b is pragmatic while 1a does not bring anything

2a) We accept the value unknown for the attribute format
2b) We accept other URIs as values for the attribute format and we develop quickly the MAMO branch that was discussed in HARMONY 2014

3) We make the attribute format optional

I believe 1) is compatible with the current version of the specification, and IMHO we should use 1b.

As for the future, there is not much difference between 2a and 3, except that 2a does not require a change of schema. 2b is nice but doomed to be incomplete.

I wonder if this is not a case of OMEX being the basis of the COMBINE archive, the latter being more restrictive.

What do you think?

--
Nicolas LE NOVERE, Babraham Institute, Babraham Campus Cambridge, CB22 3AT
Tel: +441223496433 Fax: +441223496034 Mob:+447833147074 twitter:@lenovere
Skype:n.lenovere, n.len...@gmail.com, ORCID: 0000-0002-6309-7327
http://lenoverelab.org/, http://lenoverelab.org/perso/lenov/

Frank T. Bergmann

unread,
Sep 23, 2014, 8:18:27 AM9/23/14
to n.len...@gmail.com, combine...@googlegroups.com

> I wonder if this is not a case of OMEX being the basis of the COMBINE
> archive, the latter being more restrictive.
>
> What do you think?
>

I think applications need to be able to store all files they want in the archive. But to be exchangeable, these files need to be listed in the manifest. For the longest time, we listed that the format should either be an identifiers.org URL, or a MIME-type. With MIME-types we would not face the problem, as any application would be free to store their files using either 'application/vnd-*' or 'application/x-*' prefixes.
We could optionally allow again the use of all MIME-types. The only change this would make, is that applications would not write:

http://purl.org/NET/mediatypes/application/pdf

but

application/pdf

as was defined originally.

Cheers
Frank
> --
> You received this message because you are subscribed to the Google Groups
> "COMBINE archive" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to combine-archi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/combine-archive.
> For more options, visit https://groups.google.com/d/optout.

Felix Winter

unread,
Sep 23, 2014, 8:21:07 AM9/23/14
to combine...@googlegroups.com
Hi Nicolas,

Maybe there is something about the specification which I miss, but I don't see a
problem with the "format" attribute being mandatory. As far as I know, there is
no format which cannot be described by a MIME type. In the case of Matlab files
I would just choose "application", which is defined as:
"The "application" media type is to be used for discrete data which do not fit
in any of the other categories, and particularly for data to be processed by
some type of application program." http://www.ietf.org/rfc/rfc2046.txt
The MIME types "text" and "application" together should be able to cover all
needs. Please correct me if I am wrong, which is clearly possible.

Or do you think that we need something more specific? Covering every possible
type of file format would be a bit too much to ask for, imho.

Kind regards,
Felix

> On September 23, 2014 at 1:57 PM Nicolas Le Novere <n.len...@gmail.com>
> --
> You received this message because you are subscribed to the Google Groups
> "COMBINE archive" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to combine-archi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/combine-archive.
> For more options, visit https://groups.google.com/d/optout.
Felix Winter
Thomas-Müntzer-Platz 63
18057 Rostock

IM: f...@jabber.systemli.org

Nicolas Le Novere

unread,
Sep 23, 2014, 8:30:56 AM9/23/14
to Frank T. Bergmann, combine...@googlegroups.com
On 23/09/14 13:18, Frank T. Bergmann wrote:
>
>> I wonder if this is not a case of OMEX being the basis of the COMBINE
>> archive, the latter being more restrictive.
>>
>> What do you think?
>>
>
> I think applications need to be able to store all files they want in the archive. But to be exchangeable, these files need to be listed in the manifest. For the longest time, we listed that the format should either be an identifiers.org URL, or a MIME-type. With MIME-types we would not face the problem, as any application would be free to store their files using either 'application/vnd-*' or 'application/x-*' prefixes.
> We could optionally allow again the use of all MIME-types. The only change this would make, is that applications would not write:
>
> http://purl.org/NET/mediatypes/application/pdf
>
> but
>
> application/pdf

There is no difference between those two since none is resolvable. I would rather keep the same form for all, with at least some of them being resolvable.

Nicolas Le Novere

unread,
Sep 23, 2014, 8:41:12 AM9/23/14
to combine...@googlegroups.com
On 23/09/14 13:21, Felix Winter wrote:

> Maybe there is something about the specification which I miss, but I don't see a
> problem with the "format" attribute being mandatory. As far as I know, there is
> no format which cannot be described by a MIME type. In the case of Matlab files
> I would just choose "application", which is defined as:
> "The "application" media type is to be used for discrete data which do not fit
> in any of the other categories, and particularly for data to be processed by
> some type of application program." http://www.ietf.org/rfc/rfc2046.txt
> The MIME types "text" and "application" together should be able to cover all
> needs. Please correct me if I am wrong, which is clearly possible.

Your answer is congruent with Frank's

So we could use:
http://purl.org/NET/mediatypes/application/x-matlab

(Media Types in the x branch do not need to be registered or linked to a registered vendor)

But then, what do we mean by that? The m-file? What about other MatLab files (e.g. mat files, DAQ files etc.)

> Or do you think that we need something more specific? Covering every possible
> type of file format would be a bit too much to ask for, imho.
>
> Kind regards,
> Felix>
> as was defined originally.

Frank T. Bergmann

unread,
Sep 23, 2014, 8:42:36 AM9/23/14
to Felix Winter, combine...@googlegroups.com
Hello Felix,

The only problem is a wording in the specification. While you are correct that MIME types are flexible. The specification currently states that they should be encoded in the form of a purl.org url. These are only resolvable for specific standardized formats (those with an RFC).

Personally, I don't require the MIME types to be resolvable. The applications I would write would deal with the indentifiers.org types. In case it is a MIME Type my application received, I would pass it on to an application installed on the users machine (i.e. a previewer) that is registered with the os (very common for pdfs images and the like). Otherwise I would let the user decide how those files would be treated.

But I still much rather see application/x-something, instead of 'unknown', as then applications could register for the specific mime type (as they should).

Cheers
Frank

> -----Original Message-----
> From: combine...@googlegroups.com [mailto:combine-
> arc...@googlegroups.com] On Behalf Of Felix Winter
> Sent: Tuesday, September 23, 2014 2:21 PM
> To: combine...@googlegroups.com
> Subject: Re: value of the attribute format for formats without URI of Internet
> Media Type
>

Jonathan Cooper

unread,
Sep 23, 2014, 8:45:50 AM9/23/14
to combine...@googlegroups.com
This seems like the right approach to me.

Frank T. Bergmann

unread,
Sep 23, 2014, 8:51:36 AM9/23/14
to n.len...@gmail.com, combine...@googlegroups.com
> Your answer is congruent with Frank's
>
> So we could use:
> http://purl.org/NET/mediatypes/application/x-matlab
>

If that is not prohibited then that is what we should do and not worry.

> (Media Types in the x branch do not need to be registered or linked to a
> registered vendor)
>
> But then, what do we mean by that? The m-file? What about other MatLab
> files (e.g. mat files, DAQ files etc.)
>

I think as far as the Manifest is concerned, all we need are processing instructions. And that is what we get with MIME-type. Applications are free to tell whether it is a MIME type they understand, or not (in my case whether a previewer exists or not), from then they need to let the user decide what to do with the file. I think that is fine, and I don't see a problem with there being potentially files in there that an application cannot process.

In the specific case of Matlab mentioned above, if Matlab is installed on the system, it will have registered a way to open those files, and otherwise the user will again have to be consulted if interested in the files.

But then there is no problem with the manifest that we need to solve here.
Frank

Nicolas Le Novere

unread,
Sep 23, 2014, 10:10:11 AM9/23/14
to Frank T. Bergmann, combine...@googlegroups.com
On 23/09/14 13:51, Frank T. Bergmann wrote:
>> Your answer is congruent with Frank's
>>
>> So we could use:
>> http://purl.org/NET/mediatypes/application/x-matlab
>>
>
> If that is not prohibited then that is what we should do and not worry.

Actually to comply with the current specification, it should be

http://purl.org/NET/mediatypes/application/x.matlab

Another example would be

http://purl.org/NET/mediatypes/application/x.copasi

>> (Media Types in the x branch do not need to be registered or linked to a
>> registered vendor)
>>
>> But then, what do we mean by that? The m-file? What about other MatLab
>> files (e.g. mat files, DAQ files etc.)
>>
>
> I think as far as the Manifest is concerned, all we need are processing instructions. And that is what we get with MIME-type. Applications are free to tell whether it is a MIME type they understand, or not (in my case whether a previewer exists or not), from then they need to let the user decide what to do with the file. I think that is fine, and I don't see a problem with there being potentially files in there that an application cannot process.
>
> In the specific case of Matlab mentioned above, if Matlab is installed on the system, it will have registered a way to open those files, and otherwise the user will again have to be consulted if interested in the files.
>
> But then there is no problem with the manifest that we need to solve here.

Cool

Stian Soiland-Reyes

unread,
Sep 23, 2014, 10:58:32 AM9/23/14
to Nicolas Le Novere, Frank T. Bergmann, combine...@googlegroups.com
I would suggest to add wording to use

http://purl.org/NET/mediatypes/application/octet-stream

if the media type is unknown. Falling back to application/octet-stream
is common practice by web-servers, and it gives a hint to readers that
the manifest writer simply doesn't know, and the reader can (out of
spec) try other ways to identify the content (such as using mime
magic).

If a producer needs to invent mediatypes on other people's behalf,
then I would suggest doing that with identifiers.org and a similar x-
naming style.


BTW - I (finally) moaned to IANA mailing list about the lack of
consistent URIs at the official iana.org site.
(Subject "Cool" Linked Data URIs for all IANA mediatypes? which should
soon appear at http://www.ietf.org/mail-archive/web/media-types/current/maillist.html
)
> --
> You received this message because you are subscribed to the Google Groups
> "COMBINE archive" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to combine-archi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/combine-archive.
> For more options, visit https://groups.google.com/d/optout.



--
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718
Reply all
Reply to author
Forward
0 new messages