[normal download], [torrent], [metalink], [next download format]
I'd like to think that this list could be just 2 options, normal
download & metalink, since hopefully metalink will be able to describe
the next download format as well.
but for some people, even the 2 options are too much.
is there a way we can make using metalinks easier for people already
using metalink clients?
at first, we had a way similar to Link Fingerprints -
URL#!metalink3!URLtoMetalink - which was a bit of a hack but worked!
apparently against spec, but it was a nice way to piggyback a metalink
URL onto a regular URL. most clients dropped the extra info starting
at #, while metalink clients could just get the metalink.
another (unimplemented) idea was a microformat, where Operator or
FlashGot passed on the metalink URL to clients that could use it,
while other clients got the URL to the regular file.
I think the semantic web people had a similar problem, where a URL
could serve up HTML for browsers, or RDF for RDF-aware clients.
maybe something like if the client has the metalink MIME type in the
Accept header & they request a file that has a metalink describing it,
send the metalink first but then (if the same server serving the
metalink also has the regular file) if the metalink client requests
the file again it will get the regular file (& not go into some loop).
is that possible or is there an elegant solution?
--
(( Anthony Bryan ... Metalink [ http://www.metalinker.org ]
)) Easier, More Reliable, Self Healing Downloads
How about just have the download link server look at the user agent and
then redirect the download link accordingly?
if user_agent == known_metalink_client:
redirect: "myfile.metalink"
else:
redirect: "myfile.exe"
Neil
Errrr, nevermind, I now see the error of my ways. That only works if
you manually plug the URL into the metalink client, it won't
automatically launch the client when clicking on the link.
I think to do what you are talking about you would need to check the
accept value when you present the download page to the user:
if accept_header == "metalink":
print "myfile.metalink"
else:
print "myfile.exe"
Of course this requires a browser modification or a plugin to do right.
Once you have a plugin, you can do some javascript checking to see if
the plugin or browser version is correct and change the URL link too.
Neil
> if accept_header == "metalink":
> print "myfile.metalink"
> else:
> print "myfile.exe"
i agree ;)
let's look at this from a native browser modification or plugin angle
(I'm guessing FlashGot could alter the Accept header if a metalink
compatible download manager is installed).
won't this loop, once the client requests myfile again to start the
actual download after it's requested it first and gotten the metalink?
assuming the metalink is stored on the same server as the file to be
downloaded...
it is true than X months ago, they has a thread with a small nicolas's
test on his own server which permit to
purpose an XML page which include on his own the metalink on a classical
WEB page...
Nope. There is never a request for "myfile" in this case. The two
situations are like this:
1. Metalink
- Browser requests download.html, server generates page with the
myfile.metalink download link since special accept header is present
- Browser requests myfile.metalink, pass file to metalink client
- Metalink client does normal download of myfile.exe
2. No Metalink client
- Browser requests download.html, server generates page with the normal
download link to myfile.exe.
- Browser requests myfile.exe as usual.
This isn't a complete solution as it requires some type of server side
scripting. A javascript component would probably cover any websites
that don't have that option.
Neil
This sounds good because it will work. However, the requirement for the
server-side intelligence it something that I would tend to avoid.
I'd rather suggest the following:
- the client sends an Accept header indicating its ability, but not
application/metalink+xml, because nearly every client sends "Accept:
application/*" which would mean the same. Thus, some different header
is required. Either "Accept: x-application/metalink" (maybe), or a new
header, like "X-Metalink: yes"
- the server may notice and understand the header, and return a metalink
file (either by generating it, or by rewriting the request to
${url}.metalink. The rewrite could be an internal redirect or an
external redirect which sends the client to the new location (the one
with ".metalink" appended).
- _if_ the client is about to follow a link which it acquired from with
in a metalink <url> element, in order to get the requested file from
the same server, it must not add the header which indicates its
ability, to avoid a loop.
I think that would work best.
In my opinion, the server logic needed here, in order to handle
and distinguish metalink/non-metalink requests is minimal, and can be
implemented by pure mod_rewrite magic, and doesn't require a script, or
a content generator which creates "special" content. The metalink files
that it redirects to can simply be on-disk. (Or of course they can still
be created on the fly.)
Does that make sense?
I believe that such support to transparent negotiate metalink handling
will be a big leap forward for metalinks!
Thanks,
Peter
--
"WARNING: This bug is visible to non-employees. Please be respectful!"
SUSE LINUX Products GmbH
Research & Development
There is one bit, that occured to me later, that I didn't mention
explicitely here -- the case where the metalink-enabled client can't
know in advance whether the server is going to be metalink-capable, or
if it a plain HTTP server which will return the requested object, and
not a metalink.
The client needs to be able to handle that "normal" (non-metalink)
reply. I don't know if that is practical, I hope so though -- I somehow
assume that a metalink client would, naturally, be able to act as normal
HTTP client.
Thus, the client could speak to non-metalink HTTP servers, as well as to
metalink-enabled HTTP servers which return metalinks only for some
files.
(Think of
- files that are not supposed to be redirected for various reasons
- mini files which can more efiiciently be return as is, instead of
sending a mirror list instead
- no mirror available for a certain files
)
> I think that would work best.
>
> In my opinion, the server logic needed here, in order to handle
> and distinguish metalink/non-metalink requests is minimal, and can be
> implemented by pure mod_rewrite magic, and doesn't require a script, or
> a content generator which creates "special" content. The metalink files
> that it redirects to can simply be on-disk. (Or of course they can still
> be created on the fly.)
>
> Does that make sense?
>
> I believe that such support to transparent negotiate metalink handling
> will be a big leap forward for metalinks!
Another consideration is to keep intermediate caches in mind. If the
response varies with regard what the client sent, it probably needs a
Vary: header to indicate which part of the request is causing the
variation (metalink or not, in our case), or if nothing else works, make
the metalink response uncachable.
thanks for picking this up!
On Thu, Apr 24, 2008 at 02:48:58PM -0700, Nils wrote:
>
> First of all, I don't share the concerns about application/*.
> You simply would q= application/metalink+xml which the server might
> then interpret (Apache Multiviews/TCN is capable of this for
> instance[1]).
>
> Then the statement that most clients send application/* isn't exactly
> true[2]...
Indeed, you are right. I now sampled some minutes of logs on a busy web
server, and I see that application/* turns up exactly once (msnbot/1.1),
while there are gazillions of clients sending */* (and lots that send no
accept header at all), but no other client which sends application/*.
> And besides, this doesn't even actually matter.
> Because you can tell the server to serve application/octet-stream
> instead of application/metalink+xml if the client doesn't explicitly
> ask for application/metalink+xml.
You are right. What I initially had in mind, is that some servers might
have a particular server-side "variant selection algorithm" in place,
which might lead to unwanted results, if the client sends either
application/* or */*. I had these concerns because I don't know very
well which existing mechanism might be out there. I now realize that
such mechanisms
- are maybe not in wide use, and Apache's mod_negotiation is probably
used primarily for selecting language variants of static pages.
- and that there isn't ant existing server-side implementation which
negotiates metalinks yet, and no other negotiation mechanism would
return a metalink anyway (because it doesn't know about them.)
> The client can tell from the response if it has been served a metalink
> or not (Content-Type).
>
> The "avoid loop" is indeed an important remark. ;)
Exactly.
> All in all the tools for "tranparent" metalinks via TCN exist for
> quiet some time.
> There is no real need to "reinvent" the wheel for metalinks. ;)
Seems you are right :-)
> I'll do some experiments myself, I guess.
I have changed the download.opensuse.org server to negotiate upon the
Accept header now. I kept the Accept-Features negotiation in for now,
but it is scheduled to be removed later. (At the moment there is still
some documentation referring to it.)
Thus, metalinks can now be negotiated like this:
% curl -s -H "Accept: foobar,application/metalink+xml,*/*" 'http://download.opensuse.org/distribution/10.3/repo/oss/GPLv3.txt' | head -1
<?xml version="1.0" encoding="UTF-8"?>
The reply comes with
Content-Disposition: attachment; filename="GPLv3.txt.metalink"
Content-Type: application/metalink+xml; charset=UTF-8
For clients not sending "application/metalink+xml" in the accept header,
the file itself returned.
% curl -s 'http://download.opensuse.org/distribution/10.3/repo/oss/GPLv3.txt' | head -1
GNU GENERAL PUBLIC LICENSE
Can you try if that works for you / makes sense for you?
Let me know what you think.
Thanks!
this is so nice & useful!
it brings up some issues regarding MIME type.
"application/metalink+xml" was originally chosen because it follows
the convention of other XML formats, but this is not officially
registered. (KDE had an issue with this).
looking at http://www.iana.org/cgi-bin/mediatypes.pl ("Note:
Registrations in the standards tree must be approved by the IESG and
must correspond to a formal publication by a recognized standards
body." "Standards Tree - " (blank)) makes it sound like
"application/metalink+xml" would not be possible until metalink is
approved by a standards group. (which I'm not against).
so, we could continue using the unofficial unregistered MIME type
(strangely, formats like RSS and bittorrent are not registered, at
least according to
http://www.iana.org/assignments/media-types/application/ ) or we could
register one.
I'd like to register and it looks like we'd be in the Vendor tree, at
least for now. I don't know we could "upgrade" back to
application/metalink+xml later if approved by a standards group.
I think application/vnd.metalinker.org+xml would be good.
anyone see possible problems with that?
On Wed, Apr 30, 2008 at 06:46:41AM -0700, Nils wrote:
> On Apr 27, 12:45 pm, Peter Poeml <po...@suse.de> wrote:
> > I have changed the download.opensuse.org server to negotiate upon the
> > Accept header now. I kept the Accept-Features negotiation in for now,
> > but it is scheduled to be removed later. (At the moment there is still
> > some documentation referring to it.)
> >
> > Thus, metalinks can now be negotiated like this:
> >
> > % curl -s -H "Accept: foobar,application/metalink+xml,*/*" 'http://download.opensuse.org/distribution/10.3/repo/oss/GPLv3.txt'| head -1
> > <?xml version="1.0" encoding="UTF-8"?>
> >
> > The reply comes with
> > Content-Disposition: attachment; filename="GPLv3.txt.metalink"
> > Content-Type: application/metalink+xml; charset=UTF-8
> >
> > For clients not sending "application/metalink+xml" in the accept header,
> > the file itself returned.
> > % curl -s 'http://download.opensuse.org/distribution/10.3/repo/oss/GPLv3.txt'| head -1
> > GNU GENERAL PUBLIC LICENSE
> >
> > Can you try if that works for you / makes sense for you?
> >
> > Let me know what you think.
>
> Sorry, I didn't find the time to test till now. I'll try it (if it is
> still working) and get DownThemAll! trunk to work with it.
It actually didn't work "quite right" yesterday because I broke the
<size> element, which caused clients to refuse the metalink. But I just
fixed it, so it works as it should now. Sorry if you ran into that.
"Other ways of getting a description through HTTP
* Use content negotiation. If you ask for RDF, you get the
description. If you ask for something else, you get the thing
described. (The TAG, TimBL, and others have pointed out that this
contradicts web architecture, which requires that content negotiation
choose among things that all carry the same information. That goes for
CN between RDF and HTML as much as it does for CN between GIF and
JPEG.)"
the correct, web architecture complient way to do this is apparently
the HTTP Link header:
Link: <http://example.com/resource.metalink>; rel="describedby";
type="application/metalink+xml";
http://tools.ietf.org/html/draft-nottingham-http-link-header-03
http://tools.ietf.org/html/draft-hammer-discovery-01
Ugh, first we're told not to use file.iso#!metalink!file.metalink, and now
this...
But on second thought, discouraging this use seems correct in principle...
Those seem to be years, not RFC numbers.
On Sat, Jan 17, 2009 at 2:10 PM, Eran Hammer-Lahav <er...@hueniverse.com> wrote:
>
> Again, there is nothing 'technically' wrong with this approach any
> others have taken a similar position that a descriptor can be
> collapsed into a single resource URI. But a consensus is building that
> this is the wrong way of doing things. What you might want to
> consider, since you find the semantic discussion as nonsense (which I
> can respect) is the deployment ramifications of using the Accept
> header. Many platforms limit access to such headers, some proxies
> mishandle Vary headers (which BTW, the spec should require with to any
> Accept reply), and some providers will not allow using it on their
> servers. You might want to read John Panzer's view of this [1].
what should the spec require? could you propose some text, I'm not
familiar w/ that.
Eran started a thread about TCN on the HTTP list at
http://lists.w3.org/Archives/Public/ietf-http-wg/2009JanMar/0014.html
(it wouldn't hurt the draft process for metalink people to be involved
on there :) which includes Mark's reply:
"To my knowledge, caching intermediaries haven't deployed it (i.e.,
they'll work with TCN, but they won't be able to serve negotiated
requests from cache... somebody please correct me if I'm wrong).
I'm not sure about browser implementation, but I did a quick check of
the request headers seen by a very high-traffic Web site, and a
vanishingly small number contained the Negotiate header..."
(someone had been working on a metalink plugin for squid).
I figured it wouldn't hurt to quote what we use now & what we could
use in the future from Eran's draft directly:
http://tools.ietf.org/html/draft-hammer-discovery-01
Appendix A.2.1. HTTP Response Header
When a resource representation is retrieved using and HTTP GET
request, the server includes in the response a header pointing to the
location of the descriptor document. For example, POWDER uses the
'Link' response header to create an association between the resource
and its descriptor. XRDS [XRDS] (based on the Yadis protocol
[Yadis]) uses a similar approach, but since the Link header was not
available when Yadis was first drafted, it defines a custom header
X-XRDS-Location which serves a similar but less generic purpose.
[+] Self Declaration - using the Link header, any resource can point
to its descriptor documents.
[-] Direct Descriptor Access - the header is only accessible when
requesting the resource itself via an HTTP GET request. While
HTTP GET is meant to be a safe operation, it is still possible for
some resource to have side-effects.
[+] Web Architecture Compliant - uses the Link header which is an
IETF Internet Standard [[ currently a standard-track draft ]], and
is consistent with HTTP protocol design.
[-] Scale and Technology Agnostic - since discovery accounts for a
small percent of resource requests, the extra Link header is
wasteful. For some hosted servers, access to HTTP headers is
limited and will prevent implementation.
[+] Extensible - the Link header provides built-in extensibility by
allowing new link relationships, mime-types, and other extensions.
Minimum roundtrips to retrieve the resource descriptor: 2
Appendix A.2.2. HTTP Response Header Via HEAD
Same as the HTTP Response Header method but used with an HTTP HEAD
request. The idea of using the HEAD method is to solve the wasteful
overhead of including the Link header in every reply. By limiting
the appearance of the Link header only to HEAD responses, typical GET
requests are not encumbered by the extra bytes.
[+] Self Declaration - Same as the HTTP Response Header method.
[-] Direct Descriptor Access - Same as the HTTP Response Header
method.
[-] Web Architecture Compliant - HTTP HEAD should return the exact
same response as HTTP GET with the sole exception that the
response body is omitted. By adding headers only to the HEAD
response, this solution violates the HTTP protocol and might not
work properly with proxies as they can return the header of the
cached GET request.
[+] Scale and Technology Agnostic - solves the wasted bandwidth
associated with the HTTP Response Header method, but still suffers
from the limitation imposed by requiring access to HTTP headers.
[+] Extensible - Same as the HTTP Response Header method.
Minimum roundtrips to retrieve the resource descriptor: 2
Appendix A.2.3. HTTP Content Negotiation
Using the HTTP Accept request header or Transparent Content
Negotiation as defined in [RFC2295], the consumer informs the server
it is interested in the descriptor and not the resource itself, to
which the server responds with the descriptor document or its
location. In Yadis, the consumer sends an HTTP GET (or HEAD) request
to the resource URI with an Accept header and content-type
application/xrds+xml. This informs the server of the consumer's
discovery interest, which in turn may reply with the descriptor
document itself, redirect to it, or return its location via the
X-XRDS-Location response header.
[-] Self Declaration - does not address as it focuses on the
consumer declaring its intentions.
[+] Direct Descriptor Access - provides a simple method for directly
requesting the descriptor document.
[-] Web Architecture Compliant - while it can be argued that the
descriptor can be considered another representation of the
resource, it is very much external to it. Using the Accept header
to request a separate resource (as opposed to a different
representation of the same resource) violates web architecture.
It also prevents using the discovery content-type as a valid
(self-standing) web resource having its own descriptor.
[-] Scale and Technology Agnostic - requires access to HTTP request
and response headers, as well as the registration of multiple
handlers for the same resource URI based on the Accept header. In
addition, improper use or implementation of the Vary header in
conjunction with the Accept header will cause caches to serve the
descriptor document instead of the resource itself - a great
concern to large providers with frequently visited front-pages.
[-] Extensible - applies an implicit relationship type to the
descriptor mime-type, limiting descriptor formats to a single
purpose. It also prevents using existing mime-types from being
used as a descriptor format.
Minimum roundtrips to retrieve the resource descriptor: 1
unfortunately, this rules out content negotiation, one of the
easiest/coolest features metalink uses.
It doesn't if you use HEAD.
Well, this is unfortunately missing the point of what we are doing.
Avoiding the additional round-trip is key to put this to use in a
high scalability setting. An additional round-trip is preclusive for me.
Without getting too philosophical, I see no problems in using content
negotiation to achieve this. Of all the options it seems to be the best
and most suitable one, and I neither see a conflict with what it's meant
for.
And actually, doing an extra request to find out whether a Link: to a
metalink could be provided sounds superfluous to me. What's more, when I
do a GET request, I get a response with the resource anyway, so I (as a
client) would rather have to do an extra HEAD request to discover such
Link: headers.
This would seem to me about as useful as, in language variant
negotiation, if a web browser first does a separate request to discover
the variants, and after choosing one does the real request.
This just doesn't fly, would it?
This might not make a noteworthy difference on a smallish server, but in
large-scale environment, a doubling of requests makes a real difference.
For me, this is not just theory. I have to deal with 15-40.000.000
requests per day on one server and I don't want to double those.
And while server load is one thing in these matters -- client response
latency is another. Far away clients would get notably worse response
times when they have to do two requests. The latency of overseas
connections, and even more of those to countries with bad Internet
connectivity, satellite links etc. are always causing painful delays.
And the additional bandwidth used would also not make it better.
An additional Link header would be a good idea, though. I would happily
support this. It could be an interesting option.
Thank you very much for your thoughts and insight!
> Let me know if I can help in any way.
>
> EHL
>
> [1] http://tools.ietf.org/html/draft-nottingham-http-link-header-03
> [2] http://tools.ietf.org/html/draft-hammer-discovery-01
> [3] http://www.hueniverse.com/hueniverse/2008/09/discovery-and-h.html
>
>
> On Jan 16, 4:13 pm, Anthony Bryan <anthonybr...@gmail.com> wrote:
> > Eran Hammer-Lahav and Mark Nottingham have informed me that using
> > transparent content negotiation for serving a "description" of a file,
> > and not an alternative version (like PNG vs JPG) of the same thing has
> > been ruled against by the W3C TAG. seehttp://esw.w3.org/topic/FindingResourceDescriptions
> >
> > "Other ways of getting a description through HTTP
> > * Use content negotiation. If you ask for RDF, you get the
> > description. If you ask for something else, you get the thing
> > described. (The TAG, TimBL, and others have pointed out that this
> > contradicts web architecture, which requires that content negotiation
> > choose among things that all carry the same information. That goes for
> > CN between RDF and HTML as much as it does for CN between GIF and
> > JPEG.)"
> >
> > the correct, web architecture complient way to do this is apparently
> > the HTTP Link header:
> >
> > Link: <http://example.com/resource.metalink>; rel="describedby";
> > type="application/metalink+xml";
> >
> > http://tools.ietf.org/html/draft-nottingham-http-link-header-03http://tools.ietf.org/html/draft-hammer-discovery-01
Peter
--
Contact: ad...@opensuse.org (a.k.a. ftpa...@suse.com)
#opensuse-mirrors on freenode.net
Info: http://en.opensuse.org/Mirror_Infrastructure
On Thu, Feb 05, 2009 at 02:34:19PM -0800, Nils wrote:
> On Jan 20, 1:05 am, Anthony Bryan <anthonybr...@gmail.com> wrote:
[...]
> > Eran started a thread about TCN on the HTTP list athttp://lists.w3.org/Archives/Public/ietf-http-wg/2009JanMar/0014.html
> > (it wouldn't hurt the draft process for metalink people to be involved
> > on there :) which includes Mark's reply:
> >
> > "To my knowledge, caching intermediaries haven't deployed it (i.e.,
> > they'll work with TCN, but they won't be able to serve negotiated
> > requests from cache... somebody please correct me if I'm wrong).
One caching intermediary that perfectly supports this is Apache's
disk_cache. As an example, you could load the mirror list from
http://mirrors.opensuse.org/ and you'll get a gzipped version if your
browser indicates so, and a plain version if not. Both are correctly
cached according to the set Expires, and served from the cache. If you
refresh them and cause a cache miss, you'll get a newly compressed, or
newly generated plain, version from the origin server (which is the same
Apache). (This is mostly used locallly, or for taking load of backends.
And it has come a long way; Apache 2.2 required.)
Another one is squid -- even old versions of squid support this just
fine. Squid is much more common as intermediary proxy of course.
There may be (many) other caching intermediary proxies that I am not
aware of and haven't worked with -- I don't know. Anyhow, transparent
negotiation of gzip encoding is so widely used that we can probably
safely assume that it is handled in most cases. Too many highly popular
web sites use it.
And ISPs doing interception caching, and not doing it right in this
regard, would pretty soon be out of business I think.
> > I'm not sure about browser implementation, but I did a quick check of
> > the request headers seen by a very high-traffic Web site, and a
> > vanishingly small number contained the Negotiate header..."
>
> A usual website usually has not much to negotiate...
> Likely only image representations, if there are even different
> available.
>
> What you will likely see far more often is:
> Vary: Accept-Encoding
> A lot of sites offer either "plain" or gzip encoding. This is, as far
> as I know, relatively widely deployed.
> Hence I guess (but have no concrete data to undermine that guess) that
> most proxies today in fact have no problem dealing with this.
It is quite commonplace to add a Vary header Accept-Encoding when
compressing content. It is not as well documented as it should be, but
admins find out about it pretty quickly, usually. Luckily enough, it is
pretty obvious and users notice that they get "garbage" soon, and the
problem is easy enough to find out about, and the solution trivial.
Thank you very much for your educated and detailed thoughts.
Thanks also for being present here on this list; I have subscribed to
the ietf-http-wg list to get a better interconnection by being present
there myself. Also to learn.
I have a bit difficulty seeing the use case for the Link header right
now (although I'm fully supportive of it as another option), because in
my case (metalink generator running on download server), any HTML page
that the user might be looking at with a browser is typically running on
a different server -- often even outside of my control -- which doesn't
support Metalinks itself. I have not digged deeply enough into the Link
header matter to fully understand it, but this seems like a intricacy to
me.
In addition, the fact that the metalink is transparently negotiated
allows for two further particularities -- one not unimportant, the other
crucial:
1) efficiency: for a small resource, let's say 512 bytes in size, it
would be inefficient to construct a metalink for it, or
even HTTP redirect the request, because to return the
resource directly results in about the same amount of
data being transferred to the client as the metalink or
HTTP redirect. The server response might even fit into a
single TCP payload together with the headers.
2) security: the server can decide to return certain resources
directly for security reasons.
There can also be exceptions of other kinds; e.g. one part of the URL
space, or objects of a certain mime type, or objects matching a certain
file pattern, being redirected (HTTP 30x + Location header) to another
server, the latter not being metalink capable.
All these examples are not made up -- I could show you real Apache
config files. :)
This makes it look unfeasible to me to declare metalink capability
globally. Of course, the server could maybe, eventually, do a metalink
for all the resources; but for a HEAD request or to generate a Link
header it would have to run through the full request processing phase to
decide on it.
All in all, the Link header seems to me to be more suitable for related
(other) resources. For instance, I can see a usecase for adding a Link
header, for foobar, to a foobar.md5 and foobar.asc and foobar.sha1 and
foobar.torrent resource. From there it would be a logical step to say
that foobar.metalink would also be such a resource - yes. However, how
many Link headers can I practically add without exceeding the size of a
single TCP packet? It would not be practical as far as I can see. It
makes sense only in selected cases I think. On the contrary, a Metalink
already _is_ a "directory" of such related resources in a way which
encompasses them together in a format that can be handled in an
efficient way. Like a hundred Link headers at a time ;) Looks like a
similar effort, doesn't it? ;)
I'm new to this, and happy to learn more.
Thanks,
Do a GET. If there is no Link, continue downloading from it. If there is a
Link for a .metalink, download the .metalink, parse it, and... you will
probably find the original URL is one of the mirrors. So you never have to
abort the first GET, just keep using it as a download source.
I want to reiterate that I am not suggesting replacing the content of a Metalink document with a set of links. Just the opposite. I think Metalink is a perfect example of a resource descriptor with very specific use cases (as opposed to more generic descriptors such as POWDER and XRD). I am also not objecting to the use of content negotiation. It is your prerogative to define Metalink as a valid representation of a file resource. Either way, this is more of a philosophical discussion than a technical one and I think we are past that.
What I am suggesting is that links can offer similar functionality. <LINK> elements are not valid in this case because the HTML page is not likely to be the file being downloaded (unless you consider the HTML page another representation of the same resource, but that is a stretch in this case). This leaves us with HTTP Link headers, which can be obtained from a HEAD request on the file URI, and Link-Pattern records in the site's /host-meta file.
The last option is interesting because it allows an entire download server to declare how to obtain a Metalink for any file within its authority (host). Yes, if you only download one file, you need two round trips. But if you are downloading multiple files from the same server, you can cache the /host-meta information and go directly to the Metalink descriptor for any given download.
One more think that links can offer you is the ability to support Metalink of shared hosting environments where the service is not likely to give you access to content negotiation configuration on the server. But if you can drop a /host-meta file, you can bypass that and still support Metalink. But this of course requires that the spec tells client to look for such links if the Accept header approach fails.
Now, the design requirements for the link framework I'm proposing are much more restrictive than what I am assuming you are using for Metalink. For example:
1. Clients can be assumed to have full access to the full HTTP feature set including content negotiation.
2. There is no such thing as a Metalink of a Metalink file (i.e. second derivative).
3. Metalink is not useful as a format for anything else. It is always associated with a file which is the primary focus.
4. File servers are willing to support content negotiation for file downloads.
None of these are possible for my own use cases. For example, Yahoo! will not allow using content negotiation on key properties such as the front page, but I still need to associate a descriptor to it. Yes, it is unlikely that Y!'s front page will even support or benefit from Metalink. Also, many of the platforms I need to support such as Javascript, Flash, and old versions of PHP will not allow easy access to clients to some HTTP features. And many hosting services will not allow users to setup content negotiation for their files.
So my suggestion is for you to keep what you have, and consider if the value of what I am proposing is worth including as a secondary discovery mechanism.
EHL