I have recently submitted a proposal for a new email header,
the Archived-At header. You can find it at
http://www.ietf.org/internet-drafts/draft-duerst-archived-at-00.txt
It defines a new email header, Archived-At:, to provide a direct
link to the archived form of an individual mail message. We use
this extremely successfully (currently in the form X-Archived-At)
at W3C.
I would highly appreciate any feedback and comments on this proposal.
Please make sure you copy me, as I'm not subscribed to this list.
Regards, Martin.
It might be appropriate for the field name to start with List-*, at
least for those cases where the archive is associated with a list. That
way, it's easier to separate fields added by a list from fields supplied
by the sender.
I'd also recommend that the link point to an archived copy of the
message in original form, rather than, say, one that is translated to
HTML. Translating to HTML causes a loss of information and potentially
a loss of functionality. You should be able to reply to an archived
message, refile it into a folder, follow threads, etc., but those
things are harder to do if the message is no longer in its original
format.
Thinking about this I realized that one reason I'd like such a field is
so that I could, given a message, more easily find other messages in the
thread. After all, if I already have a copy of the message with the
archived-at field, why would I want to download it? I'm much more likely
to want to look at either the messages that preceeded or followed that
message.
As long as the field remains completely unstructured, I see no need to
support comments. I'd probably change my mind if the field were changed
to be, say, a list of URIs.
> I'd also recommend that the link point to an archived copy of the
> message in original form, rather than, say, one that is translated
> to HTML. Translating to HTML causes a loss of information and
> potentially a loss of functionality. You should be able to reply to
> an archived message, refile it into a folder, follow threads, etc.,
> but those things are harder to do if the message is no longer in its
> original format.
>
> Thinking about this I realized that one reason I'd like such a field
> is so that I could, given a message, more easily find other messages
> in the thread.
Actually, if the link points to a message/rfc822 resource, won't it be
harder to find the other messages in the thread than if the link points
to a text/html page with hyperlinks? On the other hand, it's easier to
reply to a message if it's given in message/rfc822 form (or at least, it
could be, with a little browser support). Maybe it would be best for
the Archived-At: field to point to a text/html page which in turn links
to a message/rfc822 version of the same message.
> After all, if I already have a copy of the message with the
> archived-at field, why would I want to download it?
You probably wouldn't, but regardless of what data type the Archived-At:
field points at, it will make citing that message easier. Currently,
you can either include a copy of the message (which is inefficient),
or figure out the URI yourself by searching the archive (which is
inconvenient for you) or provide the message-id and let the reader
search the archive (which is inconvenient for them).
AMC
obviously it depends on the html page with hyperlinks. there are a lot
of poorly designed archives out there.
the point is, if the fact that the resource is a message is lost, it
really cannot be handled correctly. you no longer have a message, you
have a translation of a message into text.
it's too bad that we don't have a mailbox access protocol that handles
threads well. maybe we need yet another IMAP extension.
> On the other hand, it's easier to
> reply to a message if it's given in message/rfc822 form (or at least,
> it
> could be, with a little browser support). Maybe it would be best for
> the Archived-At: field to point to a text/html page which in turn links
> to a message/rfc822 version of the same message.
no, I doubt it. once something is html, it's really not reasonable to
associate special semantics with it.
the client and server could do http content negotiation, but that only
works for http.
>> After all, if I already have a copy of the message with the
>> archived-at field, why would I want to download it?
>
> You probably wouldn't, but regardless of what data type the
> Archived-At:
> field points at, it will make citing that message easier.
agreed. I just don't think I'd want to cite _a message_ as often as
I'd want to cite _the context of a message_.
> As long as the field remains completely unstructured, I see no need to
> support comments. I'd probably change my mind if the field were changed
> to be, say, a list of URIs.
Comments are incompatible with URIs unless some quoting mechanism is used.
That is because URIs may contain parentheses, as in
http://users.erols.com/blilly/(foo)(bar)
See RFC2396 for URI syntax details. RFC 2369 quotes URIs in angle brackets,
which are not themselves allowed in URIs.
One issue to consider is long URIs. If comments are disallowed, no quoting
is required. URIs cannot contain whitespace characters, so a simple way to
handle a long URI is to allow a URI to be line-folded; it can be reconstructed
by unfolding and eliding any whitespace. To handle multiple URIs, a quoting
mechanism can be used where line folding and whitespace (but not comments)
are permitted within the quotes; the URI can be reconstructed as described
above. Comments could be permitted outside the quotes.
#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################
#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################
> Keith Moore <mo...@cs.utk.edu> wrote:
>
>> I'd also recommend that the link point to an archived copy of the
>> message in original form, rather than, say, one that is translated
>> to HTML. Translating to HTML causes a loss of information and
>> potentially a loss of functionality. You should be able to reply to
>> an archived message, refile it into a folder, follow threads, etc.,
>> but those things are harder to do if the message is no longer in its
>> original format.
>>
>> Thinking about this I realized that one reason I'd like such a field
>> is so that I could, given a message, more easily find other messages
>> in the thread.
>
> Actually, if the link points to a message/rfc822 resource, won't it be
> harder to find the other messages in the thread than if the link points
> to a text/html page with hyperlinks? On the other hand, it's easier to
> reply to a message if it's given in message/rfc822 form (or at least, it
> could be, with a little browser support). Maybe it would be best for
> the Archived-At: field to point to a text/html page which in turn links
> to a message/rfc822 version of the same message.
Another idea is to serve both message/rfc822 and text/html at the same
URI. Browsers generally indicate the format they want, and servers
can pick the right content accordingly. As far as conformance is
concerned, message/rfc822 could be a MUST (to make it useful from
progrms), and text/html be a MAY (to improve rendering for humans).
On the other hand, perhaps this solution is too fragile.
>I like the idea.
As someone who'se made extensive use of the W3C list archive facility, I
heartily concur. It is a really valuable idea, and works very well in
practice.
>It might be appropriate for the field name to start with List-*, at
>least for those cases where the archive is associated with a list. That
>way, it's easier to separate fields added by a list from fields supplied
>by the sender.
No argument.
>I'd also recommend that the link point to an archived copy of the
>message in original form, rather than, say, one that is translated to
>HTML. Translating to HTML causes a loss of information and potentially
>a loss of functionality. You should be able to reply to an archived
>message, refile it into a folder, follow threads, etc., but those
>things are harder to do if the message is no longer in its original
>format.
Hmmm... the way the W3C system works, the URL works by redirection (or
something) to the actual archived message. I guess that multiple formats
could be served at the same URI using existing Web protocols. (e.g. use
HTTP GET with Accept: message/rfc822 -- then it's up to the archive
implementation to determine if it honours or rejects that.)
The existing W3C system provides links for following threads, etc., using a
regular browser.
(Did anyone mention "running code"? ;-)
#g
--
>Thinking about this I realized that one reason I'd like such a field is
>so that I could, given a message, more easily find other messages in the
>thread. After all, if I already have a copy of the message with the
>archived-at field, why would I want to download it? I'm much more likely
>to want to look at either the messages that preceeded or followed that
>message.
>
>As long as the field remains completely unstructured, I see no need to
>support comments. I'd probably change my mind if the field were changed
>to be, say, a list of URIs.
------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact
Many thanks for your comments.
At 20:37 04/02/20 -0500, Bruce Lilly wrote:
>Keith Moore wrote:
>
> > As long as the field remains completely unstructured, I see no need to
> > support comments. I'd probably change my mind if the field were changed
> > to be, say, a list of URIs.
>
>Comments are incompatible with URIs unless some quoting mechanism is used.
>That is because URIs may contain parentheses, as in
>http://users.erols.com/blilly/(foo)(bar)
>See RFC2396 for URI syntax details. RFC 2369 quotes URIs in angle brackets,
>which are not themselves allowed in URIs.
I have disallowed comments.
>One issue to consider is long URIs. If comments are disallowed, no quoting
>is required. URIs cannot contain whitespace characters, so a simple way to
>handle a long URI is to allow a URI to be line-folded; it can be reconstructed
>by unfolding and eliding any whitespace. To handle multiple URIs, a quoting
>mechanism can be used where line folding and whitespace (but not comments)
>are permitted within the quotes; the URI can be reconstructed as described
>above. Comments could be permitted outside the quotes.
I have mentioned the length limitations of RFC 2822. I think the 78
limitation may be a bit tough in some cases, but the 998 limitation
should not cause any problems in practice.
Regards, Martin.
At 18:21 04/02/20 -0500, Keith Moore wrote:
>I like the idea.
Thanks. Not exactly mine, but I'm glad to spread the word.
>It might be appropriate for the field name to start with List-*, at
>least for those cases where the archive is associated with a list. That
>way, it's easier to separate fields added by a list from fields supplied
>by the sender.
Good point. But it is not really restricted to lists. There are
other potential applications, which are not related to mailing lists.
So starting with List-* would be confusing. I added a sentence
to that effect.
>I'd also recommend that the link point to an archived copy of the
>message in original form, rather than, say, one that is translated to
>HTML. Translating to HTML causes a loss of information and potentially
>a loss of functionality. You should be able to reply to an archived
>message, refile it into a folder, follow threads, etc., but those
>things are harder to do if the message is no longer in its original
>format.
I have added some text about different formats, along the lines
suggested by Graham. I think functions such as being able to reply,...
would be great. Actually, our archives have such a function;
look for "Mail actions: [ respond to this message ]". The functionality
is limited to what can be done with the 'mailto' URI, and only
preserves some headers, and not the body. If you have ideas on
how to implement improvements, that would be great.
>Thinking about this I realized that one reason I'd like such a field is
>so that I could, given a message, more easily find other messages in the
>thread. After all, if I already have a copy of the message with the
>archived-at field, why would I want to download it? I'm much more likely
>to want to look at either the messages that preceeded or followed that
>message.
Yes. The other way we use it at W3C is to point others to emails
without having to look them up in the archive. The typical example
goes as follows: As a chair of a WG, I have to prepare an agenda for
a teleconference. I go through the mails (in my email archive) that
I think are relevant. With the system we have, I just copy/paste
the URI to the agenda. Without that system, I'd either have to
go to the archives and search for the message, or try to describe
the message in terms of sender, date/time, subject,...
>As long as the field remains completely unstructured, I see no need to
>support comments. I'd probably change my mind if the field were changed
>to be, say, a list of URIs.
I have added a sentence saying that multiple headers should be
used for multiple URIs. I also mentionned that for HTTP URIs,
there may be a "300 Multiple Choices" response. We actually
produce such a response if a mail goes to more than one mailing
list, and therefore appears more than once in the archive.
Regards, Martin.
Maybe it should only start with List- when added by a list. But it
seems like we're finding more and more cases where it's a bad idea to
have a header field without very clear rules on who is allowed to set
that field. We've seen this with Reply-To and Sender and also with
Carl Malamud's Solicitation field proposal.
>> I'd also recommend that the link point to an archived copy of the
>> message in original form, rather than, say, one that is translated to
>> HTML. Translating to HTML causes a loss of information and
>> potentially
>> a loss of functionality. You should be able to reply to an archived
>> message, refile it into a folder, follow threads, etc., but those
>> things are harder to do if the message is no longer in its original
>> format.
>
> I have added some text about different formats, along the lines
> suggested by Graham. I think functions such as being able to reply,...
> would be great. Actually, our archives have such a function;
> look for "Mail actions: [ respond to this message ]". The functionality
> is limited to what can be done with the 'mailto' URI, and only
> preserves some headers, and not the body. If you have ideas on
> how to implement improvements, that would be great.
There are limits on what you can do with HTML and HTTP. Being able to
access the archived message from IMAP is potentially a lot better, but
(as with many things) even IMAP needs some tweaking to make it work
well for this.
> look for "Mail actions: [ respond to this message ]". The
> functionality is limited to what can be done with the 'mailto' URI,
> and only preserves some headers, and not the body. If you have ideas
> on how to implement improvements, that would be great.
My first idea is to add a "raw message" link to a message/rfc822
resource. The browser could invoke an internal or affiliated MUA, or
provide the ability to save it to a local mailbox (so that you could
then use your favorite MUA), or provide the ability to resend the
message to yourself (again, so that you could then use your favorite
MUA). Those would all involve some special browser support for
message/rfc822. Without such support, users of Unix-like systems could
save the message to a file and then run sendmail to send it to their own
address.
This suggests a second idea: the archive server accepts requests to mail
a copy of a message to the user. There are various ways such requests
could be issued (typing your address into a form field, logging in and
having a saved preference, sending the request via mail [via a mailto:
link] and having the server reply), but they all raise the possibility
that the system could be abused and made to send arbitrary messages to
arbitrary recipients.
A third idea is for the HTML version to provide a link to
imap://server/msg-id, which would refer to a public (anonymous)
read-only virtual mailbox that contains one message. Your browser might
be able to invoke an MUA and point it at that mailbox, or you might be
able to paste the URI into your favorite MUA.
AMC
How many people actually use IMAP? (I don't.)
That said, I see no reason why the URI specified should not be an IMAP URI.
#g
> How many people actually use IMAP? (I don't.)
That depends on the question:
IMAP the protocol? Many.
IMAP the URI scheme? Probably very few.
There are many software packages that support message storage and access
via IMAP, several of which are in use at quite large private installations
(ca.100000 users), as well as at ISPs. http://www.imap.org/about/whatisIMAP.html
is a good place to start to learn more about what is available.
As far as IMAP the URI scheme, I have tried a few browsers, and only one
(Konqueror) appears to support it.
> That said, I see no reason why the URI specified should not be an IMAP URI.
Or POP. Browser support for the POP URI scheme is also an issue for that,
though it would likely be much easier to add support for POP than for IMAP
since the POP protocol is simpler.
IMAP certainly is more complex than POP, but I'm not sure that's
relevant to this issue.
If all you want to do is fetch a single message based on a URL, using
IMAP should be about as much work as POP. Things like IMAP's complex
support for server-side search don't matter to this simple task.
Arnt
why does it matter what they use to read their mail, as long as the
right thing happens when they click on an IMAP URI?
lots of MUAs do support IMAP. whether they support the ability to
reference messages via IMAP URLs is a different question.
mail archives won't get better without specifications for how to make
them better.
> IMAP certainly is more complex than POP, but I'm not sure that's
> relevant to this issue.
>
> If all you want to do is fetch a single message based on a URL, using
> IMAP should be about as much work as POP. Things like IMAP's complex
> support for server-side search don't matter to this simple task.
It depends on what protocols and implementation languages one is
familiar with. If one has been using something like C and is familiar
with other simple protocols (e.g. SMTP, NNTP, HTTP), POP is easy to
implement. Unless one is familiar with and actually likes programming
languages like LISP, one is unlikely to find implementing an IMAP
client to be pleasant, let alone easy. Aside from that, the simplicity
of POP's state-based protocol makes for a straightforward
implementation, as opposed to IMAP's asynchronous protocol (where
commands are tagged and response tags have to be matched to the
corresponding command).
The big issue though is that there doesn't appear to be a good mechanism
for responding to a mailing list archived message (or a message in a
list digest, in most cases) while preserving References etc. Lack of
support for IMAP and POP URI schemes may be a factor, but even widespread
support for those schemes would still require a means of getting the
message content into an application that can generate a response with
appropriate References, In-Reply-To, etc. fields. I suppose that could
in theory be done by having a browser launch a POP or IMAP email
application to retrieve the message, however in practice there would be
problems with that approach (e.g. launching Mozilla when there is already
an instance running doesn't work). And yes, the choice of POP or IMAP
doesn't really matter as far as addressing that big issue is concerned.
> lots of MUAs do support IMAP. whether they support the ability to
> reference messages via IMAP URLs is a different question.
Here's one data point: The way you tell Mutt to open an IMAP (or POP)
mailbox rather than a local mailbox is by giving it an imap: (or pop:)
URI instead of a pathname. I don't think it understands the more
fine-grained imap: URIs though (the ones that refer to resources smaller
than a mailbox, like individual messages).
AMC
>>>There are limits on what you can do with HTML and HTTP. Being able to
>>>access the archived message from IMAP is potentially a lot better, but
>>>(as with many things) even IMAP needs some tweaking to make it work well
>>>for this.
>>
>>How many people actually use IMAP? (I don't.)
>
>why does it matter what they use to read their mail, as long as the right
>thing happens when they click on an IMAP URI?
>
>lots of MUAs do support IMAP. whether they support the ability to
>reference messages via IMAP URLs is a different question.
>
>mail archives won't get better without specifications for how to make them
>better.
I feel as if we're each missing the other's point. Mine was two-fold:
(a) that a new facility that depended on IMAP might not be easy to get
deployed. Maybe. I was asking a question rather than making an assertion.
(b) that I didn't see the need for any fundamental change to
draft-duerst-archived-at-00.txt to allow IMAP to used in conjunction with
the new header.
So it seems I'm missing something in your position that makes mention of
IMAP important to this specification?
(Also, RFC3205 [1] notwithstanding, I do tend to wonder what it is that
IMAP offers in this context that isn't performed equally well using an
HTTP-based implementation. With respect to section 2.5 of RFC 3205, HTTP
seems pretty appropriate on all counts, given that most email archives are
accessed today using HTTP. Accessing email archives seems to be a function
that fits right into HTTP's core competence.)
#g
--
[1] http://www.ietf.org/rfc/rfc3205.txt
>Comments are incompatible with URIs unless some quoting mechanism is used.
>That is because URIs may contain parentheses, as in
>http://users.erols.com/blilly/(foo)(bar)
>See RFC2396 for URI syntax details. RFC 2369 quotes URIs in angle brackets,
>which are not themselves allowed in URIs.
Well this is not the first time someone has proposed a structured header
with a URI in it, surely? So what has been done about this on previous
occasions? Parentheses in URIs are not all that common, so they could be
escaped. Or else the whole thing could be put inside a quoted-string.
Using angle-brackets would doubtless also be a good idea, though I do not
think that on its own would allow comments to appear outside of the angle
brackets with parentheses inside being treated differently.
>One issue to consider is long URIs. If comments are disallowed, no quoting
>is required. URIs cannot contain whitespace characters, so a simple way to
>handle a long URI is to allow a URI to be line-folded; it can be reconstructed
>by unfolding and eliding any whitespace.
Long URIs are a pain in message/article bodies too. Is there any
possibility that those responsible to fixing the format of URIs could
include a folding mechansim? I do not like the idea of fixing what is a
more widespread URI problem within just a few particular email headers.
Another possibility is some adaptation of the mechanism used for long
parameters in RFC 2231. Not that I am enamoured of the particular solution
provided there, but I am even less enamoured of inventing YAFM.
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl
Email: c...@clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
>It defines a new email header, Archived-At:, to provide a direct
>link to the archived form of an individual mail message. We use
>this extremely successfully (currently in the form X-Archived-At)
>at W3C.
This could also have applications within Usenet. Currently, FAQs etc
posted to news.answers have a "pseudo-header" in the body of the article,
for example:
Archive-name: news-answers/guidelines
Version: $Id: guidelines,v 2.49 98/09/28 20:02:32 jjnichol Exp $
Posting-Frequency: monthly
Copyright: see Section 5
The meaning of that header is "look in the usual archive site, or one of
its mirrors, and look for the file news-answers/guidelines". I don't
overmuch like that "pseudo-header" convention, and I would like to see a
better mechanism, but for now there it is.
Now if an Archived-Header were to be used for the same purpose, it would
look like:
Archived-At: <ftp://rtfm.mit.edu/pub/usenet/news.answers/news-answers/guidelines>
Observe that the URI doesn't need to be 'http', and even if it is, it
doesn't mean that the document retrieved would be in HTML.
Anyway, this raises two further issues:
1. The syntax should allow a comma-separated list of URIs. This might
allow for the article to be retrieved in various different formats
(presumably the file extension at the end would tell you which).
2. The situation where there are numerous mirrors of the archive site
needs to be borne in mind. Is there already a mechanism within the URI
system for retrieving a resource from one of a number of sites?
If not, would something like the following work:
List-Archive: <ftp://rtfm.mit.edu/pub/usenet/news.answers/>,
<ftp://rtfm.mirror.someplace.net/news.answers/>
Archived-At: <news-answers/guidelines>
where the Archived-At is presumed to be relative to one of the
List-Archive alternatives.
This also suggests that the name of the new header should be
List-Archived-At, as others have suggested.
Let me put it another way.
1. I really don't think we should standardize a header field pointer to
mail archives that depends on HTML and HTTP.
2. IMAP seems like the best protocol available for the purpose of
accessing mail archives, so I think that IMAP should be cited in
any standard for such archives.
3. However, I don't think that we should limit such a standard to be
used exclusively with IMAP or any other protocol. Clearly it's much
easier to deploy HTTP archvies than IMAP archives.
There's a conflict between what is easy to deploy and what would be
a desirable state to encourage. I don't want to insist that we only
approve a header that works in that more desirable state - I just
want to make sure there's a clear upgrade path to that more desirable
state.
> (Also, RFC3205 [1] notwithstanding, I do tend to wonder what it is
> that IMAP offers in this context that isn't performed equally well
> using an HTTP-based implementation. With respect to section 2.5 of
> RFC 3205, HTTP seems pretty appropriate on all counts, given that most
> email archives are accessed today using HTTP. Accessing email
> archives seems to be a function that fits right into HTTP's core
> competence.)
HTML archives of mail suck, for reasons that RFC 3205 didn't even begin
to touch on. You don't want to have to use a different interface to read
mail achives than to read ordinary mail. You want the same kinds of
sorting, searching, (re)filing, address book handling, annotation,
replying, etc. available to mail archived elsewhere as for your personal
mail archives. HTTP could be used as a mailbox access protocol, but
it lacks most of the features needed - it is less functional than POP
for that purpose.
Here are some problems with using HTTP to access mail archives:
- Most MUAs don't know how to access messages using HTTP, which pretty
much limits access to web browsers. See below.
- There are no conventions for the mapping of HTTP URLs onto commonly
understood abstractions for groups of email messages such as
folders. There is no way to list all messages in a folder, or to
search through a folder for messages matching a particular pattern,
etc. WEBDAV might provide some of the missing functionality,
but it still needs to be defined how to use it for email.
Here are some problems with representing mail messages in HTML:
- it discards the fact that the resource was a mail message
- it translates valuable semantic content into mere text
(more specific examples below)
- mail readers won't know what to do with it other than to display it.
they can't reply to it, refile it, etc.
- server-side mail interfaces are slow, and cannot take advantage
of client-side context such as address books, preferences for
how outgoing mail is sent, how replies are composed, etc.
- can't do duplicate message suppression when the same message
appears from multiple sources
On this, we agree.
And there is nothing in Martin's draft that has any such dependency.
So I don't think there is a problem with [1] on these grounds.
(I've skipped the rest of your message: I did read it, and I agree with
some points you make, and not with others, but I don't think they're really
germane to the specification on the table.)
#g
--
[1] http://www.ietf.org/internet-drafts/draft-duerst-archived-at-00.txt
> Well this is not the first time someone has proposed a structured header
> with a URI in it, surely? So what has been done about this on previous
> occasions? Parentheses in URIs are not all that common, so they could be
> escaped. Or else the whole thing could be put inside a quoted-string.
RFC 2017 provides for URIs in parameters. RFC 2557 uses URIs in a structured
field. 2557 syntax is broken because it allows comments (this has been
discussed on the MHTML list). RFC 2369 uses URIs in various List- (structured)
fields. 2369 uses a quoting mechanism which permits comments outside of the
quoted URIs.
> Using angle-brackets would doubtless also be a good idea, though I do not
> think that on its own would allow comments to appear outside of the angle
> brackets with parentheses inside being treated differently.
See RFC 2369. The quoting mechanism differentiates what is quoted (the
URI) from what is not quoted (commas in the case of lists, whitespace,
line folding, comments).
> Long URIs are a pain in message/article bodies too. Is there any
> possibility that those responsible to fixing the format of URIs could
> include a folding mechansim? I do not like the idea of fixing what is a
> more widespread URI problem within just a few particular email headers.
Whitespace and CRLF are not permitted in URIs. So folding isn't a problem.
See RFC 2396 Appendix E.
I think there is an implied dependency of sorts. Without specifying how
to access archives in IMAP people are going to assume that HTTP is the
only thing that can be used. The danger is that we will paint ourselves
into a corner where we are stuck with using HTTP.
Is this extension for email readers or for web browsers?
>>Good point. But it is not really restricted to lists. There are
>>other potential applications, which are not related to mailing lists.
>>So starting with List-* would be confusing.
>
>Maybe it should only start with List- when added by a list.
Although I expect this header to mostly be used in connection
with mailing lists, the archiving functionality is really not
at all tied to mailing lists. Also, what would be the benefit
of having Archived-At and List-Archived-At for the user who
wants to find an archived copy of the message?
In our implementation, the URI you get with Archived-At is
completely independent of any particular mailing list, and
is the same for all mailing lists in case of cross-posting.
I don't even know whether the header is added once when the
email comes into our mailing list system, or it is added
independently for each mailing list. Indeed, my guess is
that it's something in the middle. If we had been able to
tightly integrate the email sending and the archiving
process, we wouldn't have needed the MID redirection.
>But it seems like we're finding more and more cases where it's a bad idea
>to have a header field without very clear rules on who is allowed to set
>that field. We've seen this with Reply-To and Sender and also with Carl
>Malamud's Solicitation field proposal.
These are certainly good examples. But I think there is a difference
between a header that can be added several times (as the Archived-At
header) and a header that can only be added once (as I think is the
case with the above examples).
Or in other words: The rules are very clear, anybody who archives
the message can add an Archived-At header, and nobody should
overwrite existing ones. I clarified this in the draft.
>Being able to access the archived message from IMAP is potentially a lot
>better, but (as with many things) even IMAP needs some tweaking to make it
>work well for this.
Very good idea. I have mentioned this in the draft. I thought
that the IMAP URI scheme only allows to address folders,
but that's not at all the case.
Regards, Martin.
Having different fields for different parties adding an archived-at
field would allow the recipient to determine which party added that
information. That way, if the information is misleading or inaccurate
or the archive is not properly maintained, the recipient has some idea
of who is responsible.
> In our implementation, the URI you get with Archived-At is
> completely independent of any particular mailing list, and
> is the same for all mailing lists in case of cross-posting.
The question is not so much whether the archive is associated with
a list, as what party supplied the information. (though I'd also assert
that it's useful to see how a message fit into the conversation of
a particular list, even if the message was posted to more than one
list.) That and sometimes you want to filter out certain kinds
of header fields (e.g. you want to remove List- fields from messages
that were re-submitted to other lists)
> >But it seems like we're finding more and more cases where it's a bad idea
> >to have a header field without very clear rules on who is allowed to set
> >that field. We've seen this with Reply-To and Sender and also with Carl
> >Malamud's Solicitation field proposal.
>
> These are certainly good examples. But I think there is a difference
> between a header that can be added several times (as the Archived-At
> header) and a header that can only be added once (as I think is the
> case with the above examples).
> Or in other words: The rules are very clear, anybody who archives
> the message can add an Archived-At header, and nobody should
> overwrite existing ones. I clarified this in the draft.
Being able to have multiple instances of the field does help, but it still
begs the question of which intermediary added the field.
> >Being able to access the archived message from IMAP is potentially a lot
> >better, but (as with many things) even IMAP needs some tweaking to make it
> >work well for this.
>
> Very good idea. I have mentioned this in the draft. I thought
> that the IMAP URI scheme only allows to address folders,
> but that's not at all the case.
I'll look forward to seeing the next draft.
Keith
Any definitive answers appreciated here. For the moment,
I'm just keeping it with "one URI, no comments, no special
mechanisms, follow the length limitations in RFC 2822".
>Parentheses in URIs are not all that common, so they could be
>escaped. Or else the whole thing could be put inside a quoted-string.
>
>Using angle-brackets would doubtless also be a good idea, though I do not
>think that on its own would allow comments to appear outside of the angle
>brackets with parentheses inside being treated differently.
The less complications, the better in my opinion.
> >One issue to consider is long URIs. If comments are disallowed, no quoting
> >is required. URIs cannot contain whitespace characters, so a simple way to
> >handle a long URI is to allow a URI to be line-folded; it can be
> reconstructed
> >by unfolding and eliding any whitespace.
>
>Long URIs are a pain in message/article bodies too. Is there any
>possibility that those responsible to fixing the format of URIs could
>include a folding mechansim? I do not like the idea of fixing what is a
>more widespread URI problem within just a few particular email headers.
There is something explicit in the new URI draft, at
http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html,
but as far as I know, it's not followed by MUAs.
Regards, Martin.
Neither, or both. It's an extension to the message *format*, not the
protocol. Applications may use it (or not) as they see fit.
My mail reader recognizes the http: protocol in am x-archived-at header and
invokes a browser to retrieve and display the message. But I'd expect it
to respond differently to other URI schemes.
If you feel there's an implied dependency, maybe something like this could
be added to the text:
[[
One way to use this header field is to include an HTTP URI, which can be
passed to a web browser to retrieve and display a copy of the archived
message. But other URI schemes may be used, such as imap:, and mail
applications may use the information thus provided in any way that assists
access to archived mail messages.
]]
#g
While that might be true in principle, it's not true in practice. A
mail reader is much more likely to be able to make effective use of an
IMAP URL than an HTTP URL. And it's more likely to do a good job
handling a message/rfc822 resource than with a text message encoded as
text/html. There's something perverse in asking a mail reader to fire
up a web browser to read an email message. And for some strange reason
I think that extensions to the email format that exist for the purpose
of reading mail should be usable by mail readers.
It seems to me that the problem we're dancing around is that, as Keith said
earlier, there is little reason to access the same message in an archive
that one already has since the Archived-At field was gotten from it. On the
other hand, message/rfc822 doesn't have a good mechanism for linking the
the "context" of the message in an archive. References coule be used in
concert with List-Archive to access its precursor messages, but finding its
successors and reconstructing the thread tree would require a huge amount
of effort.
I guess that the data object that we are missing here is the Archive itself
with some sort of standardized access protocol. Then the Archived-At field
becomes some sort of <msgarch://host.name/archive.name?msg-id>.
--
Bill McQuillan <McQu...@pobox.com>
> I guess that the data object that we are missing here is
> the Archive itself with some sort of standardized access
> protocol. Then the Archived-At field becomes some sort of
> <msgarch://host.name/archive.name?msg-id>.
You mean like <imap://host/mailbox/;UID=uid> ?
(See RFC-2192.)
AMC
If developers don't choose to deploy imap: in the face of such a comment,
then I submit that they are never likely to do so at the exhortation of any
standard.
I think there's a real danger that your quest for something else will
impede the standardization of a Really Useful Feature, for which there is
running code. I think I have heard you subscribe to the idea that the best
is the enemy of the good in standards development ... I think this could be
a shining case in point.
What is it that you want to happen to this proposal?
#g
--
PS: FWIW, I use the feature described on an almost-daily basis, and I find
that using a browser to search email archives is not at all
perverse. Quite often, I find myself switching between a mail archive and
Google to track down information, and the browser is a natural user
interface that spans both of these. The feature described here is really
very useful and is deployed today (modulo an X- at the front of the header
field name).
------------
I think the answer is that something else has been bugging me about this
proposal that I'm only starting to get a handle on. I got some more
idea about it this morning when I woke up.
As I was saying earlier, there's something perverse in asking an email
reader to fire up a web browser to read an email message. Part of the
reason that this is perverse, of course, is that the email reader is
tailor-made to do things with email messages and the web browser is not.
Unless the MUA and web browser are very tightly coupled (which seems to
be the case less often these days) this seems like a suboptimal
arrangement.
And yes, HTTP can be used to download email messages, and HTTP content
negotiation can be used to allow a mail user agent to pick between
getting the message in message/rfc822 format and text/html format (or
some other format). If the message is in rfc822 format it should stay
in the mail reader; if it's in html format it should probably go to the
web browser (Just like web browsers make poor mail readers; mail readers
make poor web browsers. Even if they support HTML they probably don't
support scripting, tabs, cookie management, security, history - things
that we have come to expect in order to use the web effectively, even
to read email archives). But at the time the MUA decides what to do
with the URI, it doesn't know whether it should download the message
itself or hand it off to a web browser.
So I'm currently thinking that Archived-At needs to supply more
information: namely the content types that are available. An
alternative might be to use a HEAD request to find out what
content-types are available.
Either way, this document probably needs to say some things about
- use of different representations of email messages, with native
(message/rfc822) format being strongly encouraged as the primary
format and HTML as a useful alternative.
- use of HTTP vs. IMAP (maybe also mentioning POP or even NNTP)
as an access protocol
- how a mail reader (or for that matter a web browser) should decide
whether to handle the message itself or hand it off to a different
tool.
- (maybe) organization of mail archives.
Some of this I understand Martin has already written, but I haven't
seen the revised document yet.
You asked me to comment on specific text that you suggested. The reason
I didn't do so was because I sensed that the issues needed more thought
and that it was premature to try to write specific text.
I do accept that there's a limit to how far this document should try to
go to address these issues - this is after all a document about a new
header field, not about how to set up email archives. My guess is that
about one paragraph for each of the above points is about right.
We could of course write a separate document on the subject of mail
archives, and perhaps we should. But we don't want to hold up this
document waiting for a recommendation for mail archives that we can
agree on.
> I think there's a real danger that your quest for something else will
> impede the standardization of a Really Useful Feature, for which there
> is running code. I think I have heard you subscribe to the idea that
> the best is the enemy of the good in standards development ... I think
> this could be a shining case in point.
I don't think it's too much to ask to discuss a Really Useful Feature
for a few days to try to understand its broader implications before we
try to promote some particular way of implementing it. Email header
fields are a data model, and there's a large body of experience that
data models are really difficult to get right. We want standards to be
stable, and we want them to be flexible enough to use for a long time.
We're not going to get that by standardizing the first idea that comes
to mind -- even if it's easy to implement and there's running code and
that running code is found to have some utility.
It may be that we place too much emphasis on running code. Any
programmer who has been around for a few years knows that a lot of code
that is "running" - maybe most of it - even if it is found to be useful,
often has a slapdash quality that won't necessarily hold up to the
strain of widespread use. Also there's an important difference between
running code in the form of a prototype and running code that implements
a well-designed specification. The prototype is useful as a
proof-of-concept and to identify flaws in the initial design, but it's
rarely suitable for widespread deployment. The running code that IETF
demands as a condition of advancement to Draft Standard serves a
different purpose - it is a check on the clarity and precision of the
specification, and that implementation is not too onerous.
--
He not busy being born, is busy dying. - Bob Dylan
> Any definitive answers appreciated here. For the moment,
> I'm just keeping it with "one URI, no comments, no special
> mechanisms, follow the length limitations in RFC 2822".
In terms of ABNF, similar to RFC 2822, that would be something like:
archived-at = "Archived-At" ":" [FWS] URI-reference *WSP CRLF ; URI-reference not empty
where "URI-Reference" is defined in RFC 2396. You may also want to
support parsing (but not generation in new messages) of older forms,
e.g.:
obs-archived-at = ("Archived-At" / "X-Archived-At") *WSP ":" [FWS] URI-reference [FWS] CRLF
You might want to replace URI-reference with absoluteURI in both
places, though that would preclude use of a fragment identifier.
Yet another alternative would be to use "absoluteURI ["#" fragment]"
which precludes relative URIs while permitting a fragment identifier.
It really depends on what restrictions you want to place on the types
of URIs to be permitted.
>RFC 2017 provides for URIs in parameters. RFC 2557 uses URIs in a structured
>field. 2557 syntax is broken because it allows comments (this has been
>discussed on the MHTML list). RFC 2369 uses URIs in various List- (structured)
>fields. 2369 uses a quoting mechanism which permits comments outside of the
>quoted URIs.
RFC 2017 deals with URIs as parameters, insists that they be in the form
of quoted-strings, and then allows you to insert FWS at selected places,
e.g.:
URL="ftp://ftp.deepdirs.org/1/2/3/4/5/6/7/
8/9/10/11/12/13/14/15/16/17/18/20/21/
file.html"
The broken-ness of RFC 2557 seems to arise from the fact that it does not
insist that any parentheses within the URI get %-quoted. Also from the
fact that it tells you to use the RFC 2017 mechanism for long URIs,
whereas that mechanism only applies to parameters. Also, it makes life
hard for itself by not providing delimiters (whether "..." or <...>)
around the URI. But, worryingly, Martin's proposed syntax also suffers
from some of these problems.
RFC 2369 appears to set the best precedents, including delimiting within
<...>, allowing FWS within those <...> (to be eliminiated before use as a
URI), and provision for comma-separated lists of such URIs.
Since the List-Archive header and the proposed [List-]Archived-At header
are quite likley to occur in the same message, it would be a great pity if
they did not follow the same syntactic conventions.
And I would much prefer a comma-separated list of URIs than multiple
Archived-At headers. For a start, it gives you a way to express some
priority between the various alternatives (as in the List-Archive header).
And then, multiple headers with the same name are Bad Thing for other
reasons (for example if ever you come to have digital signatures of
headers, such as have been mentioned recently in other threads on this
list).
>Whitespace and CRLF are not permitted in URIs. So folding isn't a problem.
>See RFC 2396 Appendix E.
Not quite so.
<http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html>
seems to recommend, in its Appendix C, that you should "just put spaces"
into your URI (all genuine SP having already been replaced by %20) and
then fold, and it especially recommends delimiting within <...> at the
same time. So if that is the way the URI people are thinking, then that is
the way we should be going.
One wishes, however, that the provision had been made in the body of their
draft, rather than as an afterthought in an Appendix.
>So I'm currently thinking that Archived-At needs to supply more
>information: namely the content types that are available. An
>alternative might be to use a HEAD request to find out what
>content-types are available.
I think the problem you are raising is a generic URI problem, and not just
a problem with this particular header.
AIUI, a URI gives you access to a "resource"; i.e. it tells you where to
find it. It tells you nothing about what to do with it when you have got
it, except what is implicit in the "scheme".
So if the scheme is 'ftp', then all you know is that you are going to get
a file of some sort.
If the scheme is 'imap', then you know that the object to be retrieved
will be something like an email message, and the rules for imap URIs will
doubtless tell you much more.
Now one of the possible benefits of the 'http' scheme is that the object
that comes back should have a Content-Type header. And what we want is for
this Content-Type to be message/rfc822.
So what Martins draft needs to say is something like "If the scheme of the
URI is 'http', then the entity that is returned SHOULD have the
Content-Type message/rfc822". Then whatever system asked for the
Archived-At object to be retrieved has a decent chance of being able to
display and process it like an email.
>Either way, this document probably needs to say some things about
>- use of different representations of email messages, with native
> (message/rfc822) format being strongly encouraged as the primary
> format and HTML as a useful alternative.
Yes, see above.
>- use of HTTP vs. IMAP (maybe also mentioning POP or even NNTP)
> as an access protocol
Yes.
>- how a mail reader (or for that matter a web browser) should decide
> whether to handle the message itself or hand it off to a different
> tool.
You can't really give detailed instructions in a standard about how
individual readers or browsers are supposed to behave. The general
intention is that all readers/browsers that recognize Content-Type headers
are supposed to use the best available tool for handling that Content-Type
(possibly even using a special plugin). Naturally, some current systems
make a better job of this than others :-( .
yes it's a generic problem with URIs (or to look at it a different way,
a design choice made for URIs)
> AIUI, a URI gives you access to a "resource"; i.e. it tells you where
> to
> find it. It tells you nothing about what to do with it when you have
> got
> it, except what is implicit in the "scheme".
some would say that making assumptions about the nature of the resource
based on the "scheme" is a bad idea.
> So what Martins draft needs to say is something like "If the scheme of
> the
> URI is 'http', then the entity that is returned SHOULD have the
> Content-Type message/rfc822".
Maybe. I'd like to encourage more archives to support native format
access. I do think it' s reasonable for a standard for accessing mail
archives to behave predictably (as in, provide some minimum format for
the sake of interoperability), and that it's reasonable to give mail
readers the message in a format that they can use. I'm concerned that
simply saying "SHOULD use 822" will alienate those that don't. "SHOULD
provide 822; MAY provide HTML or other formats if the access protocol
supports content-negotiation seems about right to me.
>> - how a mail reader (or for that matter a web browser) should decide
>> whether to handle the message itself or hand it off to a different
>> tool.
>
> You can't really give detailed instructions in a standard about how
> individual readers or browsers are supposed to behave.
no, but I don't think a paragraph or two is too much.
Keith
What is the usefulness of a header field containing a URI which only
fetches another copy of the same message?
--
Bill McQuillan <McQu...@pobox.com>
> What is the usefulness of a header field containing a URI which only
> fetches another copy of the same message?
Even if that is all it did, it would be useful in this scenario: I have
a copy of a message in my personal mailbox, and I'm creating a web page
in which I want to refer to that message. Rather than copy the message
into my own web page, I can copy the URI from the Archived-At: field
of the message into a href attribute in the web page, and let readers
fetch the message from the public archive. A similar argument applies
when I am sending a message to a large number of recipients, a few of
whom might be interested in following my reference to another message.
Rather than including a copy of the other message (causing it to be
pushed out to everyone), I can copy the URI and let the readers fetch it
if they care to.
There is also the possibility that the proposed header field will allow
more than just fetching the message; in particular, it might allow an
easy way to access other messages in the same thread.
AMC
In this case shouldn't the referred to item be a web page rather than a
message/rfc822?
> A similar argument applies
>when I am sending a message to a large number of recipients, a few of
>whom might be interested in following my reference to another message.
>Rather than including a copy of the other message (causing it to be
>pushed out to everyone), I can copy the URI and let the readers fetch it
>if they care to.
This is one legitimate use I can see, but it seems to be of low utility to me.
>There is also the possibility that the proposed header field will allow
>more than just fetching the message; in particular, it might allow an
>easy way to access other messages in the same thread.
This was actually my point! Some sort of access other than an imap: or ftp:
access to the raw message/rfc822 is what most of us would like to get.
--
Bill McQuillan <McQu...@pobox.com>
a) so you can copy the URI to anywhere else you would like to put a
reference to that message.
b) so you can delete the message but keep the URI.
c) so you can use that URI as a starting point to get to other messages
that appeared in the same context. the trick is how to identify what
that context is (is it the mailing list in which the message appeared?
which mailing list?) and how to navigate around that context.
Well, if the IMAP archive were organized so that all of the messages
sent to one list were in the same folder, the URI would actually serve
as a pointer to not only the message, but the location of the message
within the list archive. And from there (especially with a threads
extension) the MUA could presumably look for other messages in the same
thread.
I think our debate has pretty much run its course. I happen to think
HTTP-based archives have greater utility than you appear to do, but that's
fine.
(Answering another question on this list recalled to me that one of the
benefits of this proposal used in conjunction with HTTP-based archives is
that it makes it easier to construct threads of discussion and reasoning
that may be completely separate from specific message/response sequences
seen in any single email thread.)
While I don't regard it as essential, I've no objection that the spec might
say some things about the topics you suggest, assuming that it doesn't
become overbearing. And, of course, someone has to provide some suitable text.
#g
--
At 10:39 24/02/04 -0500, Keith Moore wrote:
>So I'm currently thinking that Archived-At needs to supply more
>information: namely the content types that are available. An
>alternative might be to use a HEAD request to find out what
>content-types are available.
>
>Either way, this document probably needs to say some things about
>
>- use of different representations of email messages, with native
> (message/rfc822) format being strongly encouraged as the primary
> format and HTML as a useful alternative.
>
>- use of HTTP vs. IMAP (maybe also mentioning POP or even NNTP)
> as an access protocol
>
>- how a mail reader (or for that matter a web browser) should decide
> whether to handle the message itself or hand it off to a different
> tool.
>
>- (maybe) organization of mail archives.
------------
It is extraordinarily useful in group discussions by email, in which a
participant creates messages with direct, easy-to-retrieve references to
previous messages. While looking at one message, one can cut-and-paste the
URI into another message under construction. It is especially useful when
reading messages without a live Internet connection (*).
The alternative is to spend time online rummaging around in the mail
archives to find the same message in archived form, which can be a pain,
especially when the archives are not well organized.
Partly because of this feature, but also because of the way the archives
are presented, I have found the W3C mailing lists service to be the most
usable of any that I have experienced.
#g
--
(*) I sometimes think that there are protocol designers who take an
always-on Internet connection for granted. One doesn't have to live in
some desert region of the world to be reliant on dial-up for Internet access.
Ridiculous!
United States Patent 6,704,772
Ahmed , et al. March 9, 2004
Thread based email
Abstract
Systems and methods for providing electronic messaging services to multiple
users by storing a single copy of an electronic message at a central
location and notifying recipients of the stored single copy. An electronic
message includes a distribution list and a message content. A distribution
list identifying multiple recipients causes prior art systems to duplicate
the entire message for each recipient, placing potentially large demands on
both processing power and storage space. In contrast, the systems and
methods disclosed herein store a single copy or a limited number of copies
of an electronic message addressed to multiple recipients and provide each
recipient with a relatively small notification. In addition to providing
information regarding content and origin, the notification also provides
access to the stored message. Furthermore, the methods and systems also aid
in organizing replies to electronic messages. Replies are associated with an
initial message through a message identifier. The association helps to
organize electronic messages by subject and provides context without
requiring an author to duplicate the content of the initial message with the
reply.
Inventors: Ahmed; Muhammad A. (Seattle, WA); Alam; Mohammad Shabbir
(Bellevue, WA)
Assignee: Microsoft Corporation (Redmond, WA)
Appl. No.: 399417
Filed: September 20, 1999
PS: I don't know about others but this was already prior art on our product
where we used email distribution into a single mssage folder for multiple
user participation. This is used everyday for our technical support staff
as one example. Nothing elegant about this. A EMAIL <---> NEWS interface!
--
Hector Santos, Santronics Software, Inc.
http://www.santronics.com
----- Original Message -----
From: "Martin Duerst" <due...@w3.org>
To: <ietf...@imc.org>
Sent: Friday, February 20, 2004 4:13 PM
Subject: New Internet Draft: draft-duerst-archived-at-00.txt
>
> Dear email specialists,
>
> I have recently submitted a proposal for a new email header,
> the Archived-At header. You can find it at
> http://www.ietf.org/internet-drafts/draft-duerst-archived-at-00.txt
>
> It defines a new email header, Archived-At:, to provide a direct
> link to the archived form of an individual mail message. We use
> this extremely successfully (currently in the form X-Archived-At)
> at W3C.
>
> I would highly appreciate any feedback and comments on this proposal.
> Please make sure you copy me, as I'm not subscribed to this list.
>
>
> Regards, Martin.
>
>
Best Regards,
-- Tim
>My memory is a bit fuzzy on this, but I seem to remember at least one,
>likely more systems in the early 90's that did this. Lotus cc:Mail, Notes,
>and possibly a system by Oracle of the time. Unfortunately it has been too
>long to remember details, but perhaps someone who was internal to these
>organizations may be able to shed a bit more light on the subject.
>
>
All-In-1 (from Digital Equipment Corporation) is another example that
has been around as of the late 80's which did this. IMHO M$ can't claim
it has invented this, but maybe they are the first to patent it?
/rolf
TK> My memory is a bit fuzzy on this, but I seem to remember at least one,
TK> likely more systems in the early 90's that did this. Lotus cc:Mail, Notes,
Most or all of the LAN-based email products, including cc:Mail, were
based on a central, shared message store. This was latter 1980's.
As noted, other systems like DEC's all-in-one also saved on storage by
sharing central message copies.
As for Nick's reference to Notes's mid 1980's use of inter-document
dynamic links, SRI's NLS system had them operational no later than 1972,
but probably more like 1966.
d/
--
Dave Crocker <dcrocker-at-brandenburg-dot-com>
Brandenburg InternetWorking <www.brandenburg.com>
Sunnyvale, CA USA <tel:+1.408.246.8253>