beware of "Courier-IMAP"

166 views
Skip to first unread message

Mark Crispin

unread,
Mar 10, 2000, 3:00:00 AM3/10/00
to IMAP Interest List
The Courier-IMAP server is non-compliant with the IMAP specification, and
its author states that he has no intention to make Courier-IMAP compliant:

> It's completely absurd. The parser in Courier-IMAP is a straightforward
> parser, so I treat [ and ] as distinct lexical units, so they are rejected
> when sent as part of an unquoted string. I'm not going to insert a bunch
> of spaghetti code, and break something, just to comply with completely
> nonsensical portions of IMAP4rev1.

-- Mark --

* RCW 19.190 notice: This email address is located in Washington State. *
* Unsolicited commercial email may be billed $500 per message. *
Science does not emerge from voting, party politics, or public debate.


Nicholas Lee

unread,
Mar 10, 2000, 3:00:00 AM3/10/00
to

"Mark Crispin" <m...@CAC.Washington.EDU> wrote in message
news:Pine.NXT.4.30.000310...@Tomobiki-Cho.CAC.Washington.ED
U...

> The Courier-IMAP server is non-compliant with the IMAP specification, and
> its author states that he has no intention to make Courier-IMAP compliant:
>
> > It's completely absurd. The parser in Courier-IMAP is a straightforward
> > parser, so I treat [ and ] as distinct lexical units, so they are
rejected
> > when sent as part of an unquoted string. I'm not going to insert a
bunch
> > of spaghetti code, and break something, just to comply with completely
> > nonsensical portions of IMAP4rev1.

I must say that I'm rather displease that you have posted these comments (to
myself from the author of Courier imap that I was discussing with you
privately) to an open forum.

In fact I'd say it was rather irresponible and out of context. I might even
go so far as to say you are misrepresenting the discussion and using
bully-boy tactics.

Nicholas


Mark Crispin

unread,
Mar 10, 2000, 3:00:00 AM3/10/00
to Nicholas Lee
There is no misrepresentation. The facts are clear:

1) Courier-IMAP rejected an atom with a [ character.

2) The vendor of Courier-IMAP claims that it is a client bug (in my code,
no less) to send an atom with a [ character.

3) The vendor of Courier-IMAP acknowledges that the IMAP specification
permits this, but states "I'm not going to insert a bunch of spaghetti


code, and break something, just to comply with completely nonsensical
portions of IMAP4rev1."

I have an obligation to report non-compliant servers and defiant vendors
who refuse to implement the specification. It is unfair to the dozens of
other vendors -- all of whom implement IMAP according to specification --
to be burdened by bug reports caused by a vendor who openly defies the
specification and claims that everybody else is wrong.

It has also come to my attention that he posts a so-called "client bugs"
list, which misrepresent problems in his server (or simply his failure to
understand IMAP) as being bugs in various clients.

On Fri, 10 Mar 2000, Nicholas Lee wrote:
> I must say that I'm rather displease that you have posted these comments (to
> myself from the author of Courier imap that I was discussing with you
> privately) to an open forum.
>
> In fact I'd say it was rather irresponible and out of context. I might even
> go so far as to say you are misrepresenting the discussion and using
> bully-boy tactics.

-- Mark --

Nicholas Lee

unread,
Mar 10, 2000, 3:00:00 AM3/10/00
to

"Mark Crispin" <m...@CAC.Washington.EDU> wrote in message
news:Pine.NXT.4.30.000310...@Tomobiki-Cho.CAC.Washington.ED
U...

> There is no misrepresentation. The facts are clear:

I think you've completed missed the point. My email to you was private. I
feel somewhat offended that you took certain comments from a third party
that I directed to your attention and placed those in a public forum.

Ignoring time for delivery you seemed to judge and reply to my email within
an hour. Then without waiting further response (I was asleep) 11 mintues
later posted a rather negative message regarding that third party's product
to this forum.

I state again, what your motives might be. Taking private communications
out of context and using them in a public forum is somewhat irresponible.
I'm somewhat dishearten by your actions in this matter, particular for a
member of your standing in the community.

The fact of the matter was that I was presenting an analysis for improvement
of the IMAP spec. This was given to me by the courier imap author in
response to difficults I had discovered while installing his product and
attempting to use it with both pine (4.21) and outlook express.


Nicholas

Sam

unread,
Mar 11, 2000, 3:00:00 AM3/11/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <Pine.NXT.4.30.000310...@tomobiki-cho.cac.washington.edu>,
Mark Crispin <m...@CAC.Washington.EDU> writes:

> I have an obligation to report non-compliant servers and defiant vendors
> who refuse to implement the specification. It is unfair to the dozens of
> other vendors -- all of whom implement IMAP according to specification --
> to be burdened by bug reports caused by a vendor who openly defies the
> specification and claims that everybody else is wrong.

Oh, you're full of it, Mark. There are numerous interoperability problems
between many different clients and servers, and you know it. When it comes
down to it, I think what's really eating you is that you've finally
accepting the fact that all the interoperatibility problems that have
surfaced over the years are simply due to RFC 2060 being a very poorly
written spec, both from a readability and a technical standpoint. Nobody
can read that and implement anything right off the bat. You do not see the
same level of interoperability problems with any other wire protocol, be it
ESMTP, POP3, or anything else for that matter.

And when someone gets caught in a middle of those interoperatibility
problems, and ends up agreeing with my analysis that RFC 2060 is poorly
designed, you go off the deep end. Well, Mark, that's just too bad, and I
guess you'll just have to learn how to deal with some constructive
criticism, without getting personal and going bonkers, like that.

I can't help but mention another incident several weeks ago where a similar
issue cropped up with another IMAP client -- Mulberry's Mac client. But,
unlike yourself, that fella was very polite and courteous, and, after
hashing it over in E-mail, a couple of times, he made a few tweaks to his
code, and so did I, and everyone lived happily ever after.

But, when someone on a huge ego trip decides to act like a total jackass in
public, I don't think that that's kind of a behavior is going to encourage
much cooperation.

It seems that what's really getting your goat, Mark, is your decade-old fed
with Dan Berstein, of which I really couldn't care less. For years you've
satisfied your enormous ego by refusing numerous requests to support
maildirs, with some flimsy excuse. That's how you got your kicks. And now
that's no longer necessary -- people now have a reliable alternative to
UW-IMAP that is not the bloated monster that it is, and that just bugs the
hell out of you.

> It has also come to my attention that he posts a so-called "client bugs"
> list, which misrepresent problems in his server (or simply his failure to
> understand IMAP) as being bugs in various clients.

Grow up, Mark. Stop acting like a baby.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4ym8O+3BFaxHnGY0RAhykAJ9EixentV/kLOMEZ72waHWX+yHBkwCgnFYJ
uDwbAf2Dlk+3zlY4KaIvLh4=
=cXU4
-----END PGP SIGNATURE-----


Lawrence Greenfield

unread,
Mar 11, 2000, 3:00:00 AM3/11/00
to
Since I don't know the details of the e-mail exchange that is being
debated here, I'm not prepared to defend or attack Mark. However, the
technical problems he raises are real problems. The current IMAP we
have may not be the perfect IMAP in all people's views, but it's the
specification we have after long amounts of debate and give and take,
and interoperability is important and is achieved with complying with
the spec.

Building an interoperable IMAP server _is_ hard---that's why we have
documents like RFC 2683, an excellent document on real world
implementation recommendations by Barry Leiba.

Personally, I'm very pleased to see another free IMAP server---I think
that's definitely for the best. I'd like to see it interoperate with
as many clients as possible.

I pointed out several interoperability gotchas with Courier IMAP
around three weeks ago in personal e-mail. Some of the issues to
raise in the Courier IMAP BUGS file are legitimate client bugs. Some
are not---and most issues are clearly dealt with in the specification.

Here's some of the clearer issues (the quoted text is from the
imap/BUGS file distributed with courier-imap-0.27):

> 1) Pine chokes on whitespace between BODY and [

msg-att-static = "ENVELOPE" SP envelope / "INTERNALDATE" SP date-time /
"RFC822" [".HEADER" / ".TEXT"] SP nstring /
"RFC822.SIZE" SP number / "BODY" ["STRUCTURE"] SP body /
"BODY" section ["<" number ">"] SP nstring /
"UID" SP uniqueid

The only way of generating a "BODY" followed by a [ is the
"BODY" section [ "<" number ">"] SP nstring
rule.

section = "[" [section-spec] "]"

Since "section" MUST begin with a [, there can be NO whitespace
between "BODY" and "[".

> 3) Occasionally Pine sends a FETCH request with an invalid UID.
> This usually happens after you resume a postponed message, and
> send it. It looks like other IMAP servers simply ignore this
> error condition, however Courier-IMAP will return an error
> message, which Pine shows briefly on the status line. This is
> similar to the Netscape Communicator bug (see below), but not as
> bad.

Section 6.4.8
A non-existent unique identifier is ignored without any error
message generated. Thus it is possible for a UID FETCH command to
return OK without any data or a UID COPY or UID STORE to return OK
without performing any operations.

> 1) Netscape Communicator insists that the response in HEADER.FIELDS is
> terminated by a blank line, supposedly the end of message headers.

Section 6.4.5
The HEADER, HEADER.FIELDS, and HEADER.FIELDS.NOT part
specifiers refer to the [RFC-822] header of the message or of
an encapsulated [MIME-IMT] MESSAGE/RFC822 message.
HEADER.FIELDS and HEADER.FIELDS.NOT are followed by a list of
field-name (as defined in [RFC-822]) names, and return a subset
of the header. The subset returned by HEADER.FIELDS contains
only those header fields with a field-name that matches one of
the names in the list; similarly, the subset returned by
HEADER.FIELDS.NOT contains only the header fields with a
non-matching field-name. The field-matching is
case-insensitive but otherwise exact. In all cases, the
[RFC-822] delimiting blank line between the header and the body
is always included.

Larry


Sam

unread,
Mar 11, 2000, 3:00:00 AM3/11/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> The Courier-IMAP server is non-compliant with the IMAP specification, and
> its author states that he has no intention to make Courier-IMAP compliant:
>
>> It's completely absurd. The parser in Courier-IMAP is a straightforward
>> parser, so I treat [ and ] as distinct lexical units, so they are rejected

>> when sent as part of an unquoted string. I'm not going to insert a bunch


>> of spaghetti code, and break something, just to comply with completely
>> nonsensical portions of IMAP4rev1.

Now, now, Mark, what exactly are you trying to accomplish, here? If I was
really interested in prolonging this pissing match, I would probably go
ahead and publish the entire exchange that took place, not just a single
isolated paragraph out of context, so that everyone could see for
themselves what the fuss is all about.

But, I'm not.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4ydL9+3BFaxHnGY0RAhLuAKCJGdBmoN6ibxViXnlzbaSwnGQIWwCgmjme
Ys5yIhGm/tOon2J4ZzGT6h8=
=mrU8
-----END PGP SIGNATURE-----


ra...@adsl-151-203-22-73.bellatlantic.net

unread,
Mar 12, 2000, 3:00:00 AM3/12/00
to
On 11 Mar 2000 16:07:05 GMT, Sam <s...@email-scan.webcircle.com> wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>In article <Pine.NXT.4.30.000310...@tomobiki-cho.cac.washington.edu>,
> Mark Crispin <m...@CAC.Washington.EDU> writes:
>
>> I have an obligation to report non-compliant servers and defiant vendors
>> who refuse to implement the specification. It is unfair to the dozens of
>> other vendors -- all of whom implement IMAP according to specification --
>> to be burdened by bug reports caused by a vendor who openly defies the
>> specification and claims that everybody else is wrong.
>
>Oh, you're full of it, Mark. There are numerous interoperability problems
>between many different clients and servers, and you know it. When it comes

Mark and I have disagreed about technical details in the past, especially
incompatibilities.

Publishing a known-non-compliant product, refusing to fix it, and getting
pissy when the refusal gets published is a problem. While Mark may have been
rude to publish notes from a private email, it's certainly legal and may have
even been appropriate.

>down to it, I think what's really eating you is that you've finally
>accepting the fact that all the interoperatibility problems that have
>surfaced over the years are simply due to RFC 2060 being a very poorly
>written spec, both from a readability and a technical standpoint. Nobody
>can read that and implement anything right off the bat. You do not see the
>same level of interoperability problems with any other wire protocol, be it
>ESMTP, POP3, or anything else for that matter.

Oh, my great aunt Petunia, are you a newbie.... "Easy to read", "easy to
implement" does not mean reliable, workable, complete, or even consistent.

>But, when someone on a huge ego trip decides to act like a total jackass in
>public, I don't think that that's kind of a behavior is going to encourage
>much cooperation.

I'm sure you've noticed this yourself....

--

Nico Kadel-Garcia
nka...@bellatlantic.net

Sam

unread,
Mar 12, 2000, 3:00:00 AM3/12/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <slrn8cm6f2...@adsl-151-203-22-73.bellatlantic.net>,
ra...@adsl-151-203-22-73.bellatlantic.net () writes:

> Mark and I have disagreed about technical details in the past, especially
> incompatibilities.
>
> Publishing a known-non-compliant product, refusing to fix it, and getting
> pissy when the refusal gets published is a problem.

I agree. The user who reported this problem concluded that it was a Pine
bug, and asked to have it fixed.

> While Mark may have been
> rude to publish notes from a private email, it's certainly legal and may have
> even been appropriate.

Mark Crispin published one paragraph out of a rather drawn out E-mail
exchange; this was totally misleading and didn't really have much to do
with anything, I don't really have a problem with hashing this out
publicly, but certainly not in this manner -- with intentional
misrepresentation, inflammatory rhetoric, and complete disregard for the
issues at hand, which quickly degenerates into a pissing match. Leave me
out of it, please.

As I wrote, it looks to me that Mark Crispin is simply looking to pick up
the age-old feud he -- and others -- been having with someone else. I'm
not interested.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4yyuQ+3BFaxHnGY0RAiRIAKClf9MUpuEuXYnBthobr5pSK5AIFwCg3Kc8
eO/UyicuvC5OzsNKwxnjvqQ=
=93oV
-----END PGP SIGNATURE-----


Sam

unread,
Mar 12, 2000, 3:00:00 AM3/12/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <2000031203...@smtp3.andrew.cmu.edu>,
Lawrence Greenfield <le...@andrew.cmu.edu> writes:

> Since I don't know the details of the e-mail exchange that is being
> debated here, I'm not prepared to defend or attack Mark. However, the
> technical problems he raises are real problems.

He did not raise any technical problems. He was upset simply because I
dared to question the gospel of RFC 2060; and when I explained what the
problems in that document were to someone else, that someone else agreed
with my conclusions, considered it to be a Pine bug, and asked to have it
fixed, noting that at least two other IMAP clients work just fine.

I was prepared to go over those same issues again, but, I suddenly realized
that this would merely prolong a meaningless pissing match, and I would be
wasting my breath. It doesn't really matter, in the grand scheme of things.
I have no interest in actively participating in a pissing match. I can't
help it if someone else is hell-bent on starting one, the only thing I can
do is avoid wasting my time in it.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4yyip+3BFaxHnGY0RAqLrAJ0dxAAGuy3KcLVWWSKDMfXeNxf6hwCcCfSs
nOUu3iZLI0ggYqCaAyKgVg0=
=HgyZ
-----END PGP SIGNATURE-----


Nicholas Lee

unread,
Mar 12, 2000, 3:00:00 AM3/12/00
to

<ra...@adsl-151-203-22-73.bellatlantic.net> wrote in message
news:slrn8cm6f2...@adsl-151-203-22-73.bellatlantic.net...

> Publishing a known-non-compliant product, refusing to fix it, and getting

> pissy when the refusal gets published is a problem. While Mark may have


been
> rude to publish notes from a private email, it's certainly legal and may
have
> even been appropriate.

As much as I hate to post another OT message, I feel I have to disagree
here. Its not a question of whether posting private email conversations is
legal or not. Its just not good practice or netiquette. He took something
from private conversation out of context and used that in a public forum to
discredit someone. For a member of the community such as himself this is
just not good behaviour.

Nicholas

Yiorgos Adamopoulos

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
In article <95289681...@shelley.paradise.net.nz>, Nicholas Lee wrote:
>discredit someone. For a member of the community such as himself this is
>just not good behaviour.

I see two problems here:

- Problem #1 is Mark jumping on the gun whenever someone posts something
"against" what he has documented as being IMAP, or NFS with the UW
toolkit, or maildir. Well, this is Mark's personality and it cannot be
changed, the same way as my or any other's personality cannot be changed.
Yes he is a flaming gun, but also whenever he flames, I see technical
arguments from his side (not to add tha I collect his flames - I sort of
have the same behavior in our local ntua.* newsgroups for different
reasons).

- Problem #2 is the IMAP spec and how some choose to implement it. Well,
here there are two choices. Choice #1 says, you follow the spec no
matter how you disagree with it, and certainly you do not break it.
Choice #2 says *write and implement your own spec*. Just as Bernstein
did (for example) with QMTP and Maildir. If you do not like what is
already there (and cannot convice the inventor to do otherwise) present
your alternative and let the community decide what to use. But if you
stick to implement the standard, you are inexcusable if you don't
(because you *claim* to implement it). Simply stating that the IMAP RFC
has nonsensical requirement does not prove anything. The RFC has the
requirement, so you are required to follow it no matter what.
Otherwise write (and implement) your own RFC. Nobody will stop you on
that.

I too am working on similar things and try to put my thoughts into code. I
do not think that I will ever state X is done the "wrong" way in protocol
Y. The standard is there and I either follow it, extend it or write my
own. But I do not break the existing, *especially* just because I do not
like the personality of the inventor.

--
${talks}

Sam

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <slrn8coi74...@ithaca.dbnet.ece.ntua.gr>,
ad...@dblab.ece.ntua.gr (Yiorgos Adamopoulos) writes:

> In article <95289681...@shelley.paradise.net.nz>, Nicholas Lee wrote:
>>discredit someone. For a member of the community such as himself this is
>>just not good behaviour.
>
> I see two problems here:
>
> - Problem #1 is Mark jumping on the gun whenever someone posts something
> "against" what he has documented as being IMAP, or NFS with the UW
> toolkit, or maildir. Well, this is Mark's personality and it cannot be
> changed, the same way as my or any other's personality cannot be changed.
> Yes he is a flaming gun, but also whenever he flames, I see technical

Well, that's fine, but as long as you do not cross the line that he has
crossed. It's one thing to be a flamehead -- which is a fine Usenet
tradition, after all, that is beyond all reproach -- but it's a completely
different situation when you take things out of context, and completely
misrepresent someone else in a totally underhanded and cowardly manner.
This is not flaming, this is something quite different.

> - Problem #2 is the IMAP spec and how some choose to implement it. Well,
> here there are two choices. Choice #1 says, you follow the spec no
> matter how you disagree with it, and certainly you do not break it.
> Choice #2 says *write and implement your own spec*. Just as Bernstein
> did (for example) with QMTP and Maildir.

You are not entirely 100% correct. DJB did break the letter of ESMTP.
Qmail advertises 8BITMIME, but doesn't do anything about it -- it will send
8-bit mail to non-8bit mailers without downshifting it to quoted-printable.

Now, I do have my own problems with Qmail, and DJB's reasons for doing that
happen to be 100% analogous -- he's stated that the whole 8BITMIME business
is just plain dumb -- yet you do not see me making a spectacle out of it on
Usenet and on private mailing lists. I have flamed DJB in the past over
this, but I draw the line at twisting someone else's words in order to
further my own personal agenda.

> If you do not like what is
> already there (and cannot convice the inventor to do otherwise) present
> your alternative and let the community decide what to use. But if you
> stick to implement the standard, you are inexcusable if you don't
> (because you *claim* to implement it). Simply stating that the IMAP RFC
> has nonsensical requirement does not prove anything. The RFC has the
> requirement, so you are required to follow it no matter what.
> Otherwise write (and implement) your own RFC. Nobody will stop you on
> that.

Well, this is what that crowd would _like_ for you to believe the issue is,
but it's not. It's a red herring. What's really happening is that people
are simply having a major cow because someone dared to diss IMAP4rev1,
that's all there is to it. Every time I run into something dumb in RFC
2060, and have to put in yet another workaround due to its weirdness, I
document it, and it's now grown to be quite a collection of bloopers.
Apparently, some egos got slightly bruised because of nothing more than
just a silly web page. And this latest issue will be just another footnote
in the next revision.

> I too am working on similar things and try to put my thoughts into code. I
> do not think that I will ever state X is done the "wrong" way in protocol
> Y. The standard is there and I either follow it, extend it or write my

The fact that something is a "standard" does not mean that everyone must
agree that it makes sense, and does not exempt the "standard" from being
subject to criticism. That may be what SOME people would like you to
believe, but I refuse to accept that line of thinking.

> own. But I do not break the existing, *especially* just because I do not
> like the personality of the inventor.

Congratulations -- you've completely fell for their trap. If I were to
have actually done what you've been led to believe I've done, neither Pine,
nor Outlook Express would work at all, with Courier-IMAP. Well, to be
technically correct, Pine would break with the next revision, because I
haven't yet revved since the conflict flared up (and I was never known for
making positive comments vis-a-vis Microsoft). But, it won't. One thing
I've realized is that I don't think that I really want to earn the same
reputation as UW-IMAP. In fact, I wrote Courier-IMAP precisely because of
UW-IMAP's reputation of ignoring repeated requests for compatibility with
software written by someone who's been feuding with UW-IMAP's authors.
And I'm certainly not going to go down the same path myself.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4zIzU+3BFaxHnGY0RApcHAJoD1Q7Wj2S7B4il9u9GitSUo+l0WQCeOiUT
TkRrFm8kdX5A+kMxPYibFi4=
=nqgh
-----END PGP SIGNATURE-----


ra...@adsl-151-203-22-73.bellatlantic.net

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
On 13 Mar 2000 06:38:17 GMT, Sam <s...@email-scan.webcircle.com> wrote:

>Well, this is what that crowd would _like_ for you to believe the issue is,
>but it's not. It's a red herring. What's really happening is that people
>are simply having a major cow because someone dared to diss IMAP4rev1,
>that's all there is to it. Every time I run into something dumb in RFC
>2060, and have to put in yet another workaround due to its weirdness, I
>document it, and it's now grown to be quite a collection of bloopers.

Then stop kvetching and *POST IT*. You're starting to sound like a meower....

--

Nico Kadel-Garcia
nka...@bellatlantic.net

Sam

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <slrn8cpnns...@adsl-151-203-22-73.bellatlantic.net>,
ra...@adsl-151-203-22-73.bellatlantic.net () writes:

> On 13 Mar 2000 06:38:17 GMT, Sam <s...@email-scan.webcircle.com> wrote:
>

>>Well, this is what that crowd would _like_ for you to believe the issue is,
>>but it's not. It's a red herring. What's really happening is that people
>>are simply having a major cow because someone dared to diss IMAP4rev1,
>>that's all there is to it. Every time I run into something dumb in RFC
>>2060, and have to put in yet another workaround due to its weirdness, I
>>document it, and it's now grown to be quite a collection of bloopers.
>

> Then stop kvetching and *POST IT*. You're starting to sound like a meower....

I have posted it on the project's web page, and it's included in the source
tarball. It's been out there for a while.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4zOpy+3BFaxHnGY0RAjU3AJ0eoehJXqo5qkkk1+vvtSMikVPpjgCgvwC0
ueXxTeqfHvpi6h0BJiN6h5U=
=1zAg
-----END PGP SIGNATURE-----


Vladimir A. Butenko

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
In article <courier.38CC...@email-scan.webcircle.com>, Sam
<s...@email-scan.webcircle.com> wrote:

> Well, this is what that crowd would _like_ for you to believe the issue is,
> but it's not. It's a red herring. What's really happening is that people
> are simply having a major cow because someone dared to diss IMAP4rev1,
> that's all there is to it. Every time I run into something dumb in RFC
> 2060, and have to put in yet another workaround due to its weirdness, I
> document it, and it's now grown to be quite a collection of bloopers.
> Apparently, some egos got slightly bruised because of nothing more than
> just a silly web page. And this latest issue will be just another footnote
> in the next revision.

Look. Calm down, please. You are not the only person Mark attacted here
:-). And I'd agree that taking info from a personal E-mail and posting it
on a public forum is a BAD thing. But be professional. Try to separate
personal relations/customs/etc. from the technical side of the things.

a) Yes, IMAP standard is not perfect. But it is a complicated protocol,
and for the level of its complexity - it is written pretty well. I bet you
have never read LDAP RFCs :-). If you want a perfect standard, find
someone who DOES NOT implement his/her own server or client and make that
person re-write the standard. Otherwise some things that are obvious for
the implementor, but completely unknown for others inevitably slip
through. Mark did a very good job on the IMAP standard - again, just read
the LDAP RFCs that do not even care to explain you what the whole thing is
all about.

b) Standards are called standards because they are standards. Yes, both
client and server vendors sometimes do not get what the RFC author meant,
but if they then find that the standard really SAYS what he meant, they
should comply. Otherwise the whole sense of having a standard vanishes.
For example, our implementation of the ACAP server was found to be
incompatible with a client that was designed by one of ACAP spec
co-authors. But we did what the standard SAID, and the author itself
agreed, and had to change his code, or find a workaround, and finally we
started to talk about drafting a new version of the ACAP specs, because
the standard actually said not what the authors meant. That's how the
things work in the world of standards. And this is why all standards start
their lifes as drafts so everybody can try and comment and minimize the
risk of mis-understanding in future.

c) I personally do not like the way IMAP spec treat spaces. Most of the
client parsers simply scan spaces before any lexem, so only when we
started to deploy our servers in universities that use pine, the problem
of our server inserting spaces where the IMAP standard does not require
them came up. Yes, someone wrote to Mark, because Netscape, Outlook,
Mulberry had no problem - only pine had. And Mark posted a similar note
here, called "CommuniGate Pro IMAP bug". Whatever our feels were about
this way of handling things (instead of writing to our tech. team
directly, as all other vendors do) - from the TECHNICAL point of view Mark
was right, and the IMAP standard says nothing about a space in that place
- so we had to fix it immediately and release a new version out of
schedule. Because it was OUR fault, and if we say that we comply with
IMAP4rev1, we have to comply.

d) there are many other issues in IMAP specs that can be discussed. But
they should be discussed in the professional manner, and not in starting
the "pissing match" that you always say you do not want to participate in.
And till those issues are formulated in some IMAP5 or whatever, you either
follow the written IMAP4rev1 standard, or you say that your server does
not support that standard.

> The fact that something is a "standard" does not mean that everyone must
> agree that it makes sense, and does not exempt the "standard" from being
> subject to criticism. That may be what SOME people would like you to
> believe, but I refuse to accept that line of thinking.

Standards are not equal to the law. One may think that he can change the
laws one southern state by making oral sex on public, and then protesting
from the prison cell, demanding the change of that law. The RFC standards
are neither about the moral standards, nor about political issues, - they
are about interoperability. The "social activism" is not a working method
here. If you want to change the things without waiting for a new standard,
and/or you want to push the development of a new standard - it's doable
and it's simple: if you have a client vendor who also needs a different
protocol, you can:

a) present a new keyword in your IMAP CAPABILITY response:
COURIERMODE, for example.
b) let a client issue a special command, let's say COURIERMODE ON.
c) do whatever you want with the IMAP standard after that - i.e. work in
your own protocol that your server and that client both understand.

But if the client has not issued that command, the server should work
strictly as described in RFC2060, otherwise it won't be an IMAP server.

> Congratulations -- you've completely fell for their trap. If I were to
> have actually done what you've been led to believe I've done, neither Pine,
> nor Outlook Express would work at all, with Courier-IMAP.

That would be a bad thing. The result of that thing would be inferrior
popularity of your server - compared to a standard-compliant IMAP4rev1
server.

> making positive comments vis-a-vis Microsoft). But, it won't. One thing
> I've realized is that I don't think that I really want to earn the same
> reputation as UW-IMAP. In fact, I wrote Courier-IMAP precisely because of
> UW-IMAP's reputation of ignoring repeated requests for compatibility with
> software written by someone who's been feuding with UW-IMAP's authors.

Could you please list those requests? We have now aprx. 5mln seats sold
last year. Aprx 10% of those are using IMAP. We would hear about any
problem with any IMAP software. And I should tell you - we have no
"improvements" of the IMAP4rev1 in our servers. It's strictly on the
standard, and those clients do follow the standard. So I'd be very
interested in learning about the problems the current standard presents to
the current clients.

> And I'm certainly not going to go down the same path myself.

If you want to DEVELOP a BETTER standard - let's discuss it. Right here.
As you can see, there are many IMAP4rev1 extensions that are documented in
RFCs that do not have Mark's name on them. So, I do not understand why you
think that the only way to implement a better standard is to screw up the
existing one: that's definitely has nothing to do with Mark's attitude.

Hydrogen fuel is better than gas. So, let's add hydrogen pumps on the gas
stations and encourage car manufacturers to build H-powered cars
(clients). But if you start to put Hydrogen instead of gas into all cars
that stop by because your sign reads "gas station" - I would seriously
doubt that you will be able to earn better reputation than UW-IMAP.

--
Vladimir Butenko
Stalker Software, Inc.

Mark Crispin

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to Vladimir A. Butenko
On Mon, 13 Mar 2000, Vladimir A. Butenko wrote:
> Whatever our feels were about
> this way of handling things (instead of writing to our tech. team
> directly, as all other vendors do) - from the TECHNICAL point of view Mark
> was right, and the IMAP standard says nothing about a space in that place
> - so we had to fix it immediately and release a new version out of
> schedule. Because it was OUR fault, and if we say that we comply with
> IMAP4rev1, we have to comply.

If it makes you feel any better, I've been on the other side, and with
MUCH more embarassing problems. It's not a pleasant situation.
Unfortunately, as you've discovered, the only way out is to make an
emergency release.

I strongly recommend that you go to the periodic IMC IMAP interoperability
bakeoffs. This is the best way to avoid such problems in the future. I
don't know when the next one will be held, but it should be announced
here. It's always better to get interoperability problems discovered and
resolved in pre-release code!

> Could you please list those requests?

I think that what he is talking about is that I don't want to get into the
business of supporting the "maildir" format. There are at least three
third-party c-client drivers available for maildir. If someone uses
maildir, they can go to one of those third parties for code and support.

I believe that it is infeasible to build maildir support that scales well
(e.g. does not exhibit performance problems with a moderately large
mailbox of 2000 messages) and also does not violate a major rule of either
maildir or IMAP. It's a no-win situation for me; and therefore I choose
to allow the maildir enthusiast community to do their own development,
distribution, and support of maildir IMAP code.

Mark Crispin

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
On 13 Mar 2000, Sam wrote:
> > Then stop kvetching and *POST IT*. You're starting to sound like a meower....
> I have posted it on the project's web page, and it's included in the source
> tarball. It's been out there for a while.

That's where my material came from, and your statements are incorrect.

Please implement the specification, not your notion of what the
specification should be.

If you want to see a specification changed to conform to your views (such
as forbidding "[" in atoms), please follow the process for doing so.

Please make sure you have your facts right before you attack other
people's software.

Vladimir A. Butenko

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
In article
<Pine.NXT.4.30.000313...@Tomobiki-Cho.CAC.Washington.EDU>,
Mark Crispin <m...@CAC.Washington.EDU> wrote:

> On Mon, 13 Mar 2000, Vladimir A. Butenko wrote:

> > Whatever our feels were about
> > this way of handling things (instead of writing to our tech. team
> > directly, as all other vendors do) - from the TECHNICAL point of view Mark
> > was right, and the IMAP standard says nothing about a space in that place
> > - so we had to fix it immediately and release a new version out of
> > schedule. Because it was OUR fault, and if we say that we comply with
> > IMAP4rev1, we have to comply.
>

> If it makes you feel any better, I've been on the other side, and with
> MUCH more embarassing problems. It's not a pleasant situation.
> Unfortunately, as you've discovered, the only way out is to make an
> emergency release.

I do not see anything unpleasant in the situation itself: it's our job.
It's virtually impossible that anyone (including the author :-) can
implement EVERYTHING exactly as outlined in the docs. Some things can
be misunderstood, some can be simply overlooked. And when we get the
problem report, we investigate and fix the problem.

I grepped the http://www.stalker.com/CommuniGatePro/History.html file
for "Bug" and "IMAP". There are 16 of them there (over the last 2 years),
though there were only 1 since that "CGatePro IMAP Bug" report you posted
many months ago, and I doubt if it counts as a bug :-). But who knows -
one can run into some problem in future, and that's what CGatePro Logs are
for, and that's what we are for: to fix the things if something is wrong.
I do not see anything unpleasant in these things: bugs do happen, the
problem is how many of them are there, how important they are and how
quickly they are fixed.


> I strongly recommend that you go to the periodic IMC IMAP interoperability
> bakeoffs. This is the best way to avoid such problems in the future. I
> don't know when the next one will be held, but it should be announced
> here. It's always better to get interoperability problems discovered and
> resolved in pre-release code!

While I'd enjoy to go to such a forum myself (if time permits :( ), I do
not think that this is the best way to fix the things. We work slightly
different: when a problem is reported, it's investigated immediately, and
the fix appears in the next release - at least, the next beta release that
go out every 1-2 weeks. I think that you do the same, and the main purpose
of the forum is to settle the ideological misunderstandings, draw a way
for new development, and solve those interoperability problems that go
beyond the written specs. For example, I'd like to see the eyes of those
Microsoft fellows who made their clients open 20 connections for one
session... :-) and :-(


> > Could you please list those requests?
>

> I think that what he is talking about is that I don't want to get into the
> business of supporting the "maildir" format. There are at least three
> third-party c-client drivers available for maildir. If someone uses
> maildir, they can go to one of those third parties for code and support.
>
> I believe that it is infeasible to build maildir support that scales well
> (e.g. does not exhibit performance problems with a moderately large
> mailbox of 2000 messages) and also does not violate a major rule of either
> maildir or IMAP. It's a no-win situation for me; and therefore I choose
> to allow the maildir enthusiast community to do their own development,
> distribution, and support of maildir IMAP code.

I can argue with you that directory-based solutions can be made scalable,
but this is not the point: the internal structure of some IMAP server has
nothing to do with the IMAP protocol specs themselves. That's I'm afraid,
the Sam's problem here: he had some disagreements with you as the designer
of one of IMAP servers, and that's completely up to him - he can create an
IMAP server that has nothing in common with your server, and if it is a
better server -so let it be so.

But then he passes that disagreement with Mark-designer to
Mark-protocol-maintainer, and this is completely different issue. One can
use a completely different approach and code to develop a server, but as
long as it complies with IMAP4rev1 specs, it's an IMAP4rev1 server. On the
other hand, if one takes imapd-uw code and changes just one line of it so
it would not comply (be that person name Sam or Mark) - that server will
not be an IMAP4rev1 server, and that's it.

What I wanted to see posted is not some problems of implementing this or
that in someone's code, but the problems in the IMAP4rev1 protocol itself,
and the requests for improvements of that protocol. Imporvements of a
particular server code is a completely different issue, and should be
discussed on that server support mailing list/forum.

> -- Mark --

Lyndon Nerenberg

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
>>>>> "Vladimir" == Vladimir A Butenko <but...@stalker.com> writes:

Vladimir> permits :( ), I do not think that this is the best way
Vladimir> to fix the things. We work slightly different: when a
Vladimir> problem is reported, it's investigated immediately, and
Vladimir> the fix appears in the next release - at least, the next
Vladimir> beta release that go out every 1-2 weeks. I think that
Vladimir> you do the same, and the main purpose of the forum is to
Vladimir> settle the ideological misunderstandings, draw a way for
Vladimir> new development, and solve those interoperability
Vladimir> problems that go beyond the written specs. For example,

Which means your fixing things in a reactionary mode, and without
the hive knowledge present when you get a bunch of experienced
engineers together. Being able to discuss problems in a group,
especially when there is more than one "right answer" to the
problem is invaluable. You will also find your software taking
a much severer beating at the interop then it *ever* will in
the field. For our SMS server product, about 80% of the bug fixes
that went in were a direct result of IMC interop testing. A lot
of these were edge cases that you won't likely run into in the
field, but *will* see when you're testing against other peoples
alpha software. And then there are those of us mercenaries who
while away the time telneting to the servers and doing "unexpected"
things. (My t/golf shirt collection has grown considerably as a
result of these activities ;-)

Vladimir> I'd like to see the eyes of those Microsoft fellows who
Vladimir> made their clients open 20 connections for one
Vladimir> session... :-) and :-(

Even more reason to attend.

IMAP is a complex and subtle protocol. Any vendor of IMAP
products wouldn't be taking their job seriously if they
didn't attend the IMC interop events on at least a semi-
regular basis.

--lyndon

Vladimir A. Butenko

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
In article <s9u7lf6...@zappa.esys.ca>, Lyndon Nerenberg
<lyn...@MessagingDirect.COM> wrote:

OK, I agree with all that, but I still cannot get the point.

a) there are already several thousand CGatePro server installed, and some
- on major ISPs, so it's not a problem to test any alpha-mode software
against it.

b) the CGatePro server software is available for free testing for 18
platforms directly from http://www.stalker.com/CommuniGatePro/ and I do
not see the problem in testing against it. Several E-mail client vendors
already use it as a "testbed" not only for their alpha versions, but for
the development process, too.

What the difference will it make if, say, I appear on some forum? I'm not
an IMAP server and will break as soon as someone puts a signle IMAP
command into any of my "outside world connections". If someone wants to
test something against our servers and for any reason cannot deploy it on
one of their own machines - just call 800-262-4722 and ask for Philip and
ask him to set a test account on one of our own CGatePro servers - there
are at least 3 avaiable on the "visible" Internet, and we can even create
an account on a Dynamic Cluster, too - but that's unlikely to make any
difference for the tester. Last time I looked, mail.stalker.com had about
half a dozen accounts created for varios Mail-client developers.

If there is a PROBLEM, and that problem has to be discussed with client
vendors - then, of course, a forum of some kind is a must. But how for God
sake can we find interoperability problems if we just sit together
somewhere and start to chat? I really do not get it, please explain.

> Vladimir> I'd like to see the eyes of those Microsoft fellows who
> Vladimir> made their clients open 20 connections for one
> Vladimir> session... :-) and :-(
>
> Even more reason to attend.

Yes, because there is a KNOWN problem. And it is of some interest for all
server vendors, not to Stalker only.

> IMAP is a complex and subtle protocol. Any vendor of IMAP
> products wouldn't be taking their job seriously if they
> didn't attend the IMC interop events on at least a semi-
> regular basis.

I do undrestand that it's kinda strange that we do not attend your
meetings, and I would take it as a good pitch for that conference, but
while you may say that we do not take our job seriously, I must say that
IMAP code is.. let me check.. - just 5% of CGatePro code. And it does not
create any problem neither to us, nor to our clients. While there are much
more serious issues and portions of the code that are much more
complicated than IMAP with all its extensions - and those things do
require our full attention - S/MIME incorporation, distributed LDAP,
Calendaring, WML - that's a huge list of things under active development
now.

This is all not to say that I do not see a reason for such meetings, but I
just want to know - what exactly we want to discuss there, what problems
do exist now, and - since we are in E-mail messaging business - why all
this cannot be discussed via E-mail or on Usenet?

Call me old and lazy, but if someone has to cross 11 time zones every 2
months and to participate in several Expos all over the globe, that person
inevitably developes a habbit of using E-mail and Usenet and asking
carefully what one more flight is needed for...

If I only could explain this to the cops, too - that I HAVE to drive fast,
because I'm tired of plains.... %-)


> --lyndon

Nicholas Lee

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to

"Vladimir A. Butenko" <but...@stalker.com> wrote in message
news:butenko-1303...@stalker.gamma.ru...

The issue that has given rise to this thread is essentially to provide
explicit tokinisation for at least the password field, but I might suggest
given the "Which clients canNOT hndle '@' in authentication identifiers"
thread that tokinising this might be worthwhile as well.

I was having a problem because the '[]' are use as tokin limiters (MIME
stuff) else where in the protocol spec where as they where not 'special'
characters in the password auth field. For various reasons Sam at this
stage treats [] as distrinct tokin limits where every. (Personally I think
his reasons that it reduce code bloat are extremely valid.)

The fix he gave me for including [] in the password auth field is to
stringify the field, ie. "foo[]bar". Once this is done I managed to get my
"telnet localhost imap" testing to work, outlook express worked regards,
but pine using the complete spec doesn't.

I can only say IMO that having a token limiter in one part of the protocol
stream but not in another other part, would seem to increase the code bloat
and maintance requirements. Of course not being a imap server authour I
can't say this exactly. Adding four bytes (tokinising the login id and
password auth fields) certainly in this case seems worth while.


Of course back to the topic on hand, the thing that pissed me off about
Mark's actions is he took a paragraph out context and used it to extend his
agenda. I don't care if he thinks he's doing in the public good or other
people accept this . It's bad behaviour and I'm going to tell him off for
it. There are more civil ways of doing this without resorting to being a
bully-boy.

Nicholas

Lyndon Nerenberg

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
>>>>> "Vladimir" == Vladimir A Butenko <but...@stalker.com> writes:
Vladimir> If there is a PROBLEM, and that problem has to be
Vladimir> discussed with client vendors - then, of course, a forum
Vladimir> of some kind is a must. But how for God sake can we find
Vladimir> interoperability problems if we just sit together
Vladimir> somewhere and start to chat? I really do not get it,
Vladimir> please explain.

Vladimir, it's an interop event. 80 engineers in a large room with
a large ethernet and a large number of computers running a large
number of client and server implementations, all making sure they
can talk to each other. [Note that sales, marketing, press, and other
non-engineering scum are explicitly forbidden from attending ;-)]

There is very little chatting going on. In fact, if you hear
someone talking it means they found a bug.

Since these are all engineers, you're almost guaranteed that they
have source to their products with them. Thus, things gets fixed
in minutes, and the fixes can be tested against everyone elses code
very rapidly.

One example of this involved early deployments of the SASL DIGEST-MD5
mechanism. At the interop last March there were three or four vendors
working on this. We discovered quite quickly that there were some
vast differences in interpretation of some parts of the spec. Having
a group of engineers in the same room we were able to quickly break out
and identify the issues, propose a solution, implement that, and see
if it got us any further along. All inside of an hour. If we had tried
to do that by email it a) probably never would have happened to begin
with and b) still be a work in progress.

Interop events are invaluable for this sort of thing, and I stand
completely behind my statement about serious vendors attending
them. And that's not a bullet aimed at you. It sounds like you've
not been to one, so it's hard to appreciate just how valuable they
are. (And for those of us who have attended the IMC IMAP events
over the years, we have noticed a *direct* correlation between participation
and product quality. It's amazing to see how the quality of a product
has improved by the second time a vendor shows up at the event. This
is a Good Thing for everyone.)

--lyndon

Nicholas Lee

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to

"Mark Crispin" <m...@CAC.Washington.EDU> wrote in message
news:Pine.NXT.4.30.000313...@Tomobiki-Cho.CAC.Washington.ED
U...


> I believe that it is infeasible to build maildir support that scales well
> (e.g. does not exhibit performance problems with a moderately large
> mailbox of 2000 messages) and also does not violate a major rule of either

As a point of interest, I have several mailbox at stage with near a 1000
message. Many with large attactments. I merged (copied) a few them into one
mailbox giving me 2060 messages, about 34 megs worth.

Both Outlook express and pine (4.21) have no issue with this mailbox at all.
Pine opens the mailbox instantly having never seen it before and of course
outlook express goes though the process of caching the headers which takes a
little while.

It might be noted that this is currently an semi-loaded K6-2 400 with
128megs of ram. With only my mail box being open. However I can't see how
having too parse a 2000 mbox format message of size 25-34 megs is going to
be fast (and saver) than the file system and maildir format. Worse case you
use something liek Resifs (sp?) to deal with all the small files.

Of course I'm using courier-imap.

Nicholas


Lyndon Nerenberg

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
>>>>> "Nicholas" == Nicholas Lee <nj.le...@kiwa.co.nz> writes:

Nicholas> The issue that has given rise to this thread is
Nicholas> essentially to provide explicit tokinisation for at
Nicholas> least the password field, but I might suggest given the
Nicholas> "Which clients canNOT hndle '@' in authentication
Nicholas> identifiers" thread that tokinising this might be
Nicholas> worthwhile as well.

Sorry, I wasn't clear about this. It's not a protocol issue, it's
a UI issue. The UIs of some (very) popular IMAP and POP clients
arbitrarily throw away any authentication string data entered
after an '@' character, along with the '@' character itself.

Thus, (imagine you're looking at a GUI login/password dialog
box) if I enter 'lyn...@messagingdirect.com' in the login
field of the dialog, the client throws out the '@messagingdirect.com'
part and tries to log me in as 'lyndon' (which is not my authentication
id on the server).

--lyndon

Mark Crispin

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
On Tue, 14 Mar 2000, Vladimir A. Butenko wrote:
> You forgot to mention one thing - the OS you are using. I guess that you
> are doing this on Linux, and Mark spends most of his time on - let me
> guess - Solaris :-). The speed of directory scan varies a lot on those
> platforms.

Not Solaris, but your point is correct. On some platforms, doing a stat()
on 2000 files in a directory takes 30 seconds, not 3 seconds. Ouch.

It quickly gets worse. If you are lucky, you can get the IMAP
internaldate and rfc822.size from the stat() call. But, that assumes that
the messages are stored in RFC822 format, not UNIX format, meaning CR/LF
newlines instead of LF newlines. If you don't have some sort of index
file (ala Cyrus), you end up having to open and read the file to count the
newlines. A lot of clients do "FETCH 1:* FAST"...oh dear...oh my...

You also should have an index file to store envelopes and body structures,
so you don't have to open the file. Many filesystems serialize path
references, so lots of consecutive opens are a problem. Yes (shudder),
there are clients which do "FETCH 1:* ALL" or "FETCH 1:* FULL".

So, you want to avoid doing a stat() at all; meaning that you have some
other way of discovering which objects from readdir() are files (messages)
and which are directories (subfolders). Again, the index file.

That's a lot of work for the index file to do. One reason that my mx
format is not encouraged is that it didn't go far enough and failed to
elimate the stat(). mx and mbx were simultaneous experiments, and mbx
kicked mx's rear end by two orders of magnitude. That's why work on mx
was abandoned with it half-finished.

The problem with an index file in the maildir context is that it defeats
one of the primary advertised capabilities of maildir: the use of
filesystem primitives to do locking (and hence NFS safety). So, an index
file, which is precisely what maildir tries to avoid, is probably out of
the question. What maildir does is store lots of stuff in the file names,
taking advantage of readdir() being much cheaper than the stat(). But you
can only cram so much here; and you have to interact with other maildir
software.

It is a set of difficult design tradeoff decisions that I don't wish to
make; no matter which one I choose, someone will be unhappy. That's why
it's better to let the maildir enthusiasts develop their own c-client
maildir drivers.

Mark Crispin

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to
On Mon, 13 Mar 2000, Mark Crispin wrote:
> Not Solaris, but your point is correct. On some platforms, doing a stat()
> on 2000 files in a directory takes 30 seconds, not 3 seconds. Ouch.

I should add that there's worse than just stat(). I just ran some quick
tests.

What really takes forever is copying 2000 files, because of the
serialization of open() on most filesystems. On most systems, it's much
faster to copy a 2MB file than to copy 2000 1KB files. On my system, it
came out to be 2 seconds vs. 103 seconds!

Deleting 2000 files took 80 seconds. The more messages you expunge, the
slower it is with a one-file/one-message format, and the faster it is with
a flat file format.

Renaming 2000 files, such as what you would have to do in maildir with a
build flags change if you store flags in the file name, took 93 seconds.

Big ouch.

Mark Crispin

unread,
Mar 13, 2000, 3:00:00 AM3/13/00
to Vladimir A. Butenko
On Tue, 14 Mar 2000, Vladimir A. Butenko wrote:
> But that's not the point. Even if your can get the RFC size directly from
> the file, it's still a "stat" call. And almost all mail clients want the
> to know the size of the message when they scan the mailbox.

Yes, which is why I made a mistake in my mx experiment by depending upon
stat() to get the size instead of storing it in the index. I didn't
realize how slow stat() could be.

Cyrus did not make that mistake; it also stores envelopes and body
structures in the index file. I think that Cyrus pretty much maximizes
the performance that you can squeeze out of one-file/one-message formats,
and they did a great job.

> And this still does not solve the problem of using a cluster when several
> clients can access the store from different servers

And, of course, Cyrus just punts on that entirely by outlawing NFS.

> mdir helps to avoid
> file locks (if it does not use index files), but does not help to
> synchronise changes.

Which is why we use locks in the first place! ;-)

> That's the problem only for the servers that rely on other software for
> mail delivery/automatd processing. But the point is clear, and I hope that
> all of us should agree:

Yes, I agree completely with your (a)-(e).

> There is, though one case that came to my mind, and it can become more and
> more important these days. Some of our clients said that they chose mdir
> for only one feature. It has nothing to do what we have discussed here
> before. THis feature is complete transparency.

Yes, transparency is very important, and often is neglected.

c-client is "mostly" transparent with traditional UNIX mailbox format; the
">" is not needed unless the line really looks like a UNIX mailbox header
line.

However, I agree this is an issue with traditional UNIX mailbox format,
and it's one of the big reasons why I never liked it.

mbx (the current favorite) and mtx (the old Tenex/TOPS-20 format) are
guaranteed fully transparent, even for binary data. I expect to be doing
some work soon to bum extra performance out of mbx format. It used to be
much faster than traditional UNIX format, but after the work I did to bum
performance in the traditional UNIX format it's now only "somewhat
faster". So I need to do some hacking to restore the big performance
advantage of my favorite format... ;-)

tenex format is transparent for normal text; in spite of the name, it's
actually a UNIXified mtx format (originally used by the UNIX port of MM)
that uses UNIX-style bare LF newlines instead of CR/LF. So its not
transparent if you care about CR and LF transparency and/or binary.

mbx, mtx, and tenex all allow shared read/write access, but they require
the ability to synchoronize updates and that means no NFS. Even when you
get locking out of the way, the inode vs. data cache problem over NFS will
still bite you.

mbx has the additional win that it allows shared expunge. That's great
for people who leave an IMAP client running 24/7, and then want to run an
IMAP client on their mail someplace else. For many folks at UW, it's
their office system that runs an IMAP client 24/7, for me, it's my home
system...

Vladimir A. Butenko

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
In article <95298168...@shelley.paradise.net.nz>, "Nicholas Lee"
<nj.le...@kiwa.co.nz> wrote:

> "Vladimir A. Butenko" <but...@stalker.com> wrote in message
> news:butenko-1303...@stalker.gamma.ru...
> > In article
> >
> <Pine.NXT.4.30.000313...@Tomobiki-Cho.CAC.Washington.EDU>,
> > Mark Crispin <m...@CAC.Washington.EDU> wrote:
>
> > What I wanted to see posted is not some problems of implementing this or
> > that in someone's code, but the problems in the IMAP4rev1 protocol itself,
> > and the requests for improvements of that protocol. Imporvements of a
> > particular server code is a completely different issue, and should be
> > discussed on that server support mailing list/forum.
>

> The issue that has given rise to this thread is essentially to provide
> explicit tokinisation for at least the password field, but I might suggest
> given the "Which clients canNOT hndle '@' in authentication identifiers"
> thread that tokinising this might be worthwhile as well.

Hmm. Try to look at it from the other side, OK. How long did it take for
major client developers to avoid stripping off the "@xxxx" things from the
account name? As has been already noted here, even 4.5 versions of those
products still do it (though they do not do it for me, I have to say). Now
you want to change the way the password field is sent. Look - they did not
fix the problem that at least 100,000 people have (and those people did
write to them, not only to the server vendors), and you want them to
change something to make their clients work with just one server that has
no installed base (at least, not now).

That's simply won't work. You cannot EXTEND a protocol by breaking it
compatibility with the original one. All you can do is to create a NEW
version of the protocol, or the new protocol completely. Not a bad thing,
but the question is - is anybody gonna support it? So, the changes you are
talking about are changes that can be made in a NEW protocol only, and
given the time it took for major players to implement at least SOME IMAP
functionality, I would say that we can disccus the NEW protocol, but we
are unlikely to see it deployed during the next 5-10 years.

This is why my position is: stay with whatever is specified in the current
standard, and invest the development efforts to the extensions, not to a
completely new protocol. I'd say that designing a new, not
backward-compatible protocol is worth doing when it is realized that the
current one prohibits the further development. It was like switch from POP
to IMAP - it's hardly possible to "extend" the POP protocol to give it
IMAP functionality, so a new protocol was worth developing efforts and all
troubles of deploying it. Mark was doing IMAP for over 10 years (correct
me if I'm wrong here), and IMAP started to play SOME role on the
marketplace only 2-3 years ago. When I made a presentation of IMAP
protocol (thanks, Terry for the papers) on Macworld'97 most of the
audience heard about IMAP for the first time.

And now you (and Sam) suggest to, actually, develop a NEW protocol - i.e.
enter the same 10-years cycle - and for what? For better parsing of some
lexems in IMAP?! Please, get real.


> I was having a problem because the '[]' are use as tokin limiters (MIME
> stuff) else where in the protocol spec where as they where not 'special'
> characters in the password auth field. For various reasons Sam at this
> stage treats [] as distrinct tokin limits where every. (Personally I think
> his reasons that it reduce code bloat are extremely valid.)

IMAP is NOT a laguage. While one can try to deploy a "traditional"
yacc-style parser for it, there are many catches there, because it is
place-dependent. There are no "atoms" and "special symbols" there, in the
strict sense of those terms. A regular parser that just calls "getLexem"
function would fail in many places there, as it would fail for most of
Inet protocols - SMTP, for example.

For some people - it's a very bad thing, because it's not what they EXPECT
to see. That's just a question of selecting the right tool for the job. A
generic, programming-language-style lexical analizer is not a right tool
for dealing with IMAP protocol command parsing.


> The fix he gave me for including [] in the password auth field is to
> stringify the field, ie. "foo[]bar". Once this is done I managed to get my
> "telnet localhost imap" testing to work, outlook express worked regards,
> but pine using the complete spec doesn't.

It's a good idea to put the password in quotes in any case. Otherwise you
have to check that the password does not contain a quite mark. More,
passwords can contain characters that are not allowed in a q-string,
either. Some clients (I do not remember their names) always send passwords
as LITERALs.


> I can only say IMO that having a token limiter in one part of the protocol
> stream but not in another other part, would seem to increase the code bloat
> and maintance requirements. Of course not being a imap server authour I
> can't say this exactly. Adding four bytes (tokinising the login id and
> password auth fields) certainly in this case seems worth while.

Any CLIENT vendor can do that. But any SERVER is supposed to accept
anything there - atom, q-string or a LITERAL.

> Of course back to the topic on hand, the thing that pissed me off about
> Mark's actions is he took a paragraph out context and used it to extend his
> agenda. I don't care if he thinks he's doing in the public good or other
> people accept this . It's bad behaviour and I'm going to tell him off for
> it. There are more civil ways of doing this without resorting to being a
> bully-boy.

I think that it was already some ageement here and things, hopefully,
calmed down with the lesson taken by all parties. Let's move forward.


> Nicholas

Vladimir A. Butenko

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
In article <s9u4saah9...@zappa.esys.ca>, Lyndon Nerenberg
<lyn...@MessagingDirect.COM> wrote:

> Vladimir> of some kind is a must. But how for God sake can we find
> Vladimir> interoperability problems if we just sit together
> Vladimir> somewhere and start to chat? I really do not get it,
> Vladimir> please explain.
>
> Vladimir, it's an interop event. 80 engineers in a large room with
> a large ethernet and a large number of computers running a large
> number of client and server implementations, all making sure they
> can talk to each other. [Note that sales, marketing, press, and other
> non-engineering scum are explicitly forbidden from attending ;-)]

Than I have to admin that my position as not only the CTO, but the
president of Stalker completely disquilifies me from joining, since such a
scum as myself is not welcome there :-(. Poor me.

> One example of this involved early deployments of the SASL DIGEST-MD5
> mechanism.

YOu got one more point. DIGEST-MD5 is not implemented correctly in
CGatePro :-). But since no popular client supported it, it was a forgotten
issue waiting for an RFC standard to appear. OK, one more point to go (if
such a scum as myself is allowed there).

> At the interop last March there were three or four vendors
> working on this. We discovered quite quickly that there were some
> vast differences in interpretation of some parts of the spec.

You bet.


> and identify the issues, propose a solution, implement that, and see
> if it got us any further along. All inside of an hour. If we had tried
> to do that by email it a) probably never would have happened to begin
> with and b) still be a work in progress.

While I'm talking to you, the source code of CGatePro is open in an IDE 2
feet from my desk. And it's always open and accessable for the tech stuff
24x7 hours. If a problem rises and there is a solution for it, the code is
changed the same day, if not immediately. But, OK, you got the point.



> Interop events are invaluable for this sort of thing, and I stand
> completely behind my statement about serious vendors attending
> them. And that's not a bullet aimed at you.

C'mon. I always wear a vest. Bullets are very welcome here :-)

> It sounds like you've
> not been to one, so it's hard to appreciate just how valuable they
> are.

Heh... Things are more complicated for my poor soul. AFAIR, last time you
had a meeting during one of the shows (LinuxWorld?). I was about to go
there and at least see what it was all about (guys from Cyrusoft told me
about it and convinced), but my scum part had to stay on our booth, since
there were some large clients or press people coming with whom I had to
talk. Not an excuse, but an explanation. I think if you take that meeting
next time during some show time, we have to schedule at least two people
to attend, so if I cannot go, someone from Stalker will.

> (And for those of us who have attended the IMC IMAP events
> over the years, we have noticed a *direct* correlation between participation
> and product quality. It's amazing to see how the quality of a product
> has improved by the second time a vendor shows up at the event. This
> is a Good Thing for everyone.)

OK, OK, you bought me. When will the next IMC meeting happen? I think it's
some fee to attend - to whom and when should we pay this? I hope that it
will be posted here, as Mark has said, and if some of you drop a message
to my E-mail address, that would work too. Just make it happen in the Bay
area, pleeese... :-)

Vladimir A. Butenko

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
In article <95298367...@shelley.paradise.net.nz>, "Nicholas Lee"
<nj.le...@kiwa.co.nz> wrote:

You forgot to mention one thing - the OS you are using. I guess that you


are doing this on Linux, and Mark spends most of his time on - let me
guess - Solaris :-). The speed of directory scan varies a lot on those
platforms.

CGatePro in the "native" mode scans all its account directories to read
the names of available accounts. And on some site, it takes up to 10
minutes. That's too bad because they cannot afford 10 minutes downtime to
upgrade the software. Yes, there are not 2000 files in that directory
(several orderes of magnitude more) and they are not in a flat directory,
and there is a "stat" call issued for each of the found files, but that's
still too slow. So, a 3 second delay to scan a 2000 files directory can be
easily observed on some platforms. And with 2000 files, you usually have
about 5MB mailbox that can be easily read and parsed in 3 seconds.

So, it all depends, and things are not as simple as they seem on the surface.

> Nicholas

Vladimir A. Butenko

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to

> On Tue, 14 Mar 2000, Vladimir A. Butenko wrote:

> > You forgot to mention one thing - the OS you are using. I guess that you
> > are doing this on Linux, and Mark spends most of his time on - let me
> > guess - Solaris :-). The speed of directory scan varies a lot on those
> > platforms.
>

> Not Solaris, but your point is correct. On some platforms, doing a stat()
> on 2000 files in a directory takes 30 seconds, not 3 seconds. Ouch.

Hmmmm. AIX? :-)


> It quickly gets worse. If you are lucky, you can get the IMAP
> internaldate and rfc822.size from the stat() call. But, that assumes that
> the messages are stored in RFC822 format, not UNIX format, meaning CR/LF
> newlines instead of LF newlines. If you don't have some sort of index
> file (ala Cyrus), you end up having to open and read the file to count the
> newlines. A lot of clients do "FETCH 1:* FAST"...oh dear...oh my...

I'd even say most of them.


> You also should have an index file to store envelopes and body structures,
> so you don't have to open the file. Many filesystems serialize path
> references, so lots of consecutive opens are a problem. Yes (shudder),
> there are clients which do "FETCH 1:* ALL" or "FETCH 1:* FULL".

Again, not just "there are", there are pretty popular IMAP clients that do
that. Not ALL, not FULL, but at least ENVELOPE and BODYSTRUCTURE (just to
show if there is an attachment in the file). But even if they need just
the "envelope" - that's enough to demand file opening for all files or you
need an index file.



> So, you want to avoid doing a stat() at all; meaning that you have some
> other way of discovering which objects from readdir() are files (messages)
> and which are directories (subfolders). Again, the index file.

Not exactly. There is no need to store the messages and the subfolders in
the same physical directory. In CgatePro, for example, the mailboxes are
name.mbox or name.mdir files/directories, while submailboxes are stored
inside name.folder directory.

But that's not the point. Even if your can get the RFC size directly from
the file, it's still a "stat" call. And almost all mail clients want the
to know the size of the message when they scan the mailbox.

> That's a lot of work for the index file to do. One reason that my mx
> format is not encouraged is that it didn't go far enough and failed to
> elimate the stat(). mx and mbx were simultaneous experiments, and mbx
> kicked mx's rear end by two orders of magnitude. That's why work on mx
> was abandoned with it half-finished.
>
> The problem with an index file in the maildir context is that it defeats
> one of the primary advertised capabilities of maildir: the use of
> filesystem primitives to do locking (and hence NFS safety).

And this still does not solve the problem of using a cluster when several
clients can access the store from different servers: mdir helps to avoid


file locks (if it does not use index files), but does not help to

synchronise changes. At least, it requires the server to check what has
changed in the directory each time (to issue the IMAP unilateral
FETCH/EXPUNGE messages), and for a mailboxes with 2000 messages it means
constant delays.

> So, an index
> file, which is precisely what maildir tries to avoid, is probably out of
> the question. What maildir does is store lots of stuff in the file names,
> taking advantage of readdir() being much cheaper than the stat(). But you
> can only cram so much here; and you have to interact with other maildir
> software.

That's the problem only for the servers that rely on other software for


mail delivery/automatd processing. But the point is clear, and I hope that
all of us should agree:

a) there are much more problems with .mdir format than one can see on the
first look.

b) there are rather small narrow areas where .mdir really provides an
improvement over .mbox, while in most other cases it's slower (and always
- more resource hungry)

c) in those narrow areas (storing a small number of large (MB+) messages)
where .mdir has a plus.

d) as a result of a)-c) that's very reasonable for a server implementor
not to support .mdir format, if he does not see a great demand (in those
narrow areas)

e) as a result of a)-c) that's very unreasonable for any server
implementor to provide .mdir as the ONLY way to store messages.


> It is a set of difficult design tradeoff decisions that I don't wish to
> make; no matter which one I choose, someone will be unhappy. That's why
> it's better to let the maildir enthusiasts develop their own c-client
> maildir drivers.

Exactly. Actually, I know only a handful of our clients that actually use
the .mdir format on their servers. I can never say for sure, because it's
not even in the admin hands - any CGatePro user can create mailboxes of
any type inside his/her account, but at least we have not heard about
.mdir being too popular- on the server that provides BOTH.

There is, though one case that came to my mind, and it can become more and
more important these days. Some of our clients said that they chose mdir
for only one feature. It has nothing to do what we have discussed here
before. THis feature is complete transparency.

In "classic".mbox mailbox managers, you have to add a ">" to any line that
starts with "From " and has any empty line in front of it. Not a big deal,
one would say - but not these days. S/MIME becomes more and more popular,
and if the message is not encrupted (and thus is in base-64), but is just
signed, then adding that ">" to the message by the server invalidates the
digital signature.

I'm not 100% sure, probably it's not an issue with S/MIME, but with PGP
only, but it does exist, and we know some business customers who use
secure mail a lot and thus had to choose .mdir over .mbox.

Again, it has nothing to do with .mdir as the format, any single-file
format that provides trasnparency will do this job, too.


> -- Mark --

Sam

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <butenko-1303...@stalker.gamma.ru>,


but...@stalker.com (Vladimir A. Butenko) writes:

> Look. Calm down, please. You are not the only person Mark attacted here
>:-). And I'd agree that taking info from a personal E-mail and posting it
> on a public forum is a BAD thing.

No, I have no problem with that, but only as long my position is accurately
represented, and not twisted around in order to further one's own petty
disputes with third parties.

> a) Yes, IMAP standard is not perfect. But it is a complicated protocol,

It's not much more complicated than SMTP with all the ESMTP extensions; yet
you do not find the same level of interoperability problems there.

> d) there are many other issues in IMAP specs that can be discussed. But
> they should be discussed in the professional manner, and not in starting
> the "pissing match" that you always say you do not want to participate in.

Well, you don't have much choice there when someone's deliberately
misrepresenting yourself, in order to further his own agenda. If I see
Mark Crispin, or anyone else for that matter, behaving in a totally
unprofessional and unethical manner -- again, that has nothing to do with
publishing personal mail -- I am going to challenge that no matter who he
or she is.

>> making positive comments vis-a-vis Microsoft). But, it won't. One thing
>> I've realized is that I don't think that I really want to earn the same
>> reputation as UW-IMAP. In fact, I wrote Courier-IMAP precisely because of
>> UW-IMAP's reputation of ignoring repeated requests for compatibility with
>> software written by someone who's been feuding with UW-IMAP's authors.
>
> Could you please list those requests? We have now aprx. 5mln seats sold
> last year. Aprx 10% of those are using IMAP. We would hear about any
> problem with any IMAP software. And I should tell you - we have no
> "improvements" of the IMAP4rev1 in our servers. It's strictly on the
> standard, and those clients do follow the standard. So I'd be very
> interested in learning about the problems the current standard presents to
> the current clients.

Not those kind of requests -- I've watched for a couple of years as
repeated requests to add maildir support to c-client (UW IMAP and Pine)
were refused. The initial excuses given were that there is no need for
maildir support in the UW, and this software is really for UW's use only.
I suppose that at some point that no longer seemed to be very credible;
eventually c-client grew to support a large assortment of back end mail
formats, and it would've been a stretch to ask people to believe that every
one of them was in use in the University of Washington.

So, I suppose the current excuse has something to do with performance, and
I see that elsewhere that FUD has been thoroughly debunked, so I don't need
to go over that. But the bottom line is that I got tired of watching this
constant bickering, and decided to do the job myself, and did it.

>> And I'm certainly not going to go down the same path myself.
>
> If you want to DEVELOP a BETTER standard - let's discuss it. Right here.
> As you can see, there are many IMAP4rev1 extensions that are documented in
> RFCs that do not have Mark's name on them. So, I do not understand why you

Oh, although I do believe that a better remote mail access protocol would
be very welcome, and very useful, that's nothing more than a nice fantasy.
It's not going to happen.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4zbfq+3BFaxHnGY0RApovAKCPfnnjkQoc7vfy5X4VbKY7IMk3OQCeL68p
GTNgGGe86mRVe5HQBwj5z2M=
=s161
-----END PGP SIGNATURE-----


Sam

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <butenko-1403...@stalker.gamma.ru>,


but...@stalker.com (Vladimir A. Butenko) writes:

> In article <95298367...@shelley.paradise.net.nz>, "Nicholas Lee"
> <nj.le...@kiwa.co.nz> wrote:
>
>> "Mark Crispin" <m...@CAC.Washington.EDU> wrote in message
>> news:Pine.NXT.4.30.000313...@Tomobiki-Cho.CAC.Washington.ED
>> U...
>>
>>
>> > I believe that it is infeasible to build maildir support that scales well
>> > (e.g. does not exhibit performance problems with a moderately large
>> > mailbox of 2000 messages) and also does not violate a major rule of either
>>
>> As a point of interest, I have several mailbox at stage with near a 1000
>> message. Many with large attactments. I merged (copied) a few them into one
>> mailbox giving me 2060 messages, about 34 megs worth.
>>
>> Both Outlook express and pine (4.21) have no issue with this mailbox at all.
>> Pine opens the mailbox instantly having never seen it before and of course
>> outlook express goes though the process of caching the headers which takes a
>> little while.
>>
>> It might be noted that this is currently an semi-loaded K6-2 400 with
>> 128megs of ram. With only my mail box being open. However I can't see how
>> having too parse a 2000 mbox format message of size 25-34 megs is going to
>> be fast (and saver) than the file system and maildir format. Worse case you
>> use something liek Resifs (sp?) to deal with all the small files.
>>
>> Of course I'm using courier-imap.
>

> You forgot to mention one thing - the OS you are using. I guess that you
> are doing this on Linux, and Mark spends most of his time on - let me
> guess - Solaris :-). The speed of directory scan varies a lot on those
> platforms.

Really? There's such a huge performance difference in the speed of
opendir() and readdir() on different platforms?

> upgrade the software. Yes, there are not 2000 files in that directory
> (several orderes of magnitude more) and they are not in a flat directory,
> and there is a "stat" call issued for each of the found files, but that's
> still too slow.

Well, I guess I'm lucky, because I do not need to stat each file when
opening a maildir. Just opendir() and readdir(). That's all.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4zbf8+3BFaxHnGY0RAqE4AJ4jiA3WfhZiEAGCY+/zlrXFwjHGBACgkIHx
c9KJy4WoXNWknedqKUde31A=
=jxqf
-----END PGP SIGNATURE-----


Sam

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <butenko-1303...@stalker.gamma.ru>,


but...@stalker.com (Vladimir A. Butenko) writes:

> I can argue with you that directory-based solutions can be made scalable,
> but this is not the point: the internal structure of some IMAP server has
> nothing to do with the IMAP protocol specs themselves. That's I'm afraid,
> the Sam's problem here: he had some disagreements with you as the designer
> of one of IMAP servers, and that's completely up to him - he can create an

Please don't misrepresent myself like Mark did. The only issues I've ever
had were some issues with IMAP4rev1 itself, and some problems with several
IMAP clients. Go ahead and search Deja, or any other search engine, and
try to catch me badmouthing either Mark Crispin, or UW-IMAP, before last
week. Perhaps you are referring to my comments regarding the repeated
refusals to add maildir support to UW-IMAP, but was only a personal
observation that I haven't really discussed with anyone, in fact I don't
even mention it in the release notes. That was only a personal motivation
for me, nothing more.

> IMAP server that has nothing in common with your server, and if it is a
> better server -so let it be so.
>
> But then he passes that disagreement with Mark-designer to
> Mark-protocol-maintainer, and this is completely different issue. One can

No. The only "disagreement", if you want to call it that, is with Mark
period. He decided to take a purely technical disagreement on the merits
of IMAP4rev1, and turn it into a personal attack and a smear. I don't care
who he is, a "designer" or "maintainer".

> What I wanted to see posted is not some problems of implementing this or
> that in someone's code, but the problems in the IMAP4rev1 protocol itself,

Well, yes, I think I wouldn't have any problems coming up with a list;
maybe tomorrow.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4zbqv+3BFaxHnGY0RArDgAJsFQWUPHhSgQMiOOPA2c4k6E28SiwCeJ8gO
hnkJCnRy+Mx0tNpQKQwkXes=
=tWp8
-----END PGP SIGNATURE-----


Sam

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <butenko-1403...@stalker.gamma.ru>,


but...@stalker.com (Vladimir A. Butenko) writes:

> And now you (and Sam) suggest to, actually, develop a NEW protocol - i.e.

No, I don't believe I've ever made any such suggestion.

> IMAP is NOT a laguage. While one can try to deploy a "traditional"

Well, it's not a language in the traditional sense, like C or Pascal.
Still, I think it's more complicated to be described as a mere protocol.
There's a lot of sophistication there, more than just meets the eye.

> yacc-style parser for it, there are many catches there, because it is
> place-dependent. There are no "atoms" and "special symbols" there, in the
> strict sense of those terms. A regular parser that just calls "getLexem"
> function would fail in many places there, as it would fail for most of
> Inet protocols - SMTP, for example.

And that's precisely the problem. Anything with the syntactical complexity
of IMAP should have a distinct separation between its lexical and
grammatical constructors. There's no such thing here. Everything is just
a large writhing glob of spaghetti. And that is what I believe is the root
of most of the problems with IMAP.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4zbvL+3BFaxHnGY0RAkYVAKCXyUuJCSF27G7FjAQQpN9movIPpQCdG6Nu
inQ00pVyMe7AuXxUiQufMlA=
=dBOl
-----END PGP SIGNATURE-----


Steve Sobol

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
From 'Sam':

>> You forgot to mention one thing - the OS you are using. I guess that you
>> are doing this on Linux, and Mark spends most of his time on - let me
>> guess - Solaris :-). The speed of directory scan varies a lot on those
>> platforms.
>
>Really? There's such a huge performance difference in the speed of
>opendir() and readdir() on different platforms?

I would bet that it's not the performance within the system libraries,
but rather the layout of the filesystem that makes the difference.

Linux boxen and Sun Microsystems' products both use the same hard drives.
I could take the IBM SCSI hard drive connected to the SUN SCSI controller
in the SparcStation 20 I revived back at the end of 1998, and stick it in
a PC running Linux using a BT958, and it would work just as well.

But Suns do tend to handle large directories more efficiently, from what
I understand.


--
North Shore Technologies, Cleveland, OH http://NorthShoreTechnologies.net
Steve Sobol, President, Chief Website Architect and Janitor
sjs...@NorthShoreTechnologies.net - 888.480.4NET - 216.619.2NET

Yiorgos Adamopoulos

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
In article <courier.38CD...@email-scan.webcircle.com>, Sam wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>In article <butenko-1403...@stalker.gamma.ru>,
> but...@stalker.com (Vladimir A. Butenko) writes:
>
>> And now you (and Sam) suggest to, actually, develop a NEW protocol - i.e.
>
>No, I don't believe I've ever made any such suggestion.

No, that was me ;-)

Yiorgos Adamopoulos

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
>It's not going to happen.

If I am ever to finish my PhD, another (maybe not better) protocol is going
to happen ;-)

--
${talks} /* money talks */

Yiorgos Adamopoulos

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
>Really? There's such a huge performance difference in the speed of
>opendir() and readdir() on different platforms?

Yes, because you do execute them on different filesystems. On linux this
is most probably ext2fs whereas on Solaris this is UFS.

Vladimir A. Butenko

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
<s...@email-scan.webcircle.com> wrote:

> >
> > You forgot to mention one thing - the OS you are using. I guess that you
> > are doing this on Linux, and Mark spends most of his time on - let me
> > guess - Solaris :-). The speed of directory scan varies a lot on those
> > platforms.
>

> Really? There's such a huge performance difference in the speed of
> opendir() and readdir() on different platforms?

Yes, really. C'est la vie.



> > upgrade the software. Yes, there are not 2000 files in that directory
> > (several orderes of magnitude more) and they are not in a flat directory,
> > and there is a "stat" call issued for each of the found files, but that's
> > still too slow.
>
> Well, I guess I'm lucky, because I do not need to stat each file when
> opening a maildir. Just opendir() and readdir(). That's all.

Actually, in THAT operation (account scan, not mailbox scan and not
maildir mailbox scan) CGatePro in pre-3.2 version did not use "stat" at
all. The results are were still very slow under Solaris. OK, it was nt 10
minutes, it was 7, but that's still too long for a start-up of even a
major site.

So - the short answer is: really.

Vladimir A. Butenko

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to


> > IMAP is NOT a laguage. While one can try to deploy a "traditional"
>

> Well, it's not a language in the traditional sense, like C or Pascal.
> Still, I think it's more complicated to be described as a mere protocol.
> There's a lot of sophistication there, more than just meets the eye.
>

> > yacc-style parser for it, there are many catches there, because it is
> > place-dependent. There are no "atoms" and "special symbols" there, in the
> > strict sense of those terms. A regular parser that just calls "getLexem"
> > function would fail in many places there, as it would fail for most of
> > Inet protocols - SMTP, for example.
>

> And that's precisely the problem. Anything with the syntactical complexity
> of IMAP should have a distinct separation between its lexical and
> grammatical constructors. There's no such thing here. Everything is just
> a large writhing glob of spaghetti. And that is what I believe is the root
> of most of the problems with IMAP.

Sam, while I do understand your frustration, let me politely remind you
that there are much more things in the world that one can find on a farm
in Kansas. If you attempt to use a yacc-type parser with languages like
Fortran, Snobol4 or even APL, you will get into even more problems than
you have with IMAP. It's just not the right tool. Not all languages are
based on the same design, and some have a mark of long evolution...

Yes, I'd agree with you that today, after 40+ years of compiler
development and more or less standartized process of parsing, it would be
nice that everywhere we need a parser, we can use the same algorithms. But
- IMAP appeared many years ago, making it Pascal-language-like was not (I
think) one of the Mark's goals, and while we may all say - yes, it would
be better, IF... - we have what we have now, and I do not think anyone can
BLAME Mark for that.

If you are willing to accept a humble advice, I'd risk to give you one. I
have developed about 8 compilers during the 80th. In NO ONE I used a
table-driven parser, even in those that were designed to be table-parsed:
Pascal, Ada, for example. Plain SWITCH and IF/THEN parsing was proven (at
least, in my experience) to be much more readable, customizable and easier
to support than all those table-driven parsers one can read about in all
those thick and clever books. One of the advantages of that design is
context-sensitivity - you do not have to rely on a single "getLexem"
function (though you can rely on it for languages like Pascal/Ada), but
you can, instead, call getSymbol, or getAString, etc. in the places where
it is needed. This allows to parse languages like IMAP as easily as one
can parse languages like Pascal.

That's just my humble opinion. In no way I want to say that this is how
all IMAP parsers SHOULD be built. I just say that there is a method that
allows to parse IMAP easily, quickly, and w/o any spagetty code.

> -----END PGP SIGNATURE-----

Vladimir A. Butenko

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
I think we should not keep the post subject unchanged, since the subject
of the discussion has changed, and since I do not think anyone should be
"beware of Courier-IMAP" - I think that we all should encourage Sam to
create a good IMAP server, and you, Mark, as the creator and evangelist of
the IMAP protocol would join me in wishing Sam the best luck. Good parent
can even kick their girl if she is acting wrong with her friends, but they
are not supposed to say "beware of her" to her potential, hmmm, customers.
And you are in the position of such a parent, does not matter if you like
it or not :-)

> On Tue, 14 Mar 2000, Vladimir A. Butenko wrote:

> > mdir helps to avoid
> > file locks (if it does not use index files), but does not help to
> > synchronise changes.
>

> Which is why we use locks in the first place! ;-)

Yes, but with a reminder: locks just ensure consistency on the mailbox
data, but they do not help to inform other "cleints" (client agents
within the server) that the change has occured. Thus - a need to check the
mailbox datafile/indexfile/directory periodically - thus an overhead.



> Yes, I agree completely with your (a)-(e).

Good! :-)



> Yes, transparency is very important, and often is neglected.
>
> c-client is "mostly" transparent with traditional UNIX mailbox format; the
> ">" is not needed unless the line really looks like a UNIX mailbox header
> line.

Mark, I'm pretty sure you get lots of support E-mail. And some of those
mails can say "what happened to my mailbox?" and contain the data from the
mailbox file. Copy-pasted. With exactly that separator line. So, to get it
inside a message is not such an unprobable event...



> However, I agree this is an issue with traditional UNIX mailbox format,
> and it's one of the big reasons why I never liked it.

There it's just more probable. BTW, on some systems (AIX?) Unix mailbox
format was modified to include one (or 4?) 0x01 (^A) characters in front
of the From line - I guess in attempt to make it more transparent.



> mbx, mtx, and tenex all allow shared read/write access, but they require
> the ability to synchoronize updates and that means no NFS. Even when you
> get locking out of the way, the inode vs. data cache problem over NFS will
> still bite you.

Nope :-). We do it differently, we do not acces data directly on NFS from
different sources, so there is not problem if something is cached.



> mbx has the additional win that it allows shared expunge. That's great
> for people who leave an IMAP client running 24/7, and then want to run an
> IMAP client on their mail someplace else. For many folks at UW, it's
> their office system that runs an IMAP client 24/7, for me, it's my home
> system...

So, you separate shared read/write and shared expunge, right? We treat
them all as read/write, and shared means shared - i.e. all types of
mailboxes support all types of shared operations - read, write (flag
modification), and expunge.

Yiorgos Adamopoulos

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
In article <butenko-1403...@stalker.gamma.ru>, Vladimir A. Butenko wrote:
>There it's just more probable. BTW, on some systems (AIX?) Unix mailbox
>format was modified to include one (or 4?) 0x01 (^A) characters in front
>of the From line - I guess in attempt to make it more transparent.

IIRC, 4 ctrl-As means using MMDF.

--
${talks}

Mark Crispin

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to Vladimir A. Butenko
On Tue, 14 Mar 2000, Vladimir A. Butenko wrote:
> Mark, I'm pretty sure you get lots of support E-mail. And some of those
> mails can say "what happened to my mailbox?" and contain the data from the
> mailbox file. Copy-pasted. With exactly that separator line. So, to get it
> inside a message is not such an unprobable event...

Well, since I don't use traditional UNIX mailbox format (and have not for
many years), that has not been an issue for me. ;-)

Nevertheless, I can't remember the last time that I received a mailbox
that was copy-pasted. If people send me a mailbox to analyze, they
usually send it it as a MIME attachment. About once or twice a year
someone sends me a mailbox by uuencode, but I just don't see copy-pasted
mailboxes.

> BTW, on some systems (AIX?) Unix mailbox
> format was modified to include one (or 4?) 0x01 (^A) characters in front
> of the From line - I guess in attempt to make it more transparent.

That is MMDF format, which is the standard on SCO systems. It was quite
popular about 15-20 years ago. The CTRL/A characters replace the "From "
line, although some MMDF mailboxes have both.

> > mbx, mtx, and tenex all allow shared read/write access, but they require
> > the ability to synchoronize updates and that means no NFS. Even when you
> > get locking out of the way, the inode vs. data cache problem over NFS will
> > still bite you.
> Nope :-). We do it differently, we do not acces data directly on NFS from
> different sources, so there is not problem if something is cached.

Could you explain this? Do you access data on NFS? Do you access data
from different sources simultaneously? If the answer to these is "yes",
then how do you avoid doing both? That is, without telling your users
"don't do both", which is effectively what I do when I say "don't use
mbx/mtx/tenex formats over NFS"

> So, you separate shared read/write and shared expunge, right?

Not exactly, but for the purposes of this discussion, "yes".

More specifically, there are two lock states in mbx (mtx/tenex and
unix/mmdf also have two lock states, but these work differently so I'm not
discussion them here).

One lock state indicates that a process has the mailbox selected. Every
process that has the mailbox selected owns a share lock on this lock
state. If a process can acquire an exclusive share lock on this lock
state, it can compress out expunged messages during a CHECK or EXPUNGE;
otherwise, EXPUNGE will just mark deleted messages as invisible and allow
the other sharing processes to discover that (and percolate the untagged
EXPUNGE event to the MUA) on their own.

The other lock state governs the ability to parse the mailbox (meaning
discover new messages) with a share lock, or to append to the mailbox with
an exclusive lock. Unlike the first lock state, this is transient; a
process holds this lock (share or exclusive) as long as is necessary to do
the task at hand, then it releases it. The normal effect is that any
number of processes can be parsing the mailbox (this is done at select
time for the entire mailbox, and for new messages when they arrive), but
the MDA must be able to shut out all parsing (and other MDAs) when
delivering mail.

"Parse" in this context does not mean RFC822/MIME parsing. It just means
locating internal headers and acquiring the IMAP "fast" data for all the
messages reported with an untagged EXISTS.

> We treat
> them all as read/write, and shared means shared - i.e. all types of
> mailboxes support all types of shared operations - read, write (flag
> modification), and expunge.

If I remember correctly, you accomplish this by having a multi-threaded
server, so you don't have to worry about process/process interaction. You
also essentially assume that other software isn't going to be operating on
your files, right?

That's certainly the right thing to do if you can make those assumptions.
Unfortunately, I can not in UW imapd; I *must* have process/process
interaction and external software that is completely out of my control.

chris ulrich

unread,
Mar 15, 2000, 3:00:00 AM3/15/00
to
%%On Mon, 13 Mar 2000, Vladimir A. Butenko wrote:
%%
%%I believe that it is infeasible to build maildir support that scales well
%%(e.g. does not exhibit performance problems with a moderately large
%%mailbox of 2000 messages) and also does not violate a major rule of either
%%maildir or IMAP. It's a no-win situation for me; and therefore I choose
%%to allow the maildir enthusiast community to do their own development,
%%distribution, and support of maildir IMAP code.
%%-- Mark --

I just did a trial run with a folder with 9000+ messages in it, mostly
between 1 and 4k in size. I used the netscape4 mail client from a
pentiumII to connect to an E250 with an older CPU (250?mhz). It took
about one minute to open the entire folder. Moving about 2000 of these
messages from the middle of the folder to another folder that already had
stuff in it took about 2 minutes. The sun is running solaris 2.6 and is
using normal disks.
While this is hardly speedy, it is also not terrible given the size of
the folders. Given that this is a pretty evil boundry case, and given the
advantages of having an NFS safe mailstore, I'd consider this to be
acceptable performance. I'm using courier-imap version 0.25a.
(and for the record, wu-imap performed better opening a different 8000
message folder on another machine; I didn't test moving messages from one
folder to another; wu-imap was using mbox formatted folders).
chris

Sam

unread,
Mar 15, 2000, 3:00:00 AM3/15/00
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article <8amm0s$bie$1...@pravda.ucr.edu>,
cdu@jawa. (chris ulrich) writes:

> Moving about 2000 of these
> messages from the middle of the folder to another folder that already had
> stuff in it took about 2 minutes.

That's not bad, when you consider that 2000 messages were physically
duplicated there. They were not simply hardlinked into a different folder,
they were physically copied. Then, Netscape probably went in and marked
the originals as deleted, which translates to a filesystem rename. I'm
pretty sure there's no expunge here, though.

Your BUFSIZ is probably 8K, so all your messages required only one read and
one write to be copied over. So, that's 4000 opens, 4000 closes, 2000
reads, 2000 writes; and 2000 renames, in two minutes.

I don't think it would matter whether you've taken them from the beginning
or from the end of the folder. The order in which the messages were
displayed had absolutely nothing to do with their physical order in the
directory.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: http://www.geocities.com/SiliconValley/Peaks/5799/GPGKEY.txt

iD8DBQE4zvTd+3BFaxHnGY0RAinRAKCoEkH8HPAtCE1ObNbjwXrLLLIBEwCbBRij
pcqO2n2abfOZpI5+/TRcNgY=
=kGMG
-----END PGP SIGNATURE-----


Villy Kruse

unread,
Mar 15, 2000, 3:00:00 AM3/15/00
to
On Tue, 14 Mar 2000 18:44:58 +0300,
Vladimir A. Butenko <but...@stalker.com> wrote:

> ... taken out of context ...

>
>> However, I agree this is an issue with traditional UNIX mailbox format,
>> and it's one of the big reasons why I never liked it.
>

>There it's just more probable. BTW, on some systems (AIX?) Unix mailbox


>format was modified to include one (or 4?) 0x01 (^A) characters in front
>of the From line - I guess in attempt to make it more transparent.
>

FYI this is MMDF format which has ben used for a long time on SCO
systems, and whcih, by the ways, is also supported by c-client, and
therefore by WU imapd and pine.

However, I've seen a problem (very rarely) when some pop server on a SCO
system treats this mailbox a s a regular unix mailbox, and therefore
the ^A^A^A^A sneaks into the reply. Pine will then take this as a
message separator and the message is cut in half. This problem will
probably go away as the old SCO systems (now 5 to 10 years old) are
phased out and the MMDF mail box format with it.


Villy

Vladimir A. Butenko

unread,
Mar 15, 2000, 3:00:00 AM3/15/00
to
In article
<Pine.NXT.4.30.000314...@Tomobiki-Cho.CAC.Washington.EDU>,
Mark Crispin <m...@CAC.Washington.EDU> wrote:

> That is MMDF format, which is the standard on SCO systems. It was quite
> popular about 15-20 years ago. The CTRL/A characters replace the "From "
> line, although some MMDF mailboxes have both.

Thank you all for educating me - I did not know that.

> > > mbx, mtx, and tenex all allow shared read/write access, but they require
> > > the ability to synchoronize updates and that means no NFS. Even when you
> > > get locking out of the way, the inode vs. data cache problem over NFS will
> > > still bite you.
> > Nope :-). We do it differently, we do not acces data directly on NFS from
> > different sources, so there is not problem if something is cached.
>
> Could you explain this?

A bit :-)

> Do you access data on NFS?

On back-end Mail Servers - yes.

> Do you access data from different sources simultaneously?

If you look at this from the client point of view - yes. If you look from
the NFS server point of view you will see that any mailbox can be accessed
by only back-end server at the same time. That's the CommuniGate Pro
Dynamic Cluster software that does the trick.

> If the answer to these is "yes", then how do you avoid doing both?
> That is, without telling your users
> "don't do both", which is effectively what I do when I say "don't use
> mbx/mtx/tenex formats over NFS"

That's the point. Imagine a simpler cluster (w/o frontends). Just 5
back-end servers running with the same set of NFS boxes. With a load
balancer in front of them. BTW, regular load balancer are not that good in
balancing MAIL load, since mail load != mail traffic, but for a smaller
(1,000,000 users) site - regular load balancer is still OK.

So, you hit the server A and open an IMAP/POP session with account X
mailbox Y (WebMail sessions are processed in a slightly different manner,
so I'm talking about POP/IMAP only). The server A accesses the mailbox
data on NFS directly. Other connections can be opened, and till they hit
the same server, all works OK, like they work on a single-server
CommuniGate Pro installation - with the server software doing all
synchronization INSIDE the server.

Then a new connection is established (from the same user or from a
different location) - to the same account X, mailbox Y. Besides POP and
IMAP, it can be, for example, an incoming message that came from SMTP or
that was generated locally by some Automated Rule, etc.

So, that connection was directed by the load balancer to a DIFFERENT
server B. What the CommuniGate Pro Dynamic Cluster component does is
detecting the conflict and making the server B access that mailbox VIA THE
SERVER A, where the server "B" would be treated as just one more client -
like those real POP, IMAP, etc clients connected to the server A and
accessing that mailbox.

Advanatages of this design are obvious, disadvantages are few:
a) overhead caused by that conflict-catching mechanism
b) overhead caused by indirect access to mailbox if the conflict is detected.

a) - I just ask you to believe me that it's small - we have designed it so
that it can cause additional latency, but not decrease the throughput
For IMAP and WebMail the overhead is almost zero, for POP it is larger -
since there are much more POP *sessions* going up and down every second
compared to IMAP/Web sessions that stay open for a long time. Since major
sites are still mostly POP, and overhead there is small - we think that as
IMAP and Web will take over POP the overhead will only decrease.

b1) happens ONLY when the conflict is detected, and on a site with just 10,000
accounts, the probability is already very small (<<1%), and on a site with
1,000,000 accounts - it's VERY small.

b2) b1 is not a point at all :-). Because if you build a REALLY big
CGatePro Cluster, you do use front-ends - for reliability of the site
(they protect backends from all types of attacks), for REAL load
balancing, etc. And if you use frontends, ALL connections are "indirect",
so there is no additional overhead.

CommuniGate Pro 3.3 now has a built-in SNMP agent, and it allows to grab
that statistics data from all servers. Since it is in beta now, no major
Cluster site is using it, but as soon as it is released (hopefully, in
mid-April), we will have much more statistical data from the REAL servers,
not the test clusters in our labs. If you are interested, drop me a note
in May - I hope we'll have more precise data that time.

CGatePro Cluster ensures that you do not have indirection loops. Imagine
that the server B tries to connect to some mailbox via the server A, but
while it is doing that the server A closes all its session to that
mailbox, the server C has opened it, so if special measures are not taken,
the server A will have to connect to the server C on the server B behalf,
etc - you can even model a situation where you may have loops. CGatePro
Cluster ensures that there is no indirection loops and also takes special
efforts to minimize the probability of the "wrong hit". On one of the
clusters there were just 345 "wrong hits" during 11,000,000 IMAP sessions.

> One lock state indicates that a process has the mailbox selected. Every
> process that has the mailbox selected owns a share lock on this lock
> state. If a process can acquire an exclusive share lock on this lock
> state, it can compress out expunged messages during a CHECK or EXPUNGE;
> otherwise, EXPUNGE will just mark deleted messages as invisible and allow
> the other sharing processes to discover that (and percolate the untagged
> EXPUNGE event to the MUA) on their own.

But it will discover that by some [expensive] file-level operation that it
has to do periodically, right?

> The other lock state governs the ability to parse the mailbox (meaning
> discover new messages) with a share lock, or to append to the mailbox with
> an exclusive lock.

BTW, the design we use requires only one mailbox parsing, since the
Mailbox Object is opened as long as at least on client needs it, and
parsing happens only during mailbox opening process. But, as I said, on
large sites the probability of simultaneous access to mailbox is very
small - so there is no big win here.

> Unlike the first lock state, this is transient; a
> process holds this lock (share or exclusive) as long as is necessary to do
> the task at hand, then it releases it.

What happens if the process fails at that time? Will the lock be released
automatically?


> The normal effect is that any
> number of processes can be parsing the mailbox (this is done at select
> time for the entire mailbox, and for new messages when they arrive), but
> the MDA must be able to shut out all parsing (and other MDAs) when
> delivering mail.

Sure. But we do not need to re-parse the mailbox when a new message
arrives (or added by IMAP APPEND, for example): if the mailbox is in the
"parsed" state, the message is added, and the mailbox "parsed" data is
updated to include the new info (already available at the time of message
arrival).

> "Parse" in this context does not mean RFC822/MIME parsing. It just means
> locating internal headers and acquiring the IMAP "fast" data for all the
> messages reported with an untagged EXISTS.

Sure.


> > We treat
> > them all as read/write, and shared means shared - i.e. all types of
> > mailboxes support all types of shared operations - read, write (flag
> > modification), and expunge.
>
> If I remember correctly, you accomplish this by having a multi-threaded
> server, so you don't have to worry about process/process interaction. You
> also essentially assume that other software isn't going to be operating on
> your files, right?

Yes and no. You can specify that some mailboxes are "EXTERNAL", and thus
exposed to "legacy applications" - delivery agents, "local mailers", etc.
Those mailboxes are treated differently, of course, - but usually they
are used on small sites only (mostly - in Universities and during the
migration process).


> That's certainly the right thing to do if you can make those assumptions.
> Unfortunately, I can not in UW imapd; I *must* have process/process
> interaction and external software that is completely out of my control.

Yes, there are different markets for our products.

If someone needs to continue to use some legacy mailers and delivery
agents that deal directly with mailboxes (and deal differently and not
always correctly) - it is NOT wise to try to install CommuniGate Pro
there, and we always say that those sites should stay with UW imapd, since
UW imapd is designed for that.

If someone needs to build either a very large server, or a completely
"new" server, so there is no need to continue to support legacy delivery
agents and "local mailers" (i.e. all access will be via POP/IMAP/Web),
then that's the CGatePro market. As well as other vendor's market, of
course. But there is the field where we hope to continue to lead the pack
:-)

Mark Crispin

unread,
Mar 15, 2000, 3:00:00 AM3/15/00
to Vladimir A. Butenko
On Wed, 15 Mar 2000, Vladimir A. Butenko wrote:
> > [Discovery of flag changes in shared read/write mbx mailboxes.]

> But it will discover that by some [expensive] file-level operation that it
> has to do periodically, right?

It's the same operation that's done to discover new mail: stat().

> > Unlike the first lock state, this is transient; a
> > process holds this lock (share or exclusive) as long as is necessary to do
> > the task at hand, then it releases it.
> What happens if the process fails at that time? Will the lock be released
> automatically?

System call locks are used, so the locks are automatically released in the
event of process failure.

> Sure. But we do not need to re-parse the mailbox when a new message
> arrives (or added by IMAP APPEND, for example): if the mailbox is in the
> "parsed" state, the message is added, and the mailbox "parsed" data is
> updated to include the new info (already available at the time of message
> arrival).

We don't "re-parse" either; once the SELECT is done, the only parsing is
on new messages. The only thing that is ever touched on old messages
after the SELECT-time parse is done are the flags. Flags can be changed,
of course. Also, when some other process changes the flags of a message,
we only know that some message's flags were changed; we don't know which
so we have to do a global flags sweep. Fortunately, that sweep is
relatively fast, and doesn't need to be done often in any case.

> Those mailboxes are treated differently, of course, - but usually they
> are used on small sites only (mostly - in Universities and during the
> migration process).

UW has 80K users, so isn't quite a "small" site. ;-)

> If someone needs to continue to use some legacy mailers and delivery
> agents that deal directly with mailboxes (and deal differently and not
> always correctly) - it is NOT wise to try to install CommuniGate Pro
> there, and we always say that those sites should stay with UW imapd, since
> UW imapd is designed for that.

Exactly. However, given that CommuniGate Pro effectively runs in a
"sealed" server ala Cyrus and Exchange, I'm surprised that you stayed with
the UNIX mbox format instead of something that is much more optimized for
IMAP (like what Cyrus and especially Exchange did).

Vladimir A. Butenko

unread,
Mar 16, 2000, 3:00:00 AM3/16/00
to
In article
<Pine.NXT.4.30.00031...@Tomobiki-Cho.CAC.Washington.EDU>,
Mark Crispin <m...@CAC.Washington.EDU> wrote:

> > > Unlike the first lock state, this is transient; a
> > > process holds this lock (share or exclusive) as long as is necessary to do
> > > the task at hand, then it releases it.
> > What happens if the process fails at that time? Will the lock be released
> > automatically?
>

> System call locks are used, so the locks are automatically released in the
> event of process failure.

Hm. But you've said you use two locks per mailbox. Where mailbox is a
file. So, how do you put two different locks on one file using OS-level
locks?



> > Sure. But we do not need to re-parse the mailbox when a new message
> > arrives (or added by IMAP APPEND, for example): if the mailbox is in the
> > "parsed" state, the message is added, and the mailbox "parsed" data is
> > updated to include the new info (already available at the time of message
> > arrival).
>

> We don't "re-parse" either; once the SELECT is done, the only parsing is
> on new messages. The only thing that is ever touched on old messages
> after the SELECT-time parse is done are the flags. Flags can be changed,
> of course. Also, when some other process changes the flags of a message,
> we only know that some message's flags were changed; we don't know which
> so we have to do a global flags sweep. Fortunately, that sweep is
> relatively fast, and doesn't need to be done often in any case.

OK, so you have either a small additional file with "indeces" (that keeps
all message flags) or you keep them in the original mailbox file and then
you have to re-read it.



> > Those mailboxes are treated differently, of course, - but usually they
> > are used on small sites only (mostly - in Universities and during the
> > migration process).
>

> UW has 80K users, so isn't quite a "small" site. ;-)

While i can believe that there is ONE server @ UW that handles all those
users, it's hard to imagine that all of those users have and USE shell
accounts on that very server. For example, in Stanford (that has started
to switch its mail services to CGatepro from Sun's SIMS and other
solutions) an avg server handles just 1000-5000 users, though they have
many servers installed. A system with 5000 shell accounts is a large site
from the Unix OS point of view, but a small site in terms of CGatePro mail
server.


> > If someone needs to continue to use some legacy mailers and delivery
> > agents that deal directly with mailboxes (and deal differently and not
> > always correctly) - it is NOT wise to try to install CommuniGate Pro
> > there, and we always say that those sites should stay with UW imapd, since
> > UW imapd is designed for that.
>

> Exactly. However, given that CommuniGate Pro effectively runs in a
> "sealed" server ala Cyrus and Exchange, I'm surprised that you stayed with
> the UNIX mbox format instead of something that is much more optimized for
> IMAP (like what Cyrus and especially Exchange did).

We do not "stay" :-) CGatePro mailbox management is completely modular.
Hopefully it would be of some interest to this group readers - there is an
almost complete interface of the abstract Mailbox "object" as it is seen
from the CGatePro kernel (I left the virtual classes only):

class VMailbox : public STObject {
virtual int getUIDValidity(void) = NIL;
virtual bool getFirstRecent(VMailboxMessageID& theID,
bool resetRecent) = NIL;
virtual bool parse(void) = NIL;
virtual mailboxView* createPhysicalView(void) = NIL;
virtual STErrorCode addMessage(ReadableSource* theSource,
VString theReturnPath,
SBData* additionalHeaders,
messageView* pNewView) = NIL;
virtual bool getPhysicalMessageView(VMailboxMessageID theID,
messageView* pView) = NIL;
virtual STFileOffset getPhysicalMessageSize(VMailboxMessageID theID,
bool withCRLF) = NIL;
virtual STErrorCode readPhysicalMessage(SBMutableData& theBuffer,
VMailboxMessageID theID,
STFileOffset offset,size_t maxLength)=NIL;
virtual void physicallyRemoveMarkedMessages(mailboxView* theView)=NIL;
virtual int storeMessageFlags(VMailboxMessageID theID,
messageFlags* updateFlags,
flagsOperation operation) = NIL;
};

The synchronization and other boring tasks are performed by the Mailbox Manager
itself, and implementations of particular mailbox formats should not care
about it, they can be written (and they are written) in the same manner they
would be written for a server that handles just one client at a time.

As you can see, to support a new mailbox format one would just write 10
functions/methods.

We can support, for example, a BSD mailbox with an index, if anyone would
need it. But as you saw in the Log files I posted here few months ago,
parsing even a multi-MB "BSD" mailbox with 10,000 messages is a matter of
second in CGatePro, so why should we implement something (an additional
mailbox format) that noone is likely to use? .mdir is different - and as
we've ageed, there are several situations when it is useful, also there
are plenty of "mdir fans" running around - so we have .mdir support.

A much more interesting thing is "database"-based mailboxes. This was long
on our "To Do" list, but putting all those ORACLE client-side utilities
into CGatePro code seemed to be a VERY bad idea. Fortunately, now a
better, standartized interfaces to varios DB sources emerge, and we plan
to support them.

So, one is able to store messages in some SQL database, but still access
them via POP,IMAP, WebMail, and AT THE SAME time - make advanced SQL
operations on those messages. This feature becomes more and more popular
on many corporate sites.

Yiorgos Adamopoulos

unread,
Mar 16, 2000, 3:00:00 AM3/16/00
to
In article <butenko-1603...@stalker.gamma.ru>, Vladimir A. Butenko wrote:
>A much more interesting thing is "database"-based mailboxes. This was long
>on our "To Do" list, but putting all those ORACLE client-side utilities
>into CGatePro code seemed to be a VERY bad idea. Fortunately, now a
>better, standartized interfaces to varios DB sources emerge, and we plan
>to support them.
>
>So, one is able to store messages in some SQL database, but still access
>them via POP,IMAP, WebMail, and AT THE SAME time - make advanced SQL
>operations on those messages. This feature becomes more and more popular
>on many corporate sites.

You pretty much describe some of what I am trying to do. Although you can
act upon emails as BLOBs and have a simple design where every user is a
table on the system and every email a record on the table with attributes
serial number, headers and body. If you need subfolders it gets trickier
(but not hard) add OO - but hey Object Databases are still very slow.
You need to play with the native interface of every database (and version)
to achieve speed. For the large number of transactions that you will need,
ODBC (and JDBC, DBI/DBD) won't be faster than what you have today.

It would be nice to device a *storage manager* that deals only with email
(and multimedia data- since all those cool .jpg and .avi are sent arround).
You could then have filters to do feature extraction on the message and
flag it as needed, so that you could select "all the videos" from the mail
store, etc. A storage manager, not a complete DB with all the bells and
whistles.

--
Yiorgos Adamopoulos -- #include <std/disclaimer.h>
ad...@dblab.ece.ntua.gr -- Knowledge and Data Base Systems Laboratory, NTUA

Mark Crispin

unread,
Mar 16, 2000, 3:00:00 AM3/16/00
to Vladimir A. Butenko
On Thu, 16 Mar 2000, Vladimir A. Butenko wrote:
> > System call locks are used, so the locks are automatically released in the
> > event of process failure.
> Hm. But you've said you use two locks per mailbox. Where mailbox is a
> file. So, how do you put two different locks on one file using OS-level
> locks?

Ha! That's a trade secret! ;-)

I use an auxillary file whose sole purpose is to hold the second system
call lock. This is different from .lock locking, in which the second file
is the lock.

I wish that UNIX offered thawed vs. frozen opens; and that it offered
multiple named locks on a file. But it doesn't, so one has to do what he
can within UNIX's limitations.

> OK, so you have either a small additional file with "indeces" (that keeps
> all message flags) or you keep them in the original mailbox file and then
> you have to re-read it.

You don't have to re-read the file; just the flags. Random access I/O is
wonderful. A lot of people don't understand it; and NFS does its best to
discourage you from doing read/write random access I/O. But the
functionality is there.

> While i can believe that there is ONE server @ UW that handles all those
> users, it's hard to imagine that all of those users have and USE shell
> accounts on that very server.

We do not have shell accounts on our mail servers, and we have many mail
servers.

> We do not "stay" :-) CGatePro mailbox management is completely modular.

As is c-client. But the point is, if it's a sealed server why bother with
legacy formats?

> As you can see, to support a new mailbox format one would just write 10
> functions/methods.

Do you support IMAP as one of your mailbox formats (e.g. can you proxy)?
If so, I think that you need more than 10 methods. c-client has 33
methods per driver, mostly because of IMAP which uses most of them.
Fortunately, many of these can be null (= use the default) and some others
are broilerplate, so only about a dozen are significant. So we're in the
same ballpark.

The actual c-client API closely matches IMAP (no big surprise there), so
from the application's perspection it looks like all the world is IMAP.

This also means that the c-client based IMAP server is a very simple
program. All it does is parse IMAP commands into c-client API calls.

Vladimir A. Butenko

unread,
Mar 17, 2000, 3:00:00 AM3/17/00
to

> On Thu, 16 Mar 2000, Vladimir A. Butenko wrote:
> > > System call locks are used, so the locks are automatically released in the
> > > event of process failure.
> > Hm. But you've said you use two locks per mailbox. Where mailbox is a
> > file. So, how do you put two different locks on one file using OS-level
> > locks?
>

> Ha! That's a trade secret! ;-)
>
> I use an auxillary file whose sole purpose is to hold the second system
> call lock. This is different from .lock locking, in which the second file
> is the lock.

So, if the system crashes, you do not have to care about all those .lock
files that, hmm, some large-scale servers use. And that make them "close
down for a clean-up" for many hours....



> > OK, so you have either a small additional file with "indeces" (that keeps
> > all message flags) or you keep them in the original mailbox file and then
> > you have to re-read it.
>

> You don't have to re-read the file; just the flags. Random access I/O is
> wonderful. A lot of people don't understand it; and NFS does its best to
> discourage you from doing read/write random access I/O. But the
> functionality is there.

NFS has nothing to do with it. ANY file access is slow, and random is just
a bit slower. When we optimized the flag updates algorithm in CGatePro 5
months ago, om heavy-loaded sites disk i/o subsytem load dropped 2-3 fold.
Mostly for POP operations, but it improved IMAP, too.

Speaking about the flags: there is a Q for you. I think you should give us
a definite answer, since you was the person who has created the IMAP
specs, so you must know better ;-).

What SHOULD happen if I copy a message from one mailbox to a different
one? Should that message appear as "RECENT" in that mailbox or not? We
have a huge fight among our customers - what is the "right way" to do
this.


> > While i can believe that there is ONE server @ UW that handles all those
> > users, it's hard to imagine that all of those users have and USE shell
> > accounts on that very server.
>

> We do not have shell accounts on our mail servers, and we have many mail
> servers.

If you do not have shell account there, then what's the sense to use a
server design that limits itself intentionally to be compatible with tools
used from shell accounts on the same server?


> > We do not "stay" :-) CGatePro mailbox management is completely modular.
>

> As is c-client. But the point is, if it's a sealed server why bother with
> legacy formats?

Because the format itself is not bad enough. We could, of course, change
"From " to something like ^A^A^A^A, to make the mailbox more transparent,
but that's not a big deal. And the idea of one text file used as a mailbox
is not bad at all - if the manager handling that format is designed in an
efficient way.


> > As you can see, to support a new mailbox format one would just write 10
> > functions/methods.
>

> Do you support IMAP as one of your mailbox formats (e.g. can you proxy)?

Yes.

> If so, I think that you need more than 10 methods.

It happened that we need less, and I showed you all of them :-) Do not
forget that those are just the internal objects of the Mailbox Manager,
and the Manager itself has some brains. IMAP, POP, WebMail and Delivery
modules do not talk directly to those objects - they talk to the Manager.
That does a lot (like synching, flag handling optimization, etc.) - but
does not care about the physical implementation of the mail store.

> c-client has 33
> methods per driver, mostly because of IMAP which uses most of them.

That's the point: we have an additional layer that simplifies the things.


> Fortunately, many of these can be null (= use the default) and some others
> are broilerplate, so only about a dozen are significant. So we're in the
> same ballpark.

Then, it's very good.



> The actual c-client API closely matches IMAP (no big surprise there), so
> from the application's perspection it looks like all the world is IMAP.

As you can see, the VMailbox methods I showed are not quite IMAP-like, but
if you look at the mailbox manager methods, they do resemble IMAP to a
certain extent.



> This also means that the c-client based IMAP server is a very simple
> program. All it does is parse IMAP commands into c-client API calls.

Sure. As I said, the IMAP server is a very small portion of CGatePro code
- even with all those ACL, QUOTA, STARTTLS, etc extensions it has to
handle.

Vladimir A. Butenko

unread,
Mar 17, 2000, 3:00:00 AM3/17/00