UoW IMAP, lock files and multiple servers

7 views
Skip to first unread message

John S. Humanski

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to
I work for a local ISP in Chicago and we are trying to implement IMAP
using UoW IMAP. We have multiple mail servers that NFS mount a disk
array that contains our mail directory structure.

What is the expected behavior if a user accesses their mail from
multiple servers at once? Is there an expected behavior?

Also, regarding the lock files that are created, the naming convention
is .devno.ino where devno is the device number of the file system the
mail folder is on and ino is the inode of the mail file. Since each
server has a different idea about the device number, I changed the
naming convention to .username.ino. (BTW, we have all of the lock files
going to a centralized location so they are visible by all servers.)
The PID of the imapd process is written to these lock files. Is this
information (i.e., the PID) used in any way?

I looked through the UoW documentation but I couldn't find the answers
to my questions. If these questions are answered in the docs or a FAQ
could you please point me in the right direction?

TIA

--
John S. Humanski mailto:jhum...@interaccess.com
Senior Software Engineer Phone: 312-496-4669 Fax: 312-496-4499
InterAccess Co. http://www.interaccess.com
Data CLEC offering DSL & Internet Services


Ed Symanzik

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to
"John S. Humanski" wrote:

> What is the expected behavior if a user accesses their mail from
> multiple servers at once? Is there an expected behavior?

Mr. Crispin will insist that the expected behavior is for mailboxes to
be corrupted. We have been using a procmail style lock for several
years
and have not had a problem.

username.inode is a good idea. uid.inode might be better.

John S. Humanski

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to
Ed Symanzik wrote:

> "John S. Humanski" wrote:
>
> > What is the expected behavior if a user accesses their mail from
> > multiple servers at once? Is there an expected behavior?
>
> Mr. Crispin will insist that the expected behavior is for mailboxes to
> be corrupted.

I was afraid of that. I hope you're wrong. :-)

> We have been using a procmail style lock for several
> years
> and have not had a problem.

Can you be more specific about what you mean by "procmail style" locking?

>
> username.inode is a good idea. uid.inode might be better.

Unfortunately, we use a customer management software package and must
perform our authentication against its database. Our customers do not
have uids per se. But thanks for the idea.

Ed Symanzik

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to
"John S. Humanski" wrote:

> Can you be more specific about what you mean by "procmail style" locking?

The typical method of creating lockfiles is something like this:
open("lockfile", O_WRONLY | O_CREAT | O_EXCL, 0444);

This creates a file but fails if the file exists. Works fine on the
local
filesystem but NFS doesn't support O_EXCL. The trick is to find an
atomic
function that will create a file but fail if it already exists.

Procmail does this with links. First, create a unique filename. Then,
create a link to it with the name of your lock file.
open("uniquename", O_WRONLY | O_CREAT, 0444);
link("uniquename", "username.inode");

If the link failed with errno == EEXIST then someone else has the lock.

John S. Humanski

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to
Ed Symanzik wrote:

Thanks for the quick response. This looks very promising. I will look into
it.

Mark Crispin

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to Ed Symanzik
On Thu, 10 Feb 2000, Ed Symanzik wrote:
> > What is the expected behavior if a user accesses their mail from
> > multiple servers at once? Is there an expected behavior?
> Mr. Crispin will insist that the expected behavior is for mailboxes to
> be corrupted.

That is absolute, unadulterated nonsense. If you are going to answer
technical questions you have a moral obligation to be truthful.

-- Mark --

* RCW 19.190 notice: This email address is located in Washington State. *
* Unsolicited commercial email may be billed $500 per message. *
Science does not emerge from voting, party politics, or public debate.


Mark Crispin

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to John S. Humanski
On Thu, 10 Feb 2000, John S. Humanski wrote:
> I work for a local ISP in Chicago and we are trying to implement IMAP
> using UoW IMAP. We have multiple mail servers that NFS mount a disk
> array that contains our mail directory structure.

It is generally not recommended to have use NFS filesystems with IMAP
servers. Propaganda from the NFS disk array vendors notwithstanding, NFS
is not a proper filesystem and an Ethernet cable is not a disk channel.

NFS does not implement complete UNIX filesystem semantics. What you will
discover is that there is minimal CPU overhead with IMAP and significant
disk overhead; consequently your multiple mail servers will be doing not
much of anything, with the bottleneck being in your Ethernet to the NFS
filesystem.

It's not that it won't work; it's that you'll be wasting your money on
unnecessary resources and not addressing the real issue. Put another way,
you won't notice a significant performance difference between 1 CPU and 10
CPUs in your configuration.

You would get much better performance if each of your CPUs had its own
piece of the overall user space on a local (not NFS disk), and use a
mechanism to direct incoming IMAP sessions to the proper NFS server. The
idea is to have an I/O bandwidth equal to 10 times the individual CPU
local disk bandwidth.

> What is the expected behavior if a user accesses their mail from
> multiple servers at once? Is there an expected behavior?

The behavior that you will see is that -- because of NFS -- nothing will
prevent more than one IMAP server from opening the same mailbox at a time.
This in turn means that more than one IMAP server will think that it can
rewrite the mailbox.

The reason for this is that the locks which prevent this from happening
don't work over NFS.

Fortunately, as long as you allow file-create (protection 1777) on the
/var/spool/mail directory, all is not lost. There is a second level of
locking, common to all UNIX applications, which will prevent any writes
being done simultaneously. This lock is exclusive, and fortunately will
work over NFS -- AS LONG AS THE /var/spool/mail DIRECTORY is 1777 and not
775!!!

This is what Ed Symanzik refers to as "procmail locks" -- but like most
sorcerer's apprentices he's not too clear on the details. I wish that
people would make an effort to find out the facts rather than posting
untrue and misleading statements.

What will happen is that if one imapd rewrites the mailbox, then any other
imapd that has it open will see that the mailbox has changed and will go
to look at what the change is before doing any rewrite of its own. It
expects that change to be new mail (from sendmail, etc.) and is prepared
to handle that. But, since the change is other than new mail, the other
imapd no longer has a good grasp of what's going on; it will report an
"Unexpected changes to mailbox (try restarting)" error and close the
session *without* writing any changes of its own.

It is important to understand that if the /var/spool/mail directory is
protected 775 (which is the default on many Linux and SVR4) systems, then
imapd can not make this kind of lock and thus the mailbox is vulnerable to
having multiple entities rewrite it simultaneously.

> Also, regarding the lock files that are created, the naming convention
> is .devno.ino where devno is the device number of the file system the
> mail folder is on and ino is the inode of the mail file. Since each
> server has a different idea about the device number, I changed the
> naming convention to .username.ino. (BTW, we have all of the lock files
> going to a centralized location so they are visible by all servers.)
> The PID of the imapd process is written to these lock files. Is this
> information (i.e., the PID) used in any way?

You could have saved yourself the trouble. The lock that you are
describing is the one that does not work over NFS. This is the lock that
prevents multiple imapds from opening the mailbox simultaneously.

In any case, .username.inode is not a good choice. If you know that only
one device will be in use, just .inode is better.

But since it won't work with NFS anyway, it's entirely an academic
question.

Andrew Lochart

unread,
Feb 10, 2000, 3:00:00 AM2/10/00
to
"John S. Humanski" wrote:
>
> I work for a local ISP in Chicago and we are trying to implement IMAP
> using UoW IMAP. We have multiple mail servers that NFS mount a disk
> array that contains our mail directory structure.
>
> What is the expected behavior if a user accesses their mail from
> multiple servers at once? Is there an expected behavior?

I'll leave it to others to explain the why's and wherefor's of why using
NFS for e-mail is usually a bad idea. Earthlink has even written a white
paper (http://www.earthlink.com/about/papers/mailarch.html) on the
subject and admit that locking is a serious concenr in their
architecture. The temptation to use simple NFS appliances like NetApp's
can be great, but resist!

Cheers,
Andrew

--
Andrew Lochart Director, Product Marketing, Message Routers
and...@mirapoint.com http://www.mirapoint.com +1-408-517-1326

Alan J. Flavell

unread,
Feb 11, 2000, 3:00:00 AM2/11/00
to
On Thu, 10 Feb 2000, Mark Crispin wrote:

> You would get much better performance if each of your CPUs had its own
> piece of the overall user space on a local (not NFS disk), and use a
> mechanism to direct incoming IMAP sessions to the proper NFS server.

(I guess you meant "to the proper IMAP server").

I'm only aware of this as a user, so the details aren't too clear to
me, but CERN has a local DNS entry for every registered mail user, of
which there must be many thousands, of the form
USERNAME.mailbox.cern.ch, which points to one of their imap servers
and can be switched around behind the scenes without users needing to
know. This seems (from a user point of view) to work very
effectively.


Ed Symanzik

unread,
Feb 11, 2000, 3:00:00 AM2/11/00
to
Mark Crispin wrote:
> Fortunately, as long as you allow file-create (protection 1777) on the
> /var/spool/mail directory, all is not lost. There is a second level of
> locking, common to all UNIX applications, which will prevent any writes
> being done simultaneously. This lock is exclusive, and fortunately will
> work over NFS -- AS LONG AS THE /var/spool/mail DIRECTORY is 1777 and not
> 775!!!

My mistake. NFS v3 added support for exclusive creates. While most
Unix
applications do support dotlocks, they will fail in an NFS v2
environment.

>
> This is what Ed Symanzik refers to as "procmail locks" -- but like most
> sorcerer's apprentices he's not too clear on the details. I wish that
> people would make an effort to find out the facts rather than posting
> untrue and misleading statements.

Ditto for condescending wizards.

Vladimir A. Butenko

unread,
Feb 11, 2000, 3:00:00 AM2/11/00
to
In article <38A376E0...@mirapoint.com>, Andrew Lochart
<and...@mirapoint.com> wrote:

> "John S. Humanski" wrote:
> >
> > I work for a local ISP in Chicago and we are trying to implement IMAP
> > using UoW IMAP. We have multiple mail servers that NFS mount a disk
> > array that contains our mail directory structure.
> >
> > What is the expected behavior if a user accesses their mail from
> > multiple servers at once? Is there an expected behavior?
>
> I'll leave it to others to explain the why's and wherefor's of why using
> NFS for e-mail is usually a bad idea. Earthlink has even written a white
> paper (http://www.earthlink.com/about/papers/mailarch.html) on the
> subject and admit that locking is a serious concenr in their
> architecture. The temptation to use simple NFS appliances like NetApp's
> can be great, but resist!

I'd like to follow Mark here, and ask you to provide more carefully
composed advices.

a) this has nothing to do with IMAP (which is the protocol), we are
talking about the mail store that has to provide concurrent multi-access
to mailboxes - the same requirement exists for all NATVIE WebMail
solutions (i.e. those that access the mailbox store directly, and not via
IMAP/POP). Even if your clients use POP only, there ARE (but very rare)
situations when multi-access to a mailbox is a must.

b) IF and ONLY IF the mail store you use relies on file locking, then you
WILL have problems with NFS-based systems (even with NFS v3).

c) IF your mail store does NOT rely on file locking, then you CAN use NFS
w/o ANY problem, and you can create very powerful clusters.

If you need an example, have a look at, say, mail.charter.net. Try to
connect to its port 110 or 143. You will see (watching prompts) that
different severs answer on that IP address (they use a Layer4 switch). But
you can get access to an account from any server. THERE IS a NetApp box
(or boxes, I do not know the details) behind all that, and they work
perfectly, because NO LOCKING is needed.

I do not know their policy, but I think you can get an account with them.
Get it, connect to it via IMAP, then open a different connection, trying
several times making sure that you have connected to a different server.
Then, when you have 2 mail clients connected to the same account via
different servers, try to use all those IMAP operations - delete messages
from mailboxes, copy, etc. - on both clients and see if you can force any
problem there.

You should consult with the site admins to learn if the .mbox format is
the default format they use. Or you can specify mailbox types explicitly,
by creating mailboxes as:
mailboxname.mbox
mailboxname.mdir
(the suffix will be removed, the mailboxname mailbox will be created, but
it will have the specified format).

And then you can see that if you do not rely on OS/FileSystem locks, you
can safely use both types of mailboxes and access them from different
servers and still keep all things in synch.

Side note: an idea of a Static Cluster was also mentioned here. The Static
Cluster is a set of servers serving the same domain(s), but with each
server having its portion of the domain accounts on its "local" disks.
I.e., unlike in a Dynamic Cluster, each account has its "home" on one of
the servers.

To distribute the load one can use a DNS hack (like one that was mentioned
here), but a more solid and scalable approach is using a Central Directory
that keeps the records for all accounts and specifies the server on which
the account is located.

Even in THIS, Static Situation, using a File Server for mailbox storage
can be a big plus. The main problem of the Static Cluster is reliability:
if one of the servers goes down, all accounts "hosted" on that server
become unavailable. If the account store is located on an NFS (or any
other) file server, the site admin can simply update the "home" attribute
for all failed accounts in the Central Directory so they will point to one
of surviving servers, ensure that that server also has access to the same
directory on the NFS box (where all those accounts are located) and - all
those account are back in business, with downtime being to 1-5 minutes
range.

Again, this is all possible if the mail server you use does NOT use
File-Level Locks for synching.

> Cheers,
> Andrew
>
> --
> Andrew Lochart Director, Product Marketing, Message Routers
> and...@mirapoint.com http://www.mirapoint.com +1-408-517-1326

--
Vladimir Butenko
Stalker Software, Inc.

John S. Humanski

unread,
Feb 11, 2000, 3:00:00 AM2/11/00
to
Mark Crispin wrote:

> On Thu, 10 Feb 2000, John S. Humanski wrote:

[many lines of valuable information snipped]

Thanks Mark. I will spend some time absorbing this info and discussing the
options within my organization.

John S. Humanski

unread,
Feb 15, 2000, 3:00:00 AM2/15/00
to
Mark Crispin wrote:

> > What is the expected behavior if a user accesses their mail from
> > multiple servers at once? Is there an expected behavior?
>

> The behavior that you will see is that -- because of NFS -- nothing will
> prevent more than one IMAP server from opening the same mailbox at a time.
> This in turn means that more than one IMAP server will think that it can
> rewrite the mailbox.
>
> The reason for this is that the locks which prevent this from happening
> don't work over NFS.
>

> Fortunately, as long as you allow file-create (protection 1777) on the
> /var/spool/mail directory, all is not lost. There is a second level of
> locking, common to all UNIX applications, which will prevent any writes
> being done simultaneously. This lock is exclusive, and fortunately will
> work over NFS -- AS LONG AS THE /var/spool/mail DIRECTORY is 1777 and not
> 775!!!

Mark,

Could you be more specific as to why protection 1777 works for NFS? We have
been in contact with Sun on this and they don't seem to know why but they are
checking on it. (As a matter of fact when we told them that you were the
source of the information they said it must be true! Kudos to you, sir.)

Thanks.

Ali Liptrot

unread,
Feb 15, 2000, 3:00:00 AM2/15/00
to
In article <38A376E0...@mirapoint.com>, Andrew Lochart
<and...@mirapoint.com> wrote:

> "John S. Humanski" wrote:
> >
> > I work for a local ISP in Chicago and we are trying to implement IMAP
> > using UoW IMAP. We have multiple mail servers that NFS mount a disk
> > array that contains our mail directory structure.
> >

> > What is the expected behavior if a user accesses their mail from
> > multiple servers at once? Is there an expected behavior?
>

> I'll leave it to others to explain the why's and wherefor's of why using
> NFS for e-mail is usually a bad idea. Earthlink has even written a white
> paper (http://www.earthlink.com/about/papers/mailarch.html) on the
> subject and admit that locking is a serious concenr in their
> architecture. The temptation to use simple NFS appliances like NetApp's
> can be great, but resist!

(Sorry if this is a repeat. We experienced server problems and did not see
the original post listed)

Mark Crispin

unread,
Feb 15, 2000, 3:00:00 AM2/15/00
to John S. Humanski
On Tue, 15 Feb 2000, John S. Humanski wrote:
> Could you be more specific as to why protection 1777 works for NFS? We have
> been in contact with Sun on this and they don't seem to know why but they are
> checking on it. (As a matter of fact when we told them that you were the
> source of the information they said it must be true! Kudos to you, sir.)

It isn't that "protection 1777 works for NFS", but rather "the mechanism
that works for NFS needs 1777 protection."

There's a detailed document called locking.txt in the IMAP toolkit
documentation. Here's a summary:

The c-client library (the heart of imapd, ipop[23]d, Pine, etc.) uses two
forms of locks for the traditional UNIX mailbox format. There are other
locking strategies for other mailbox formats; for the purposes of this
discussion these are disregarded.

One form is private to c-client, and uses files in /tmp. It arbitrates
between multiple agents having the mailbox open read/write and knowing
about messages and where they are. This is an exclusive lock. There is a
mechanism called "kiss of death"; a process which covets the lock but
finds it busy can send a "kiss of death" signal to the process which owns
the lock. If the process owning the lock commits suicide within 10
seconds, the process which covets the lock can then seize it for itself.

This form of locking doesn't work over NFS. The loss of this locking
mechanism means that a process which covets read/write open will just do
so, without regard for any other process already having read/write.
Sooner or later, one process will write an change to the mailbox that the
other did not expect; this will cause the other to report a fatal error
and quit.

The basic impact is that instead of "last in wins, and you get a chance to
save your work before you die", the situation is "whomever writes the file
wins, and the other guys don't get a chance to save their work." It is
less graceful, but not a disaster.

Now, let's turn to what works with NFS.

UNIX mail applications from prehistory have used a lock file, on the same
directory with the mailbox file, consisting of the mailbox file name with
".lock" appended. For example, if mrc's INBOX is /var/spool/mail/mrc,
then the lock file is /var/spool/mail/mrc.lock

This lock is an exclusive lock, and it arbitrates any read() or write() to
the file. It is used to communicate between the MUA (such as a c-client
application) and the MDA (such as /bin/mail called from sendmail). It's
sole purpose is to prevent mail delivery and mail reading/updating from
happening at the same time.

If the file exists, the lock is applied. So you have to be able to create
the file and you have to be able to delete it. This is where directory
protection 1777 comes in.

Note that the MUA can not keep this lock on, since while the lock is
applied new mail can't be delivered. So, the program locks, does its
thing, and then immediately unlocks.

Fortunately, when there is only an MUA and an MDA, there's no problem with
this, because the MDA only appends. So it just has to stop the MUA from
reading (or updating) while it's appending. Similarly, while the MUA is
reading (or updating), it has to stop the MDA from appending.

Because old stuck locks can be left lying around, locks older than 5
minutes are ignored. Similarly, a program that has waited for 5 minutes
for the lock to unlock can also ignore the lock.

All of this assumes cooperative behavior; UNIX is a "gentlemen's
timesharing system." Fortunately, there are plenty of nastier ways that
one can be obnoxious on UNIX other than playing with lock files, and it is
relatively easy to catch brats who do it. So usually, those with bad
behavior in mind quickly move on from lock files.

Now, what's important is that you are able to create the lock atomically.
That is, you are able to say "create this lock file" and know from the
system call return value whether or not you actually created it.

Unfortunately, under NFS, you can't do it with the standard O_EXCL bit to
the open() system call. OK, NFSv3 claims that you can, but all the world
isn't NFSv3. Also, anyone who actually trusts SUN's claims to have fixed
all these NFS problems is either naive or is smoking some serious loco
weed... ;-)

This has led to claims that you can't do this form of locking over NFS,
and that you must therefore using something like maildir. This is
untrue. It requires some skullduggery, but it is possible.

What you do is create a different file name with a random name that you
are reasonably certain won't be repeated by any other program using that
NFS cluster. Techniques for generating such names are fairly well
understood; this isn't rocket science.

OK, you have the name; so how do you get the lock file? You use hard
links. You make a hard link from your random name to the lock file. Now,
you have to ignore the return from the system call, because NFS being
non-atomic makes both success and failure a lie. What you do instead is
check the link count on your random name. If it is 1, the link failed,
and you didn't get the lock. If it is 2, the link succeeded, and you got
the lock -- and you know that you and only you got it. Either way, you
now delete the random name, and either proceed (because you got the lock)
or retry (because you didn't).

Yes, it really does work. It required reading NFS source code to verify
that it does work.

Now, it's claimed that you don't have to go to that trouble that on
NFSv3. Bill Clinton didn't inhale, and he didn't have sex with Monica
Lewinsky either. Right.

Andrej Borsenkow

unread,
Feb 16, 2000, 3:00:00 AM2/16/00
to

"Mark Crispin" <m...@CAC.Washington.EDU> wrote in message
news:Pine.NXT.4.30.00021...@Tomobiki-Cho.CAC.Washington.EDU
...

> On Tue, 15 Feb 2000, John S. Humanski wrote:
> > Could you be more specific as to why protection 1777 works for NFS? We
have
> > been in contact with Sun on this and they don't seem to know why but
they are
> > checking on it. (As a matter of fact when we told them that you were
the
> > source of the information they said it must be true! Kudos to you,
sir.)
>
> It isn't that "protection 1777 works for NFS", but rather "the mechanism
> that works for NFS needs 1777 protection."
>

More precisely - the described (description omitted) locking mechanism needs
the ability to create files in mail spool directory. Traditionally, MDAs and
MUAs run SGID mail and mail spool directory belongs to user root/group mail
and has 775 permission, enabling every SGID mail program to create/delete
lock file.

The UoW imapd, Pine or any utility from imap-utils do not run as SGID
programs; hence spool directory has to have write permissions for all users.
And sticky bit simply prevents users from deleteing files belonging to
others (at least, on more or less modern Unix).

Please, do not start yet another fame war "SGID or not SGID" :-)

/andrej


Ed Symanzik

unread,
Feb 16, 2000, 3:00:00 AM2/16/00
to
Mark Crispin wrote:

> What you do is create a different file name with a random name that you
> are reasonably certain won't be repeated by any other program using that
> NFS cluster. Techniques for generating such names are fairly well
> understood; this isn't rocket science.
>
> OK, you have the name; so how do you get the lock file? You use hard
> links. You make a hard link from your random name to the lock file. Now,
> you have to ignore the return from the system call, because NFS being
> non-atomic makes both success and failure a lie. What you do instead is
> check the link count on your random name. If it is 1, the link failed,
> and you didn't get the lock. If it is 2, the link succeeded, and you got
> the lock -- and you know that you and only you got it. Either way, you
> now delete the random name, and either proceed (because you got the lock)
> or retry (because you didn't).

This requires four operations:
create(unique); link(unique,lock); stat(lock); unlink(unique);

What do you think of using symlinks?
symlink(unique,lock); readlink(lock);

Here the unique file doesn't actually exist and the readlink is done to
verify success.

Mark Crispin

unread,
Feb 16, 2000, 3:00:00 AM2/16/00
to Ed Symanzik
On Wed, 16 Feb 2000, Ed Symanzik wrote:
> What do you think of using symlinks?
> symlink(unique,lock); readlink(lock);
>
> Here the unique file doesn't actually exist and the readlink is done to
> verify success.

Unfortunately, this is incompatible with other UNIX tools which require
that the lock be a regular file.

John S. Humanski

unread,
Feb 17, 2000, 3:00:00 AM2/17/00
to
"John S. Humanski" wrote:

> I work for a local ISP in Chicago and we are trying to implement IMAP
> using UoW IMAP. We have multiple mail servers that NFS mount a disk
> array that contains our mail directory structure.

I just want to thank everyone who replied, especially Mark. It was all
extremely helpfully and we have learned quite a bit.

Thanks again.

Reply all
Reply to author
Forward
0 new messages