I was assuming that someone who operates advanced hardware such as
mentioned in the subject has a general understanding of operating
systems, of file systems and of network protocols.
That's not right. I apologize.
Wietse
> The reply has been given: I refer to the text that talks about disk
> update rates and rotational latencies, and the texts on synchronous
> versus asynchronous disk updates.
>
> If those replies do not make sense, then the reader MUST acquire a
> minimal amount of clue first. That I consider a reasonable request.
You are right. It's just the tone of discussions on the list that annoy me.
People often don't know a lot about this stuff, but are trying to learn, and
since you wrote Postfix, people expect that you can help them with learning
how to use it. And it seems that these questions are annoying for the people
with a lot of experience. And maybe this list is the wrong place for
less-experienced users, but where should they go? (Except of course to the
FAQ :)
Sander.
Wietse&Brad&Rafi : sorry for the annoyed reaction, it's not your fault.
Justin Robertson
<zu...@linux.com>
We're all pleading ignorance, Judge :-)
The Linux ext2fs file system invalidates some assumptions that are
true with UNIX file systems. With Linux, directory updates are
asynchronous, so even when a file's content is fsync(2)-ed to disk,
there is no guarantee about the status of the file's directory
entry (*). For example, open()-ing a new file returns before the
file system has been updated. Until very recently, when a file was
renamed, the directory updates could be done such that the old file
name was removed before the new file name was put in place. If the
system crashed in the middle of rename, it was possible lose files.
UNIX file systems, esp those based on FFS, take great pains to
ensure that directories are updated in a safe manner. For example,
when a file is renamed, the new name is put in place before the
old name is removed. And open()-ing a new file returns *after* the
directory entry is put in place.
Because of this difference, Postfix on Linux by default enforces
synchronous writes on the queue directory tree, so that Postfix
will not lose mail should the machine crash. This makes Postfix
on Linux slower than Postfix on comparable UNIX systems. That's
too bad, but simply I can't recommend configurations that lose mail
in a system crash.
It is possible that Sendmail uses the default asynchronous extfs
directory updates, which trades speed for loss in reliability. If
the machine crashes in the middle of receiving a burst of mail,
then you can expect to find some mail in the lost+found directory.
The measurement with the low-cost IDE disk was flawed. 1024 Messages
in 2 seconds means that hardly any data was written to disk. That
figures with Postfix or whatever on an async ext2fs file system.
The measurement gives no useful data about mail system performance.
If it acceptable to lose mail I can speed up Postfix tremendously.
Wietse
(*) for that, Linus recommends that the application open()s the
directory and fsync()s it. Right. When pigs fly. Fortunately we
still have a choice of operating systems.
... which is why it confuses me that FFS is so incredibly flaky.
If you interrupt the power to a machine running FFS, quite
frequently you will lose files. Sometimes files which had not
even been accessed for weeks before the power interruption. Most
of our machines, except those that are very new, have gaps in the
/usr/src directory where some crash or other wiped files.
I have even had fsck then go on to destroy the entire disk, when
given an almost-entirely-working one.
If anyone could educate me why our Windows 95 workstations have a much
more reliable filesystem than our (BSD) Unix servers, I'd love to hear
from them.
Cheers
Jon
--
\/ Jon Ribbens / j...@oaktree.co.uk
It could be a problem with the Linux SCSI subsystem. While stable, is it
somewhat evil code as of the 2.2.x kernel revisions. Much works has been
done in the 2.3.x development series to clean up the SCSI subsystem (with
good results). Also, the SCSI HDs likely support TCQ (tagged command
queing), but this is not enabled by default in the Linux kernel for
compatibility. I suggest he tunes the Linux kernel SCSI AIC7xxx driver
parameters before he does anything else, as the "most compatible" SCSI mode
is slower than the "best speed" mode of IDE (which is also default).
Otherwise, this is not a postfix issue, nor does it nessitate further
discussion/flaming on this list.
--
Hi! I'm a .signature virus! Copy me into your ~/.signature to help me
spread!
Just for everbody who still disagrees; here's some interesting stuff.
IDE Machine
(sync)
# grep 'hda' /etc/mtab
/dev/hda1 / ext2 rw,sync 0 0
# /usr/bin/time ./smtp-source -l 15024 -m 1000 -f root@localhost -t
root@localhost -d 127.0.0.1:25
0.27user 0.39system 0:50.27elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (137major+18minor)pagefaults 0swaps
SCSI Machine
(async on - making a point)
# grep 'sda' /etc/mtab
/dev/sdb1 / ext2 rw,errors=remount-ro,errors=remount-ro 0 0
# /usr/bin/time ./smtp-source -l 15024 -m1000 -f root@localhost -t
root@localhost -d 127.0.0.1:25
0.66user 1.82system 4:38.42elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (142major+22minor)pagefaults 0swaps
I'd be willing to bet that Dylan is right on this one.
Justin Robertson
<zu...@linux.com>
On ext2fs with default settings, the fsstone number of files/second
is largely a matter of CPU speed and of available memory. Hardly
anything is written to disk in those 2 seconds.
> it took 44 seconds to send 1000 15k emails.
Let me demonstrate how these numbers depend on conditions.
With my 450MHZ Redhat 6.1 box with Ultra 2 LVD SCSI (*) and Postfix
19991231-pl04 installed as per Postfix defaults, pumping in 1000
15k emails via SMTP takes 162 seconds elapsed time. When I revert
to the default Linux async directory updates that becomes 93 seconds.
When in addition to that I turn off the default synchronous syslog
writes, the elapsed time goes down to 12 seconds. With async syslog
and with sync directory updates, the elapsed time to pump in the
mail is 85 seconds.
(*) Adaptec 2940U2W, Seagate Barracuda ST39175LW, connected by a
twisted-pair LVD cable.
Summarized in a table, the time to pump in 1000 15kbyte messages
into Postfix on a 450MHZ Redhat 6.1 box with Ultra-2 Wide LVD SCSI:
directory syslog seconds
===========================
sync sync 162
async sync 93 (1)
sync async 85 (2)
async async 12
(1) Indicates how I expect other mailers to run on Linux.
(2) Is how I would run Postfix if I had to use Linux.
My FreeBSD 2.2.8 Thinkpad 600 does the same test in 69 seconds
elapsed time. I would hope that FreeBSD is faster than that on the
Ultra 2 LVD SCSI disk. However, I'm not going to wipe out Linux
just this test. I need the machine for running VMware.
Command used for stress test:
/usr/bin/time ./smtp-source -m1000 -s20 -l15360 -c -tnull@localhost localhost
How to toggle sync/async writes on Linux:
chattr -R +S /var/spool/postfix turns on safe directory updates (and more)
chattr -R -S /var/spool/postfix turns off sync writes (Linux default)
mail.* -/var/log/maillog turns off sync syslog writes
mail.* /var/log/maillog turns on sync syslog writes (Linux default)
By the way, it's interesting to see that out of the box, Linux
handles logging more securely (sync writes) than email (async
directory updates).
Wietse
what exactly is the problem with doing that if:
a) it can be wrapped in "#ifdef linux" so it doesn't affect compilation on
other systems, and
b) doing so will safely speed up postfix on linux?
is it just an aesthetic distaste for doing something "ugly" or is there a
good technical reason for not doing it?
craig
ps: yes, using reiserfs would be better - but it's not a standard part
of the kernel yet and many people won't (or don't know how to) use
experimental patches.
--
craig sanders
Since you mention it: this is a common confusion. The driver that
used to be a big mess and has been revised in 2.3.x kernels is the SCSI
_generic_ driver (/dev/sg*, as opposed to /dev/sd*), used for "SCSI
emulation" with IDE CD writers, DATs, Zip drives, parallel port discs,
and a few other oddball devices, but _never_ with "normal" internal
discs. The driver for internal discs hasn't change much in 2.3.x,
except for adding support for new controllers.
> Also, the SCSI HDs likely support TCQ (tagged command queing), but
> this is not enabled by default in the Linux kernel for compatibility.
> I suggest he tunes the Linux kernel SCSI AIC7xxx driver parameters
> before he does anything else, as the "most compatible" SCSI mode is
> slower than the "best speed" mode of IDE (which is also default).
Now, this may be indeed the real crux of the matter. The design of
the PC DMA is simply abysmal; without TCQ the latency of a SCSI disc
on a PC can be as much as 40 times larger than the one you'd normally
get from the same disc on a less brain-dead machine, say a Sparc.
(Typically the throughput would be smaller too --- this time because
of the abysmal PC bus --- but that can't be "fixed" in software.) But
for some reasons whoever was supposed to document this for Linux either
didn't understand what this TCQ was about, or was a big fan of black
magic.
> Otherwise, this is not a postfix issue, nor does it nessitate further
> discussion/flaming on this list.
Regards,
Liviu Daia
--
Dr. Liviu Daia e-mail: Liviu...@imar.ro
Institute of Mathematics web page: http://www.imar.ro/~daia
of the Romanian Academy PGP key: http://www.imar.ro/~daia/daia.asc
So would migrating to a better file system, and that would have
the benefit of avoiding unnecessary code that jumps hoops whenever
Postfix creates, renames or removes a queue file, bounce/defer log
file, mailbox file or maildir file.
> is it just an aesthetic distaste for doing something "ugly" or is there a
> good technical reason for not doing it?
I have something against unnecessary code that can't be tested.
Every line of code is a potential bug. Microsoft has a goal of 4
bugs per 1000 lines; my goal is fewer bugs than that. Code that
isn't written doesn't have bugs. That's how I avoided a lot of bugs
with Postfix.
> ps: yes, using reiserfs would be better - but it's not a standard part
> of the kernel yet and many people won't (or don't know how to) use
> experimental patches.
Reiserfs still needs to prove itself.
Wietse
We don't know for sure what your syncing behavior is, but regardless the
default monolithic sendmail installation does less disk I/O than postfix.
To get sendmail to behave like the default postfix installation you'd set
in sendmail.cf
# default delivery mode
O DeliveryMode=queue
# queue up everything before forking?
O SuperSafe=True
A high-volume production sendmail installation would probably set these
anyway.
I run production mail systems with Linux, but I don't trust ext2fs. I sync
writes and gain the speed back by spending money on hardware RAID with
battery-backed cache. Given limited local resources and expertise (it's
much easier to find Linux admins willing to accept .edu pay scales than
UNIX admins), that tradeoff seems better than supporting another OS. Plus
hardware RAID isn't a bad thing to have. I used to try to run maildir on a
NetApp but had too many problems with Linux NFS.
If your hardware is fixed, I think your best bet would be to invest some
time learning Open/FreeBSD. Or just live with the risk. If your customers
and boss are used to Windows, they'll be forgiving of the occasional mail
lossage in case of catastrophic failure. It's not like it'll happen every
day. It takes real UNIX or mainframe experience to know that such lossage
need not be accepted.
--
Rich Graves <rcgr...@brandeis.edu>
UNet Systems Administrator
Where's the difference? Sendmail tries to deliver by default, which
involves a half-dozen of files.
In queue-only mode, sendmail manipulates the following files, according
to the strace command:
create qf<QUEUE ID>
create xf<QUEUE ID>
create df<QUEUE ID>
create tf<QUEUE ID>
rename tf<QUEUE ID> to qf<QUEUE ID>
remove xf<QUEUE ID>
That's Sendmail 8.9.3, invoked with -O DeliveryMode=queue.
Postfix is optimized for receiving mail via SMTP:
create incoming/temporary-name
rename incoming/temporary-name to incoming/<QUEUE ID>
With local submission, mail is copied by a privileged process in
order to cross the boundary between user-land and Postfix:
user-land:
create maildrop/temporary-name
rename maildrop/temporary-name to maildrop/<QUEUE ID>
pickup/cleanup daemon:
open maildrop/<QUEUE ID>
create incoming/temporary-name
rename incoming/temporary-name to incoming/<NEW QUEUE ID>
> To get sendmail to behave like the default postfix installation you'd set
> in sendmail.cf
>
> # default delivery mode
> O DeliveryMode=queue
> # queue up everything before forking?
> O SuperSafe=True
>
> A high-volume production sendmail installation would probably set these
> anyway.
Wietse
Hi Justin, did you make sure that your LVD HD is actually running at 80MB/s
and not at 10MB/s? Ive seen this happen quite often. Sometimes you need to
fiddle with the scsi bios.
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 15.
Cor
How do my Ultra 2 Wide LVD SCSI measurements compare with yours?
It took me some time to line up the data, and it would be a shame
if you didn't notice them.
Wietse
Justin Robertson:
>
> Yeah it's good. Thanks for the concern though. :)
>
>
>
>
> Justin Robertson
> <zu...@linux.com>
At the risk of inflaming the situation, the moment you put an IDE disk in
with write-back caching, all this goes out the window. I suspect this is
what is happening here. ie: the ide disk is caching the writes and telling
the OS that it has been written when in fact it has not. In this scenario
the IDE disk is always doing async writes internally, regardless of whether
or not chattr(), fsync() etc are used. With SCSI disks, the OS has much
better control over this situation, especially with ordered tag queueing
etc.
This is probably why the IDE box is winning, the disk is probably cheating
and defeating the "don't loose mail" safety checks, while the SCSI box is
doing the right thing.
Cheers,
-Peter
Sure you can. When the power is finally cut, the system has already
synced and shutdown cleanly because the UPS has alerted it beforehand.
Or don't BSD people believe in UPS's because Linux people use them?
(asynchronous writes are bad because we don't do it, POSIX conformity is
bad because we don't do it, our TCP/IP implementation is the standard
even though it conflicts RFC's because we invented sockets, ...)
You can be defeated by hardware caching anyway (IDE drives come standard
with 2Mb of cache these days, decent SCSI controllers have more) so the
power-failure argument is flawed from that perspective too.
People subscribe here to talk about a rather nice cross-platform MTA
called postfix, not to be fed anti-Linux crap from BSD bigots. Try to
appreciate anything that can be considered Unix-like, you'll miss them
when middle-managers turn everything into Windows NT and wonder why
armageddon arrived early.
Cheers,
--
Matt "...the only place for 63,000 bugs is a rain forest"
> Sure you can. When the power is finally cut, the system has already
> synced and shutdown cleanly because the UPS has alerted it beforehand.
>
It's not an operating system's job to rely on the existance of certain
hardware. It's also not exactly great practice to write software with
certain assumptions that a given piece of hardware--especially something
like a UPS--is in use.
When you stop assuming the lowest common denominator, people will get
screwed.
Now, I know that because of the fact that I just subscribed, I have no
place to ask this and I'm contradicting myself by replying, but what
the hell does this have to do with Postfix? Yeah, the original thread was
asking about Postfix performance on a user's machine, but I fail to see
where "BSD bigotry" and using a UPS fits into that thread ... I
honestly don't really care, I'm just curious.
Hi, I'm Scott.
> Or don't BSD people believe in UPS's because Linux people use them?
Oh, we use UPSes. Heck, we have entire computer rooms that are
on UPS. We just don't depend solely on them, because we know that
UPSes sometimes fail, too (maybe the battery goes bad, maybe there's
something wrong with the cable from the computer to the UPS, who
knows?).
At each and every step in the process, we set our machines up in
the fastest but most robust manner we can, so that even if all the
components before it fail, we still have the highest possible chance
of recovery without loss.
> (asynchronous writes are bad because we don't do it,
No, pure asynchronous writes are bad and *THEREFORE* we don't do
them *BY DEFAULT*.
> People subscribe here to talk about a rather nice cross-platform MTA
> called postfix, not to be fed anti-Linux crap from BSD bigots.
How about anti-BSD (or anti-anyone else) crap from Linux bigots?
We know, FOR A FACT, that pure asynchronous writes are unsafe.
We still allow you to shoot yourself in the foot if you really want
to, but we don't turn them on by default, and we put in all sorts of
warnings to discourage people from doing them.
We also know that using softupdates *is* safe, and we know from
empirical testing that it gets you all the same performance (or more)
than pure asynchronous writes get you, so why would anyone want to
run with asynchronous writes? Maybe because they don't have
something that is both fast *and* safe, such as softupdates? Perhaps
this is a little bit of sour grapes?
When there are valid criticisms to be made, it is not bigotry to
make them. Linus himself has recognized most of the shortcomings in
Linux that you're ever likely to see mentioned here, and it is my
understanding that they are all supposed to be fixed someday.
But why wait to get them fixed when you can run an OS that is
fully compatible with Linux (so all your binaries should just plain
run), and will actually run many of them faster than Linux (because
of the improved memory management), and has the /usr/ports subsystem
with over 3200 ports defined that require nothing more than "cd
/usr/ports/net/spegla; make install", and all the rest of the work is
done for you?
I'll be the first to concede that FreeBSD is not for everyone.
Right now, our threading still needs work, although there is a
Linux-compatible threading mode and there are POSIX pthreads, they're
still not as good as they should be. I'm sure there are other areas
in which FreeBSD is not yet as good as it should be. However, on the
whole, I believe that it is the best freely available Unix or
Unix-like OS for x86.
--
These are my opinions and should not be taken as official Skynet policy
=========================================================================
Brad Knowles, <b...@skynet.be> Sys. Arch., Mail/News/FTP/Proxy Admin
Note: No Microsoft programs were used in the creation or distribution of
this message. If you are using a Microsoft program to view this message,
be forewarned that I am not responsible for any harm you may encounter as
a result.
See <http://i-want-a-website.com/about-microsoft/twelve-step.html> for
details.
This is the wrong list for advocacy. Take it elsewhere.
-Dan
Difficult to say. You are not giving much information about the disk
subsystem, so this is pure speculation: The IDE disk has write-back cache
enabled while the SCSI disk has it disabled.
Also, by know you've probably found out (the hard way) that mentioning
Linux on this mailing list can completely sidestep a thread. Try to avoid
such politically incorrect "swear words".
I'm disappointed about how few people actually read your message before
replying to it.
Regards,
/ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻTŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ\
| Rask Ingemann Lambertsen | E-mail: mailto:ra...@kampsax.dtu.dk |
| Please do NOT Cc: to me or the | WWW: http://www.gbar.dtu.dk/~c948374/ |
| mailing list. I am on the list.| "ThrustMe" on XPilot, ARCnet and IRC |
| You have not converted a man because you have silenced him. |
Lighten up, please. There's more in the world than Linux and BSD.
My personal experience is based on a dozen OSes. On this same list
I have complained many times about brain damaged Solaris, which is
what I use a lot. Writing high-performance software means that
one will inevitably be confronted by the quirks of the underlying
OS. A taboo on such matters just does not make sense to me.
Wietse
> Lighten up, please. There's more in the world than Linux and BSD.
Like HP-UX. And no need for bashing there, I already do that myself all the
time :)
--
Ralf Hildebrandt <R.Hild...@tu-bs.de> www.stahl.bau.tu-bs.de/~hildeb
Okay, so I have this coworker who believes that NT is God's Gift to Sysadmins.
There are lots of weird gods around, aren't they?
Yeah, he means Cthulu. That's the kind of OS he/she/it'd give as a gift.
>
> I Fail to see what that has to do with it...
> The entire point I'm trying to make here is that faster hardware is
> running slower, and I'd like to find out why. Now it's not a case of me
> having b0rked hardware, because I've had other people experience the same
> 'issue' on SCSI drives vs IDE drives. What specificly is it that is
> handled so differently between the two that would slow down mail transfer
> so intensly. The reason I say it's SCSI, is SCSI is the only common factor
> between some of the boxes this has been tried on and slow, and all the IDE
> boxes, even when I place an IDE drive into the machine in question, seem
> to yeild better performance...
Check to see if sync update for that directory are on (if the
old machines was not...).
Also, file system size, running the same (default) block and
inode size on a say 2GB and a 16GB partition is not the best idea.
---
As folks might have suspected, not much survives except roaches,
and they don't carry large enough packets fast enough...
--About the Internet and nuclear war.
what linux bigotry?
the message you are referring to as "linux bigotry" was a response to
your BSD bigotry, basically telling you that it wasn't appropriate for
this list.
you know a lot about mail systems, and your contributions on this list
are generally worthwhile and interesting to read - but i must admit i
have to tune out when you start on an anti-linux rant. your opinions
are certainly valid but they are not the whole story, they are just one
perspective. other people, also with a lot of unix experience, come to
different conclusions. some prefer linux, some prefer one of the BSDs.
it really doesn't matter.
i've used both linux and freebsd (and several commercial unices too) for
the last 8 or 9 years or so. i haven't yet seen anything good enough in
the BSD kernels or bad enough in the linux kernels to offset the fact
that BSD userland is abysmal compared to Debian GNU/Linux - /usr/ports
is cool, but still only a pale imitation of what a decent package
manager can do...it's fine if you only have to administer one or two
machines, but a real PITA if you have to look after dozens.
the kernels are different, with different strengths and weaknesses. both
are more than good enough. so, for me, the deciding factors are a) speed
of development (linux wins here) and b) userland (debian wins here,
regardless of what kernel it's running on - HURD and Linux so far, with
a freebsd port in progress).
YMMV - your selection criteria may be different.
if i were a kernel hacker then i might care more about the differences
between the kernels...but i'm not, i'm a systems administrator so i pick
the best overall system. i don't base my decision on just one small but
important part of the system. if and when debian on freebsd gets to a
usable state then i'll trial that and maybe use it regularly. until then
i'll use whatever kernel debian runs best on...which happens to be linux
at the moment.
the point of this message is not to continue a BSD vs Linux (vs
whatever) flamewar. it is to point out that it basically comes down to
personal preference...both systems have advantages and disadvantages.
take your pick and leave others to make their own choices based on their
own needs and experiences.
BTW, if async writes are so bad then why has it failed to cause me even
a single problem in the 5+ years i've been running linux systems? i've
built literally hundreds of linux boxes in that time, many of them under
quite heavy disk IO load (e.g. squid proxy servers, large mail servers,
postgres database servers, news servers, web servers, and more)...and
very few of them with a UPS.
if async writes were such a big problem then you'd expect to see at
least a dozen or more serious filesystem failures in that time across
that many machines.
craig
--
craig sanders
Can we stop this, please. It seems I can't criticize an OS without
people falling over each other.
Remember, all software sucks. Some sucks more, and some sucks less.
But it sucks regardless. If I want to see something elegant I go
look for a piece of art.
Wietse
Can we stop whinging about how OS x y or z sucks, and instead get back to
whinging about how postfix sucks? :>
-Dan
<SMILEY>
No - let's complain about how 2 year old *alpha* postfix is stable so
that your boss wants to know why you want to upgrade it to a current
version ( with regexp,better bounce handling, etc...)
Contrast this with most commercial software that needs to be upgraded at
least 2-3 times a year - that gives you an excuse to upgrade to a current
feature set & needs twice as much CPU & RAM - so you get to upgrade your
computer every year or two as well ...
</SMILEY>
>
> -Dan
>
Rafi
> BTW, if async writes are so bad then why has it failed to cause me even
> a single problem in the 5+ years i've been running linux systems?
Let me just summarize the known facts.
Async writes are dangerous, regardless of what OS you use. If
you use async writes, you take a significantly increased risk that
you will lose files, or even entire filesystems. As an
administrator, you have the option of explicitly choosing to use
async writes or not, but you should fully understand the dangers
before you do so, and be willing to take the risks.
The one thing that Linux does wrong here is by default taking
that decision out of the hands of the person setting up the machine
-- most people who set these things up won't know the risks, and
probably would not choose to take those risks if they knew what they
were.
BSD (and other OSes) force you to explicitly take an extra step
in order to be able to shoot yourself in the foot in this manner,
while Linux gives you a loaded gun without first checking to see if
you know what you're doing.
At Wietse's request, I will not be continuing to post on this thread.