I admit it. I'm panicking.
I took a deep breath, looked through the FAQ, searched a few websites and I'm
still stuck. Here's the situation:
Yesterday /var/log/mail started filling with a repetitive series of postdrop
warnings:
# excerpt from /var/log/mail (note repetition after 14 messages)
Oct 15 02:48:03 bastion postfix/postdrop[3665]: warning: mail_queue_enter:
create file maildrop/74326.3665: No space left on device
Oct 15 02:48:03 bastion postfix/postdrop[3627]: warning: mail_queue_enter:
create file maildrop/212653.3627: No space left on device
Oct 15 02:48:04 bastion postfix/postdrop[3592]: warning: mail_queue_enter:
create file maildrop/58021.3592: No space left on device
Oct 15 02:48:04 bastion postfix/postdrop[3540]: warning: mail_queue_enter:
create file maildrop/38075.3540: No space left on device
Oct 15 02:48:05 bastion postfix/postdrop[3501]: warning: mail_queue_enter:
create file maildrop/145119.3501: No space left on device
Oct 15 02:48:05 bastion postfix/postdrop[3470]: warning: mail_queue_enter:
create file maildrop/28170.3470: No space left on device
Oct 15 02:48:06 bastion postfix/postdrop[3418]: warning: mail_queue_enter:
create file maildrop/21640.3418: No space left on device
Oct 15 02:48:06 bastion postfix/postdrop[3378]: warning: mail_queue_enter:
create file maildrop/118365.3378: No space left on device
Oct 15 02:48:07 bastion postfix/postdrop[3345]: warning: mail_queue_enter:
create file maildrop/38291.3345: No space left on device
Oct 15 02:48:07 bastion postfix/postdrop[3296]: warning: mail_queue_enter:
create file maildrop/6541.3296: No space left on device
Oct 15 02:48:08 bastion postfix/postdrop[3260]: warning: mail_queue_enter:
create file maildrop/222109.3260: No space left on device
Oct 15 02:48:11 bastion postfix/postdrop[3786]: warning: mail_queue_enter:
create file maildrop/80247.3786: No space left on device
Oct 15 02:48:12 bastion postfix/postdrop[3748]: warning: mail_queue_enter:
create file maildrop/188474.3748: No space left on device
Oct 15 02:48:12 bastion postfix/postdrop[3717]: warning: mail_queue_enter:
create file maildrop/69373.3717: No space left on device
Oct 15 02:48:13 bastion postfix/postdrop[3665]: warning: mail_queue_enter:
create file maildrop/74326.3665: No space left on device
Oct 15 02:48:13 bastion postfix/postdrop[3627]: warning: mail_queue_enter:
create file maildrop/212653.3627: No space left on device
Apparently, I have plenty of disk and inodes (see mailing list archives.) I
have plenty of swap. I rebooted and the warnings continue.
# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda3 4294967295 0 4294967295 0% /
/dev/sda1 10040 40 10000 1% /boot
shmfs 4294967295 1 4294967294 1% /dev/shm
# df -k
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/sda3 8650728 4170112 4480616 49% /
/dev/sda1 38859 3733 33120 11% /boot
shmfs 975488 0 975488 0% /dev/shm
# excerpted from top
3:09am up 1:28, 3 users, load average: 0.09, 0.04, 0.08
104 processes: 102 sleeping, 1 running, 1 zombie, 0 stopped
CPU0 states: 0.0% user, 2.0% system, 0.0% nice, 97.0% idle
CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle
Mem: 1028028K av, 931472K used, 96556K free, 0K shrd, 21472K buff
Swap: 265064K av, 0K used, 265064K free 725576K
cached
The repetitive cycle of events could indicate a group of objects stuck in a
queue continuing to error out; honestly, I don't know enough about the
underlying architecture to know what to look for in that respect. And
realistically, I'm guessing based on the symptoms. The warning messages are
consistent and repetitive.
This system has been running fine for months; there have been no changes to
main.cf since May (system configuration info attached below - including
'postconf -n' output.) I've been making changes to .procmailrc to deliver to
mh mailboxes because I've been seeing too much mbox corruption under KMail to
trust it (or the mbox format) anymore; also I'm trying to integrate
bogofilter into my spam defenses, alongside SpamAssassin, Razor, and DCC.
I've made minor DNS changes but nothing involving MX records; minor
adjustments to PTR and A records. I mention all this in case it jogs
someone's memory. I'm fairly certain I broke something; it's just not clear
what since everything seems to be working besides the flood of (spurious?)
warnings from postdrop.
I didn't notice the problem right off since mail kept getting delivered; how
reliably, I'm not sure. My ISP has been breaking their mail and POP servers,
KMail's been breaking my inbox, and I've been breaking procmail so I'm having
a difficult time separating the effects from all the potential causes.
Suffice to say, I'm not obviously out of space, I've restarted everything,
and I'm at my wit's end. I can't get a PID to run lsof against and it's far
too late at night for me to be playing with strace. Postfix has been quiet
and stable for months; I have no idea why it's started on fire now or how to
put it out.
I know; helluva first post. :/
Thanks much for your time and attention,
--
Bob Apthorpe
# cat /etc/issue
Welcome to SuSE Linux 7.2 (i386) - Kernel \r (\l).
# uname -a
Linux soyokaze 2.4.4-64GB-SMP #1 SMP Fri May 18 14:54:08 GMT 2001 i686
unknown
# rpm -qa | egrep postfix
postfix-20010228pl03-9
# output from postconf -n
access_map_reject_code = 550
alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
allow_mail_to_commands = alias,forward
allow_mail_to_files = alias,forward
allow_untrusted_routing = no
bounce_size_limit = 50000
canonical_maps = hash:/etc/postfix/canonical
command_directory = /usr/sbin
daemon_directory = /usr/lib/postfix
debug_peer_level = 2
default_destination_concurrency_limit = 10
default_privs = nobody
default_transport = smtp
forward_path = $home/.forward$recipient_delimiter$extension,$home/.forward
invalid_hostname_reject_code = 501
local_destination_concurrency_limit = 2
local_recipient_maps = $relocated_maps $alias_maps unix:passwd.byname
mail_name = Postfix
mail_owner = postfix
mail_spool_directory = /var/mail
mailbox_command = /usr/bin/procmail -a "$EXTENSION"
maps_rbl_domains = relays.ordb.org, relays.osirusoft.com
maps_rbl_reject_code = 550
masquerade_domains = $mydomain
masquerade_exceptions =
message_size_limit = 20971520
mydestination = $myhostname, localhost.$mydomain, localhost, $mydomain
mydomain = cynistar.net
myhostname = soyokaze.cynistar.net
mynetworks = 206.225.63.56/29, 127.0.0.0/8
mynetworks_style = subnet
program_directory = /usr/lib/postfix
queue_directory = /var/spool/postfix
recipient_delimiter = +
reject_code = 550
relay_domains = $mydestination
relay_domains_reject_code = 550
relocated_maps = hash:/etc/postfix/relocated
smtpd_banner = $myhostname ESMTP $mail_name
smtpd_client_restrictions = permit_mynetworks, reject_maps_rbl
smtpd_error_sleep_time = 5
smtpd_etrn_restrictions =
smtpd_hard_error_limit = 25
smtpd_helo_required = yes
smtpd_helo_restrictions = permit_mynetworks, reject_invalid_hostname
smtpd_recipient_limit = 50
smtpd_recipient_restrictions = permit_mynetworks, permit_mx_backup,
reject_maps_rbl, reject_unauth_pipelining, check_relay_domains, reject
smtpd_sender_restrictions = permit_mynetworks, reject_maps_rbl,
reject_unknown_sender_domain, permit
smtpd_soft_error_limit = 10
smtpd_timeout = 300s
strict_rfc821_envelopes = yes
transport_maps = hash:/etc/postfix/transport
unknown_address_reject_code = 450
unknown_client_reject_code = 450
unknown_hostname_reject_code = 450
virtual_maps = hash:/etc/postfix/virtual
-
To unsubscribe, send mail to majo...@postfix.org with content
(not subject): unsubscribe postfix-users
postdrop is called from the "sendmail" command to enter new mail into
the maildrop directory.
If I understood the structure correctly, postdrop is called from sendmail
and is not under the control of the "master" process.
Its actions therefore are not related to any action related with incoming
mail via SMTP.
In order to track down your problem you need to find out which processes
are trying to deliver email by calling the "sendmail" command.
(You are writing of procmail later on.)
> Apparently, I have plenty of disk and inodes (see mailing list archives.) I
> have plenty of swap. I rebooted and the warnings continue.
>
> # df -i
> Filesystem Inodes IUsed IFree IUse% Mounted on
> /dev/sda3 4294967295 0 4294967295 0% /
> /dev/sda1 10040 40 10000 1% /boot
> shmfs 4294967295 1 4294967294 1% /dev/shm
>
> # df -k
> Filesystem 1k-blocks Used Available Use% Mounted on
> /dev/sda3 8650728 4170112 4480616 49% /
> /dev/sda1 38859 3733 33120 11% /boot
> shmfs 975488 0 975488 0% /dev/shm
On your /dev/sda3 partition, you have 4GB used up but no inodes used.
What kind of filesystem are you using? Let me guess: we are not talking
about ext2, are we? The error message "no space left on device" is actually
the textual output equivalent to ENOSPC, which is to be created by the
filesystem driver.
Best regards,
Lutz
--
Lutz Jaenicke Lutz.J...@aet.TU-Cottbus.DE
http://www.aet.TU-Cottbus.DE/personen/jaenicke/
BTU Cottbus, Allgemeine Elektrotechnik
Universitaetsplatz 3-4, D-03044 Cottbus
On Tuesday 15 October 2002 03:50, you wrote:
> On Tue, Oct 15, 2002 at 03:34:27AM -0500, Bob Apthorpe wrote:
> > Yesterday /var/log/mail started filling with a repetitive series of
> > postdrop warnings:
> >
> > # excerpt from /var/log/mail (note repetition after 14 messages)
> > Oct 15 02:48:03 bastion postfix/postdrop[3665]: warning:
> > mail_queue_enter: create file maildrop/74326.3665: No space left on
> > device
[snip]
> > Oct 15 02:48:13 bastion postfix/postdrop[3627]: warning:
> > mail_queue_enter: create file maildrop/212653.3627: No space left on
> > device
>
> postdrop is called from the "sendmail" command to enter new mail into
> the maildrop directory.
> If I understood the structure correctly, postdrop is called from sendmail
> and is not under the control of the "master" process.
> Its actions therefore are not related to any action related with incoming
> mail via SMTP.
Ok; as you say below that would probably rule out procmail.
> In order to track down your problem you need to find out which processes
> are trying to deliver email by calling the "sendmail" command.
> (You are writing of procmail later on.)
It may be mailman, though mailman has been fairly stable and untouched for
months as well. I've cut back the frequency of qrunner; we'll see if that
changes the frequency of the warnings (apparently not.) 'lsof | egrep mail'
doesn't give me anything useful, but that's expected if the process causing
the trouble isn't persistent (e.g. invoked via crontab.) Realistically, it
could be anything given how much code is hardwired to invoke sendmail.
> > Apparently, I have plenty of disk and inodes (see mailing list archives.)
> > I have plenty of swap. I rebooted and the warnings continue.
> >
> > # df -i
> > Filesystem Inodes IUsed IFree IUse% Mounted on
> > /dev/sda3 4294967295 0 4294967295 0% /
> > /dev/sda1 10040 40 10000 1% /boot
> > shmfs 4294967295 1 4294967294 1% /dev/shm
> >
> > # df -k
> > Filesystem 1k-blocks Used Available Use% Mounted on
> > /dev/sda3 8650728 4170112 4480616 49% /
> > /dev/sda1 38859 3733 33120 11% /boot
> > shmfs 975488 0 975488 0% /dev/shm
>
> On your /dev/sda3 partition, you have 4GB used up but no inodes used.
> What kind of filesystem are you using? Let me guess: we are not talking
> about ext2, are we?
Sorry about that; the filesystem is ReiserFS. That clarifies the mysterious
inode stats somewhat.
> The error message "no space left on device" is actually
> the textual output equivalent to ENOSPC, which is to be created by the
> filesystem driver.
Which leads one to ask if there's some part of the journal/transaction
log/other piece of hidden filesystem infrastructure that has been exhausted.
Meaning one needs to dig into the underlying mechanics of the filesystem to
find out what conditions lead to ENOSPC getting thrown.
I fear I'm operating far beyond my competence...
Thanks much for the quick and detailed reply,
-- Bob
> Thou art out of inodes.
No, you're not. I misread your df output :(
>
> > Oct 15 02:48:06 bastion postfix/postdrop[3418]: warning: mail_queue_enter:
> > create file maildrop/21640.3418: No space left on device
>
>
> > # df -i
> > Filesystem Inodes IUsed IFree IUse% Mounted on
> > /dev/sda3 4294967295 0 4294967295 0% /
You have plenty of free inodes.
> But the volume is half full.
What kind of FS are you using?
--
Ralf Hildebrandt Ralf.Hil...@charite.de
Postfix Tips: http://www.arschkrebs.de/postfix/ Tel. +49 (0)30-450 570-155
May's Law: The quality of correlation is inversely proportional to the
density of control. (The fewer data points, the smoother the curves.)
> I admit it. I'm panicking.
Thou art out of inodes.
> Oct 15 02:48:06 bastion postfix/postdrop[3418]: warning: mail_queue_enter:
> create file maildrop/21640.3418: No space left on device
> # df -i
> Filesystem Inodes IUsed IFree IUse% Mounted on
> /dev/sda3 4294967295 0 4294967295 0% /
All your inodes are in use
> # df -k
> Filesystem 1k-blocks Used Available Use% Mounted on
> /dev/sda3 8650728 4170112 4480616 49% /
But the volume is half full.
Ergo: You have lots of small files lying around.
--
Ralf Hildebrandt Ralf.Hil...@charite.de
Postfix Tips: http://www.arschkrebs.de/postfix/ Tel. +49 (0)30-450 570-155
Real programmers never work 9 to 5. If any real programmers are around
at 9 am, it's because they were up all night.
Procmail has the ability to forward mail. I don't use procmail so I don't know
how this is done. It is however not unlikely the procmail calls sendmail for
delivery to remote destinations.
> It may be mailman, though mailman has been fairly stable and untouched for
> months as well. I've cut back the frequency of qrunner; we'll see if that
> changes the frequency of the warnings (apparently not.) 'lsof | egrep mail'
> doesn't give me anything useful, but that's expected if the process causing
> the trouble isn't persistent (e.g. invoked via crontab.) Realistically, it
> could be anything given how much code is hardwired to invoke sendmail.
Hmm. What irritates me is that your write about other emails still being
transferred normally.
> > > Apparently, I have plenty of disk and inodes (see mailing list archives.)
> > > I have plenty of swap. I rebooted and the warnings continue.
So you did already reboot??
I do not comment about your setup, having just one large partition for
everything. This will make it very difficult to work solve the problem,
as problems with the filesystem will make everything unstable.
> Sorry about that; the filesystem is ReiserFS. That clarifies the mysterious
> inode stats somewhat.
>
> > The error message "no space left on device" is actually
> > the textual output equivalent to ENOSPC, which is to be created by the
> > filesystem driver.
>
> Which leads one to ask if there's some part of the journal/transaction
> log/other piece of hidden filesystem infrastructure that has been exhausted.
> Meaning one needs to dig into the underlying mechanics of the filesystem to
> find out what conditions lead to ENOSPC getting thrown.
In linux/fs/reiserfs/bitmap.c I found:
...
/* NO_MORE_UNUSED_CONTIGUOUS_BLOCKS should only mean something to
** the preallocation code. The rest of the filesystem asks for a block
** and should either get it, or know the disk is full. The code
** above should never allow ret == NO_MORE_UNUSED_CONTIGUOUS_BLOCK,
** as it doesn't send for_prealloc = 1 to do_reiserfs_new_blocknrs
** unless it has already successfully allocated at least one block.
** Just in case, we translate into a return value the rest of the
** filesystem can understand.
**
** It is an error to change this without making the
** rest of the filesystem understand NO_MORE_UNUSED_CONTIGUOUS_BLOCKS
** If you consider it a bug to return NO_DISK_SPACE here, fix the rest
** of the fs first.
*/
if (ret == NO_MORE_UNUSED_CONTIGUOUS_BLOCKS) {
#ifdef CONFIG_REISERFS_CHECK
reiserfs_warning("reiser-2015: this shouldn't happen, may cause false out of disk space error");
#endif
return NO_DISK_SPACE;
}
...
Whatever this means: it seems that ENOSPC is returned for other underlying
problems...
Best regards,
Lutz
--
Lutz Jaenicke Lutz.J...@aet.TU-Cottbus.DE
http://www.aet.TU-Cottbus.DE/personen/jaenicke/
BTU Cottbus, Allgemeine Elektrotechnik
Universitaetsplatz 3-4, D-03044 Cottbus
Thanks much to everyone for their help and patience. I'm pursuing this
as a filesystem issue now; I'll report back with an answer if I find
one simpler than 'backup, reformat, restore.'
Again, many thanks,
--
Bob Apthorpe <arcl...@jump.net>
On Tue, 15 Oct 2002 10:28:43 -0000, Andre wrote:
> Replying
> >
> > [...]
> >
> > > > Apparently, I have plenty of disk and inodes (see mailing list
> > > > archives.) I have plenty of swap. I rebooted and the warnings continue.
> > > >
> > > > # df -i
> > > > Filesystem Inodes IUsed IFree IUse% Mounted on
> > > > /dev/sda3 4294967295 0 4294967295 0% /
> > > > /dev/sda1 10040 40 10000 1% /boot
> > > > shmfs 4294967295 1 4294967294 1% /dev/shm
> > > >
> > > > # df -k
> > > > Filesystem 1k-blocks Used Available Use% Mounted on
> > > > /dev/sda3 8650728 4170112 4480616 49% /
> > > > /dev/sda1 38859 3733 33120 11% /boot
> > > > shmfs 975488 0 975488 0% /dev/shm
> > >
> > > On your /dev/sda3 partition, you have 4GB used up but no inodes used.
> > > What kind of filesystem are you using? Let me guess: we are not talking
> > > about ext2, are we?
> >
> > Sorry about that; the filesystem is ReiserFS. That clarifies the mysterious
> > inode stats somewhat.
>
> Take this problem to the ReiserFS mailling list, they may be able to help you
.
>
> >
> > [...]
> >The glorious ReiserFS. Nothing but trouble with that crap.
>
> FUD much, Ralf?
>
> The only time I've seen this error from a ReiserFS filesystem was when
> the disk was failing (I've seen it once--the disk failed completely a
> couple weeks later...and ext3 behaved equally badly on the same disk).
We've seen it fail on working hardware. There's only one thing I
expect of a filesystem: Stability.
ReiserFS has always been flaky, two to one years ago (you only had to
mention ReiserFS in the old company and everybody would just scream
"don't use it, I've lost a (partition|whole disk|computer) due to it
being unable to recover after problems".
Just a week ago a working system at bild.de blew up due to ReiserFS
being unable to recover the filesystem after a crash.
Also, the amount of patches that the developers of ReiserFS flood the
lkml with isn't really encouraging me.
I wouldn't use ReiserFS anywhere I need to access the data reliably
(it's ok on Squid Caches).
--
Ralf Hildebrandt Ralf.Hil...@charite.de
Postfix Tips: http://www.arschkrebs.de/postfix/ Tel. +49 (0)30-450 570-155
Microsoft: A Proven Danger to National Security
http://www.infowarrior.org/articles/msdanger.pdf
> I have had corruption problems with both xfs and ext3 though.
We also had a corruption problem with ext3 two weeks ago. But at least
the filesystem check didn't destroy the rest of the filesystem.
--
Ralf Hildebrandt Ralf.Hil...@charite.de
Postfix Tips: http://www.arschkrebs.de/postfix/ Tel. +49 (0)30-450 570-155
"C makes it easy to shoot yourself in the foot. C++ makes it harder,
but when you do, it blows away your whole leg." -- Bjarne Stroustrup
FWIW, I tried ReiserFS a few times since the developers declared it
"usable" (which happened, IIRC, some two years ago). Every time I did
that I run into heaps of problems, ranging from minor annoyances like
"du" and "df" returning garbage, to utterly complete corruption, so bad
that no file was left intact --- all that without me actually trying
to make it blow up. Then I looked at the code, and I found it clumsy,
to put it politely. At some point, I did a search on a Linux forum,
and found out that a certain bug was declared fixed and reappeared for
no less than *four* times. Then, I have colleagues and friends, some
of which I trust to know what they're doing even more then I trust
myself, and they all claim to having had similar experiences. On this
very list, whenever the ReiserFS topic comes up, several people post
their horror stories. Well. To me, there is no uncertainty, and no
doubt about it; and the only fear I have is that somebody, some day,
might actually constrain me to use it. So, please, feel free to ignore
all this, and use ReiserFS on all your production servers if you feel
like it. But don't call the experience of all those people FUD. It's
insulting, and uncalled for.
Regards,
Liviu Daia
--
Dr. Liviu Daia e-mail: Liviu...@imar.ro
Institute of Mathematics web page: http://www.imar.ro/~daia
of the Romanian Academy PGP key: http://www.imar.ro/~daia/daia.asc