Postfix logging of rejects (and a general log format)

Michael Tokarev

unread,

Nov 29, 2001, 6:36:07 PM11/29/01

to postfix-users

Having in mind today's amount of junk mail it is necessary
to have ability to track some information about postfix's
action (a side note on this. In this week I received more
than 25,000 postmaster notifications about various rejects
and the like. Usual amount of such rejects here was about
50 per day). Postfix has very strong logging, but format
is good for humans, not for automatic inspection. And
besides that, logging format isn't very sutable for accounting
and even simple grepping *pairs* from/to in the logs.
(for example I need to know a list of recipients of one
sender after receiving a compilant that he was a spammer).

Some examples.

Nov 1 06:10:48 gate postfix/smtpd[26951]: reject: \
RCPT from unknown[195.68.19.113]: \
504 <habiba>: Helo command rejected: need fully-qualified hostname; \
from=<a24...@portugalmail.com> to=<m...@tls.msk.ru>

First of all, two points here.

Fixed information is at the end (from/to) while "variable-length"
is at the beginning. (Variable length: reject_maps_rbl uses DNS
TXT record that may be long and may include "; from=" pattern for
example). Better IMHO is to have the following:

Nov 1 06:10:48 gate postfix/smtpd[26951]: reject: \
from=<a24...@portugalmail.com> to=<m...@tls.msk.ru> \
RCPT from unknown[195.68.19.113]: \
504 <habiba>: Helo command rejected: need fully-qualified hostname

Note also that queue-id is missing but present in a previous line

Nov 1 06:10:38 gate postfix/smtpd[26951]: E9959380D: \
client=unknown[195.68.19.113]

and in a line from cleanup (header/body checks). Also, not all
information is present (notably HELO/EHLO argument that is also
important for filtering). What I suggest is:

Nov 1 06:10:48 gate postfix/smtpd[26951]: E9959380D: reject: \
client=unknown[195.68.19.113] \
helo=<habiba> \
from=<a24...@portugalmail.com> \
to=<m...@tls.msk.ru> \
state=RCPT: \
504 <habiba>: Helo command rejected: need fully-qualified hostname

Might be too verbose but it is worth it IMHO. If a size was given
in MAIL FROM, "size=xxx" might be useful too. For any informating
that is absent corresponding field will be omitted.

Next, for header/body checks:

Nov 1 23:58:53 gate postfix/cleanup[1330]: 065FB380D: reject: \
header X-Mailer: Mail Bomber; \
from=<a...@bobyr.ru> to=<ci...@tls.msk.ru>

The same applies: "variable part" (header in this case) should be at
the end. And as long as postfix now have more advanced IPC protocol,
this line should be better written as (finally):

Nov 1 23:58:53 gate postfix/cleanup[1330]: 065FB380D: reject: \
client=fish.zenon.net[213.189.198.214] \
helo=<fish.zenon.net> \
from=<a...@bobyr.ru> \
to=<ci...@tls.msk.ru> \
header X-Mailer: Mail Bomber

(note the client and helo args: them can be passed to cleanup
daemon). Or, another way around, when cleanup daemon uses IPC
to send complete info back to smtpd, something like:

Nov 1 23:58:53 gate postfix/smtpd[1330]: 065FB380D: reject: \
client=fish.zenon.net[213.189.198.214] \
helo=<fish.zenon.net> \
from=<a...@bobyr.ru> \
to=<ci...@tls.msk.ru> \
state=DATA \
554 Message content rejected;
header: X-Mailer: Mail Bomber

It will be very helpful too to have "restriction name" prepended
here as well, like "restriction=reject_maps_rbl:relays.ordb.org".

This is about rejects. The same partially applies to normal
log entries too. For example, I see no information about
HELO/EHLO argument -- might be useless for most people but
in today's unfriendly internet it may help somewhat. I see
tons of junk mails with HELO=aol.com or HELO=www.mail.ru --
having this info in logs will significantly help to set up
filters. This should be ok to log HELO/EHLO in the same
line as "client=name[ip.addr]":

Nov 1 23:58:51 gate postfix/smtpd[1329]: 065FB380D: \
client=fish.zenon.net[213.189.198.214] helo=<fish.zenon.net>

(for this to work, smtpd should be modified slightly to require
cleanup connection only after a MAIL FROM if that's not a case).

And the most troublestome part is logging of from/to/size info.
For example:

Nov 2 00:21:15 gate postfix/smtpd[1412]: A65CD380D: \
client=russian-caravan.cloud9.net[168.100.1.4]
Nov 2 00:21:16 gate postfix/cleanup[1385]: A65CD380D: \
message-id=<20011101212...@proven.weird.com>
Nov 2 00:21:16 gate postfix/qmgr[32700]: A65CD380D: \
from=<owner-pos...@postfix.org>, size=2918, nrcpt=1 (queue active)
Nov 2 00:21:16 gate postfix/smtp[1415]: A65CD380D: \
to=<m...@tls.msk.ru>, relay=mail.tls.msk.ru[192.168.1.1], delay=1, \
status=sent (250 Ok: queued as CCE868C02)

What I want to see is (last two lines):

Nov 2 00:21:16 gate postfix/qmgr[32700]: A65CD380D: nrcpt=1 (queue active)
Nov 2 00:21:16 gate postfix/smtp[1415]: A65CD380D: \
from=<owner-pos...@postfix.org>, size=2918, \
to=<m...@tls.msk.ru>, relay=mail.tls.msk.ru[192.168.1.1], delay=1, \
status=sent (250 Ok: queued as CCE868C02)

(this is just a move of from and size records into "sent" line). This
way, constant information will be multiplied in several lines if a
message is for many recipients, but this isn't that bad.

The last change is essential for me. Currently, there is NO way to
track down individual messages in logs. There is no defined "start"
of a message and no defined "end" of it. And most important benefit
here will be an ability to simple track some sender/rcpt combinations
using grep-like tools.

The question. Is it ok to change logs to look like described above?
This will simplify most log monitoring tools too but will break
compatibility with existing tools.

Comments?

Regards,
Michael.
-
To unsubscribe, send mail to majo...@postfix.org with content
(not subject): unsubscribe postfix-users

Michael Tokarev

unread,

Nov 29, 2001, 7:59:27 PM11/29/01

to postfi...@postfix.org

Matthias Andree wrote:
>
[]
> It'd help however if qmgr explicitly logged if a queue id expired by
> default, that'd simplify parsers big time already.

It already does so:

20010714

Logging: the queue manager now logs a "status=expired"
record when it returns a message that is too old. Files:
*qmgr/qmgr_active.c.

A funny example:

Nov 3 13:11:39 gate postfix/qmgr[32700]: ED4F4380C: \
from=<>, status=expired, returned to sender

BTW, it doesn't log recipient in this case.

Keith Woodworth

unread,

Nov 29, 2001, 7:22:18 PM11/29/01

to Michael Tokarev, postfix-users

On Fri, 30 Nov 2001, Michael Tokarev wrote:

|+The last change is essential for me. Currently, there is NO way to
|+track down individual messages in logs. There is no defined "start"
|+of a message and no defined "end" of it. And most important benefit
|+here will be an ability to simple track some sender/rcpt combinations
|+using grep-like tools.
|+
|+The question. Is it ok to change logs to look like described above?
|+This will simplify most log monitoring tools too but will break
|+compatibility with existing tools.

Since on the topic of logs and having used sendmail exclusively for the
last bunch of years and am used to the format of sendmail logs, how much
different are postfix logs compared to sendmail logs?

Having no experience with postfix (exploring the option of migrating over
at the moment) I cannot comment on the format but being able to easily
find items of interest (as you say above the start and stop of a message)
by grep'ing and other tools is a big benefit.

Thanks,
Keith

Michael Tokarev

unread,

Nov 29, 2001, 7:27:28 PM11/29/01

to postfix-users

Keith Woodworth wrote:
>
[]

> Since on the topic of logs and having used sendmail exclusively for the
> last bunch of years and am used to the format of sendmail logs, how much
> different are postfix logs compared to sendmail logs?

Well, I don't remember sendmail's logs either... But for a short
comparision -- postfix "prefer" to have more lines while sendmail
prefers to have more info in one line. I don't know if sendmail
logs both recipient and sender in one line but postfix definitely
doesn't. The same is true for "message lifetime" boundaries --
I don't know how to find this info in sendmail's logs but in postfix
it is almost impossible. Worse that postfix reuses queue-ids so
log-analisys tools can easily combine several different messages
together.

Regards,
Michael.

Wietse Venema

unread,

Nov 29, 2001, 7:47:15 PM11/29/01

to Postfix users

It is extremely unlikely that the frst implementation of a program
is the right one, and Postfix is not an exception.

The Postfix logging is still first-generation code that needs to
be replaced by more configurable mechanisms. Changing it from one
hard-coded logging format to another hard-coded logging format is
only a waste of time, in my opinion.

First-generation code that still needs to be replaced: the map
interface (multi attribute), syslogging (configurable), queue file
format (extensible without losing backwards compatibility), bounce
message (configurable), you get the idea. The internal protocol
revision was only the first in a long series of renovations.

Wietse

Matthias Andree

unread,

Nov 29, 2001, 7:49:34 PM11/29/01

to postfi...@postfix.org

Michael Tokarev <m...@tls.msk.ru> writes:

> Well, I don't remember sendmail's logs either... But for a short
> comparision -- postfix "prefer" to have more lines while sendmail
> prefers to have more info in one line.

Obviously, that's caused by the modular design: you don't usually have
multiple processes cooperate to assemble a /single/ log line.

> I don't know if sendmail logs both recipient and sender in one line
> but postfix definitely doesn't. The same is true for "message
> lifetime" boundaries -- I don't know how to find this info in
> sendmail's logs but in postfix it is almost impossible. Worse that
> postfix reuses queue-ids so log-analisys tools can easily combine
> several different messages together.

Well, will Postfix reuse a queue-id that is in use? I hope not, so one
should expect the scopes of a queue-id in the log (life ranges) do not
overlap. It's still a PITA because you need a stateful inspector for
your logs. Been there, done that for some log format other than
Postfix's (troll-ftpd IIRC), and it was no fun to implement.

It'd help however if qmgr explicitly logged if a queue id expired by
default, that'd simplify parsers big time already.

--
Matthias Andree

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." Benjamin Franklin

Michael Tokarev

unread,

Nov 29, 2001, 8:03:56 PM11/29/01

to postfi...@postfix.org

Matthias Andree wrote:
>
> Michael Tokarev <m...@tls.msk.ru> writes:
>
> > Well, I don't remember sendmail's logs either... But for a short
> > comparision -- postfix "prefer" to have more lines while sendmail
> > prefers to have more info in one line.
>
> Obviously, that's caused by the modular design: you don't usually have
> multiple processes cooperate to assemble a /single/ log line.

Not at all. The only "problem" I see is header/body checks w/o
client information (cooperation between cleanup and smtpd (and
qmqpd) in this case), but this is easily solvable now. In other
places I wrote (almost) all the needed information available in
internal structures (e.g. sender and size of a message available
for all sending/delivering agents).

Regards,
Michael.

mi...@netline.com

unread,

Nov 29, 2001, 8:16:00 PM11/29/01

to Matthias Andree, postfi...@postfix.org

/* Matthias Andree [m...@dt.e-technik.uni-dortmund.de] writes: */

>It'd help however if qmgr explicitly logged if a queue id expired by
>default, that'd simplify parsers big time already.
>

It got this behavior in July 2001. From the HISTORY file...

20010714
Logging: the queue manager now logs a "status=expired"
record when it returns a message that is too old. Files:
*qmgr/qmgr_active.c.

For what it's worth, a typical log entry looks like:

Nov 18 16:44:23 selanne postfix/qmgr[277]: 9294C108866: \
from=<sen...@netline.com>, status=expired, returned to sender

So if you are tracking the Qid's, figuring out the message details is
a snap.

--
do svidaniya,

~mitch

Keith Woodworth

unread,

Nov 30, 2001, 3:08:47 AM11/30/01

to Michael Tokarev, postfix-users

On Fri, 30 Nov 2001, Michael Tokarev wrote:

|+Well, I don't remember sendmail's logs either... But for a short
|+comparision -- postfix "prefer" to have more lines while sendmail
|+prefers to have more info in one line. I don't know if sendmail
|+logs both recipient and sender in one line but postfix definitely
|+doesn't. The same is true for "message lifetime" boundaries --
|+I don't know how to find this info in sendmail's logs but in postfix
|+it is almost impossible. Worse that postfix reuses queue-ids so
|+log-analisys tools can easily combine several different messages
|+together.

I just got postfix running on a test machine here and am running through
some basic stuff and see what you mean by logging in postfix. It will take
a bit of getting used to.

I wish I was a better coder as I'd try to tackle something like that.

Though it would be a little less messy to log more things on one line
rather than a connect from host line, then a client line, then message ID,
then a disconnect.

A quick comparison of logs and if you strip the 4 extra lines from postfix
logging like mentioned above would bring them closer to being the same but
not quite.

I dont know what you mean by message lifetime though? Messages that have
been expired out of the queue?

Keith

Ralf Hildebrandt

unread,

Nov 30, 2001, 3:14:13 AM11/30/01

to postfix-users

On Fri, Nov 30, 2001 at 12:08:47AM -0800, Keith Woodworth wrote:

> I wish I was a better coder as I'd try to tackle something like that.

Actually, you can use shell scripts like "show_all_entries_related_to" that
use grep and awk to filter out the Queids associated with a
sender/recipient/client and then display that.

> I dont know what you mean by message lifetime though? Messages that have
> been expired out of the queue?

Your average message takes only about 2 minutes to leave the system. That
means that given a small enough window of time (1d), all queueid's are unique.

--
Ralf Hildebrandt Tel. +49 (0)30-450 570-155
Fax. +49 (0)30-450 570-916
PP: MMDF gone mad with standards fever. Think "Brazil".

Michael Tokarev

unread,

Nov 30, 2001, 5:56:40 AM11/30/01

to postfix-users

Keith Woodworth wrote:
>
[]

> I dont know what you mean by message lifetime though? Messages that have
> been expired out of the queue?

There are different "paths" a message may be entered into postfix
(smtpd, qmqpd, local /usr/sbin/sendmap (pickup), local forwarding
and aliases, and internal bounces and double-bounces). There is
no single regexp for "start of message lifetime", moreover, things
like bounces have "unusual" start of life in logs. For locally
forwarded/aliased lines additional steps are required to connect
original and forwarded message (when owner-xxx alias exists or
.forward was used). Well, it IS possible to find the starting
point, but it is at least difficult. And the final point -- it
can be found reliable only if a message was sent or bounced
"correctly", or when it has only one recipient. For others,
correct "end of life" in logs will be either expiration or
reuse of the same queue id. In any case one should implement
non-trivial stateful parsing machine and count all possible
injection and delivery paths. (See for example comment by
Matthias Andree).

Regards,
Michael.

Matthias Andree

unread,

Nov 30, 2001, 8:17:16 AM11/30/01

to postfi...@postfix.org

mi...@netline.com writes:

> It got this behavior in July 2001. From the HISTORY file...
>
> 20010714
> Logging: the queue manager now logs a "status=expired"
> record when it returns a message that is too old. Files:
> *qmgr/qmgr_active.c.
>
> For what it's worth, a typical log entry looks like:
>
> Nov 18 16:44:23 selanne postfix/qmgr[277]: 9294C108866: \
> from=<sen...@netline.com>, status=expired, returned to sender
>
> So if you are tracking the Qid's, figuring out the message details is
> a snap.

That's not what I'm after, although it's useful.

Nope: qmgr does not log when smtp, local or another transport has
successfully delivered a message to the last recipient that was left to
do, so you end up tracking the recipients for that queue-id?

This is all trivial when a mail has one recipient, with multi-recipient
mail, you only know that a queue id is no longer in use when either
looking at mailq output (not possible when parsing log files at a later
point in time) or because some other software (postdrop, sendmail,
smtpd, qmqpd, whoever) reuses it.

What it does (snapshot-20011127):

Nov 30 14:04:09 emma1 postfix/pickup[13263]: 78433A200F: uid=500 from=<emma>
Nov 30 14:04:09 emma1 postfix/cleanup[13287]: 78433A200F: message-id=<200111301304...@emma1.emma.line.org>
Nov 30 14:04:09 emma1 postfix/qmgr[13264]: 78433A200F: from=<m...@dt.e-technik.uni-dortmund.de>, size=367, nrcpt=2 (queue active)
Nov 30 14:04:10 emma1 postfix/local[13288]: 78433A200F: to=<em...@emma1.emma.line.org>, relay=local, delay=1, status=sent ("|/usr/bin/procmail -t")
Nov 30 14:04:18 emma1 postfix/local[13289]: 78433A200F: to=<em...@emma1.emma.line.org>, relay=local, delay=9, status=sent ("|/usr/bin/procmail -t")

What I suggest: Add

Nov 30 14:04:18 emma1 postfix/qmgr[13264]: 78433A200F: status=done

Or something similar. That way, we'd definitely know when a queue-id is
no longer in use, so that context-aware log file analysis software can
know when a transaction is complete.

I'm not sure if this is complete, but it does just that:

*** ./src/qmgr/qmgr_active.c.old Tue Nov 27 00:09:02 2001
--- ./src/qmgr/qmgr_active.c Fri Nov 30 14:15:03 2001
***************
*** 471,478 ****
message->queue_id, message->queue_name);
msg_warn("%s: remove %s from %s: %m", myname,
message->queue_id, message->queue_name);
! } else if (msg_verbose) {
! msg_info("%s: remove %s", myname, message->queue_id);
}
}

--- 471,481 ----
message->queue_id, message->queue_name);
msg_warn("%s: remove %s from %s: %m", myname,
message->queue_id, message->queue_name);
! } else {
! if (msg_verbose) {
! msg_info("%s: remove %s", myname, message->queue_id);
! }
! msg_info("%s: status=done", message->queue_id);
}
}

--
Matthias Andree

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." Benjamin Franklin

Wietse Venema

unread,

Nov 30, 2001, 8:58:04 AM11/30/01

to Postfix users

Matthias Andree:

> Nope: qmgr does not log when smtp, local or another transport has
> successfully delivered a message to the last recipient that was left to
> do, so you end up tracking the recipients for that queue-id?

Queue ID recycling can be avoided to some extent.

This requires more (time) bits per queue ID. The queue ID length
can be kept in check by packing more bits per character (right now
Postfix uses A-F0-9 which packs only four bits per character; by
using a-zA-Z0-9 we can pack almost 6 bits per character).

This way we can add some 20 bits to the queue ID without making it
longer and ensure that a queue ID will not recycle in a week. Add
one character to the queue ID, and it will not recycle in a year.

Another approach is to log the message ID instead of the queue ID,
but that adds significantly to the logfile size.

Wietse

Ralf Hildebrandt

unread,

Nov 30, 2001, 9:13:10 AM11/30/01

to Postfix users

On Fri, Nov 30, 2001 at 08:58:04AM -0500, Wietse Venema wrote:

> This way we can add some 20 bits to the queue ID without making it
> longer and ensure that a queue ID will not recycle in a week. Add
> one character to the queue ID, and it will not recycle in a year.

For Germany, 90 days should suffice (maxmimum log retention time required by
the data protection laws)

--
Ralf Hildebrandt Tel. +49 (0)30-450 570-155

http://www.arschkrebs.de/ Fax. +49 (0)30-450 570-916
Qmail : a small office of neatly dressed clerks, delivering short
clipped remarks to queries, and handling mail with a rude
impersonality, except in the case of failure where they let their
hair down and have an after-hours beer and let you know about it,
pointing to the pertinent header sections.

Matthias Andree

unread,

Nov 30, 2001, 9:47:16 AM11/30/01

to postfi...@postfix.org

wie...@porcupine.org (Wietse Venema) writes:

> Matthias Andree:
> > Nope: qmgr does not log when smtp, local or another transport has
> > successfully delivered a message to the last recipient that was left to
> > do, so you end up tracking the recipients for that queue-id?
>
> Queue ID recycling can be avoided to some extent.

Sure it can, but is it necessary or trivial?

I assume at a given point in time, there is only one queue entry with
that ID, so the ID itself is unique as long as it exists.

Well, if it's a trivial change in one central place and it does not need
a queue conversion (so that a software that runs the longer IDs still
copes with old active queues), there are no objections. But don't spend
too much time on this.

I think it's sufficient if the life span of that queue ID is visible
from the logs:

* A queue ID starts to exist when a mail enters the queue,
postdrop/sendmail/smtpd/qmqpd,

* A queue ID ceases to exist when qmgr finally removes the
queue entry when all bounces and deliveries are complete and no
deferrals are left.

We know the creation of a queue ID, but we get to know the deletion only
in verbose mode. See my patch in this thread for a suggestion.

When the expiry of a queue id is known, the corresponding log entry is
like a "barrier". Before and after this expire log line "barrier" the
same queue ID refers to different mails, between to consecutive
"barriers" the same queue ID refers to the same mail. (Maybe the wording
is clumsy, I'm not a native English speaker. Please apologize.)

> Another approach is to log the message ID instead of the queue ID,
> but that adds significantly to the logfile size.

That approach would be a step back from the current situation.

1. Message-IDs are not unique:

a. Users provide Message-IDs. By recycling these from a public mailing
list, an attacker may hide in the logs.

b. Users provide Message-IDs with wrong domain names or even "localhost"
(I've seen NOCs send mail with these Message-IDs!). Clashes can
happen when two users with the same software write a mail at the same
time.

c. Some mailers don't let you configure the domain. It's common to see
Message-ID: <some-date-bas...@big-freemailer.domain>, like
Message-ID: <1234567...@gmx.net> -- with the domain part
obviously bogus when a private user emits such a Message-ID.

2. Message-IDs have no /reasonable/ length.

3. Logging message-ID paired with the queue-ID, so as to achieve a
stateless log, is just bloat. Stateful log analyzers can extract the
message-ID and associate it with a queue-ID if they want that
information, and it gets reliable without guessing with the barrier
proposed above.

--
Matthias Andree

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." Benjamin Franklin

Wietse Venema

unread,

Nov 30, 2001, 10:23:48 AM11/30/01

to Postfix users

Many people don't need the "log queue file deletion" record, so
what about making the logging configurable.

A lot of the Postfix logging is distributed over different processes
so as to make trouble shooting possible. smtpd logs connection info
so you have some clue of what happens even when no mail transaction
is completed; cleanup logs the message ID rather than leaving that
up to the qmgr, because a lot of time may pass before the qmgr gets
around to open the queue file; qmgr logs queueID and sender, so
you know it is not stuck; a delivery agent logs successful/failed
delivery itself rather than leaving that up to the qmgr, because
the qmgr process may no longer be around by that time.

It is possible to compress all this information into fewer records,
but that requires a stateful logging process thet flushes the buffer
when a message life ends or when certain checkpoints are passed.
I'm not sure the solution's complexity is worth the effort.

Wietse

Matthias Andree:

Matthias Andree

unread,

Dec 1, 2001, 6:46:09 PM12/1/01

to postfi...@postfix.org

wie...@porcupine.org (Wietse Venema) writes:

> Many people don't need the "log queue file deletion" record, so
> what about making the logging configurable.

Actually, that's a good plan, but I'd have to dig up how to do this from
the archives, and it would probably end up being a patch of more than
just -1 +4 lines.

> It is possible to compress all this information into fewer records,
> but that requires a stateful logging process thet flushes the buffer
> when a message life ends or when certain checkpoints are passed.
> I'm not sure the solution's complexity is worth the effort.

I think that complexity can be left up to "third parties". Logging early
is probably more useful (at least together with the "Big Picture") as
something goes wrong. You'd add that log daemon as single central point
of (logging) failure, which is not exactly enhancing robustness.