Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bounced Mail Due to Postfix/Cyrus conflict

2 views
Skip to first unread message

Jon Rifkin

unread,
Feb 15, 2007, 4:19:13 PM2/15/07
to
INTRO

This message describes a problem between Postfix and Cyrus, and describes
a workaround which worked for us. The problem is caused by a
mis-communication between Postfix's lmtp client and Cyrus's lmtpd
daemon, and is triggered when Cyrus's lmtpd daemon has too many
'lockers' locking its database. The result is that Postfix fails
to deliver valid email, instead returning it to the sender.

I'm posting this to (1) help users who are experiencing similar
problems, and (2) alert Postfix and Cyrus developers to this problem.


FIRST SYMPTOM

For each returned email, our log showed lines like this,

Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>, orig_to=<...>,
relay=public/lmtp[public/lmtp], delay=1, status=bounced
(host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA command))

The mystery was, why is postfix/lmtp bouncing mail? Didn't it receive
an OK message ("250 2.1.5 ok")?


THE ACTUAL PROBLEM

Following the on-line advice, we turned on verbose messages for
lmtp, replacing "lmtp" with "lmtp -v" in the appropriate column in
/etc/postfix/master.cf, and then re-examined our logs.

The cause of our failed emails was then narrowed to the conversion
between Postfix's lmtp client, and Cyrus's lmtpd daemon. The problem
happened in two stages.

In the first stage, Cyrus's lmtpd would report a problem with too many
'lockers',

Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok
Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> SIZE=26455
Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...>
Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA
Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok
Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok
Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 354 go ahead
Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: . --C1>>
Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 350 lockers

As a result, Postfix defers the message for later delivery (note that below the
status says 'bounced', but later log entries (not shown) show that this messages
was retried and eventually delivered). Here's the bounce message give in this
stage,

Feb 13 13:05:51 HOST postfix/lmtp[11340]: 45ACD380F0: to=<...>,
relay=public/lmtp[public/lmtp], delay=5, status=bounced
(host public/lmtp[public/lmtp] said: 350 lockers (in reply to end of DATA command))


In the next stage, the conversation continues, and an attempt is made to
deliver a second email. Below is the rest of the conversation between lmtp
and lmtpd.

Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RSET --C2>>
Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 Ok <<R1--
Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> --C3>>
Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...> --C4>>
Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA --C5>>
Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok <<R2--
Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok <<R3--
Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok <<R4--
Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>, orig_to=<...>,
relay=public/lmtp[public/lmtp], delay=1, status=bounced
(host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA command))

The text like --C1>> and <<R1-- are the commands given by lmtp, and the
responses returned by lmtpd.

Here, Postfix/lmtp bounces the email because in response to its DATA
command (marked --C5>> above), it receives the reply "250 2.1.5 ok"
(marked <<R4-- above), but it rightfully expects instead "354 go ahead".

Postfix's lmtp gets the wrong response because the responses are out of
sync. You can see this by following the sequence of replies. The first
command from lmtp is the '.' command (--C1>> above) signaling the end of
input to the DATA command. The response to this <<R1--, although it
*appears* to be the repines to the RSET command. By following the
sequence you can see that lmtp thinks that "250 2.1.5 ok" is an invalid
response to the "DATA" command, but it's really the valid response to
the "RCPT TO:" command.

Note that last line above. It shows lmtp bouncing the email. While
this bounce looks similar to the first bounce, these types of bounces
were always returned to sender.


OUR WORKAROUND

Our workaround for this problem is relatively simple. We got Cyrus's
lmtpd client to stop issuing the 'lockers' message by turning off
'duplicatesuppression' (it is on by default). By setting

duplicatesuppression: 0

Duplicate suppression compares every incoming
email to a list of previous emails in the duplicate database, and
rejects those that are duplicates for any particular user. Apparently
this process is what was generating the 'lockers' messages. Since we
have no need of this feature, turning it off is an easy workaround for
us.


WHAT IS THE RIGHT FIX?

Should Postfix's lmtp client re-queue the messages that it currently
returns? Or, should Cyrus's lmtpd daemon do a better job with its error
messages back to Postfix's lmtp?


Wietse Venema

unread,
Feb 15, 2007, 4:29:32 PM2/15/07
to
Jon Rifkin:

> Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>, orig_to=<...>,
> relay=public/lmtp[public/lmtp], delay=1, status=bounced
> (host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA command))
>
> The mystery was, why is postfix/lmtp bouncing mail? Didn't it receive
> an OK message ("250 2.1.5 ok")?

The server does not speak LMTP.

Specifically, 250 is not a valid reply to the DATA command.

> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> SIZE=26455
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...>
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 354 go ahead
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: . --C1>>
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 350 lockers

The server does not speak LMTP.

Specifically, 350 is not a valid END-OF-DATA reply.

> Feb 13 13:05:51 HOST postfix/lmtp[11340]: 45ACD380F0: to=<...>,
> relay=public/lmtp[public/lmtp], delay=5, status=bounced
> (host public/lmtp[public/lmtp] said: 350 lockers (in reply to end of DATA command))

And Postfix rightfully bounces.


> In the next stage, the conversation continues, and an attempt is made to
> deliver a second email. Below is the rest of the conversation between lmtp
> and lmtpd.
>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RSET --C2>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 Ok <<R1--
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> --C3>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...> --C4>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA --C5>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok <<R2--
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok <<R3--
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok <<R4--

Now that is bogus, but considering the earlier incorrect
server responses nothing surprises me anymore.

Wietse

Victor Duchovni

unread,
Feb 15, 2007, 4:50:13 PM2/15/07
to
On Thu, Feb 15, 2007 at 04:19:13PM -0500, Jon Rifkin wrote:

> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> SIZE=26455
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...>
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 354 go ahead
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: . --C1>>
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 350 lockers

WTF? 3XX is not a valid response to ".". Whatever protocol this server
is speaking, it is not LMTP. Note (not English, but clear enough):

http://listas.softwarelivre.org/pipermail/postfix-br/2005-October/008456.html

Por exemplo:

<usuario1 em dominio.com.br>: host 127.0.0.1[127.0.0.1] said: 149 lockers (in reply to MAIL FROM command)

<usuario2 em dominio.com.br>: host 127.0.0.1[127.0.0.1] said: 142 lockers (in reply to MAIL FROM command)

<usuario3 em dominio.com.br>: host 127.0.0.1[127.0.0.1] said: 147 lockers (in reply to MAIL FROM command)

So Cyrus LMTP is reporting the lock count as an LMTP response! This is
totally bogus, obtain a fix for that software.


> OUR WORKAROUND
>
> Our workaround for this problem is relatively simple. We got Cyrus's
> lmtpd client to stop issuing the 'lockers' message by turning off
> 'duplicatesuppression' (it is on by default). By setting
>
> duplicatesuppression: 0
>
> Duplicate suppression compares every incoming
> email to a list of previous emails in the duplicate database, and
> rejects those that are duplicates for any particular user. Apparently
> this process is what was generating the 'lockers' messages. Since we
> have no need of this feature, turning it off is an easy workaround for
> us.

The lock count messages this issues are in violation of the LMTP protocol.
Obtain an updated Cyrus LMTP server that is not broken in this way.

> WHAT IS THE RIGHT FIX?
>
> Should Postfix's lmtp client re-queue the messages that it currently
> returns? Or, should Cyrus's lmtpd daemon do a better job with its error
> messages back to Postfix's lmtp?

Those are not LMTP error messages, they are garbage leaking out of the
application into the LMTP protocol stream. Obtain an updated Cyrus
LMTP server, and update Postfix to 2.3.7 to partially work-around getting
out of step with broken LMTP servers..

20061203

Bugfix (introduced with Postfix 2.2): with SMTP server
tarpit delays of smtp_rset_timeout or larger, the SMTP
client could get out of sync with the server while reusing
a connection. The symptoms were "recipient rejected .. in
reply to DATA". Fix by Victor Duchovni and Wietse. File:
smtp/smtp_proto.c.

The above fix is in Postfix 2.3.5, 2.3.6, 2.3.7 and recent 2.4 snapshots.

There was a different problem with session caching to broken servers in
Postfix 2.1 and earlier. Uprade to 2.3.7 or disable LMTP session caching.

--
Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:majo...@postfix.org?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.

Wietse Venema

unread,
Feb 15, 2007, 4:52:26 PM2/15/07
to
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> SIZE=26455
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...>
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 354 go ahead
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: . --C1>>
> Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 350 lockers

I suspect that "350 lockers" is not the only END_OF_DATA reply for this
recipient.

I suspect that the LMTP server sends a SECOND reply for the same recipient.

From this point on the client and server are out of sync:

> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RSET --C2>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 Ok <<R1--

Note the reply of 2.1.5, which means "destination address valid".
For RSET, a reply of 2.0.0 would be appropriate.

This 250 2.1.5 is the SECOND reply to END_OF_DATA. At this point
the client and server are out of step by one response.

> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> --C3>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...> --C4>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA --C5>>
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok <<R2--

Here we have the 250 2.0.0 is the response to RSET.

But since the server earlier sent TWO replies to END_OF_DATA, the
client sees it as the response to MAIL FROM.

> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok <<R3--
> Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok <<R4--
> Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>, orig_to=<...>,
> relay=public/lmtp[public/lmtp], delay=1, status=bounced
> (host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA command))

And all this makes perfectly sense, since the server sent TWO
replies at END_OF_DATA.

Wietse

0 new messages