Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Milter refuses the message, but sendmail creates a DSN anyway (with fixing patch)

22 views
Skip to first unread message

billyli...@gmail.com

unread,
Apr 16, 2007, 5:04:46 PM4/16/07
to
Hi all,

Quick summary of the problem: If the message transfer time to sendmail
acting as an SMTP server exceeds Timeout.queuewarn or
Timeout.queuereturn, and the message is refused by a milter, sendmail
improperly creates a Delivery Status Notification. Patch is included.

I'm running open source sendmail 8.13.8 as an inbound internet
gateway. I have my Timeout.queuewarn value set to 1 hour, and a
mimedefang milter that enforces various company-defined policies on
the inbound messages. Recently, we started doing business with a
company overseas who sends us large messages over a very slow
connection. In many cases, the messages are in transit for more than
the 1 hour value set above. Unfortunately (for the overseas company),
the content of the message causes mimedefang to tempfail it. That
shouldn't be a big deal, as the sender is complient and retries the
message later. The log entries look like this.

Apr 15 12:46:54 dclnpsm1 sendmail[25939]: NOQUEUE: connect from
somedomain.br [200.123.4.567]
Apr 15 12:47:02 dclnpsm1 sendmail[25939]: STARTTLS=server,
relay=somedomain.br [200.123.4.567], version=TLSv1/SSLv3, verify=NOT,
cipher=DHE-RSA-AES256-SHA, bits=256/256
Apr 15 12:47:02 dclnpsm1 sendmail[25939]: STARTTLS=server, cert-
subject=, cert-issuer=, verifymsg=ok
Apr 15 12:47:14 dclnpsm1 sendmail[25939]: l3FHks7R025939: Got first
header
...
Apr 15 14:33:05 dclnpsm1 sendmail[25939]: l3FHks7R025939:
from=<som...@somedomain.br>, size=13738149, class=0, nrcpts=1,
msgid=<longmess...@somedomain.br>, proto=ESMTP,
daemon=dclnpsm1MTA, relay=somedomain.br [200.123.4.567]
Apr 15 14:33:07 dclnpsm1 sendmail[25939]: l3FHks7R025939: Milter:
data, reject=451 4.3.2 Please try again later
Apr 15 14:33:07 dclnpsm1 mimedefang[8772]: l3FHks7R025939: Filter time
is 1249ms
Apr 15 14:33:07 dclnpsm1 sendmail[25939]: l3FHks7R025939:
to=<someo...@hertz.com>, delay=01:45:58, pri=13768149, stat=Please
try again later
Apr 15 14:33:08 dclnpsm1 sendmail[25939]: l3FHks7R025939:
l3FHks7S025939: sender notify: Warning: could not send message for
past 1 hour


The confusion I have is in the last line. If my milter tempfailed the
message and thus never accepted responsibility for it, why is /my/ MTA
sending a delay notify? Shouldn't that be the responsibility of the
client?

I have since verified that this happens with all domains, with or
without TLS, and affects both Timeout.queuewarn and
Timeout.queuereturn. This is especially pernicious in the case of
tempfails mixed with queuereturns, as the sender would received
multiple permanent failure DSNs, even though the message was still
being retried. I've also checked the 8.14 line of sendmail, and it
appears to have the same issue.

Recreating the condition in a testing environment is relatively easy:
create a milter that tempfails all messages, set your
Timeout.queuewarn* or Timeout.queuereturn* values to something
absurdly low (3 minutes), and your Timeout.data* to something absurdly
high (2 hours). Then send a message from off of the box, but wait
about 5 minutes before putting in the final '\r\n.\r\n'. The server
will fail the message, yet generate a DSN.

I was able to trace the delivery of the message through smtp_data()
down to where it is aborted. It looks like it doesn't do anything to
suppress these DSNs before it hands the envelope off to
dropenvelope(). I've included a patch that suppresses the DSNs for
the particular situation I'm experiencing, but I'd appreciate the
input of someone who is more well-versed with the sendmail code to
tell me if there is a better flag to be setting.

There is a comment in the code (about line 3451 of srvrsmtp.c) saying
that "We goose error returns by clearing error bit" but that doesn't
seem to be related. That section of code is also skipped if the
milter rejects the data chunk of the message.

Any input you have would be appreciated.

Thanks,
William Lieurance
The Hertz Corporation

--- sendmail/srvrsmtp.c 2007-04-16 15:54:52.000000000 -0500
+++ sendmail/srvrsmtp.c 2007-04-16 16:02:44.000000000 -0500
@@ -3110,6 +3110,7 @@
bool doublequeue;
ADDRESS *a;
ENVELOPE *ee;
+ ADDRESS *q;
char *id;
char *oldid;
char buf[32];
@@ -3416,6 +3417,16 @@
{
/* Log who the mail would have gone to */
logundelrcpts(e, e->e_message, 8, false);
+ /*
+ ** If something above refused the message, we still haven't
+ ** accepted responsibility for it. Don't send DSNs.
+ */
+ for (q = e->e_sendqueue; q != NULL; q = q->q_next)
+ {
+ q->q_flags &= ~QPINGONSUCCESS;
+ q->q_flags &= ~QPINGONFAILURE;
+ q->q_flags &= ~QPINGONDELAY;
+ }
flush_errors(true);
buffer_errors();
goto abortmessage;

jma...@ttec.com

unread,
Apr 16, 2007, 11:06:43 PM4/16/07
to
On Apr 16, 5:04 pm, billylieura...@gmail.com wrote:
> Hi all,

>
> The confusion I have is in the last line. If my milter tempfailed the
> message and thus never accepted responsibility for it, why is /my/ MTA
> sending a delay notify? Shouldn't that be the responsibility of the
> client?

Yes. In my opinion it shouldnt be sending any notification, even if it
accepted responsibility, but thats less clear. What happens to
messages that are hour long in transmitting that are accepted by the
milter (and not delivered immediately for some reason such as the
downstream being unaccessible)? Do those trigger a notification?

I hope you sent a copy (please mark it as such) of the usenet posting
to the sendmail bug email address located at

http://www.sendmail.org/email-addresses.html

since it sounds like you found one, and thats where the people who
REALLY know the code live.

> Recreating the condition in a testing environment is relatively easy:
> create a milter that tempfails all messages, set your
> Timeout.queuewarn* or Timeout.queuereturn* values to something
> absurdly low (3 minutes), and your Timeout.data* to something absurdly
> high (2 hours). Then send a message from off of the box, but wait
> about 5 minutes before putting in the final '\r\n.\r\n'. The server
> will fail the message, yet generate a DSN.
>

I havent tested. I wonder if success return receipts would be returned
if it wasnt timed out and the original message requested them.

> I was able to trace the delivery of the message through smtp_data()
> down to where it is aborted. It looks like it doesn't do anything to
> suppress these DSNs before it hands the envelope off to
> dropenvelope(). I've included a patch that suppresses the DSNs for
> the particular situation I'm experiencing, but I'd appreciate the
> input of someone who is more well-versed with the sendmail code to
> tell me if there is a better flag to be setting.

(not like I am overly familiar with that particular flow)

e->e_sendqueue = NULL;

That might also do the trick.

Res

unread,
Apr 16, 2007, 11:22:20 PM4/16/07
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


8.13.8?

8.14.1 is current please see if your problem exists on that version.


On Tue, 16 Apr 2007, billyli...@gmail.com wrote:

> Hi all,
>
> Quick summary of the problem: If the message transfer time to sendmail
> acting as an SMTP server exceeds Timeout.queuewarn or

- --
Cheers
Res

Let Novell know what you think of their back door deal with the devil.
Sign the petition today: http://techp.org/p/1/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGJD1vsWhAmSIQh7MRAh5NAKCH3WeQC3LA6dOTXtO1x+QUnlBf1wCeLBlA
R7q0C2A8XuUctqrydiuhN+I=
=3ANs
-----END PGP SIGNATURE-----

billyli...@gmail.com

unread,
Apr 17, 2007, 12:34:37 AM4/17/07
to

Res \/\/|20+3:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> 8.13.8?
>
> 8.14.1 is current please see if your problem exists on that version.

(From above)

>> I've also checked the 8.14 line of sendmail, and it
>> appears to have the same issue.

I haven't done the same in-depth testing of 8.14, but it appears to
function the same way.

billyli...@gmail.com

unread,
Apr 17, 2007, 1:06:05 AM4/17/07
to
jma...@ttec.com \/\/|20+3:

> On Apr 16, 5:04 pm, billylieura...@gmail.com wrote:
> > Hi all,
>
> >
> > The confusion I have is in the last line. If my milter tempfailed the
> > message and thus never accepted responsibility for it, why is /my/ MTA
> > sending a delay notify? Shouldn't that be the responsibility of the
> > client?
>
> Yes. In my opinion it shouldnt be sending any notification, even if it
> accepted responsibility, but thats less clear. What happens to
> messages that are hour long in transmitting that are accepted by the
> milter (and not delivered immediately for some reason such as the
> downstream being unaccessible)? Do those trigger a notification?

Yes, they do (just tested it; good thought btw.), and I think they
should. Once the message is accepted, even before it is tried
immediately, the originating client can cheerfully disconnect and
clear the message from its queue. The responsibilty to either deliver
the message or let the sender know about a non-delivery would seem to
rest with the server that accepted the transaction (post DATA
command), and no longer with the one who has the option of deleting it
from its queue.

Once the message is /accepted/ as such the chunk of code in the patch
is skipped. It is tried transactionally, and possibly queued with no
modification to the flags regarding notification behavior. The result
in my testing was a delay DSN as the queue file was saved and the
envelope was dropped, which I think is the correct behavior.

>
> I hope you sent a copy (please mark it as such) of the usenet posting
> to the sendmail bug email address located at
>
> http://www.sendmail.org/email-addresses.html
>
> since it sounds like you found one, and thats where the people who
> REALLY know the code live.
>

I did. Thank you. :)

> > Recreating the condition in a testing environment is relatively easy:
> > create a milter that tempfails all messages, set your
> > Timeout.queuewarn* or Timeout.queuereturn* values to something
> > absurdly low (3 minutes), and your Timeout.data* to something absurdly
> > high (2 hours). Then send a message from off of the box, but wait
> > about 5 minutes before putting in the final '\r\n.\r\n'. The server
> > will fail the message, yet generate a DSN.
> >
>
> I havent tested. I wonder if success return receipts would be returned
> if it wasnt timed out and the original message requested them.

Not real sure what you're suggesting here. I think that if a return
receipt was requested with a message that hit the patch, it wouldn't
actually be delivered locally, so the return receipt would not come
into play.

>
> > I was able to trace the delivery of the message through smtp_data()
> > down to where it is aborted. It looks like it doesn't do anything to
> > suppress these DSNs before it hands the envelope off to
> > dropenvelope(). I've included a patch that suppresses the DSNs for
> > the particular situation I'm experiencing, but I'd appreciate the
> > input of someone who is more well-versed with the sendmail code to
> > tell me if there is a better flag to be setting.
>
> (not like I am overly familiar with that particular flow)
>
> e->e_sendqueue = NULL;
>
> That might also do the trick.

It probably would, but I'm less comfortable with modifying the amount
of data in the envelope in case it is needed later (logging or other
future purposes). I also don't really want to have to play around
with NULLing plus free-ing plus protecting against dereferences later,
plus whatever other pointer stuff I have to do if setting flags gets
me where I need to be. Paths of least resistance and whatnot. ;)

jma...@ttec.com

unread,
Apr 17, 2007, 7:56:41 AM4/17/07
to
On Apr 17, 1:06 am, billylieura...@gmail.com wrote:
> jmai...@ttec.com \/\/|20+3:

> Yes, they do (just tested it; good thought btw.), and I think they
> should. Once the message is accepted, even before it is tried
> immediately, the originating client can cheerfully disconnect and
> clear the message from its queue. The responsibilty to either deliver
> the message or let the sender know about a non-delivery would seem to
> rest with the server that accepted the transaction (post DATA
> command), and no longer with the one who has the option of deleting it
> from its queue.

My point is that it would seem more correct to judge queue lifetime
starting AFTER final data block is received, when the message has
actually been queued for delivery. Until that point it hasnt.

>
> Once the message is /accepted/ as such the chunk of code in the patch
> is skipped. It is tried transactionally, and possibly queued with no
> modification to the flags regarding notification behavior. The result
> in my testing was a delay DSN as the queue file was saved and the
> envelope was dropped, which I think is the correct behavior.

The point is that the calculation of queue lifetime does not appear to
be based on receipt of the SMTP DATA \r\n.\r\n

>
>
> Not real sure what you're suggesting here. I think that if a return
> receipt was requested with a message that hit the patch, it wouldn't
> actually be delivered locally, so the return receipt would not come
> into play.

Suppose the timeouts are not exceeded on a message that requests
delivery receipts and the milter rejects or tempfails the message in
xxfi_eom()

Do the return receipts get sent (improperly)?


> > > input of someone who is more well-versed with the sendmail code to
> > > tell me if there is a better flag to be setting.
>
> > (not like I am overly familiar with that particular flow)
>
> > e->e_sendqueue = NULL;
>
> > That might also do the trick.
>
> It probably would, but I'm less comfortable with modifying the amount
> of data in the envelope in case it is needed later (logging or other
> future purposes).

Its logged already

if (aborting)


{
/* Log who the mail would have gone to */
logundelrcpts(e, e->e_message, 8, false);

> I also don't really want to have to play around
> with NULLing plus free-ing

No free(), its an rpool. milter-rrres has been doing this for a few
versions already.

> plus protecting against dereferences later,
> plus whatever other pointer stuff

e->e_sendqueue is a linked list, and all code which touches it should
expect it to be possibly NULL.

seems to be me to render the message undeliverable at all, and more
importantly to not cause any return processing in dropenevelope()
since the control variables in the function will remain "false"

> I have to do if setting flags gets
> me where I need to be. Paths of least resistance and whatnot. ;)

I suggested the other test cases to illustrate that there may indeed
be other bugs not addressed by setting flags.

0 new messages