Problems with post-processing and DSNs

2 views
Skip to first unread message

Dr Dave

unread,
Aug 11, 2010, 3:29:52 PM8/11/10
to Registry_Internet
There is an interesting discussion in the SMTP Interest Group <ietf-
sm...@imc.org> started by Dave Crocker on 9 August 2010. The initial
subject was "Processing after the end of DATA" and what to do about
the fact that this processing can take a long time, cause a timeout at
the transmitter, and re-transmission of a duplicate message some time
later. I suggested that we can avoid the problem by doing the time-
consuming part of the processing (statistical spam filtering) AFTER
the acceptance of data (as recommended in our standard configuration
for receivers using our Registry. This led to an off-topic discussion
of the problems that might occur in sending a DSN for messages
rejected by a post-process spam filter, and a debate about whether
those messages are really necessary. This is a continuation of that
discussion.

Дилян Палаузов wrote:
>
> On 10.08.2010 20:18, David MacQuigg wrote:
>> My system is too small to draw any conclusions, but my impression is
>> that we are overly-concerned about a very small fraction of the mailflow
>> - those few messages that are legit, but get a false reject from a
>> statistical filter. We really don't care about the proper SMTP response
>> if it is spam. The only advantage I see in running the spam filter
>> before ACCEPT is a few less messages at the bottom of my spam bucket. I
>> wish I had some numbers from a larger system.
>
> A user might have installed redirect for her incoming mails to another mail provider. The other provider might SMTP reject incoming spam. If there is spam for your user, your server does accept it, then sends it to the other mail provider, and that second server SMTP rejects it, then your server will have to send DSN for that SPAM (provided that you do not want to discard emails). With rejection-preferences similar on both servers, the amount of such DSNs will decrease, when both servers reject after DATA. You have no influence on what other servers do with the spam (smtp rejecting or inserting in spam folders).

We have quite a bit of influence. We insist that our recipients turn
off all spam filtering, chain-forwarding, SPF checks, whatever might
conflict with our acting as Receiver for their box67.com email
address. Typically, they will have a private mailbox known only to
our Receiver/Forwarder. Failures resulting from SMTP rejects, DSNs,
or any other problem at the recipient's mailbox are treated the same
as a mailbox being suddenly offline. Incoming mail to that address is
rejected until we can resolve the problem with the recipient.

>> Would be nice if SMTP had a 'conditional ACCEPT': "Sorry, your
>> transmitter ID 'yahoo.com' cannot be verified, Your message has been
>> routed to the recipient's quarantine." Imagine the effect that would
>> have on all the lame excuses for an invalid or hard-to-verify HELO name.
>
> What do you want to communicate to the sender with this conditional ACCEPT: "Dear Sender, your suspicious email has arrived. It might be read soon, it might be read late, or it might not be read at all."?

No, the message I suggested had a specific reason for the failure. It
should also have a link to a webpage with instructions on how to fix
the failure, like what we have now on REJECT messages:
http://open-mail.org/3strikes/REJECT.html

> The problem with the spam folders are the false positives. Many users do check the spam folders/quarantines not that often as their Inbox and with less attention. This might lead to reading a message too late, or even overseeing it. I prefer to immediately notify the sender that her message was not delivered, including in the SMTP response alternative ways to contact the recipient (and spamassassin-evaluation of the mail), rather than let the sender hope that her email was properly delivered, while there is a risk that the recipient oversees that message.

False rejects are indeed the main problem with modern email systems,
but this problem isn't solved by rejecting a small fraction of
messages with the highest spam scores. There are still plenty of
messages, almost all spam, that get a score low enough to be kept in
quarantine, and there is always a risk that the recipient will
overlook a legitimate message. The only good solution to the false
reject problem is to offer senders a way to avoid the statistical
filters entirely. That is what a reputation system does. When a
message comes from yahoo.com (or any other A-rated domain), it goes
straight to the recipient's inbox.

Dr Dave

unread,
Aug 14, 2010, 2:51:36 PM8/14/10
to Registry_Internet
This is a copy of Dilyan's reply. I've copied it and added my replies
here to avoid cluttering the SMTP Interest Group with off topic
posts. Please respond here, not in the SMTP mailing list.

Дилян Палаузов wrote:
> I do not think it is good to insist on such things. One might
> want to communicate to a sender, that her mailbox does not exist
> anymore and SMTP reject the mails from that sender. Your service
> will then conclude, that destination mailbox is suddenly offline,
> but it is not. ...

I may not have been clear. I was talking *only* about what happens on
the receive side, where we have no relationship with the sender.
Perhaps this diagram from http://open-mail.org/MHSmodels.html will
help.

|-------- Recipient's Network ---------|
/
--> / --> Receiver/Forwarder ~~> MDA ==> Recipient
/
Border

When a message arrives at our Receiver, the order of processing is: 1)
IP blacklist, 2) HELO whitelist, 3) ACCEPT or REJECT, 4) Spam filter,
5) Forward to Recipient's Mail Delivery Agent (MDA). The first three
steps handle most of the mailflow without delay, and we don't have a
problem with SMTP timeouts from the sending side. The post-SMTP
processing can take as long as it needs. Only small fraction of the
messages goes through the spam filter, and each message can have its
own low-priority filtering process.

The question is what to do about non-delivery notices (NDNs) that are
generated downstream (by the MDA, or perhaps some further downstream
agent, incorrectly set up by one of our recipients. These NDNs will
come back to us, because we re-write the Return Address on every
message we forward. Our current procedure is to immediately suspend
the recipient's account, so we don't accept any more messages we might
not be able to deliver, while we contact the recipient to resolve the
problem. If the problem is not resolved in a few days, we look for
SPF records to see if we can discard it as a forgery, we look at the
message to make sure it is not obviously spam, then we send an NDN to
the original return address, or as a last resort to the postmaster at
the HELO domain. The NDN includes just 5KB of the headers and initial
text of the message, no attachments.

The objection to our procedure seems to be that we should be sending
NDNs on the messages that pass the IP blacklist, fail the HELO
whitelist, and get a high spam score. The assumption is that it is
our responsibility to notify the sender when a message will end up in
quarantine. First, we do not decide what goes in an individual
recipient's quarantine folder. All our Receiver does is tag the
message with a spam score. The recipient decides whether to a
maintain a separate folder with high spam scores (a quarantine), where
to set the spam/ham threshold, and how much time to spend reviewing
messages with a high spam score. Second, this will require automation
of the sending of NDNs, which will inevitably lead to complaints about
backscatter.

While I think it would be a good thing to send a notice of some sort
when a message is not whitelisted, we cannot do that now because of
the "near ban" on NDNs, and the fact that SMTP has no "conditional
ACCEPT". There appears to be no good way to communicate with the
sender other than ACCEPT or REJECT.


>> >> Would be nice if SMTP had a 'conditional ACCEPT': "Sorry, your
>> >> transmitter ID 'yahoo.com' cannot be verified, Your message has been
>> >> routed to the recipient's quarantine." Imagine the effect that would
>> >> have on all the lame excuses for an invalid or hard-to-verify HELO
>> >> name.
>> >
>> > What do you want to communicate to the sender with this conditional
>> > ACCEPT: "Dear Sender, your suspicious email has arrived. It might be
>> > read soon, it might be read late, or it might not be read at all."?
>>
>> No, the message I suggested had a specific reason for the failure. It
>> should also have a link to a webpage with instructions on how to fix the
>> failure, like what we have now on REJECT messages:
>> http://open-mail.org/3strikes/REJECT.html
>
> The link above shall be send with SMTP permanent reject (not
> conditional accept).

We do send a message like this with REJECT. I'm saying it would be
nice if we could send a similar message with an ACCEPT. Note: a
conditional ACCEPT would not be the same as a temporary reject. We do
not want the sender to try again later. We want him to fix whatever
is wrong with his transmitter (bad HELO name, too much spam, whatever)
so he can get on our whitelist.

> My rewording does not change the meaning of David`s message -- the
> sender wants to know if the message is delivered, or not; and the
> message "Sorry, your transmitter ID 'yahoo.com' cannot be
> verified, Your message has been routed to the recipient's
> quarantine." does not clarify this.

Again, all we can do is ACCEPT or REJECT, and ACCEPT includes both
whitelisted and "quarantined" messages. The proposed conditional
ACCEPT would provide the sender with additional and valuable
information. It would also, in the long run, have a beneficial effect
on the reliability of email. As more senders become aware of how
their mail is handled, pressure will build on their email service
providers to make sure their outgoing mail is whitelisted.


> > > The problem with the spam folders are the false positives. Many
> > > users do check the spam folders/quarantines not that often as
> > > their Inbox and with less attention. This might lead to reading a
> > > message too late, or even overseeing it. I prefer to immediately
> > > notify the sender that her message was not delivered, including in
> > > the SMTP response alternative ways to contact the recipient (and
> > > spamassassin-evaluation of the mail), rather than let the sender
> > > hope that her email was properly delivered, while there is a risk
> > > that the recipient oversees that message.
>>
>> False rejects are indeed the main problem with modern email systems, but
>> this problem isn't solved by rejecting a small fraction of messages with
>> the highest spam scores. There are still plenty of messages, almost all
>> spam, that get a score low enough to be kept in quarantine, and there is
>> always a risk that the recipient will overlook a legitimate message. The
>> only good solution to the false reject problem is to offer senders a way
>> to avoid the statistical filters entirely. That is what a reputation
>> system does. When a message comes from yahoo.com (or any other A-rated
>> domain), it goes straight to our recipient's inbox.
>
> Informing the sender, that her message was evaluated as spam and
> is therefore returned, including information how to alternatively
> contact the recipient, is also good solution. Or rather: giving
> the mailbox owners the one [smtp spam reject] or the other [spam
> folders] option is a good solution.

Again, this is entirely up to the recipient. We don't manage their
spam folders.

> By the way, in the Russian newspapers you can read job offers for
> marketing agents. Their job is to send mails using free mail
> providers to a provided long lists of recipients by copy&paste the
> list of addresses [= to send spam]. Thus, if the message comes
> from an A-rated domain, it does not mean the message is ham.

A-rated domains like yahoo.com and most other large ESPs do not allow
spam to be sent from their transmitters. Domains like google.com,
that do have a problem, don't get an A rating.
Reply all
Reply to author
Forward
0 new messages