Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

SOLVED: rbl check being skipped - Postfix logs no error on NXDOMAIN, does on SERVFAIL

1,902 views
Skip to first unread message

Stan Hoeppner

unread,
Jan 22, 2010, 7:18:00 AM1/22/10
to
Stan Hoeppner put forth on 1/22/2010 1:28 AM:
> I've wondered for a couple of months why my rbl check is being skipped. I've
> not seen a spamhaus entry in my logs since Sept 25 '09. Interestingly, postgrey
> is being called now and then, and it is after the rbl check in main.cf. Any
> idea why my rbl check is being skipped? What have I screwed up to cause this?

Bad form replying to my own post but...

After a hint from Ralf, I started digging around and here is what I found:

1. Spamhaus has banned Google Public DNS resolver queries. I didn't know this
until today. If Postfix is using Google Public DNS resolvers, rbl queries to
zen.spamhaus.org fail but Postfix (Debian Lenny 2.5.5-1.1) logs NOTHING about
it. Not the query attempt, not the failure, zilch, nut'n. This explains why I
haven't seen any zen entries in my log since Sept 25 last year, apparently the
day I switched to Google DNS resolvers. A total lack of log entries makes
troubleshooting anything very difficult. Thanks to Ralf's off list suggestion,
I was able to start troubleshooting down the correct path.

2. For other dns resolvers that Spamhaus doesn't like, such as a few under the
CenturyLink umbrella (former Embarq/Sprint resolvers) an error is logged, such as:

Jan 22 05:27:53 greer postfix/smtpd[19251]: warning:
50.211.118.82.zen.spamhaus.org: RBL lookup error: Host or domain name not found.
Name service error for name=50.211.118.82.zen.spamhaus.org type=A: Host not
found, try again

3. Sometime between my switch to the Google resolvers and today, Spamhaus
decided to ban my previous Embarq resolvers. So, when I switched back to the
old ones, I got errors like that above, and my zen queries still failed. I dug
around through some very old paperwork and found a set of old Sprint resolvers
in Kansas City I'd never actually used which aren't banned by Spamhaus. Turns
out this is probably a good thing since the resolvers I found that work are also
closest physically and electrically, the primary being 4 hops and 35ms away, the
secondary 7 hops and 40ms away.

I'm glad I got this solved. I really wish that when I was using the Google
resolvers that Postfix would have been logging some kind of errors. If it had,
I'd have known I had a real problem much sooner. The total lack of log entries
for ~3 months is what finally jolted me to look into this. This is a sad state
of affairs. So the question at this point is, why didn't Postfix log any errors
when NXDOMAIN domain was returned, but did log errors when SERVFAIL is returned?

--
Stan

Mikael Bak

unread,
Jan 22, 2010, 8:50:00 AM1/22/10
to
Stan Hoeppner wrote:
>
> 1. Spamhaus has banned Google Public DNS resolver queries.

Stan,
Do you have a good enough reason to not run your own name resolver on
your front MX machine?

IMO relying on third parties for DNS on an MX is bad design.

Mikael

Wietse Venema

unread,
Jan 22, 2010, 8:58:53 AM1/22/10
to
Stan Hoeppner:

> 1. Spamhaus has banned Google Public DNS resolver queries. I
> didn't know this until today. If Postfix is using Google Public
> DNS resolvers, rbl queries to zen.spamhaus.org fail but Postfix
> (Debian Lenny 2.5.5-1.1) logs NOTHING about it. Not the query
> attempt, not the failure, zilch, nut'n. This explains why I

The query returns NXDOMAIN. No-one has asked me to log all the
NXDOMAIN results for DNSBL queries.

Wietse

With query through Google DNS the host is "not listed" in zen.spamhaus.org:

% dig @8.8.8.8 a 105.49.136.89.zen.spamhaus.org

; <<>> DiG 9.6.1-P1 <<>> @8.8.8.8 a 105.49.136.89.zen.spamhaus.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 50578
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;105.49.136.89.zen.spamhaus.org. IN A

;; AUTHORITY SECTION:
zen.spamhaus.org. 150 IN SOA need.to.know.only. hostmaster.spamhaus.org. 1001221345 3600 600 432000 150

;; Query time: 169 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Jan 22 08:48:32 2010
;; MSG SIZE rcvd: 112

With direct query, the host is listed as you can see for yourself.

Stan Hoeppner

unread,
Jan 22, 2010, 9:34:35 AM1/22/10
to
Mikael Bak put forth on 1/22/2010 7:50 AM:

> Stan Hoeppner wrote:
>>
>> 1. Spamhaus has banned Google Public DNS resolver queries.
>
> Stan,
> Do you have a good enough reason to not run your own name resolver on
> your front MX machine?
>
> IMO relying on third parties for DNS on an MX is bad design.

Due to this fiasco I'm already looking into it. I'd never really considered it
an issue until now since it's such a light duty box. Not sure if I have enough
memory on the box right now to run a caching resolver. I may need to grab a
stick or two. It wouldn't be an issue except for the fact I recently added a
bunch of daemons to this box so I could decommission a _really old_ machine
(dual P166) that housed the mail store and file shares. That increased the
memory footprint quite a bit.

Suggestions for a lightweight local resolver daemon on Debian Lenny are welcome.
I've never actually used bind before and I've never been a dns admin. I have a
vague hazy memory of reading grumblings that bind may be a bit too "heavy" for
using as a local machine resolver.

--
Stan

Kenneth Marshall

unread,
Jan 22, 2010, 9:39:18 AM1/22/10
to

pdns-recursor 3.1.7.2 is easy to configure/use and has a tuneable
resource footprint.

Cheers,
Ken

Noel Jones

unread,
Jan 22, 2010, 11:00:40 AM1/22/10
to
On 1/22/2010 6:18 AM, Stan Hoeppner wrote:
>
> 1. Spamhaus has banned Google Public DNS resolver queries. I didn't know this
> until today. If Postfix is using Google Public DNS resolvers, rbl queries to
> zen.spamhaus.org fail but Postfix (Debian Lenny 2.5.5-1.1) logs NOTHING about
> it. Not the query attempt, not the failure, zilch, nut'n.

Nothing is logged because the DNS server gives an authoritive
"does not exist" answer. That's not an error, it is the
expected response when a client is not listed in an RBL.

It would be silly to log such events except under debug
conditions. At any rate, the log for this would look
completely normal; lookup performed, host not listed. The
logs would be indistinguishable from any other successful RBL
lookup of an unlisted client.

> 2. For other dns resolvers that Spamhaus doesn't like, such as a few under the
> CenturyLink umbrella (former Embarq/Sprint resolvers) an error is logged, such as:
>
> Jan 22 05:27:53 greer postfix/smtpd[19251]: warning:
> 50.211.118.82.zen.spamhaus.org: RBL lookup error: Host or domain name not found.
> Name service error for name=50.211.118.82.zen.spamhaus.org type=A: Host not
> found, try again

An error is logged because this DNS server returned an error.

Obviously this DNS server is configured differently WRT
spamhaus lookups.

> I'm glad I got this solved. I really wish that when I was using the Google
> resolvers that Postfix would have been logging some kind of errors. If it had,
> I'd have known I had a real problem much sooner. The total lack of log entries
> for ~3 months is what finally jolted me to look into this. This is a sad state
> of affairs. So the question at this point is, why didn't Postfix log any errors
> when NXDOMAIN domain was returned, but did log errors when SERVFAIL is returned?
>


Test RBL lookups with the published test address. 127.0.0.1
should never be listed, 127.0.0.2 should always be listed.

$ host 1.0.0.127.zen.spamhaus.org
Host 1.0.0.127.zen.spamhaus.org not found: 3(NXDOMAIN)

$ host 2.0.0.127.zen.spamhaus.org
2.0.0.127.zen.spamhaus.org has address 127.0.0.2
2.0.0.127.zen.spamhaus.org has address 127.0.0.4
2.0.0.127.zen.spamhaus.org has address 127.0.0.10

-- Noel Jones

Victor Duchovni

unread,
Jan 22, 2010, 11:46:04 AM1/22/10
to
On Fri, Jan 22, 2010 at 10:40:03AM -0600, Stan Hoeppner wrote:

> Kenneth Marshall put forth on 1/22/2010 8:39 AM:


>
> > pdns-recursor 3.1.7.2 is easy to configure/use and has a tuneable
> > resource footprint.
>

> Got her installed, configured, up and running. Let's see if this improves this
> spamhaus situation, and a handful a day of other dns related errors I've been
> getting during mail transactions. Those other errors may be normal, maybe not.
> This resolver should help me figure that out.
>
> I limited the cache to 65536 entries to start with to keep the ram footprint
> low.

You can probably drop it even lower to ~8K entries, without significant
impact on cache effectiveness, this is a single host cache for a low
query volume host, not a recursive cache for a large network.

--
Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:majo...@postfix.org?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.

Stan Hoeppner

unread,
Jan 22, 2010, 11:58:18 AM1/22/10
to
Noel Jones put forth on 1/22/2010 10:00 AM:

> Nothing is logged because the DNS server gives an authoritive "does not
> exist" answer. That's not an error, it is the expected response when a
> client is not listed in an RBL.

Hi Noel,

I was not venting at Postfix, or Wietse, or any of the devs for that matter, as
much as I was venting at the situation. Vietse, Victor, my apologies if it
seemed I was venting at you. I was not.

My venting should be aimed at Spamhaus. What they've done here is the opposite
of transparency. In the case of Google DNS, Spamhaus has pulled something a bit
underhanded in my estimation. They don't want people using Google DNS to query
Spamhaus zones. That's fine. I have no problem with that. But the way in
which they have blocked access creates a silent discard on mail servers using
Google DNS, or at least Postfix (I can't speak for other MTAs in this regard).

What they should have done is reply with a code that actually generates a
visible log error, so an admin, such as myself, can actually see that something
is wrong. Instead, all I got from my logs was silence. Multiple months of that
deafening silence finally prompted my action as I knew there had to be something
wrong. My A/S special sauce is good, but it's not so darn good that I wouldn't
at least get one zen lookup in a few months. Thankfully it's good enough that
even without any dnsbls I've only been averaging about 1 spam/day in the inbox.
Getting zen lookups working again may not help much, but at least I'll get one
more shot at killing the junk before letting it through.

Anyway, I've got my own resolver up now on my Postfix MX. It appears to be working:

greer:/# host 2.0.0.127.zen.spamhaus.org
2.0.0.127.zen.spamhaus.org A 127.0.0.10
2.0.0.127.zen.spamhaus.org A 127.0.0.2
2.0.0.127.zen.spamhaus.org A 127.0.0.4

--
Stan

Mark Goodge

unread,
Jan 22, 2010, 12:07:53 PM1/22/10
to
On 22/01/2010 16:58, Stan Hoeppner wrote:
>
> My venting should be aimed at Spamhaus. What they've done here is the opposite
> of transparency. In the case of Google DNS, Spamhaus has pulled something a bit
> underhanded in my estimation. They don't want people using Google DNS to query
> Spamhaus zones. That's fine. I have no problem with that. But the way in
> which they have blocked access creates a silent discard on mail servers using
> Google DNS, or at least Postfix (I can't speak for other MTAs in this regard).

They're not doing anything underhand. What they're doing to Google is
exactly the same as they do to any other DNS server which exceeds the
rate limit for the free lookup. This is documented on the Spamhaus
website, along with a note explicitly warning users of free public DNS
resolvers that they shouldn't use Spamhaus as it probably won't work.
And, after all, why should it? if something is being provided for free,
such as an open public DNS resolver, then the operators aren't going to
want to pay for commercial access to something that they can't recoup
money on by charging their own users.

If you're going to use a PBL, such as those provided by Spamhaus, then
you really ought to read the documentation first in order to avoid
obvious bear traps like the one you fell into. It's not the fault of
Spamhaus, Google or Postfix if people don't RTFM.

Mark

Larry Stone

unread,
Jan 22, 2010, 12:17:01 PM1/22/10
to
On Fri, 22 Jan 2010, Stan Hoeppner wrote:

> My venting should be aimed at Spamhaus. What they've done here is the opposite
> of transparency. In the case of Google DNS, Spamhaus has pulled something a bit
> underhanded in my estimation. They don't want people using Google DNS to query
> Spamhaus zones. That's fine. I have no problem with that. But the way in
> which they have blocked access creates a silent discard on mail servers using
> Google DNS, or at least Postfix (I can't speak for other MTAs in this regard).

> What they should have done is reply with a code that actually generates a


> visible log error, so an admin, such as myself, can actually see that something
> is wrong. Instead, all I got from my logs was silence. Multiple months of that
> deafening silence finally prompted my action as I knew there had to be something
> wrong.

This is getting away from Postfix so I'll keep this part short but I'll
take the opposite side. For Spamhaus to reply with anything other than
NXDOMAIN risked some MTA rejecting the mail. For those resolvers they, for
whatever reason, do not want to serve, a response that says "accept the
mail" is the only logical response. Anything other than that or a specific
reject reason (as encoded in a NXDOMAIN response) is undefined and could
cause some MTA to incorrectly reject the mail.

When I first set up asking RBL lists, I periodically checked the logs to
make sure they were working. Even today, I have a weekly cron job that
gives me a report of RBL effectiveness (it's real crude - a simple grep
piped to wc -l) and mails it to me. I don't trust that I have anything
setup correctly until I see proof in my logs.

-- Larry Stone
lsto...@stonejongleux.com

Stan Hoeppner

unread,
Jan 22, 2010, 12:33:52 PM1/22/10
to
Mark Goodge put forth on 1/22/2010 11:07 AM:

> It's not the fault of
> Spamhaus, Google or Postfix if people don't RTFM.

I'll give you that. I'd been using zen for years, and sbl-xbl for years before
that. When I changed my resolvers to Google from my current provider's (for
performance reasons, and not just my MX), I didn't go to spamhaus.org to check
and make sure it was ok to do so. It never dawned on me that there would be a
problem. I guess because I'm not a dns monkey (not a racial slur, think
training monkeys) it just didn't occur to me that there would be a problem.

The most interesting part about this, actually, is that when I switched my
resolvers back today, I found Google was apparently blocking them also,
Centurylink's dsl customer resolvers. This happened within the past 3 months.
I don't know what the reason is, but I doubt it's based on query volume given
that these resolvers serve residential and small business dsl customers. They
were working fine before I switched to Google resolvers.

I think it's working again now, though I'll have to wait and see, given my mail
flow and the fact that zen and postgrey only get table scraps, and not many at
that. ;)

--
Stan

Noel Jones

unread,
Jan 22, 2010, 1:13:55 PM1/22/10
to
On 1/22/2010 10:58 AM, Stan Hoeppner wrote:
> Noel Jones put forth on 1/22/2010 10:00 AM:
>
>> Nothing is logged because the DNS server gives an authoritive "does not
>> exist" answer. That's not an error, it is the expected response when a
>> client is not listed in an RBL.
>
> Hi Noel,
>
> I was not venting at Postfix, or Wietse, or any of the devs for that matter, as
> much as I was venting at the situation. Vietse, Victor, my apologies if it
> seemed I was venting at you. I was not.
>
> My venting should be aimed at Spamhaus. What they've done here is the opposite
> of transparency. In the case of Google DNS, Spamhaus has pulled something a bit
> underhanded in my estimation. They don't want people using Google DNS to query
> Spamhaus zones. That's fine. I have no problem with that. But the way in
> which they have blocked access creates a silent discard on mail servers using
> Google DNS, or at least Postfix (I can't speak for other MTAs in this regard).

First remember how RBLs work. An authoritive NXDOMAIN means
the site is not listed, any other answer means the site is
listed. No answer (timeout) is an error that can only mean
"try again". That doesn't leave any option for an automatic
"you're blacklisted" code.

When spamhaus blacklists a site, they answer that every host
is not listed via the normal NXDOMAIN. There are good reasons
to do this, but it doesn't make the job any easier from this
side of the fence.

Since they return the normal "not listed", no MTA or filter
will log anything unusual -- you just won't see any hits.

The up side is that it's unlikely that any MTA or filter will
mistakenly reject or delay mail. If spamhaus just didn't
answer, you would get timeouts in your log but high volume
sites could experience a denial of service if every mail
transaction suddenly took 30-60 seconds longer than normal.
If they list everyone, that creates a worse problem.

I suspect your other provider did something manually to return
timeouts. While this logged the errors that finally brought
this to your attention, this has the very real potential to
cause problems, although it's unlikely that anyone with high
enough volume to suffer from this uses an external DNS. So
while it would be wrong for spamhaus to timeout on everyone,
it's not so bad for an ISP's DNS to return timeouts.

>
> What they should have done is reply with a code that actually generates a
> visible log error, so an admin, such as myself, can actually see that something
> is wrong.

As you see now, this is simply not possible with the current
implementation of RBLs. This isn't a postfix (or any MTA
specific) problem, but rather the way that *all* RBLs are
implemented since their invention.

For this to change, there would need to be an invention of an
agreed-upon method to signal the client that their query
succeeded, but is not honored for some reason. This is
unlikely to happen anytime soon since there is no obvious
technical solution, and it's not a problem the RBL operators
are particularly concerned about.


> Instead, all I got from my logs was silence. Multiple months of that
> deafening silence finally prompted my action as I knew there had to be something
> wrong.

If you're concerned, hack up a cron script to probe the test
addresses and mail yourself the output.

I think we've spent enough time on this.

-- Noel Jones

0 new messages