Timeouts during cache cleaning and zone collection

Auer, Karl James

unread,

Jun 20, 2005, 5:05:06 AM6/20/05

to

Hi there.

We are seeing a problem with BIND 9.3.0, compiled with threading on
Solaris, whereby the servers stop answering queries for a couple of
seconds. Qeuries in this interval time out. That is, they are not
answered slowly, they are not answered at all.

The servers do this a) when they clean their caches and b) when they are
downloading zones.

Archived messages on the matter of cache cleaning suggest that these
timeouts are normal for BIND, and that the only way to avoid them is to
set turn cache cleaning off. I've tried setting the cleaning interval to
only a few minutes, but it just caused more timeouts - there seems to be
a sort of minimum interruption due to cache cleaning.

Of more concern are the interruptions due to zone downloads. We have a
(poorly designed) system whereby our zone files are generated from a
database if required; the zone files are completely rewritten, with a
new serial number, and a hidden master is reloaded. That causes a bunch
of notifies to hit the secondaries, which then reload those zones from
the hidden master. That process causes these timeouts. The secondaries
are the nameservers that field all our queries. Note that about 400
zones are downloaded, though most are very small or even empty. There
are a couple of larger ones, but even there we are only loading 50000 or
so entries.

We have not separated authoritative and caching nameserver functions.
The affected secondaries handle internal and external queries, but are
not under a particularly heavy load. Even if we did separate the
functions, it wouldn't help (I think) because the caching issue would
still be there on the caching servers, and the download issue would
still be there on the authoritative servers. We want to separate the
functions anyway, for all the usual reasons.

Note that this problem isn't new - it's just that our monitoring has
improved :-)

So my question: Is it normal for a BIND server to stop answering queries
during zone downloads? If not what might be the problem here?

Regards, K.

--=20
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (karl...@id.ethz.ch) Geschaeft/work +41- 1-6327531
Kommunikation, ETHZ RZ Privat/home +41-43-2660706
Eidgenoessische Technische Hochschule, Zuerich Fax +41- 1-6321225
Clausiusstrasse 59 CH-8092 ZUERICH Switzerland

Peter Dambier

unread,

Jun 20, 2005, 6:43:40 AM6/20/05

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Karl,

the way bind is designed those moments when bind goes "offline" for
cache cleaning and reloading zones is normal. Maybe that is the
reason why you should have more than one DNS-server in the first place.

My own bind with some 40 zones is not to very annoying but I do observe
a lot of timeouts and even give-ups when it looks for serial numbers on
its masters. It is in the logs but I do not notice it.

I do notice that it takes more time to start bind with every additional
zone it loads.

If you can then split authoritative servers for the outside and caches
for the inside.

Nevertheless it is a good idea to load some zones on the inside resolvers

1. Your own zones should be permanently on your resolvers. You know better
than the root-servers what belongs to you. Nobody can poison your cache
about information your server is authoritative for.

2. The root zone. It is the single most problematic point of failure. There
have been attacks on the root-servers. The root zone file really is not so
big at all. I do clone a.public-root.net for the "." zone. All the
public-root.net servers allow axfr transfer. If the root breakes then my
nameserver will continue running for at least two weeks. I know a lot of
other root-servers do not allow zone transfer for security reasons. What
security?

3. Companies you do bussiness with. Especially banks know about security
problems with poisoned servers. Many think about publishing their zone
information and exchanging zone information with regular costumers.

Windows allows 2 nameservers other operating systems allow 3 nameservers
so you should have 2 or 3 resolvers inside your company. Hide them
behind a firewall. Dont let normal workstations query outside nameservers.

Having 2 or 3 nameervers nobody will notice that one of them is day
dreaming from time to time.

If you arrange for server1 asking outside1 (backup outside2, backup outside3),
server2 asking server1 (backup outside2, backup outside3) and server3 asking
server2 (backup outside3, backup outside1) then rarely your servers will
update at the same time. Nobody will notice the failure of a single server.

- --
Peter and Karin Dambier
Public-Root
Graeffstrasse 14
D-64646 Heppenheim
+49-6252-671788 (Telekom)
+49-6252-599091 (O2 Genion)
+1-360-226-6583-9738 (INAIC)
mail: pe...@peter-dambier.de
http://iason.site.voila.fr
http://www.kokoom.com/iason
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQFCtp3aPGG/Vycj6zYRArZlAJwO5TdSHp1C0fgf95qLNNKLXQKEIACffc0t
Asz9XF48Te9fcX5Q9q2Bhj0=
=u7Ml
-----END PGP SIGNATURE-----

Nod

unread,

Jun 21, 2005, 12:44:22 PM6/21/05

to

On Mon, 20 Jun 2005 11:05:06 +0200, "Auer, Karl James" <karl...@id.ethz.ch>
wrote:

>Hi there.
>
>We are seeing a problem with BIND 9.3.0, compiled with threading on
>Solaris, whereby the servers stop answering queries for a couple of
>seconds. Qeuries in this interval time out. That is, they are not
>answered slowly, they are not answered at all.
>
>The servers do this a) when they clean their caches and b) when they are
>downloading zones.
>
>Archived messages on the matter of cache cleaning suggest that these
>timeouts are normal for BIND, and that the only way to avoid them is to
>set turn cache cleaning off. I've tried setting the cleaning interval to
>only a few minutes, but it just caused more timeouts - there seems to be
>a sort of minimum interruption due to cache cleaning.
>

<snip>

I'm having a similar issue to this, namely the cache cleaning.
It seems that whenever the cache-cleaner runs, it blocks requests to the
nameserver. While the secondary cache will take over, most of the clients will
see a 5-6 second delay (Windows) before using the secondary server. Setting the
cache cleaner to run once a minute takes about 7-8 seconds to clean. Leaving it
at an hour means it will be cleaning for about 20 minutes.
Turning off the cleaner results in bind using up all the ram and getting killed
by the OS.

The server in question serves about 11000 zones. Dual xeon 2.8, 4gb ram, FreeBSD
5.4 64 bit, MAXDSIZ, MAXSSIZ, and DFLDSIZ set to 2gb at the moment. Generally,
about 800-900 recursive clients at any given time.

If anyone has any ideas how to improve the cache cleaner performance, it would
be most appreciated. As it is now, the nameserver 'going away' happens far too
often, and is very noticable.

Nod

unread,

Jun 21, 2005, 12:29:47 PM6/21/05

to

On Mon, 20 Jun 2005 11:05:06 +0200, "Auer, Karl James" <karl...@id.ethz.ch>
wrote:

>Hi there.

>
>We are seeing a problem with BIND 9.3.0, compiled with threading on
>Solaris, whereby the servers stop answering queries for a couple of
>seconds. Qeuries in this interval time out. That is, they are not
>answered slowly, they are not answered at all.
>
>The servers do this a) when they clean their caches and b) when they are
>downloading zones.
>
>Archived messages on the matter of cache cleaning suggest that these
>timeouts are normal for BIND, and that the only way to avoid them is to
>set turn cache cleaning off. I've tried setting the cleaning interval to
>only a few minutes, but it just caused more timeouts - there seems to be
>a sort of minimum interruption due to cache cleaning.
>

Joshua Coombs

unread,

Jun 21, 2005, 2:23:14 PM6/21/05

to

> I'm having a similar issue to this, namely the cache cleaning.
> It seems that whenever the cache-cleaner runs, it blocks requests to
> the
> nameserver. While the secondary cache will take over, most of the
> clients will
> see a 5-6 second delay (Windows) before using the secondary server.
> Setting the
> cache cleaner to run once a minute takes about 7-8 seconds to clean.
> Leaving it
> at an hour means it will be cleaning for about 20 minutes.
> Turning off the cleaner results in bind using up all the ram and
> getting killed
> by the OS.
>
> The server in question serves about 11000 zones. Dual xeon 2.8, 4gb
> ram, FreeBSD
> 5.4 64 bit, MAXDSIZ, MAXSSIZ, and DFLDSIZ set to 2gb at the moment.
> Generally,
> about 800-900 recursive clients at any given time.
>
> If anyone has any ideas how to improve the cache cleaner
> performance, it would
> be most appreciated. As it is now, the nameserver 'going away'
> happens far too
> often, and is very noticable.

This issue has bit us pretty hard as it causes freeradius to fail auth
attempts during bind's cleaning windows. This occurs weather or not
the freeradius box has multiple resolvers defined. Our solution,
we're moving away from bind wherever we can till a fix is tested.
We've had excellent luck with pdnsd, the latest dev builds are working
quite well.

Joshua Coombs

Nod

unread,

Jun 21, 2005, 4:28:39 PM6/21/05

to

I've found somewhat of a workaround, but it involves setting the max-cache-size
to about 16 megabytes. The CPU is constantly thrashing with a 1 minute cache
cleaning, but it doesn't go down for an perceptible time period. Generally the
clenaning takes less than 0.8 seconds, not enough to be complained about.

Not an ideal solution, but an immediate one.

JINMEI Tatuya / 神明達哉

unread,

Jun 21, 2005, 5:50:55 PM6/21/05

to

>>>>> On Tue, 21 Jun 2005 16:44:22 GMT,
>>>>> no...@nospam.none (Nod) said:

> I'm having a similar issue to this, namely the cache cleaning.
> It seems that whenever the cache-cleaner runs, it blocks requests to the
> nameserver. While the secondary cache will take over, most of the clients will
> see a 5-6 second delay (Windows) before using the secondary server. Setting the
> cache cleaner to run once a minute takes about 7-8 seconds to clean. Leaving it
> at an hour means it will be cleaning for about 20 minutes.
> Turning off the cleaner results in bind using up all the ram and getting killed
> by the OS.

> The server in question serves about 11000 zones. Dual xeon 2.8, 4gb ram, FreeBSD
> 5.4 64 bit, MAXDSIZ, MAXSSIZ, and DFLDSIZ set to 2gb at the moment. Generally,
> about 800-900 recursive clients at any given time.

> If anyone has any ideas how to improve the cache cleaner performance, it would
> be most appreciated. As it is now, the nameserver 'going away' happens far too
> often, and is very noticable.

As I said in a separate message, upgrading to latest versions would be
a good idea (if you're using an older version), particularly for this
type of issue. As also mentioned in the separate message, specifying
a build-time option for memory management sometimes improves the
performance.

And finally, if you're enabling threads, I'd suggest to turn it off.
The current implementation of BIND9 threading is particularly
unfriendly with FreeBSD's thread support. It doesn't buy anything but
poor performance. (This will be improved in 9.4, but it cannot be a
short-term solution).

JINMEI, Tatuya
Communication Platform Lab.
Corporate R&D Center, Toshiba Corp.
jin...@isl.rdc.toshiba.co.jp

JINMEI Tatuya / 神明達哉

unread,

Jun 21, 2005, 5:44:06 PM6/21/05

to

>>>>> On Mon, 20 Jun 2005 11:05:06 +0200,
>>>>> "Auer, Karl James" <karl...@id.ethz.ch> said:

> We are seeing a problem with BIND 9.3.0, compiled with threading on
> Solaris, whereby the servers stop answering queries for a couple of
> seconds. Qeuries in this interval time out. That is, they are not
> answered slowly, they are not answered at all.

This may be partly by a bug of inefficient algorithm for DB traversing
(cache or authoritative). This bug was fixed in 9.3.1 with this
change:

1740. [bug] Replace rbt's hash algorithm as it performed badly
with certain zones. [RT #12729]

so, upgrading to 9.3.1 would be a good deal.

Also, I've heard that some of the past performance problems were often
due to memory management issues, and sometimes can be improved by
building named with 'internal malloc'. You can do this by, e.g., as
follows:

% ./configure --enable-threads STD_CDEFINES='-DISC_MEM_USE_INTERNAL_MALLOC=1'

JINMEI Tatuya / 神明達哉

unread,

Jun 21, 2005, 6:54:43 PM6/21/05

to

>>>>> On Wed, 22 Jun 2005 06:44:06 +0900,
>>>>> JINMEI Tatuya <jin...@isl.rdc.toshiba.co.jp> said:

> Also, I've heard that some of the past performance problems were often
> due to memory management issues, and sometimes can be improved by
> building named with 'internal malloc'. You can do this by, e.g., as
> follows:

> % ./configure --enable-threads STD_CDEFINES='-DISC_MEM_USE_INTERNAL_MALLOC=1'

To be clearer: '--enable-threads' is independent from internal
malloc...I just carelessly copied and pasted test inputs in my
environment to re-check the behavior.

% ./configure STD_CDEFINES='-DISC_MEM_USE_INTERNAL_MALLOC=1'

will simply be enough for the 'internal malloc' feature.

Nod

unread,

Jun 21, 2005, 11:28:13 PM6/21/05

to

Just upgraded from 9.3.1rc1 to 9.3.1. Unfortunately, the cache cleaning still
results in no answer from the nameserver until it's done.
At the moment, my workaround is to set the max-cache size to 16 megs with 1
minute cleaning, and let it chew on the CPU.
Operating system choices aside, is there any reason to expect that moving to a
threaded nameserver would help overcome this issue?

JINMEI Tatuya / 神明達哉

unread,

Jun 22, 2005, 2:55:41 PM6/22/05

to

>>>>> On Wed, 22 Jun 2005 03:28:13 GMT,
>>>>> no...@nospam.none (Nod) said:

> Just upgraded from 9.3.1rc1 to 9.3.1. Unfortunately, the cache cleaning still
> results in no answer from the nameserver until it's done.

Hmm...just to be sure, did you also enable
ISC_MEM_USE_INTERNAL_MALLOC? (I'm dubious about whether this takes a
dominant role for this particular issue, though).

Also, didn't you get *any* responses during the cache cleaning? Or
did you still see some responses (as well as no-answers)? In my
understanding, named should still be able to respond to queries even
during the cache cleaning while it can drop some of the queries due to
the additional cleaning task.

If you can see some responses, one additional possibility of tuning is
to reduce the load of each 'chunk' of the whole cleaning work. You
can do this by modifying the DNS_CACHE_CLEANERINCREMENT macro at line
48 of bind-9.3.1/lib/dns/cache.c:

#define DNS_CACHE_CLEANERINCREMENT 1000 /* Number of nodes. */

I hear a report that changing this value to 200 could eliminate the
packet loss in some environment. (Unfortunately, there is no other
way than modifying the source code to tune this value at the moment).

> Operating system choices aside, is there any reason to expect that moving to a
> threaded nameserver would help overcome this issue?

Yes, at least in theory, because with multiple threads (on multiple
processors) additional threads can continue accepting (and responding
to) queries while the other thread is cleaning the cache. However,
I'd still not recommend to enable threads for the combination of
FreeBSD and BIND9. The overhead for the threads in this combination
is so bad, so I suspect the possible benefit does not outweigh the
performance penalty.

For other OSes (with multiple processors), it may be possible to
mitigate this problem with multiple threads.

Nod

unread,

Jun 22, 2005, 7:40:57 PM6/22/05

to

On Thu, 23 Jun 2005 03:55:41 +0900, JINMEI Tatuya / 神明達哉
<jin...@isl.rdc.toshiba.co.jp> wrote:

>>>>>> On Wed, 22 Jun 2005 03:28:13 GMT,
>>>>>> no...@nospam.none (Nod) said:
>
>> Just upgraded from 9.3.1rc1 to 9.3.1. Unfortunately, the cache cleaning still
>> results in no answer from the nameserver until it's done.
>
>Hmm...just to be sure, did you also enable
>ISC_MEM_USE_INTERNAL_MALLOC? (I'm dubious about whether this takes a
>dominant role for this particular issue, though).

No, we didn't use this particular option.

>
>Also, didn't you get *any* responses during the cache cleaning? Or
>did you still see some responses (as well as no-answers)? In my
>understanding, named should still be able to respond to queries even
>during the cache cleaning while it can drop some of the queries due to
>the additional cleaning task.

The server would respond to an 'rndc status', and showed about 30-40 recursive
clients. For a server normally serving 600+, I'd consider this to be
non-responsive. If it was answering DNS requests, I couldn't tell.

>
>If you can see some responses, one additional possibility of tuning is
>to reduce the load of each 'chunk' of the whole cleaning work. You
>can do this by modifying the DNS_CACHE_CLEANERINCREMENT macro at line
>48 of bind-9.3.1/lib/dns/cache.c:
>
>#define DNS_CACHE_CLEANERINCREMENT 1000 /* Number of nodes. */
>
>I hear a report that changing this value to 200 could eliminate the
>packet loss in some environment. (Unfortunately, there is no other
>way than modifying the source code to tune this value at the moment).

I'll look into it, however it seems like the cleaning method itself is
intrinsically flawed. As I see a cache, once new data comes into a full cache,
the old data should get 'pushed' out, instead of a period (garbage?) cleanup. If
I'm misunderstanding this, feel free to correct.

>
>> Operating system choices aside, is there any reason to expect that moving to a
>> threaded nameserver would help overcome this issue?
>
>Yes, at least in theory, because with multiple threads (on multiple
>processors) additional threads can continue accepting (and responding
>to) queries while the other thread is cleaning the cache. However,
>I'd still not recommend to enable threads for the combination of
>FreeBSD and BIND9. The overhead for the threads in this combination
>is so bad, so I suspect the possible benefit does not outweigh the
>performance penalty.
>
>For other OSes (with multiple processors), it may be possible to
>mitigate this problem with multiple threads.
>
> JINMEI, Tatuya
> Communication Platform Lab.
> Corporate R&D Center, Toshiba Corp.
> jin...@isl.rdc.toshiba.co.jp

Linux is a possability, but I'd hesitate to make such a trastic change for a
problem that's going to be the same. Multiple threads or not, the cache-cleaning
method just seems inefficient.

Niall O'Reilly

unread,

Jun 23, 2005, 5:57:33 AM6/23/05

to

On 22 Jun 2005, at 19:55, JINMEI Tatuya / 神明達哉 wrote:

> I'd still not recommend to enable threads for the combination of
> FreeBSD and BIND9. The overhead for the threads in this combination
> is so bad, so I suspect the possible benefit does not outweigh the
> performance penalty.
>
> For other OSes (with multiple processors), it may be possible to
> mitigate this problem with multiple threads.

Could people with information (positive or negative) on other OSs
please let me know, and I'll summarize to the list.

Best regards,

Niall O'Reilly

PGP key ID: AE995ED9 (see www.pgp.net)
Fingerprint: 23DC C6DE 8874 2432 2BE0 3905 7987 E48D AE99 5ED9

JINMEI Tatuya / 神明達哉

unread,

Jun 23, 2005, 3:33:06 PM6/23/05

to

>>>>> On Wed, 22 Jun 2005 23:40:57 GMT,
>>>>> no...@nospam.none (Nod) said:

>> Also, didn't you get *any* responses during the cache cleaning? Or
>> did you still see some responses (as well as no-answers)? In my
>> understanding, named should still be able to respond to queries even
>> during the cache cleaning while it can drop some of the queries due to
>> the additional cleaning task.

> The server would respond to an 'rndc status', and showed about 30-40 recursive

> clients. For a server normally serving 600+, I'd consider this to be
> non-responsive. If it was answering DNS requests, I couldn't tell.

Okay, so what's actually happening is:

- if you execute 'rndc status' during the cleaning period, 'recursive
clients' are 30-40.
- if you execute 'rndc status' at other timing, 'recursive clients'
are usually over 600.

Is my understanding correct?

> I'll look into it, however it seems like the cleaning method itself is
> intrinsically flawed. As I see a cache, once new data comes into a full cache,
> the old data should get 'pushed' out, instead of a period (garbage?) cleanup. If
> I'm misunderstanding this, feel free to correct.

Depending on the meaning of 'full cache', 'new data' or 'the old
data'. When you specify max-cache-size, a busy cache will eventually
be 'full' in that the memory for the cache reaches the specified
maximum. Then some of the existing entries in the cache will be
removed ('pushed out') so that the cache can contain new entries with
the memory consumption under the maximum. This procedure runs
separately from the periodic cleaning, so we could say this 'instead
of a period cleanup'.

So, if you can live with a moderate setting of max-cache-size (I'm
afraid 16 megs are way too small for a busy cache server), that would
be the solution for you. (I guess in this case you don't have to set
the cleaning interval to such a small value as 1 minute). Relying on
a small max-cache-size has a different drawback, though: it tends to
make the CPU busy as you saw.

Whether or not the cleaning method is intrinsically flawed (even if it
is, ISC would need time to fix it and we'd need a workaround in the
mean time), I'm still interested in how the change of
DNS_CACHE_CLEANERINCREMENT works. So it would be great if you could
give it a try. You might also want to modify the cleaning interval to
a smaller value (say, 20 minutes). A moderate combination of
DNS_CACHE_CLEANERINCREMENT and the cleaning interval may reasonably
mitigate the inactivity.

Nod

unread,

Jun 24, 2005, 8:00:36 AM6/24/05

to

On Fri, 24 Jun 2005 04:33:06 +0900, JINMEI Tatuya / 神明達哉
<jin...@isl.rdc.toshiba.co.jp> wrote:

>>>>>> On Wed, 22 Jun 2005 23:40:57 GMT,
>>>>>> no...@nospam.none (Nod) said:
>
>>> Also, didn't you get *any* responses during the cache cleaning? Or
>>> did you still see some responses (as well as no-answers)? In my
>>> understanding, named should still be able to respond to queries even
>>> during the cache cleaning while it can drop some of the queries due to
>>> the additional cleaning task.
>
>> The server would respond to an 'rndc status', and showed about 30-40 recursive
>> clients. For a server normally serving 600+, I'd consider this to be
>> non-responsive. If it was answering DNS requests, I couldn't tell.
>
>Okay, so what's actually happening is:
>
>- if you execute 'rndc status' during the cleaning period, 'recursive
> clients' are 30-40.
>- if you execute 'rndc status' at other timing, 'recursive clients'
> are usually over 600.
>
>Is my understanding correct?

Correct.

>> I'll look into it, however it seems like the cleaning method itself is
>> intrinsically flawed. As I see a cache, once new data comes into a full cache,
>> the old data should get 'pushed' out, instead of a period (garbage?) cleanup. If
>> I'm misunderstanding this, feel free to correct.
>
>Depending on the meaning of 'full cache', 'new data' or 'the old
>data'. When you specify max-cache-size, a busy cache will eventually
>be 'full' in that the memory for the cache reaches the specified
>maximum. Then some of the existing entries in the cache will be
>removed ('pushed out') so that the cache can contain new entries with
>the memory consumption under the maximum. This procedure runs
>separately from the periodic cleaning, so we could say this 'instead
>of a period cleanup'.

It appears that when the cache is full, the cleaner runs anyway. I didn't see a
seperate log entry that looked any different.

>
>So, if you can live with a moderate setting of max-cache-size (I'm
>afraid 16 megs are way too small for a busy cache server), that would
>be the solution for you. (I guess in this case you don't have to set
>the cleaning interval to such a small value as 1 minute). Relying on
>a small max-cache-size has a different drawback, though: it tends to
>make the CPU busy as you saw.

Sure does. 16 megs is way too small, but it does work around the lockups.

>
>Whether or not the cleaning method is intrinsically flawed (even if it
>is, ISC would need time to fix it and we'd need a workaround in the
>mean time), I'm still interested in how the change of
>DNS_CACHE_CLEANERINCREMENT works. So it would be great if you could
>give it a try. You might also want to modify the cleaning interval to
>a smaller value (say, 20 minutes). A moderate combination of
>DNS_CACHE_CLEANERINCREMENT and the cleaning interval may reasonably
>mitigate the inactivity.
>
> JINMEI, Tatuya
> Communication Platform Lab.
> Corporate R&D Center, Toshiba Corp.
> jin...@isl.rdc.toshiba.co.jp
>

I'll take a look.