Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

BIND 9.3.2 on FreeBSD 6.1 out of control

0 views
Skip to first unread message

patrick

unread,
Jan 2, 2007, 1:45:58 PM1/2/07
to
I'm running BIND 9.3.2 on FreeBSD 6.1, and am noticing that it gets
out of control after running for a while.

PID UID THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
60480 53 1 132 0 195M 194M RUN 41.7H 75.54% named

After restarting it, its CPU usage goes back down to what it should
be, as does its memory usage. I really don't want to babysit this
process, so I'm trying to find the cause of this. I have
"max-cache-size" set to "150M", as before I turned this on, this
process would just grow and grow until it hit FreeBSD's limit and
would stop responding all together not to mention eating up as much
CPU time as it could.

I never had this problem at all with BIND 8, and am wondering if
there's something I'm doing wrong with BIND 9 to have this problem?
Has anyone else experienced this? On a related note, is there any way
to have BIND 9 keep its memory usage much lower automatically without
having to set an arbitrary/explicit max-cache-size? (ie. Have it clean
up its own cache for entries that haven't been used after a certain
period of time?)

Any help would be greatly appreciated.

Patrick


Mark Andrews

unread,
Jan 2, 2007, 5:59:38 PM1/2/07
to

> I'm running BIND 9.3.2 on FreeBSD 6.1, and am noticing that it gets
> out of control after running for a while.
>
> PID UID THR PRI NICE SIZE RES STATE TIME WCPU COMMAN
> D
> 60480 53 1 132 0 195M 194M RUN 41.7H 75.54% named
>
> After restarting it, its CPU usage goes back down to what it should
> be, as does its memory usage. I really don't want to babysit this
> process, so I'm trying to find the cause of this. I have
> "max-cache-size" set to "150M", as before I turned this on, this
> process would just grow and grow until it hit FreeBSD's limit and
> would stop responding all together not to mention eating up as much
> CPU time as it could.
>
> I never had this problem at all with BIND 8, and am wondering if
> there's something I'm doing wrong with BIND 9 to have this problem?

BIND 8 has a smaller memory foot print.

> Has anyone else experienced this? On a related note, is there any way
> to have BIND 9 keep its memory usage much lower automatically without
> having to set an arbitrary/explicit max-cache-size? (ie. Have it clean
> up its own cache for entries that haven't been used after a certain
> period of time?)

It does do that based on ttl. max-cache-size is an additional
constraint. You can also adjust maximum retention times using
max-cache-ttl and max-ncache-ttl.

195M with a 150M max-cache-size sounds about right. You have
150M of active cache and some overhead.



> Any help would be greatly appreciated.
>
> Patrick
>
>

--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: Mark_A...@isc.org


patrick

unread,
Jan 3, 2007, 2:21:03 PM1/3/07
to
But when it reaches that limit, is there any reason why the named
process starts eating up all CPU time? The memory size, I can handle
(and control) -- it's the unexplained CPU usage that concerns me.

This server is a master for 143 domains, and will likely take on many
more. Is there a way I can see what's in the cache? I'd just like to
get an idea of how much memory each record takes up so I can do some
math for future planning on resource requirements.

Patrick

patrick

unread,
Jan 7, 2007, 5:42:49 PM1/7/07
to
So is no one else experiencing this sudden surge in CPU usage when
BIND 9 hits its max cache size?

Mark Andrews

unread,
Jan 7, 2007, 7:02:05 PM1/7/07
to

> So is no one else experiencing this sudden surge in CPU usage when
> BIND 9 hits its max cache size?

When the cache hits it's max size it trigger's a cleaning
pass which randomly removes 25% of the RRset's in the cache
as well as cleaning those that have expired. That will
take some cpu usage.

At some point it would be nice to make the overmemory removal
be LRU based rather than random.

Mark

> On 1/3/07, patrick <gibbl...@gmail.com> wrote:
> > But when it reaches that limit, is there any reason why the named
> > process starts eating up all CPU time? The memory size, I can handle
> > (and control) -- it's the unexplained CPU usage that concerns me.
> >
> > This server is a master for 143 domains, and will likely take on many
> > more. Is there a way I can see what's in the cache? I'd just like to
> > get an idea of how much memory each record takes up so I can do some
> > math for future planning on resource requirements.
> >
> > Patrick

bert hubert

unread,
Jan 8, 2007, 4:23:40 AM1/8/07
to
On Mon, Jan 08, 2007 at 11:02:05AM +1100, Mark Andrews wrote:
> When the cache hits it's max size it trigger's a cleaning
> pass which randomly removes 25% of the RRset's in the cache
> as well as cleaning those that have expired. That will
> take some cpu usage.
>
> At some point it would be nice to make the overmemory removal
> be LRU based rather than random.

I can tell from first hand experience that that works very well, much
advised.

PowerDNS doesn't go all the way, but simply shifts a cache entry to the back
of the 'to delete queue' whenever it is accessed.

Bert

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://netherlabs.nl Open and Closed source services


patrick

unread,
Jan 8, 2007, 3:51:05 PM1/8/07
to
Seems to me like this shouldn't take all that long though, should it?
What's happening with me is that when it hits the max_cache_size, it
consumes the CPU indefinitely. The only way to get it to settle down
is to kill the process and restart it.

Out of curiosity, do most people set max_cache_size? If not, how is
BIND not indefinitely growing? If they do, how big is it usually set?
I'm getting the impression that most people are not experiencing the
same problems as me, but I have no idea what I could be doing wrong
because I'm not sure how everyone else has configured their servers.
My configuration file hasn't changed too much since upgrading from
BIND 8.

Patrick

On 1/7/07, Mark Andrews <Mark_A...@isc.org> wrote:
>
> > So is no one else experiencing this sudden surge in CPU usage when
> > BIND 9 hits its max cache size?
>

> When the cache hits it's max size it trigger's a cleaning
> pass which randomly removes 25% of the RRset's in the cache
> as well as cleaning those that have expired. That will
> take some cpu usage.
>
> At some point it would be nice to make the overmemory removal
> be LRU based rather than random.
>

Eric J. Feldhusen

unread,
Jan 8, 2007, 4:43:43 PM1/8/07
to
patrick wrote:
> Seems to me like this shouldn't take all that long though, should it?
> What's happening with me is that when it hits the max_cache_size, it
> consumes the CPU indefinitely. The only way to get it to settle down
> is to kill the process and restart it.

I can't find the thread, but I thought I remember reading something on
the FreeBSD lists about this problem with BIND 9.3.x and that it was a
FreeBSD specific problem. You might find something on the FreeBSD lists
that could help.


--
Eric Feldhusen
Network Administrator http://www.remc1.org
er...@remc1.org
PO Box 270 (906) 482-4520 x239
809 Hecla St (906) 482-5031 fax
Hancock, MI 49930 (906) 370 6202 mobile


Mark Andrews

unread,
Jan 8, 2007, 6:29:39 PM1/8/07
to

> Seems to me like this shouldn't take all that long though, should it?
> What's happening with me is that when it hits the max_cache_size, it
> consumes the CPU indefinitely. The only way to get it to settle down
> is to kill the process and restart it.
>
> Out of curiosity, do most people set max_cache_size? If not, how is
> BIND not indefinitely growing? If they do, how big is it usually set?
> I'm getting the impression that most people are not experiencing the
> same problems as me, but I have no idea what I could be doing wrong
> because I'm not sure how everyone else has configured their servers.
> My configuration file hasn't changed too much since upgrading from
> BIND 8.
>
> Patrick

Every record has a TTL. Records are removed once that time
has expired either as a side effect of the lookup or in the
regular cleaning sweep. This is no different to BIND 8.

You could be seeing this which is fixed in BIND 9.4.0rc1.

2116. [bug] 'rndc reload' could cause the cache to continually
be cleaned. [RT #16401]

If 'rndc reload/reconfig' (not rndc reload zone) was run
while named was cleaning due to memory constaints it could
run continually.

If you are not running 'rndc reload' it should stop.

Mark

Thomas Schulz

unread,
Jan 9, 2007, 11:32:39 AM1/9/07
to
In article <enub7h$2v57$1...@sf1.isc.org>, patrick <gibbl...@gmail.com> wrote:
>Seems to me like this shouldn't take all that long though, should it?
>What's happening with me is that when it hits the max_cache_size, it
>consumes the CPU indefinitely. The only way to get it to settle down
>is to kill the process and restart it.
>
>Out of curiosity, do most people set max_cache_size? If not, how is
>BIND not indefinitely growing? If they do, how big is it usually set?
>I'm getting the impression that most people are not experiencing the
>same problems as me, but I have no idea what I could be doing wrong
>because I'm not sure how everyone else has configured their servers.
>My configuration file hasn't changed too much since upgrading from
>BIND 8.

I think that many if not most people do not set max_cache_size. If you
have enough memory and have limits set high enough that they never hit,
bind will reach a high but stable memory usage after running several
days.

>Patrick
>
>On 1/7/07, Mark Andrews <Mark_A...@isc.org> wrote:
>>
>> > So is no one else experiencing this sudden surge in CPU usage when
>> > BIND 9 hits its max cache size?
>>
>> When the cache hits it's max size it trigger's a cleaning
>> pass which randomly removes 25% of the RRset's in the cache
>> as well as cleaning those that have expired. That will
>> take some cpu usage.
>>
>> At some point it would be nice to make the overmemory removal
>> be LRU based rather than random.
>>
>> Mark
>>
>> > On 1/3/07, patrick <gibbl...@gmail.com> wrote:
>> > > But when it reaches that limit, is there any reason why the named
>> > > process starts eating up all CPU time? The memory size, I can handle
>> > > (and control) -- it's the unexplained CPU usage that concerns me.
>> > >
>> > > This server is a master for 143 domains, and will likely take on many
>> > > more. Is there a way I can see what's in the cache? I'd just like to
>> > > get an idea of how much memory each record takes up so I can do some
>> > > math for future planning on resource requirements.
>> > >
>> > > Patrick
>> --
>> Mark Andrews, ISC
>> 1 Seymour St., Dundas Valley, NSW 2117, Australia
>> PHONE: +61 2 9871 4742 INTERNET: Mark_A...@isc.org
>>
>
>


--
Tom Schulz
sch...@adi.com


Gushi

unread,
Jan 12, 2007, 2:38:58 PM1/12/07
to
last pid: 75482; load averages: 1.07, 1.02, 1.00
up 8+21:17:23 14:40:02
27 processes: 2 running, 25 sleeping
CPU states: 85.6% user, 0.0% nice, 12.5% system, 1.9% interrupt,
0.0% idle
Mem: 169M Active, 186M Inact, 104M Wired, 112M Buf, 1544M Free
Swap: 967M Total, 967M Free

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU
COMMAND
303 bind 125 0 162M 162M RUN 38.0H 93.41% 93.41% named

The above is with max-cache-size 2000000;

There is still a problem here.

-Dan

JINMEI Tatuya / 神明達哉

unread,
Jan 15, 2007, 7:16:49 AM1/15/07
to
>>>>> On Mon, 8 Jan 2007 12:51:05 -0800,
>>>>> patrick <gibbl...@gmail.com> said:

> Seems to me like this shouldn't take all that long though, should it?
> What's happening with me is that when it hits the max_cache_size, it
> consumes the CPU indefinitely. The only way to get it to settle down
> is to kill the process and restart it.

> Out of curiosity, do most people set max_cache_size? If not, how is
> BIND not indefinitely growing? If they do, how big is it usually set?
> I'm getting the impression that most people are not experiencing the
> same problems as me, but I have no idea what I could be doing wrong
> because I'm not sure how everyone else has configured their servers.
> My configuration file hasn't changed too much since upgrading from
> BIND 8.

I've often heard this type of symptom, especially on FreeBSD. One
common cause is that BIND9's cache cleaning (prior to 9.4) relies on
system's malloc()/free() routines, which could run inefficiently on
FreeBSD (I'm not really sure if it's still the case for 6.1, but it
was at least for some 4.X versions); and one common workaround is to
enable BIND9's internal memory allocator by rebuilding BIND9 with:

% ./configure STD_CDEFINES='-DISC_MEM_USE_INTERNAL_MALLOC=1'

This should be very effectively particularly for the case of cache
cleaning because it will most likely recycle pre-allocated memory
fragments without involving any system calls or possible-inefficient
standard libraries.

Notes:
- enabling the internal allocator may cause another performance
problem if you also enable threads on a multi-processor machine
(although it wouldn't matter much for FreeBSD because FreeBSD's
thread support is not really friendly with BIND9 wrt performance -
at least before 7.0)

- BIND 9.4 enables the internal allocator by default with removing the
possible bottlenecks when using it with threads (and it works well
for FreeBSD threads). So if you want to enable threads on a
multi-processor server, you may want to try 9.4; however, 9.4's
better support for threads will require a bit more memory footprint,
which is another tradeoff issue.

JINMEI, Tatuya
Communication Platform Lab.
Corporate R&D Center, Toshiba Corp.
jin...@isl.rdc.toshiba.co.jp


Matthew Schlosser

unread,
Jan 15, 2007, 5:17:27 PM1/15/07
to
> I've often heard this type of symptom, especially on FreeBSD. One
> common cause is that BIND9's cache cleaning (prior to 9.4) relies on
> system's malloc()/free() routines, which could run inefficiently on
> FreeBSD (I'm not really sure if it's still the case for 6.1, but it
> was at least for some 4.X versions); and one common workaround is to
> enable BIND9's internal memory allocator by rebuilding BIND9 with:
>
> % ./configure STD_CDEFINES='-DISC_MEM_USE_INTERNAL_MALLOC=1'
>
> This should be very effectively particularly for the case of cache
> cleaning because it will most likely recycle pre-allocated memory
> fragments without involving any system calls or possible-inefficient
> standard libraries.
>
[snip]

Could this affect the problem I'm having (see message "Bind9 Crazy-high CPU
on Linux")? The symptoms are similar, but not the same.


JINMEI Tatuya / 神明達哉

unread,
Jan 15, 2007, 9:56:11 PM1/15/07
to
>>>>> On Mon, 15 Jan 2007 16:17:27 -0600,
>>>>> "Matthew Schlosser" <mschl...@eschelon.com> said:

>> I've often heard this type of symptom, especially on FreeBSD. One
>> common cause is that BIND9's cache cleaning (prior to 9.4) relies on
>> system's malloc()/free() routines, which could run inefficiently on
>> FreeBSD (I'm not really sure if it's still the case for 6.1, but it
>> was at least for some 4.X versions); and one common workaround is to
>> enable BIND9's internal memory allocator by rebuilding BIND9 with:
>>
>> % ./configure STD_CDEFINES='-DISC_MEM_USE_INTERNAL_MALLOC=1'
>>
>> This should be very effectively particularly for the case of cache
>> cleaning because it will most likely recycle pre-allocated memory
>> fragments without involving any system calls or possible-inefficient
>> standard libraries.

> Could this affect the problem I'm having (see message "Bind9 Crazy-high CPU


> on Linux")? The symptoms are similar, but not the same.

Perhaps, but it also depends on details of the problem and of your
environment, especially the memory usage when the problem occurs and
whether you specify max-cache-size (and if yes the value of the
option).

It should also be noted that even if the use of the internal allocator
mitigates the problem it doesn't change the fact that cache cleaning
is a heavy task. So, while cleaning the cache, the CPU will still be
pretty busy (the busy period will probably be shortened), and some
incoming queries may still be dropped during the period. BIND 9.4
remedies the query drop issue further, by adjusting non-interruptible
cleaning periods (during which queries aren't processed) based on
average query rate.

In summary,

- the internal allocator on 9.3 may help, but the CPU will still be
busy (though the busy period may be shorter) and queries may still
be dropped during the busy period.
- 9.4 may also help, and hopefully there will be no query drops;
however, the CPU will still be busy for some shorter period.

Gushi

unread,
Jan 16, 2007, 11:30:25 PM1/16/07
to
I'll try this once BIND 9.4 settles into ports. Jinmei, thanks for all
the advice.

-Dan

0 new messages