maximum number of FD events (64) received

Samer Khattab

unread,

Sep 27, 2010, 5:27:01 AM9/27/10

to bind-...@lists.isc.org

Hi all,

I'm using Bind as a caching name server and serving around 2000 req per second, and recently have the following messages showing up from time to time in the general.log.

27-Sep-2010 10:45:47.639 sockmgr 0x2ad7af2f5010: maximum number of FD events (64) received
27-Sep-2010 10:45:47.872 sockmgr 0x2ad7af2f5010: maximum number of FD events (64) received

BIND BIND 9.7.1-P2
RHEL 5.5 kernel 2.6.18-194.11.3.el5

What is the meaning of these messages ? Are they related to the system file descriptors ?

Sergey V. Lobanov

unread,

Sep 27, 2010, 9:42:54 AM9/27/10

to bind-...@lists.isc.org

Reconfigure Bind thus:

STD_CDEFINES='-DISC_SOCKET_MAXEVENTS=256' ./configure --your-options

then recompile

> _______________________________________________
> bind-users mailing list
> bind-...@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users

--
wbr,
Sergey V. Lobanov

Samer Khattab

unread,

Sep 27, 2010, 10:16:17 AM9/27/10

to ser...@lobanov.in, bind-...@lists.isc.org

Thanks Sergey,

I want to know one more thing, if you can help me.

Will this error cause timeouts ? does it have impact on performance ?

JINMEI Tatuya / 神明達哉

unread,

Sep 28, 2010, 2:46:53 AM9/28/10

to Samer Khattab, bind-...@lists.isc.org

At Mon, 27 Sep 2010 13:27:01 +0400,
Samer Khattab <skha...@gmail.com> wrote:

> I'm using Bind as a caching name server and serving around 2000 req per
> second, and recently have the following messages showing up from time to
> time in the general.log.
>
> 27-Sep-2010 10:45:47.639 sockmgr 0x2ad7af2f5010: maximum number of FD events
> (64) received
> 27-Sep-2010 10:45:47.872 sockmgr 0x2ad7af2f5010: maximum number of FD events
> (64) received
>
> BIND BIND 9.7.1-P2
> RHEL 5.5 kernel 2.6.18-194.11.3.el5
>
> What is the meaning of these messages ? Are they related to the system file
> descriptors ?

These logs are not (directly) related to file descriptors. They mean
epoll returned more socket events than the implementation normally
expects (which is 64). This is not necessarily an error because the
remaining events will be returned with the next call to epoll_wait().
However, the event loop should generally runs pretty quickly, so it's
still an unexpected situation.

You may want to check overall stability of the server, e.g., in terms
of the ratio of server failures (SERVFAIL) that your server returns to
the clients, cache memory footprint, cache hit ratio, number of query
drops (if any), etc. If these are okay and you only see the log
messages occasionally, you can probably ignore them.

Otherwise, if you use multiple threads on a multi-core machine and you
set max-cache-size to some finite value, you may be hit by a recently
found bug in the cache memory management, which can make a caching
server very busy. (but it's a wild guess: I've personally never seen
this bug trigger the log message in question). This bug will be fixed
in 9.7.2.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.

Dmitry Rybin

unread,

Dec 9, 2010, 3:55:03 AM12/9/10

to bind-...@lists.isc.org

28.09.2010 10:46, JINMEI Tatuya / 神明達哉 пишет:

> These logs are not (directly) related to file descriptors. They mean
> epoll returned more socket events than the implementation normally
> expects (which is 64). This is not necessarily an error because the
> remaining events will be returned with the next call to epoll_wait().
> However, the event loop should generally runs pretty quickly, so it's
> still an unexpected situation.
>
> You may want to check overall stability of the server, e.g., in terms
> of the ratio of server failures (SERVFAIL) that your server returns to
> the clients, cache memory footprint, cache hit ratio, number of query
> drops (if any), etc. If these are okay and you only see the log
> messages occasionally, you can probably ignore them.
>
> Otherwise, if you use multiple threads on a multi-core machine and you
> set max-cache-size to some finite value, you may be hit by a recently
> found bug in the cache memory management, which can make a caching
> server very busy. (but it's a wild guess: I've personally never seen
> this bug trigger the log message in question). This bug will be fixed
> in 9.7.2.

A have same error after upgrade from 9.7.0-P1 to 9.7.2-P2:

Dec 9 11:40:03 thunderball named[13574]: 09-Dec-2010 11:40:03.719
general: info: sockmgr 0x101856f70: maximum number of FD events (64)
received

bind-9.7.2-P3, FreeBSD 8. vanilla src:
make with:

$ STD_CDEFINES='-DFD_SETSIZE=16384' ./configure --enable-threads
--enable-largefile --enable-atomic --with-libxml2=yes
$ STD_CDEFINES='-DFD_SETSIZE=16384' make

=======================
Decision:
Add to configure & make STD_CDEFINES='-DISC_SOCKET_MAXEVENTS=256'

--
Рыбин Дмитрий
Эксперт по аварийному восстановлению сервисов
Отдел систем ШПД
Департамент ИТ- инфраструктуры
Группа компаний Вымпелком
Tel: +7(495) 7871000