Operation cancelled Error

Ben

unread,

May 23, 2012, 2:14:02 PM5/23/12

to bind-...@lists.isc.org

Hi,

I am doing load testing for bind as caching dns server.Fro that i
configure one machine as client and one as server.I setup bind as
caching dns server and set recursive-clients 30000.

While doing load test from client machine via resperf, i got many errors
in named.run file which shows,I checked that time there is no cpu high
usage / memory high usage on server and clients.Why server is not
permitted operation.

23-May-2012 23:30:12.085 error (operation canceled) resolving
'www.thethreadexchange.com/AAAA/IN': 192.33.14.30#53
23-May-2012 23:30:12.085 error (operation canceled) resolving
'c2.nstld.net/A/IN': 192.42.93.31#53
23-May-2012 23:30:12.085 error (operation canceled) resolving
'nothirst.com/A/IN': 192.54.112.30#53
23-May-2012 23:30:12.085 error (operation canceled) resolving
'172.153.42.186.in-addr.arpa/PTR/IN': 199.212.0.53#53
23-May-2012 23:30:12.085 error (operation canceled) resolving
'xxy.com/MX/IN': 192.12.94.30#53
23-May-2012 23:30:12.086 error (operation canceled) resolving
'192.140.138.187.in-addr.arpa/PTR/IN': 193.0.9.3#53
23-May-2012 23:30:12.086 error (operation canceled) resolving
'mail.n-u-c.ru/A/IN': 193.232.128.6#53
23-May-2012 23:30:12.086 error (operation canceled) resolving
'www.gayteacher.net/A/IN': 108.59.10.134#53
23-May-2012 23:30:12.086 error (operation canceled) resolving
'www.forever-christies.com/A/IN': 192.12.94.30#53
23-May-2012 23:30:12.086 error (operation canceled) resolving
'166.98.232.189.in-addr.arpa/PTR/IN': 200.3.13.10#53
23-May-2012 23:30:12.086 error (operation canceled) resolving
'89.140.112.200.in-addr.arpa/PTR/IN': 202.12.28.140#53
23-May-2012 23:30:12.086 error (operation canceled) resolving
'9z772drlt.89ys/A/IN': 192.228.79.201#53
23-May-2012 23:30:12.087 error (operation canceled) resolving
'video327.myfreecams.com/A/IN': 192.26.92.30#53
23-May-2012 23:30:12.087 error (operation canceled) resolving
'ns1.thny.bbc.co.uk/A/IN': 194.83.244.131#53
23-May-2012 23:30:12.087 error (operation canceled) resolving
'6.246.26.190.in-addr.arpa/PTR/IN': 200.3.13.10#53
23-May-2012 23:30:12.087 error (operation canceled) resolving
'instagram.com/A/IN': 192.54.112.30#53
23-May-2012 23:30:12.087 error (operation canceled) resolving
'acriacao.com/A/IN': 192.12.94.30#53
23-May-2012 23:30:12.087 error (operation canceled) resolving
'technologie.gazeta.pl/A/IN': 192.203.230.10#53

rndc status shows,

version: 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2
CPUs found: 8
worker threads: 8
number of zones: 19
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is ON
recursive clients: 6400/29900/30000
tcp clients: 0/100
server is up and running

i constanly watch rndc status command , and at recuresive-clients tab ,
first values increases maximum up to 6000-6500, why it is not going to
maximum which i define 30000..?
rndc status shows 8 worker process, when i checked by pgrep named , it
shows only single instance.so does it need to show 8 instance or ?
Currently we use bind as caching name server , so why rndc status shows
number of zones 19..?

Kindly guide me to resolve above confusion.

Bind build info:
named -V
BIND 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2 built with
'--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu'
'--target=x86_64-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr'
'--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin'
'--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include'
'--libdir=/usr/lib64' '--libexecdir=/usr/libexec'
'--sharedstatedir=/var/lib' '--mandir=/usr/share/man'
'--infodir=/usr/share/info' '--with-libtool' '--localstatedir=/var'
'--enable-threads' '--enable-ipv6' '--with-pic' '--disable-static'
'--disable-openssl-version-check' '--with-dlz-ldap=yes'
'--with-dlz-postgres=yes' '--with-dlz-mysql=yes'
'--with-dlz-filesystem=yes' '--with-gssapi=yes' '--disable-isc-spnego'
'--with-docbook-xsl=/usr/share/sgml/docbook/xsl-stylesheets'
'build_alias=x86_64-redhat-linux-gnu'
'host_alias=x86_64-redhat-linux-gnu'
'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS= -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
--param=ssp-buffer-size=4 -m64 -mtune=generic' 'CPPFLAGS= -DDIG_SIGCHASE'

From client machine :

/usr/local/nom/bin/resperf -s 10.115.1.231 -d
/root/dnsperf_test_queries.tsv
DNS Resolution Performance Testing Tool
Nominum Version 2.0.0.0

[Status] Command line: resperf -s 10.115.1.231 -d
/root/dnsperf_test_queries.tsv
[Status] Sending
[Status] Reached 65536 outstanding queries
[Status] Waiting for more responses
[Status] Testing complete

Statistics:

Queries sent: 74038
Queries completed: 74038
Queries lost: 0
Run time (s): 100.000000
Maximum throughput: 2838.000000 qps
Lost at that point: 24.32%

what are the configuration parameter required to increase QPS for
server? I mean any fine tuning in bind / OS side, please suggest us.

Best Regards,
Ben

Regards,
Ben

Ben

unread,

May 24, 2012, 7:43:19 AM5/24/12

to bind-...@lists.isc.org

Hello,

Any reply please...

Regards,
Ben

Jeremy C. Reed

unread,

May 24, 2012, 8:54:47 AM5/24/12

to Ben, bind-...@lists.isc.org

On Thu, 24 May 2012, Ben wrote:

> > version: 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2
> > CPUs found: 8
> > worker threads: 8
> > number of zones: 19
> > debug level: 0
> > xfers running: 0
> > xfers deferred: 0
> > soa queries in progress: 0
> > query logging is ON
> > recursive clients: 6400/29900/30000
> > tcp clients: 0/100
> > server is up and running
> >
> >
> > i constanly watch rndc status command , and at recuresive-clients tab ,
> > first values increases maximum up to 6000-6500, why it is not going to
> > maximum which i define 30000..?

I don't know why it never reached the maximum. resperf should try to
scale up to attempting 100,000 questions in its last second. (At 60th
second I think; the final 40 seconds is waiting for responses.) It only
tries 74038 during its total time, but I am not sure what is limiting
it.

Maybe your datafile is not unique enough? Maybe your source port range
is not large enough? So then BIND 9 is matching existing requests and
dropping.

It depends a lot on the dataset. (I think I have seen around 17,000
queries with resperf and as low as 236 qps -- in this case it was
depending on number of ACLs.)

I don't know why you have the burst of "operation canceled". (The
ISC_R_CANCELED can happen from different problems.)

> > rndc status shows 8 worker process, when i checked by pgrep named , it
> > shows only single instance.so does it need to show 8 instance or ?

8 worker threads is different than 8 processes.

> > Currently we use bind as caching name server , so why rndc status shows
> > number of zones 19..?

The 19 zones are built-in zones. (See the ARM for the list.)

By the way, to set some comparison maximum baseline you can try having
resperf query the built-in zones. (It won't be real recursive work, but
should show you some potential maximum qps.)

Jeremy C. Reed
ISC

Ben

unread,

May 25, 2012, 2:13:47 AM5/25/12

to Jeremy C. Reed, bind-...@lists.isc.org

Hi Jeremy,

Thanks for your kind response.

> On Thu, 24 May 2012, Ben wrote:
>
>>> version: 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2
>>> CPUs found: 8
>>> worker threads: 8
>>> number of zones: 19
>>> debug level: 0
>>> xfers running: 0
>>> xfers deferred: 0
>>> soa queries in progress: 0
>>> query logging is ON
>>> recursive clients: 6400/29900/30000
>>> tcp clients: 0/100
>>> server is up and running
>>>
>>>
>>> i constanly watch rndc status command , and at recuresive-clients tab ,
>>> first values increases maximum up to 6000-6500, why it is not going to
>>> maximum which i define 30000..?
> I don't know why it never reached the maximum. resperf should try to
> scale up to attempting 100,000 questions in its last second. (At 60th
> second I think; the final 40 seconds is waiting for responses.) It only
> tries 74038 during its total time, but I am not sure what is limiting
> it.
>
> Maybe your datafile is not unique enough? Maybe your source port range
> is not large enough? So then BIND 9 is matching existing requests and
> dropping.

My source port range is
cat /proc/sys/net/ipv4/ip_local_port_range
1024 65535

I downloaded data file from resperf provider site.

> It depends a lot on the dataset. (I think I have seen around 17,000
> queries with resperf and as low as 236 qps -- in this case it was
> depending on number of ACLs.)

I do not using more acl for testing purpose.

> I don't know why you have the burst of "operation canceled". (The
> ISC_R_CANCELED can happen from different problems.)

Please suggest us that what are reasons generate "operation canceled"
error comes in named.run log file

>>> rndc status shows 8 worker process, when i checked by pgrep named , it
>>> shows only single instance.so does it need to show 8 instance or ?
> 8 worker threads is different than 8 processes.
>
>>> Currently we use bind as caching name server , so why rndc status shows
>>> number of zones 19..?
> The 19 zones are built-in zones. (See the ARM for the list.)
>
> By the way, to set some comparison maximum baseline you can try having
> resperf query the built-in zones. (It won't be real recursive work, but
> should show you some potential maximum qps.)
>

Is there anything which we need to mind on OS kernel tuning parameters
or from bind configuration side to achieve more QPS?

By the way, what is highest benchmark for bind with no. of QPS in
production servers?

I would request you , if someone has getting high QPS with bind in
production servers, kindly suggest your inputs.

> Jeremy C. Reed
> ISC
Regards,
Ben

Ben

unread,

May 25, 2012, 8:11:38 AM5/25/12

to Jeremy C. Reed, bind-...@lists.isc.org

Hi,

I tried all things to avoid current problem, but still same.Can we have
information that why bind shows "Operation canceled" error in named.run
file? and why bind does not take full power?when i do load test and same
time watching rndc status command , it only tries to reach to 6000-6500
, and then goes back to 0..

Is there anything remaining in bind to configure or any issue in OS?

I would request you to please suggest me to solve this.

Regards,
Ben

Ben

unread,

May 27, 2012, 8:09:38 AM5/27/12

to Jeremy C. Reed, bind-...@lists.isc.org

Dear ISC Team,

Any suggestions please.

Regards,
Ben

> Hi,
>
> I tried all things to avoid current problem, but still same.Can we
> have information that why bind shows "Operation canceled" error in
> named.run file? and why bind does not take full power?when i do load
> test and same time watching rndc status command , it only tries to
> reach to 6000-6500 , and then goes back to 0..
>
> Is there anything remaining in bind to configure or any issue in OS?
>
> I would request you to please suggest me to solve this.
>
> Regards,
> Be
>
>

Ben

unread,

May 31, 2012, 10:08:21 AM5/31/12

to Jeremy C. Reed, bind-...@lists.isc.org

Dear ISC Team,

Any input please, if is there anything from my side, kindly suggest me.

Best Regards,
Ben