Accelerating response of dns.resolver.query

1,905 views
Skip to first unread message

Peter Václavek

unread,
Jan 25, 2013, 8:45:31 AM1/25/13
to dnspyth...@googlegroups.com
Is there a way to accelerate getting answers from resolver ?
I need to run a script on 200 000 domains... they are fetched from database.
Every second counts :)



resolver = dns.resolver.Resolver()
resolver.lifetime= 1
resolver.timeout= 0.5

def getAAAArecord(domain):
    try:
        answers = resolver.query(domain,'AAAA')
        return answers[0]
    except:
        return "N/A"



         print "*** Getting AAAA NS records for", domain
        writeLog("*** Getting AAAA NS records for "+domain)
        try:
            answers = resolver.query(domain, 'NS')
            for ns in answers:
                ip_ns = getAAAArecord(str(ns))
                print ip_ns,' :: ', ns
                writeLog(str(ip_ns)+' :: '+str(ns))
                       
                sql = "INSERT INTO ns (id_domain,ns,ip_ns) VALUES ("+str(id_domain)+",'"+str(ns)+"','"+str(ip_ns)+"')"
                try:
                    cursor.execute(sql)
                    db.commit()
                except:
                    db.rollback()
                       
            print
            writeLog("\n")
        except NXDOMAIN as enx:
            print "Error: NXDOMAIN - domain does not exist"
            writeLog("Error: NXDOMAIN - domain does not exist")
        except Timeout as etime:
            print "Error: dns.exception.Timeout"
            writeLog("Error: dns.exception.Timeout")

ajportier

unread,
Jan 25, 2013, 10:22:20 AM1/25/13
to dnspyth...@googlegroups.com
I would look into making your lookup function thread aware so you aren't single threading the lookups (so long as there isn't a sequential dependency for each transaction). I had a similar problem and threading my lookups increased performance 50% by running a small pool of 6 simultaneous queries.

Peter Václavek

unread,
Jan 25, 2013, 3:55:50 PM1/25/13
to dnspyth...@googlegroups.com

Would you be so kind and wrote me some simple code using similiar queries which I used with threads? You would make much more easier for me to understand how to write multi-thread scripts in Python.

ajportier

unread,
Jan 25, 2013, 4:17:46 PM1/25/13
to dnspyth...@googlegroups.com
My script was for checking the return codes of a large list of records against a large list of servers. I tried to strip out the complicated bits and make this just a threaded lookup function. It may not compile; read up on "threading" and "Queue" if you have problems.

import sys
import Queue
import threading
import dns.resolver

class ThreadDnsLookup(threading.Thread):
    ''' Thread-aware DNS lookup '''
    
    def __init__(self, queue):
        threading.Thread.__init__(self)
        self.queue = queue

    def run(self):
        while True:
            fqdn = self.queue.get()
            
            resolver = dns.resolver.Resolver()
            try:
                answers = resolver.query(qname=fqdn)
                rrset = [rr.address for rr in answers.rrset]
                for rr in rrset
                    print rr
                # do whatever other sequence dependent stuff here,
                # or save returns to global objects. You shouldn't
                # return from a threaded function because you have no
                # control over when it finishes.
            except dns.resolver.NXDOMAIN:
                print 'NXDOMAIN'
            except dns.resolver.Timeout:
                print 'Timeout'
            except dns.resolver.NoAnswer:
                print 'No Answer'
            
            # indicate that the lookup is complete
            self.queue.task_done()

if __name__ == '__main__':
    threadCount = 6
    fqdn_list = open(sys.argv[1],'rb')

    queue = Queue.Queue()
    for i in range(threadCount):
        t = ThreadDnsLookup(queue, rcodes, rrsets)
        t.setDaemon(True)
        t.start()

    # add each fqdn to check to the queue for work            
    for fqdn in fqdn_list:
        queue.put(fqdn)
    
    # wait for all threads to finish    
    queue.join()

On Friday, January 25, 2013 8:45:31 AM UTC-5, Peter Václavek wrote:

Peter Václavek

unread,
Jan 25, 2013, 7:10:00 PM1/25/13
to dnspyth...@googlegroups.com
Many thanks ajportier !
I had already done some test scripts and there are working, but I need to tune it a little bit more.

I have one more question... I made two versions of the script. In one I have not used queue. In the other I used queue.
I though that the use of queue is not mandatory but without it was script slower than the non-parallelized version. Why is that?
Might be the reason why it is working like that because queue starts threads "simultaneously" (just waiting for task to be added into queue) but without using it, threads are too fast and they are processed serially?
Reply all
Reply to author
Forward
0 new messages