asyncio - run coroutine in the background

Frank Millman

unread,

Feb 15, 2016, 1:36:00 AM2/15/16

to

Hi all

Using asyncio, there are times when I want to execute a coroutine which is
time-consuming. I do not need the result immediately, and I do not want to
block the current task, so I want to run it in the background.

run_in_executor() can run an arbitrary function in the background, but a
coroutine needs an event loop. After some experimenting I came up with
this -

class BackgroundTask:
async def run(self, coro, args, callback=None):
loop = asyncio.get_event_loop()
loop.run_in_executor(None, self.task_runner, coro, args, callback)

def task_runner(self, coro, args, callback):
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)

fut = asyncio.ensure_future(coro(*args))
if callback is not None:
fut.add_done_callback(callback)

loop.run_until_complete(fut)
loop.close()

Usage -

bg_task = BackgroundTask()
args = (arg1, arg2 ...)
callback = my_callback_function
await bg_task.run(coro, args, callback)

Although it 'awaits' bk_task.run(), it returns immediately, as it is simply
waiting for run_in_executor() to be launched.

Hope this is of some interest.

Frank Millman

Marko Rauhamaa

unread,

Feb 15, 2016, 1:54:52 AM2/15/16

to

"Frank Millman" <fr...@chagford.com>:

> Using asyncio, there are times when I want to execute a coroutine which
> is time-consuming. I do not need the result immediately, and I do not
> want to block the current task, so I want to run it in the background.

You can't run code "in the background" using asyncio. Coroutines perform
cooperative multitasking in a single thread on a single CPU.

Parallel processing requires the use of threads or, often preferably,
processes.

To put it in another way, never run time-consuming code in asyncio.

Marko

Frank Millman

unread,

Feb 15, 2016, 2:17:01 AM2/15/16

to

"Marko Rauhamaa" wrote in message news:8737sum...@elektro.pacujo.net...

No arguments there.

I started with a task that ran quickly, but as I added stuff it started to
slow down.

The execution of the task involves calling some existing functions, which
are themselves coroutines. As you have noted elsewhere, once you turn one
function into a coroutine, all calls higher up the chain have to be
coroutines as well.

The benefit of my class is that it enables me to take the coroutine and run
it in another thread, without having to re-engineer the whole thing.

Hope this makes sense.

Frank

Marko Rauhamaa

unread,

Feb 15, 2016, 2:34:48 AM2/15/16

to

"Frank Millman" <fr...@chagford.com>:

> The benefit of my class is that it enables me to take the coroutine
> and run it in another thread, without having to re-engineer the whole
> thing.
>
> Hope this makes sense.

Sure.

Marko

Paul Rubin

unread,

Feb 15, 2016, 2:39:20 AM2/15/16

to

"Frank Millman" <fr...@chagford.com> writes:
> The benefit of my class is that it enables me to take the coroutine
> and run it in another thread, without having to re-engineer the whole
> thing.

Threads in Python don't get you parallelism either, of course.

I haven't used async/await yet and it's looking painful. I've been
wanting to read this:

http://www.snarky.ca/how-the-heck-does-async-await-work-in-python-3-5

but I start to think it isn't all that great an approach to concurrency.

Frank Millman

unread,

Feb 15, 2016, 3:17:50 AM2/15/16

to

"Paul Rubin" wrote in message
news:87h9ha8...@jester.gateway.pace.com...

>
> "Frank Millman" <fr...@chagford.com> writes:
> > The benefit of my class is that it enables me to take the coroutine
> > and run it in another thread, without having to re-engineer the whole
> > thing.
>
> Threads in Python don't get you parallelism either, of course.
>

Agreed. My class does not alter the total time taken, but it does free up
the original task to carry on with other work.

run_in_executor() uses threads by default, but it does allow you to specify
processes as an alternative.

> I haven't used async/await yet and it's looking painful. I've been
> wanting to read this:
>
> http://www.snarky.ca/how-the-heck-does-async-await-work-in-python-3-5
>
> but I start to think it isn't all that great an approach to concurrency.
>

Thanks for that link. I had a quick scan, and it looks interesting, but some
of it a bit above my head. I have bookmarked it, as I think that as my
understanding increases, I will gain more from it on each re-read.

Frank

Marko Rauhamaa

unread,

Feb 15, 2016, 6:05:44 AM2/15/16

to

Paul Rubin <no.e...@nospam.invalid>:

> Threads in Python don't get you parallelism either, of course.

Ah, of course.

Processes it is, then.

Marko

Chris Angelico

unread,

Feb 16, 2016, 12:51:41 AM2/16/16

to

On Mon, Feb 15, 2016 at 6:39 PM, Paul Rubin <no.e...@nospam.invalid> wrote:
> "Frank Millman" <fr...@chagford.com> writes:
>> The benefit of my class is that it enables me to take the coroutine
>> and run it in another thread, without having to re-engineer the whole
>> thing.
>
> Threads in Python don't get you parallelism either, of course.
>

They can. The only limitation is that, in CPython (and some others),
no two threads can concurrently be executing Python byte-code. The
instant you drop into a C-implemented function, it can release the GIL
and let another thread start running. Obviously this happens any time
there's going to be a blocking API call (eg if one thread waits on a
socket read, others can run), but it can also happen with
computational work:

import numpy
import threading

def thread1():
arr = numpy.zeros(100000000, dtype=numpy.int64)
while True:
print("1: %d" % arr[0])
arr += 1
arr = (arr * arr) % 142957

def thread2():
arr = numpy.zeros(100000000, dtype=numpy.int64)
while True:
print("2: %d" % arr[0])
arr += 2
arr = (arr * arr) % 142957

threading.Thread(target=thread1).start()
thread2()

This will happily keep two CPU cores occupied. Most of the work is
being done inside Numpy, which releases the GIL before doing any work.
So it's not strictly true that threading can't parallelise Python code
(and as mentioned, it depends on your interpreter - Jython can, I
believe, do true multithreading), but just that there are limitations
on what can execute concurrently.

ChrisA

Kevin Conway

unread,

Feb 16, 2016, 8:23:04 AM2/16/16

to

If you're handling coroutines there is an asyncio facility for "background
tasks". The ensure_future [1] will take a coroutine, attach it to a Task,
and return a future to you that resolves when the coroutine is complete.
The coroutine you schedule with that function will not cause your current
coroutine to wait unless you await the future it returns.

[1]
https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future

On Mon, Feb 15, 2016, 23:53 Chris Angelico <ros...@gmail.com> wrote:

> On Mon, Feb 15, 2016 at 6:39 PM, Paul Rubin <no.e...@nospam.invalid>
> wrote:

> > "Frank Millman" <fr...@chagford.com> writes:
> >> The benefit of my class is that it enables me to take the coroutine
> >> and run it in another thread, without having to re-engineer the whole
> >> thing.
> >
> > Threads in Python don't get you parallelism either, of course.
> >
>

> --
> https://mail.python.org/mailman/listinfo/python-list
>

Frank Millman

unread,

Feb 16, 2016, 8:52:49 AM2/16/16

to

"Kevin Conway" wrote in message
news:CAKF=+dim8wzPRvm86_V2W5-XSop...@mail.gmail.com...

>
> If you're handling coroutines there is an asyncio facility for "background
> tasks". The ensure_future [1] will take a coroutine, attach it to a Task,
> and return a future to you that resolves when the coroutine is complete.
> The coroutine you schedule with that function will not cause your current
> coroutine to wait unless you await the future it returns.
>
> [1]
> https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future
>

Thank you Kevin!

That works perfectly, and is much neater than my effort.

Frank

Marko Rauhamaa

unread,

Feb 16, 2016, 9:17:23 AM2/16/16

to

Kevin Conway <kevinjac...@gmail.com>:

> If you're handling coroutines there is an asyncio facility for
> "background tasks". The ensure_future [1] will take a coroutine,
> attach it to a Task, and return a future to you that resolves when the
> coroutine is complete.

Ok, yes, but those "background tasks" monopolize the CPU once they are
scheduled to run.

If your "background task" doesn't need a long time to run, just call the
function in the foreground and be done with it. If it does consume time,
you need to delegate it to a separate process so the other tasks remain
responsive.

Marko

Frank Millman

unread,

Feb 16, 2016, 9:36:34 AM2/16/16

to

"Marko Rauhamaa" wrote in message news:87d1rwp...@elektro.pacujo.net...

I will explain my situation - perhaps you can tell me if it makes sense.

My background task does take a long time to run - about 10 seconds - but
most of that time is spent waiting for database responses, which is handled
in another thread.

You could argue that the database thread should rather be handled by another
process, and that is definitely an option if I find that response times are
affected.

So far my response times have been very good, even with database activity in
the background. However, I have not simulated a large number of concurrent
users. That could throw up the kinds of problem that you are concerned
about.

Frank

Kevin Conway

unread,

Feb 16, 2016, 9:55:10 AM2/16/16

to

> Ok, yes, but those "background tasks" monopolize the CPU once they are
scheduled to run.

This is true if the coroutines are cpu bound. If that is the case then a
coroutine is likely the wrong choice for that code to begin with.
Coroutines, in asyncio land, are primarily designed for io bound work.

> My background task does take a long time to run - about 10 seconds - but
most of that time is spent waiting for database responses, which is handled
in another thread.

Something else to look into is an asyncio driver for your database
connections. Threads aren't inherently harmful, but using them to achieve
async networking when running asyncio is a definite code smell since that
is precisely the problem asyncio is supposed to solve for.

On Tue, Feb 16, 2016, 08:37 Frank Millman <fr...@chagford.com> wrote:

> "Marko Rauhamaa" wrote in message news:87d1rwp...@elektro.pacujo.net.
> ..

> I will explain my situation - perhaps you can tell me if it makes sense.
>
> My background task does take a long time to run - about 10 seconds - but
> most of that time is spent waiting for database responses, which is handled
> in another thread.
>
> You could argue that the database thread should rather be handled by
> another
> process, and that is definitely an option if I find that response times are
> affected.
>
> So far my response times have been very good, even with database activity
> in
> the background. However, I have not simulated a large number of concurrent
> users. That could throw up the kinds of problem that you are concerned
> about.
>
> Frank
>
>

> --
> https://mail.python.org/mailman/listinfo/python-list
>

Steven D'Aprano

unread,

Feb 16, 2016, 10:17:30 AM2/16/16

to

On Wed, 17 Feb 2016 01:17 am, Marko Rauhamaa wrote:

> Ok, yes, but those "background tasks" monopolize the CPU once they are
> scheduled to run.

Can you show some code demonstrating this?

--
Steven

Frank Millman

unread,

Feb 16, 2016, 10:21:37 AM2/16/16

to

"Kevin Conway" wrote in message

news:CAKF=+dhXZ=yax8STAWr_gjX3Tg8yUj...@mail.gmail.com...

>
> > My background task does take a long time to run - about 10 seconds - but
> > most of that time is spent waiting for database responses, which is
> > handled
> > in another thread.
>
> Something else to look into is an asyncio driver for your database
> connections. Threads aren't inherently harmful, but using them to achieve
> async networking when running asyncio is a definite code smell since that
> is precisely the problem asyncio is supposed to solve for.
>

Maybe I have not explained very well. I am not using threads to achieve
async networking. I am using asyncio in a client server environment, and it
works very well. If a client request involves a database query, I use a
thread to perform that so that it does not slow down the other users. I
usually want the originating client to block until I have a response, so I
use 'await'. However, occasionally the request takes some time, and it is
not necessary for the client to wait for the response, so I want to unblock
the client straight away, run the task in the background, and then notify
the client when the task is complete. This is where your suggestion of
'ensure_future' does the job perfectly.

I would love to drive the database asynchronously, but of the three
databases I use, only psycopg2 seems to have asyncio support. As my
home-grown solution (using queues) seems to be working well so far, I am
sticking with that until I start to experience responsiveness issues. If
that happens, my first line of attack will be to switch from threads to
processes.

Chris Angelico

unread,

Feb 16, 2016, 10:29:04 AM2/16/16

to

On Wed, Feb 17, 2016 at 2:21 AM, Frank Millman <fr...@chagford.com> wrote:
> I would love to drive the database asynchronously, but of the three
> databases I use, only psycopg2 seems to have asyncio support. As my
> home-grown solution (using queues) seems to be working well so far, I am
> sticking with that until I start to experience responsiveness issues. If
> that happens, my first line of attack will be to switch from threads to
> processes.

And this is where we demonstrate divergent thought processes. *My*
first line of attack if hybrid async/thread doesn't work would be to
mandate a PostgreSQL backend, not to switch to hybrid async/process :)
Is the added value of "you get three options of database back-end"
worth the added cost of "but now my code is massively more complex"?

ChrisA

Frank Millman

unread,

Feb 16, 2016, 10:45:38 AM2/16/16

to

"Chris Angelico" wrote in message
news:CAPTjJmqMiE4gROqNYVHwAhCn...@mail.gmail.com...

Then we will have to agree to diverge ;-)

If I ever get my app off the ground, it will be an all-purpose,
multi-company, multi-currency, multi-everything accounting/business system.

There is a massive market out there, and a large percentage of that is
Microsoft-only shops. I have no intention of cutting myself off from that
market before I even start.

I am very happy with my choice of 3 databases -

1. sqlite3 - ideal for demo purposes and for one-man businesses

2. Sql Server for those that insist on it

3. PostgreSQL for every one else, and my recommendation if asked

Frank

Marko Rauhamaa

unread,

Feb 16, 2016, 12:12:29 PM2/16/16

to

Steven D'Aprano <st...@pearwood.info>:

Sure:

========================================================================
#!/usr/bin/env python3

import asyncio, time

def main():
asyncio.get_event_loop().run_until_complete(asyncio.wait([
background_task(),
looping_task() ]))

@asyncio.coroutine
def looping_task():
while True:
yield from asyncio.sleep(1)
print(int(time.time()))

@asyncio.coroutine
def background_task():
yield from asyncio.sleep(4)
t = time.time()
while time.time() - t < 10:
pass

if __name__ == '__main__':
main()
========================================================================

which prints:

1455642629
1455642630
1455642631
1455642642 <============== gap
1455642643
1455642644
1455642645

Marko

Marko Rauhamaa

unread,

Feb 16, 2016, 12:13:16 PM2/16/16

to

Steven D'Aprano <st...@pearwood.info>:

Marko Rauhamaa

unread,

Feb 16, 2016, 12:14:29 PM2/16/16

to

Steven D'Aprano <st...@pearwood.info>:

Marko Rauhamaa

unread,

Feb 16, 2016, 12:15:15 PM2/16/16

to

Steven D'Aprano <st...@pearwood.info>:

1455642643
1455642644
1455642645

Marko

Marko Rauhamaa

unread,

Feb 16, 2016, 12:15:44 PM2/16/16

to

Marko Rauhamaa <ma...@pacujo.net>:

> Sure:

Sorry for the multiple copies.

Marko

Marko Rauhamaa

unread,

Feb 16, 2016, 12:20:25 PM2/16/16

to

"Frank Millman" <fr...@chagford.com>:

> I would love to drive the database asynchronously, but of the three
> databases I use, only psycopg2 seems to have asyncio support.

Yes, asyncio is at its infancy. There needs to be a moratorium on
blocking I/O.

Marko

Robin Becker

unread,

Feb 16, 2016, 12:52:55 PM2/16/16

to

I thought perhaps background jobs were sending them :)
--
Robin Becker

Paul Rubin

unread,

Feb 17, 2016, 11:38:38 PM2/17/16

to

Marko Rauhamaa <ma...@pacujo.net> writes:
> @asyncio.coroutine
> def background_task(): ...

> while time.time() - t < 10:
> pass

Wait, that's a cpu-busy loop, you can't do that in cooperative
multitasking. Of course you need a wait there.

Marko Rauhamaa

unread,

Feb 18, 2016, 1:10:50 AM2/18/16

to

Paul Rubin <no.e...@nospam.invalid>:

That was the very point: to demonstrate that coroutines monopolize the
CPU.

Marko

Paul Rubin

unread,

Feb 20, 2016, 2:40:21 AM2/20/16

to

Unfortunately there appears to be no way to open a file in Linux without
at least potentially blocking (slow disk or whatever). You need
separate threads or processes to do the right thing.

Marko Rauhamaa

unread,

Feb 20, 2016, 3:13:22 AM2/20/16

to

Paul Rubin <no.e...@nospam.invalid>:

I have been wondering about the same thing. It would appear that disk
I/O is considered nonblocking at a very deep level:

* O_NONBLOCK doesn't have an effect

* a process waiting for the disk to respond cannot receive a signal

* a process waiting for the disk to respond stays in the "ready" state

Note that

* most disk I/O operates on a RAM cache that is flushed irregularly

* memory mapping and swapping make disk I/O and RAM access two sides of
the same coin

* page faults can turn any assembly language instruction into a
blocking disk I/O operation

* ordinary disks don't provide for much parallelism; processes are
usually serialized for disk I/O

If the file system happens to be NFS, a networking issue may paralyze
the whole system.

...

On the networking side, there is also a dangerous blocking operation:
socket.getaddrinfo() (and friends). As a consequence, socket.bind(),
socket.connect() may block indefinitely. In fact, even asyncio's
BaseEventLoop.create_server() and BaseEventLoop.create_sonnection() may
block indefinitely without yielding.

SEE ALSO
getaddrinfo_a(3)

Marko

Paul Rubin

unread,

Feb 20, 2016, 3:37:32 AM2/20/16

to

Marko Rauhamaa <ma...@pacujo.net> writes:
> It would appear that disk I/O is considered nonblocking at a very deep
> level:
> * O_NONBLOCK doesn't have an effect
> * a process waiting for the disk to respond cannot receive a signal
> * a process waiting for the disk to respond stays in the "ready" state

You can handle those issues with AIO. It's open(2) that seems to have
no asynchronous analog as far as I can tell. Actually, looking at the
AIO man pages, it appears that the Linux kernel currently doesn't
support it and it's instead simulated by a userspace library using
threads. I didn't realize that before. But AIO is at least specified
by POSIX, and there was some kernel work (io_setup(2) etc.) that may or
may not still be in progress. It also doesn't have an open(2) analog,
sigh.

> On the networking side, there is also a dangerous blocking operation:
> socket.getaddrinfo() (and friends). As a consequence, socket.bind(),
> socket.connect() may block indefinitely. In fact, even asyncio's
> BaseEventLoop.create_server() and BaseEventLoop.create_sonnection() may
> block indefinitely without yielding.

getaddrinfo is a notorious pain but I think it's just a library issue;
an async version should be possible in principle. How does Twisted
handle it? Does it have a version?

I've just felt depressed whenever I've looked at any Python async stuff.
I've written many Python programs with threads and not gotten into the
trouble that people keep warning about. But I haven't really understood
the warnings, so maybe they know something I don't. I just write in a
multiprocessing style, with every mutable object owned by exactly one
thread and accessed only by RPC through queues, sort of a poor man's
Erlang. There's a performance hit but there's a much bigger one from
using Python in the first place, so I just live with it.

Chris Angelico

unread,

Feb 20, 2016, 3:52:29 AM2/20/16

to

On Sat, Feb 20, 2016 at 7:37 PM, Paul Rubin <no.e...@nospam.invalid> wrote:
> getaddrinfo is a notorious pain but I think it's just a library issue;
> an async version should be possible in principle. How does Twisted
> handle it? Does it have a version?

In a (non-Python) program of mine, I got annoyed by synchronous name
lookups, so I hacked around it: instead of using the regular library
functions, I just do a DNS lookup directly (which can then be
event-based - send a UDP packet, get notified when a UDP packet
arrives). Downside: Ignores /etc/nsswitch.conf and /etc/hosts, and
goes straight to the name server. Upside: Is able to do its own
caching, since the DNS library gives me the TTLs, but
gethostbyname/getaddrinfo won't.

ChrisA

Marko Rauhamaa

unread,

Feb 20, 2016, 3:59:25 AM2/20/16

to

Chris Angelico <ros...@gmail.com>:

> In a (non-Python) program of mine, I got annoyed by synchronous name
> lookups, so I hacked around it: instead of using the regular library
> functions, I just do a DNS lookup directly (which can then be
> event-based - send a UDP packet, get notified when a UDP packet
> arrives). Downside: Ignores /etc/nsswitch.conf and /etc/hosts, and
> goes straight to the name server. Upside: Is able to do its own
> caching, since the DNS library gives me the TTLs, but
> gethostbyname/getaddrinfo won't.

Ditto in a Python program of mine, although I don't bother with caching:
the DNS server is perfectly capable of caching the entries for me.

Marko

Chris Angelico

unread,

Feb 20, 2016, 4:02:33 AM2/20/16

to

If you know you have a local DNS server, sure. Mine is written for a
generic situation where that can't be depended on, so it caches
itself. But it's no big deal.

ChrisA

Marko Rauhamaa

unread,

Feb 20, 2016, 4:28:54 AM2/20/16

to

Paul Rubin <no.e...@nospam.invalid>:

> I've just felt depressed whenever I've looked at any Python async
> stuff. I've written many Python programs with threads and not gotten
> into the trouble that people keep warning about.

Programming-model-wise, asyncio is virtually identical with threads. In
each, I dislike the implicit state concept. I want the state to stand
out with big block letters.

> I've just felt depressed whenever I've looked at any Python async
> stuff. I've written many Python programs with threads and not gotten
> into the trouble that people keep warning about. But I haven't really
> understood the warnings, so maybe they know something I don't. I just
> write in a multiprocessing style, with every mutable object owned by
> exactly one thread and accessed only by RPC through queues, sort of a
> poor man's Erlang. There's a performance hit but there's a much bigger
> one from using Python in the first place, so I just live with it.

Good for you if you have been able to choose your own programming model.
Most people have to deal with a legacy mess. Also, maintainers who
inherit your tidy code might not be careful to ship only nonmutable
objects in the queues.

Your way of using threads works, of course, with the caveat that it is
not possible to get rid of a blocking thread from the outside. With
asyncio, you can at least cancel tasks.

Marko

Kevin Conway

unread,

Feb 20, 2016, 8:52:30 AM2/20/16

to

> getaddrinfo is a notorious pain but I think it's just a library issue; an
async version should be possible in principle. How does Twisted handle
it? Does it have a version?

I think we're a little outside the scope of OP's question at this point,
but for the sake of answering this:

There are a few cases that I know of where Twisted uses the standard lib
socket DNS methods. One is when resolving names to IPv6 addresses [1] when
creating a client connection to a remote source. The other is in the
default DNS resolver that is installed in the reactor [2]. Creating client
connections allows the call to 'getaddrinfo' to block without mitigation.
The default DNS resolver, unfortunately, dispatches calls of
'gethostbyname' to a thread pool.

Without seeing the commit history, I'd assume the use of 'socket' and
threads by default is an artifact that predates the implementation of the
DNS protocol in Twisted. Twisted has, in 'twisted.names' [3], a DNS
protocol that uses UDP and leverages the reactor appropriately. Thankfully,
Twisted has a reactor method called 'installResolver' [4] that allows you
to hook in any DNS resolver implementation you want so you aren't stuck
using the default, threaded implementation.

As far as asyncio, it also defaults to an implementation that delegates to
an executor (default: threadpool). Unlike Twisted, though, it appears to
require a subclass of the event loop to override the 'getaddrinfo' method
[5].

[1]
https://github.com/twisted/twisted/blob/trunk/twisted/internet/tcp.py#L622
[2]
https://github.com/twisted/twisted/blob/trunk/twisted/internet/base.py#L257
[3] https://github.com/twisted/twisted/tree/trunk/twisted/names
[4]
https://github.com/twisted/twisted/blob/trunk/twisted/internet/base.py#L509
[5]
https://github.com/python/cpython/blob/master/Lib/asyncio/base_events.py#L572

> --
> https://mail.python.org/mailman/listinfo/python-list
>

Martin A. Brown

unread,

Feb 20, 2016, 12:45:41 PM2/20/16

to

Hello there,

I realize that this discussion of supporting asynchronous name
lookup requests in DNS is merely a detour in this thread on asyncio,
but I couldn't resist mentioning an existing tool.

>> getaddrinfo is a notorious pain but I think it's just a library
>> issue; an async version should be possible in principle. How
>> does Twisted handle it? Does it have a version?
>

>In a (non-Python) program of mine, I got annoyed by synchronous
>name lookups, so I hacked around it: instead of using the regular
>library functions, I just do a DNS lookup directly (which can then
>be event-based - send a UDP packet, get notified when a UDP packet
>arrives). Downside: Ignores /etc/nsswitch.conf and /etc/hosts, and
>goes straight to the name server. Upside: Is able to do its own
>caching, since the DNS library gives me the TTLs, but
>gethostbyname/getaddrinfo won't.

Another (non-Python) DNS name lookup library that does practically
the same thing (along with the shortcomingsn you mentioned, Chris:
no NSS nor /etc/hosts) is the adns library. Well, it is DNS, after
all.

http://www.gnu.org/software/adns/
https://pypi.python.org/pypi/adns-python/1.2.1

And, there are Python bindings. I have been quite happy using the
adns tools (and tools built on the Python bindings) for mass lookups
(millions of DNS names). It works very nicely.

Just sharing knowledge of an existing tool,

-Martin

--
Martin A. Brown
http://linux-ip.net/

Chris Angelico

unread,

Feb 20, 2016, 4:48:04 PM2/20/16

to

On Sun, Feb 21, 2016 at 4:45 AM, Martin A. Brown <mar...@linux-ip.net> wrote:
> Another (non-Python) DNS name lookup library that does practically
> the same thing (along with the shortcomingsn you mentioned, Chris:
> no NSS nor /etc/hosts) is the adns library. Well, it is DNS, after
> all.
>
> http://www.gnu.org/software/adns/
> https://pypi.python.org/pypi/adns-python/1.2.1
>
> And, there are Python bindings. I have been quite happy using the
> adns tools (and tools built on the Python bindings) for mass lookups
> (millions of DNS names). It works very nicely.
>
> Just sharing knowledge of an existing tool,
>

Ultimately, anything that replaces a gethostbyname/getaddrinfo call
with an explicit DNS lookup is going to have the exact same benefit
and downside: lookups won't freeze the program, but you can't use
/etc/hosts any more. (Slightly sloppy language but that's how an end
user will see it.)

ChrisA