Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Subprocess puzzle and two questions

68 views
Skip to first unread message

w...@mac.com

unread,
Nov 13, 2012, 10:34:55 PM11/13/12
to pytho...@python.org, William R. Wing
I need to time the operation of a command-line utility (specifically nslookup) from within a python program I'm writing. I don't want to use python's timeit function because I'd like to avoid python's subprocess creation overhead. That leads me to the standard UNIX time function. So for example, in my bash shell, if I enter:

$ time nslookup www.es.net 8.8.4.4

I get:

Server: 8.8.4.4
Address: 8.8.4.4#53

Non-authoritative answer:
www.es.net canonical name = www3.es.net.
Name: www3.es.net
Address: 128.55.22.201

real 0m0.069s
user 0m0.006s
sys 0m0.004s

The first lines are the result of an nslookup of the IP address of "www.es.net" using the server at 8.8.4.4 (Google's public DNS server b).
The last three lines are what I'm after: the real elapsed wall-clock time, the time spent in user space and the time spent in kernel space.

However, if I try the same operation in the python interpreter using subprocess.Popen like so:

>>> import subprocess
>>> result = subprocess.Popen(['time', 'nslookup', 'www.es.net', '8.8.4.4'], shell = False, stdout = subprocess.PIPE, stderr = subprocess.PIPE).communicate()
>>> print result
('Server:\t\t8.8.4.4\nAddress:\t8.8.4.4#53\n\nNon-authoritative answer:\nwww.es.net\tcanonical name = www3.es.net.\nName:\twww3.es.net\nAddress: 128.55.22.201\n\n', ' 0.06 real 0.00 user 0.00 sys\n')

And the timing information I'm after has been truncated to two digits after the decimal. It appears that Popen is applying a default format. If I do explicit formatting:

>>> time = result[1].lstrip().split(' ')[0]
>>> formatted_time = '{: >7.3f}'.format(float(time))
>>> print formatted_time
0.060

I get three digits, BUT that third digit isn't real, the format operation has simply appended a zero. So:

1) how can I recover that third digit from the subprocess?
2) is there a more pythonic way to do what I'm trying to do?

python 2.7, OS-X 10.8.2

Thanks in advance -
Bill Wing

Roy Smith

unread,
Nov 13, 2012, 11:41:24 PM11/13/12
to
In article <mailman.3664.1352867...@python.org>,
w...@mac.com wrote:

> I need to time the operation of a command-line utility (specifically
> nslookup) from within a python program I'm writing.

Ugh. Why are you doing this? Shelling out to nslookup is an incredibly
slow and clumsy way of doing name translation. What you really want to
be doing is calling getaddrinfo() directly.

See http://docs.python.org/2/library/socket.html#socket.getaddrinfo for
details.

William Ray Wing

unread,
Nov 14, 2012, 12:03:51 AM11/14/12
to Roy Smith, pytho...@python.org
> --
Because, unless I'm badly mistaken (very possible), getaddrinfo doesn't let me specify the server from which the name is returned. I'm really not after the name, what I'm REALLY after is the fact that a path exists to the name server I specify (and how long it takes to respond). In the "good old days" I would just have ping'd it, but these days more and more DNS boxes (and servers of all sorts) are shutting off their ping response.

Thanks, Bill
Message has been deleted

Kushal Kumaran

unread,
Nov 14, 2012, 1:55:19 AM11/14/12
to pytho...@python.org
w...@mac.com writes:

> I need to time the operation of a command-line utility (specifically nslookup) from within a python program I'm writing. I don't want to use python's timeit function because I'd like to avoid python's subprocess creation overhead. That leads me to the standard UNIX time function. So for example, in my bash shell, if I enter:
>

It is unclear to me what overhead you are avoiding.
It is possible that the "time" invocation from the shell is invoking
your shell's builtin time implementation, and your python code is
running /usr/bin/time or /bin/time. You should see the same behaviour
from the shell if you run /bin/time or /usr/bin/time (whatever you have)
instead of just "time". subprocess.Popen should never modify the output
of programs it runs.

--
regards,
kushal

Tim Roberts

unread,
Nov 14, 2012, 2:17:14 AM11/14/12
to
w...@mac.com wrote:
>...
>However, if I try the same operation in the python interpreter using subprocess.Popen like so:
>
>>>> import subprocess
>>>> result = subprocess.Popen(['time', 'nslookup', 'www.es.net', '8.8.4.4'], shell = False, stdout = subprocess.PIPE, stderr = subprocess.PIPE).communicate()
>>>> print result
>('Server:\t\t8.8.4.4\nAddress:\t8.8.4.4#53\n\nNon-authoritative answer:\nwww.es.net\tcanonical name = www3.es.net.\nName:\twww3.es.net\nAddress: 128.55.22.201\n\n', ' 0.06 real 0.00 user 0.00 sys\n')
>
>And the timing information I'm after has been truncated to two digits after the decimal. It appears that Popen is applying a default format.

No, that's silly. A few minutes thought should have told you that. In
your standalone test, you are getting the "time" command that is built in
to bash. In the subprocess example, you've specified "shell = False", so
you are using the external "time" command (/usr/bin/time in my system), and
that command has a different output format. The csh "time" command is
different yet again.

>1) how can I recover that third digit from the subprocess?

Do you actually believe that the third decimal place has any meaning at
all? It doesn't.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Roy Smith

unread,
Nov 14, 2012, 9:22:57 AM11/14/12
to
In article <mailman.3666.1352873...@python.org>,
Oh, my. You're using DNS as a replacement for ping? Fair enough. In
that case, all you really care about is that you can connect to port 53
on the server...

import socket
import time
s = socket.socket()
t0 = time.time()
s.connect(('8.8.8.8', 53))
t1 = time.time()
print "it took %f seconds to connect" % (t1 - t0)

Chris Angelico

unread,
Nov 14, 2012, 9:40:05 AM11/14/12
to pytho...@python.org
On Thu, Nov 15, 2012 at 1:22 AM, Roy Smith <r...@panix.com> wrote:
> Oh, my. You're using DNS as a replacement for ping? Fair enough. In
> that case, all you really care about is that you can connect to port 53
> on the server...
>
> import socket
> import time
> s = socket.socket()
> t0 = time.time()
> s.connect(('8.8.8.8', 53))
> t1 = time.time()
> print "it took %f seconds to connect" % (t1 - t0)

That assumes that (a) the remote server supports TCP for DNS (since
UDP is by far the more often used, some name servers don't bother
supporting TCP), and (b) that connection time for TCP is comparable to
ping or an actual DNS lookup. But in terms of approximating your
connection times, that's gotta be way better than shelling out to
several other processes.

ChrisA

w...@mac.com

unread,
Nov 14, 2012, 9:37:12 AM11/14/12
to Roy Smith, pytho...@python.org, w...@mac.com
> --
> http://mail.python.org/mailman/listinfo/python-list

Now THAT looks better. Simpler, cleaner, (longer, taller, stronger, faster, cheaper… :-)

Thanks,
Bill

Roy Smith

unread,
Nov 14, 2012, 11:20:54 AM11/14/12
to
I wrote:
>> Oh, my. You're using DNS as a replacement for ping? Fair enough. In
>> that case, all you really care about is that you can connect to port 53
>> on the server...
>>
>> s = socket.socket()
>> s.connect(('8.8.8.8', 53))

In article <mailman.3684.1352904...@python.org>,
Chris Angelico <ros...@gmail.com> wrote:
>That assumes that (a) the remote server supports TCP for DNS

This is true. I honestly don't know what percentage of DNS servers
out there only support UDP. The two I tried (Google's 8.8.8.8, and my
Apple TimeCapsule) both supported TCP, but that's hardly a
representitive sample.

> and (b) that connection time for TCP is comparable to
> ping or an actual DNS lookup.

My first thought to solve both of these is that it shouldn't be too
hard to hand-craft a minimal DNS query and send it over UDP. Then, I
hunted around a bit and found that somebody had already done that, in
spades. Take a look at http://www.dnspython.org; it might be exactly
what's needed here.

Chris Angelico

unread,
Nov 14, 2012, 4:54:29 PM11/14/12
to pytho...@python.org
On Thu, Nov 15, 2012 at 3:20 AM, Roy Smith <r...@panix.com> wrote:
> I wrote:
>>> Oh, my. You're using DNS as a replacement for ping? Fair enough. In
>>> that case, all you really care about is that you can connect to port 53
>>> on the server...
>>>
>>> s = socket.socket()
>>> s.connect(('8.8.8.8', 53))
>
> In article <mailman.3684.1352904...@python.org>,
> Chris Angelico <ros...@gmail.com> wrote:
>>That assumes that (a) the remote server supports TCP for DNS
>
> This is true. I honestly don't know what percentage of DNS servers
> out there only support UDP. The two I tried (Google's 8.8.8.8, and my
> Apple TimeCapsule) both supported TCP, but that's hardly a
> representitive sample.

I don't know either, all I know is that DNSReport recommends
supporting TCP, and none of my DNS servers ever fail that check.

>> and (b) that connection time for TCP is comparable to
>> ping or an actual DNS lookup.
>
> My first thought to solve both of these is that it shouldn't be too
> hard to hand-craft a minimal DNS query and send it over UDP. Then, I
> hunted around a bit and found that somebody had already done that, in
> spades. Take a look at http://www.dnspython.org; it might be exactly
> what's needed here.

Yeah, that sounds like a good option. I'm slightly surprised that
there's no way with the Python stdlib to point a DNS query at a
specific server, but dnspython might be the solution. On the flip
side, dnspython is dauntingly large; it looks like a full
implementation of DNS, but I don't see a simple entrypoint that wraps
it all up into a simple function that can be bracketed with
time.time() calls (granted, I only skimmed the docs VERY quickly). So
it may be simpler to hand-craft an outgoing UDP packet once, save it
as a string literal, send that, and just wait for any response. That
eliminates all DNS protocolling and just times the round trip.

ChrisA

Roy Smith

unread,
Nov 14, 2012, 8:49:19 PM11/14/12
to
In article <mailman.3700.1352930...@python.org>,
Chris Angelico <ros...@gmail.com> wrote:

> I'm slightly surprised that there's no way with the Python stdlib to
> point a DNS query at a specific server

Me too, including the "only slightly" part. The normal high-level C
resolver routines (getaddrinfo/getnameinfo, or even the old
gethostbyname series), don't expose any way to do that. You have to dig
quite far down in the resolver library stack to get to the point where
you can do that. The concept of not knowing or caring which specific
server has the data you need is quite deeply baked into the basic DNS
architecture.

Chris Angelico

unread,
Nov 14, 2012, 9:04:21 PM11/14/12
to pytho...@python.org
Indeed. But Python boasts that the batteries are included, and given
the wealth of other networking facilities that are available, it is a
bit of a hole that you can't run DNS queries in this way.

Mind you, if Python's managed to get this far without it being a major
stumbling-block, that probably means that it's not a serious lack. And
I don't think many people write DNS *servers* in Python. (Most people
don't write DNS servers at all, since BIND exists. But I did exactly
that this week, since it would be easier than most other options.)

ChrisA

Roy Smith

unread,
Nov 14, 2012, 9:10:29 PM11/14/12
to
In article <mailman.3707.1352945...@python.org>,
Chris Angelico <ros...@gmail.com> wrote:

> Indeed. But Python boasts that the batteries are included, and given
> the wealth of other networking facilities that are available, it is a
> bit of a hole that you can't run DNS queries in this way.

Think of the socket and struct modules as a pile of carbon rods and gobs
of zinc paste, from which you can assemble your own batteries, and make
them in exactly the shape and size you need.

Chris Angelico

unread,
Nov 14, 2012, 9:21:07 PM11/14/12
to pytho...@python.org
Then assembly language is a pile of protons, neutrons, and electrons...

:)

ChrisA

Dave Angel

unread,
Nov 14, 2012, 9:55:20 PM11/14/12
to Chris Angelico, pytho...@python.org
On 11/14/2012 09:21 PM, Chris Angelico wrote:
> Then assembly language is a pile of protons, neutrons, and electrons...

And real machine language (microcode) is a pile of quarks; fermions
versus bosons. But in recent years, you pretty much have to work at
Intel to see that part of the processor.



--

DaveA

Kushal Kumaran

unread,
Nov 14, 2012, 11:53:08 PM11/14/12
to pytho...@python.org
Chris Angelico <ros...@gmail.com> writes:
> Indeed. But Python boasts that the batteries are included, and given
> the wealth of other networking facilities that are available, it is a
> bit of a hole that you can't run DNS queries in this way.
>
> Mind you, if Python's managed to get this far without it being a major
> stumbling-block, that probably means that it's not a serious lack. And
> I don't think many people write DNS *servers* in Python. (Most people
> don't write DNS servers at all, since BIND exists. But I did exactly
> that this week, since it would be easier than most other options.)
>

Indeed. Most people would prefer if random applications didn't make
their own decisions about using specific DNS servers. That way, the
users can make their own configuration choices (gai.conf, nsswitch.conf)
according to their site preferences.

If your application needs that level of control (if you're writing a
nslookup replacement for some reason, perhaps), dnspython
(www.dnspython.org) seems to have it.

--
regards,
kushal

Aahz

unread,
Nov 15, 2012, 12:42:58 AM11/15/12
to
In article <mailman.3700.1352930...@python.org>,
From one of my scripts lying around:

domain = MAILTO.split('@',1)[1]
server = str(dns.resolver.query(domain, 'MX')[0].exchange)

You'll need to play around a bit to find out what that does, but it
should point you in the right direction.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"LL YR VWL R BLNG T S" -- www.nancybuttons.com

Nobody

unread,
Nov 15, 2012, 5:54:10 PM11/15/12
to
On Wed, 14 Nov 2012 20:49:19 -0500, Roy Smith wrote:

>> I'm slightly surprised that there's no way with the Python stdlib to
>> point a DNS query at a specific server
>
> Me too, including the "only slightly" part. The normal high-level C
> resolver routines (getaddrinfo/getnameinfo, or even the old
> gethostbyname series), don't expose any way to do that.

That's because the high-level routines aren't tied to DNS.

gethostbyname() and getaddrinfo() use the NSS (name-service switch)
mechanism, which is configured via /etc/nsswitch.conf. Depending upon
configuration, hostnames can be looked up via a plain text file
(/etc/hosts), Berkeley DB files, DNS, NIS, NIS+, LDAP, WINS, etc. DNS is
just one particular back-end, which may or may not be used on any given
system.

If you specifically want to perform DNS queries, you have to use a
DNS-specific interface (e.g. the res_* functions described in the
resolver(3) manpage), or raw sockets, rather than a high-level interface
such as gethostbyname() or getaddrinfo().

Roy Smith

unread,
Nov 15, 2012, 8:07:38 PM11/15/12
to
In article <pan.2012.11.15....@nowhere.com>,
Nobody <nob...@nowhere.com> wrote:

> That's because the high-level routines aren't tied to DNS.

This is true.

>> gethostbyname() and getaddrinfo() use the NSS (name-service switch)
> mechanism, which is configured via /etc/nsswitch.conf. Depending upon
> configuration, hostnames can be looked up via a plain text file
> (/etc/hosts), Berkeley DB files, DNS, NIS, NIS+, LDAP, WINS, etc.

Gethostbyname() long predates NSS. For that matter, I think it even
predates DNS (i.e. back to the days when /etc/hosts was the *only* way
to look up a hostname).

But, that's a nit.

Nobody

unread,
Nov 16, 2012, 7:17:26 PM11/16/12
to
On Thu, 15 Nov 2012 20:07:38 -0500, Roy Smith wrote:

>>> gethostbyname() and getaddrinfo() use the NSS (name-service switch)
>> mechanism, which is configured via /etc/nsswitch.conf. Depending upon
>> configuration, hostnames can be looked up via a plain text file
>> (/etc/hosts), Berkeley DB files, DNS, NIS, NIS+, LDAP, WINS, etc.
>
> Gethostbyname() long predates NSS.

Before NSS there was host.conf, which provided similar functionality
except that the set of mechanisms was fixed (they were built into libc
rather than being dynamically-loaded libraries) and it only applied to
hostnames (NSS is also used for getpwent(), getprotoent(), etc).

> For that matter, I think it even predates DNS (i.e. back to the days
> when /etc/hosts was the *only* way to look up a hostname).
>
> But, that's a nit.

Indeed; the main point is that gethostbyname() has never been specific to
DNS.

0 new messages