Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to measure elapsed time under Windows?

10 views
Skip to first unread message

Grant Edwards

unread,
Feb 9, 2010, 11:10:56 AM2/9/10
to
What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

start = time.clock()
[stuff being timed]
stop = time.clock()

delta = stop-start


However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

Is there a way to lock the python process to a single core so
that time.clock() works right?

--
Grant Edwards grante Yow! If I felt any more
at SOPHISTICATED I would DIE
visi.com of EMBARRASSMENT!

Jean-Michel Pichavant

unread,
Feb 9, 2010, 1:23:21 PM2/9/10
to Grant Edwards, pytho...@python.org
Grant Edwards wrote:
> What's the correct way to measure small periods of elapsed
> time. I've always used time.clock() in the past:
>
> start = time.clock()
> [stuff being timed]
> stop = time.clock()
>
> delta = stop-start
>
>
> However on multi-processor machines that doesn't work.
> Sometimes I get negative values for delta. According to
> google, this is due to a bug in Windows that causes the value
> of time.clock() to be different depending on which core in a
> multi-core CPU you happen to be on. [insert appropriate
> MS-bashing here]
>
> Is there another way to measure small periods of elapsed time
> (say in the 1-10ms range)?
>
> Is there a way to lock the python process to a single core so
> that time.clock() works right?
>
>

Did you try with the datetime module ?

import datetime
t0 = datetime.datetime.now()
t1 = t0 - datetime.datetime.now()
t1.microseconds
Out[4]: 644114

JM

Grant Edwards

unread,
Feb 9, 2010, 2:04:01 PM2/9/10
to
On 2010-02-09, Jean-Michel Pichavant <jeanm...@sequans.com> wrote:
> Grant Edwards wrote:
>> What's the correct way to measure small periods of elapsed
>> time. I've always used time.clock() in the past:
>>
>> start = time.clock()
>> [stuff being timed]
>> stop = time.clock()
>>
>> delta = stop-start
>>
>> However on multi-processor machines that doesn't work.
>> Sometimes I get negative values for delta. According to
>> google, this is due to a bug in Windows that causes the value
>> of time.clock() to be different depending on which core in a
>> multi-core CPU you happen to be on. [insert appropriate
>> MS-bashing here]
>>
>> Is there another way to measure small periods of elapsed time
>> (say in the 1-10ms range)?
>>
>> Is there a way to lock the python process to a single core so
>> that time.clock() works right?
>>
>>
>
> Did you try with the datetime module ?

No. What mechanism does it use to get the current date/time?

> import datetime
> t0 = datetime.datetime.now()
> t1 = t0 - datetime.datetime.now()
> t1.microseconds
> Out[4]: 644114

That looks very broken to me. I need to measure stuff in the
1-20ms range, and the smallest value you can measure using the
method above appears to be 640ms. Thats almost 2/3 of a second.

--
Grant Edwards grante Yow! If our behavior is
at strict, we do not need fun!
visi.com

Grant Edwards

unread,
Feb 9, 2010, 4:19:05 PM2/9/10
to

Duh. It just occurred to me that was done interactively. I'll
give the datetime module a try once I reboot my test machine
back into Windows.

--
Grant Edwards grante Yow! Is this sexual
at intercourse yet?? Is it,
visi.com huh, is it??

Grant Edwards

unread,
Feb 9, 2010, 4:45:38 PM2/9/10
to
On 2010-02-09, Jean-Michel Pichavant <jeanm...@sequans.com> wrote:

Doesn't work. datetime.datetime.now has granularity of
15-16ms.

Intervals much less that that often come back with a delta of
0. A delay of 20ms produces a delta of either 15-16ms or
31-32ms

--
Grant Edwards grante Yow! I'm receiving a coded
at message from EUBIE BLAKE!!
visi.com

Grant Edwards

unread,
Feb 9, 2010, 5:00:50 PM2/9/10
to
On 2010-02-09, Grant Edwards <inv...@invalid.invalid> wrote:

time.time() exhibits the same behavior, so I assume that
datetime.datetime.new() ends up making the same libc/system
call as time.time(). From what I can grok of the datetime
module source code, it looks like it's calling gettimeofday().

I can't find any real documentation on the granularity of Win32
gettimeofday() other than a blog post that claims it is 10ms
(which doesn't agree with what my tests show).

--
Grant Edwards grante Yow! I feel better about
at world problems now!
visi.com

Gabriel Genellina

unread,
Feb 9, 2010, 6:43:51 PM2/9/10
to pytho...@python.org
En Tue, 09 Feb 2010 13:10:56 -0300, Grant Edwards
<inv...@invalid.invalid> escribi�:

> What's the correct way to measure small periods of elapsed
> time. I've always used time.clock() in the past:
>

> However on multi-processor machines that doesn't work.
> Sometimes I get negative values for delta. According to
> google, this is due to a bug in Windows that causes the value
> of time.clock() to be different depending on which core in a
> multi-core CPU you happen to be on. [insert appropriate
> MS-bashing here]

I'm not sure you can blame MS of this issue; anyway, this patch should fix
the problem:
http://support.microsoft.com/?id=896256

> Is there another way to measure small periods of elapsed time
> (say in the 1-10ms range)?

No that I know of. QueryPerformanceCounter (the function used by
time.clock) seems to be the best timer available.

> Is there a way to lock the python process to a single core so
> that time.clock() works right?

Interactively, from the Task Manager:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskman_assign_process.mspx

In code, using SetProcessAffinityMask and related functions:
http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx

--
Gabriel Genellina

Paul McGuire

unread,
Feb 10, 2010, 1:41:46 AM2/10/10
to
On Feb 9, 10:10 am, Grant Edwards <inva...@invalid.invalid> wrote:
> Is there another way to measure small periods of elapsed time
> (say in the 1-10ms range)?
>
On Feb 9, 10:10 am, Grant Edwards <inva...@invalid.invalid> wrote:
> Is there another way to measure small periods of elapsed time
> (say in the 1-10ms range)?
>

I made repeated calls to time.clock() in a generator expression, which
is as fast a loop I can think of in Python. Then I computed the
successive time deltas to see if any granularities jumped out. Here
are the results:

>>> import time
>>> from itertools import groupby
>>>
>>> # get about 1000 different values of time.clock()
>>> ts = set(time.clock() for i in range(1000))
>>>
>>> # sort in ascending order
>>> ts = sorted(ts)
>>>
>>> # compute diffs between adjacent time values
>>> diffs = [j-i for i,j in zip(ts[:-1],ts[1:])]
>>>
>>> # sort and group
>>> diffs.sort()
>>> diffgroups = groupby(diffs)
>>>
>>> # print the distribution of time differences in microseconds
>>> for i in diffgroups: print "%3d %12.6f" % (len(list(i[1])), i[0]*1e6)
...
25 2.234921
28 2.234921
242 2.514286
506 2.514286
45 2.793651
116 2.793651
1 3.073016
8 3.073016
6 3.352381
4 3.631746
3 3.911112
1 3.911112
5 4.190477
2 4.469842
1 6.146033
1 8.660319
1 9.777779
1 10.895239
1 11.174605
1 24.304765
1 41.904767

There seems to be a step size of about .28 microseconds. So I would
guess time.clock() has enough resolution. But also beware of the
overhead of the calls to clock() - using timeit, I find that each call
takes about 2 microseconds (consistent with the smallest time
difference in the above data set).

-- Paul

Message has been deleted

Paul McGuire

unread,
Feb 10, 2010, 8:37:22 AM2/10/10
to
On Feb 10, 2:24 am, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> On Tue, 9 Feb 2010 21:45:38 +0000 (UTC), Grant Edwards
> <inva...@invalid.invalid> declaimed the following in
> gmane.comp.python.general:

>
> > Doesn't work.  datetime.datetime.now has granularity of
> > 15-16ms.
>
> > Intervals much less that that often come back with a delta of
> > 0.  A delay of 20ms produces a delta of either 15-16ms or
> > 31-32ms
>
>         WinXP uses an ~15ms time quantum for task switching. Which defines
> the step rate of the wall clock output...
>
> http://www.eggheadcafe.com/software/aspnet/35546579/the-quantum-was-n...http://www.eggheadcafe.com/software/aspnet/32823760/how-do-you-set-ti...
>
> http://www.lochan.org/2005/keith-cl/useful/win32time.html
> --
>         Wulfraed         Dennis Lee Bieber               KD6MOG
>         wlfr...@ix.netcom.com     HTTP://wlfraed.home.netcom.com/

Gabriel Genellina reports that time.clock() uses Windows'
QueryPerformanceCounter() API, which has much higher resolution than
the task switcher's 15ms. QueryPerformanceCounter's resolution is
hardware-dependent; using the Win API, and a little test program, I
get this value on my machine:
Frequency is 3579545 ticks/sec
Resolution is 0.279365114840015 microsecond/tick

-- Paul

Grant Edwards

unread,
Feb 10, 2010, 11:15:22 AM2/10/10
to
On 2010-02-09, Gabriel Genellina <gags...@yahoo.com.ar> wrote:
> En Tue, 09 Feb 2010 13:10:56 -0300, Grant Edwards
><inv...@invalid.invalid> escribi?:

>
>> What's the correct way to measure small periods of elapsed
>> time. I've always used time.clock() in the past:
>>
>> However on multi-processor machines that doesn't work.
>> Sometimes I get negative values for delta. According to
>> google, this is due to a bug in Windows that causes the value
>> of time.clock() to be different depending on which core in a
>> multi-core CPU you happen to be on. [insert appropriate
>> MS-bashing here]
>
> I'm not sure you can blame MS of this issue; anyway, this
> patch should fix the problem:
> http://support.microsoft.com/?id=896256

I'm curious why it wouldn't be Microsoft's fault, because

A) Everything is Microsoft's fault. ;)

B) If a patch to MS Windows fixes the problem, how is it not a
problem in MS Windows?

>> Is there a way to lock the python process to a single core so
>> that time.clock() works right?
>
> Interactively, from the Task Manager:
> http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskman_assign_process.mspx

Thanks. That looks a bit easier than disabling the second core
(which is what I ended up doing).

> In code, using SetProcessAffinityMask and related functions:
> http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx

With help from google and some old mailing list posting I might
even try that.

--
Grant Edwards grante Yow! It's a lot of fun
at being alive ... I wonder if
visi.com my bed is made?!?

Gabriel Genellina

unread,
Feb 10, 2010, 5:37:36 PM2/10/10
to pytho...@python.org
En Wed, 10 Feb 2010 13:15:22 -0300, Grant Edwards
<inv...@invalid.invalid> escribi�:

> On 2010-02-09, Gabriel Genellina <gags...@yahoo.com.ar> wrote:
>> En Tue, 09 Feb 2010 13:10:56 -0300, Grant Edwards
>> <inv...@invalid.invalid> escribi?:
>>
>>> What's the correct way to measure small periods of elapsed
>>> time. I've always used time.clock() in the past:
>>> However on multi-processor machines that doesn't work.
>>> Sometimes I get negative values for delta. According to
>>> google, this is due to a bug in Windows that causes the value
>>> of time.clock() to be different depending on which core in a
>>> multi-core CPU you happen to be on. [insert appropriate
>>> MS-bashing here]
>>
>> I'm not sure you can blame MS of this issue; anyway, this
>> patch should fix the problem:
>> http://support.microsoft.com/?id=896256
>
> I'm curious why it wouldn't be Microsoft's fault, because
>
> A) Everything is Microsoft's fault. ;)
>
> B) If a patch to MS Windows fixes the problem, how is it not a
> problem in MS Windows?

I won't argue against A) because its truthness (?) is self-evident :)

99% of my code does not run in Python 3.x; I may fix it and it will
eventually run fine, but that doesn't mean it's *my* fault.

The original problem was with the RDTSC instruction on multicore CPUs;
different cores may yield different results because they're not
synchronized at all times.

Windows XP was launched in 2001, and the first dual core processors able
to execute Windows were AMD Opteron and IBM Pentium D, both launched
around April 2005 (and targeting the server market, not the home/desktop
market of Windows XP).
How could MS know in 2001 of a hardware issue that would happen four years
in the future?
Guido seems very jealous of his time machine and does not lend it to
anyone.

--
Gabriel Genellina

Grant Edwards

unread,
Feb 10, 2010, 5:40:34 PM2/10/10
to
On 2010-02-09, Gabriel Genellina <gags...@yahoo.com.ar> wrote:

> In code, using SetProcessAffinityMask and related functions:
> http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx

That solves the problem. If I don't set the process affinity
mask, I regularly get measurements that are off by 6ms. I
presume 6ms is the skew between the two cores' performance
counter values. I don't know if that difference is stable or
how it varies, but now it doesn't matter. :)

--

Tim Roberts

unread,
Feb 13, 2010, 3:29:45 PM2/13/10
to
"Gabriel Genellina" <gags...@yahoo.com.ar> wrote:
>
>The original problem was with the RDTSC instruction on multicore CPUs;
>different cores may yield different results because they're not
>synchronized at all times.

Not true. The synchronization issue has two causes: initial
synchronization at boot time, and power management making microscopic
adjustments to the clock rate. In Windows NT 4, Microsoft took extra pains
to adjust the cycle counters on multiprocessor computers during boot so
that the processors started out very close together. Once set, they
naturally remained in lock step, until aggressive power management because
more prevalent. In XP, they stopped trying to align at boot time.

>Windows XP was launched in 2001, and the first dual core processors able
>to execute Windows were AMD Opteron and IBM Pentium D, both launched
>around April 2005 (and targeting the server market, not the home/desktop
>market of Windows XP).
>How could MS know in 2001 of a hardware issue that would happen four years
>in the future?

No, you're underestimating the problem. The issue occurs just as much in
machines with multiple processor chips, which was supported clear back in
the original NT 3.1, 1992.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Albert van der Horst

unread,
Feb 22, 2010, 12:52:48 PM2/22/10
to
In article <87404349-5d3a-4396...@f8g2000yqn.googlegroups.com>,
Paul McGuire <pt...@austin.rr.com> wrote:

>On Feb 10, 2:24=A0am, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
>> On Tue, 9 Feb 2010 21:45:38 +0000 (UTC), Grant Edwards
>> <inva...@invalid.invalid> declaimed the following in
>> gmane.comp.python.general:
>>
>> > Doesn't work. =A0datetime.datetime.now has granularity of

>> > 15-16ms.
>>
>> > Intervals much less that that often come back with a delta of
>> > 0. =A0A delay of 20ms produces a delta of either 15-16ms or
>> > 31-32ms
>>
>> =A0 =A0 =A0 =A0 WinXP uses an ~15ms time quantum for task switching. Whic=

>h defines
>> the step rate of the wall clock output...
>>
>> http://www.eggheadcafe.com/software/aspnet/35546579/the-quantum-was-n...h=
>ttp://www.eggheadcafe.com/software/aspnet/32823760/how-do-you-set-ti...
>>
>> http://www.lochan.org/2005/keith-cl/useful/win32time.html
>> --
>> =A0 =A0 =A0 =A0 Wulfraed =A0 =A0 =A0 =A0 Dennis Lee Bieber =A0 =A0 =A0 =
>=A0 =A0 =A0 =A0 KD6MOG
>> =A0 =A0 =A0 =A0 wlfr...@ix.netcom.com =A0 =A0 HTTP://wlfraed.home.netcom.=

>com/
>
>Gabriel Genellina reports that time.clock() uses Windows'
>QueryPerformanceCounter() API, which has much higher resolution than
>the task switcher's 15ms. QueryPerformanceCounter's resolution is
>hardware-dependent; using the Win API, and a little test program, I
>get this value on my machine:
>Frequency is 3579545 ticks/sec
>Resolution is 0.279365114840015 microsecond/tick

In Forth we add a small machine code routine that executes the
RDTSC instruction. (I used that to play music on a couple of
mechanical instruments in real time.)
It just counts the (3 Ghz) clock cycles in a 64 bit timer.
Subtract two samples and you're done.

Is there a mechanism in Python to do something similar,
embedded assembler or something?

(This is not a general solution, but at least it would work on
Windows, that is i86 only.)

>
>-- Paul


--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Grant Edwards

unread,
Feb 22, 2010, 12:45:04 PM2/22/10
to
On 2010-02-22, Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
> In article <87404349-5d3a-4396...@f8g2000yqn.googlegroups.com>,

>>Gabriel Genellina reports that time.clock() uses Windows'
>>QueryPerformanceCounter() API, which has much higher resolution
>>than the task switcher's 15ms. QueryPerformanceCounter's
>>resolution is hardware-dependent; using the Win API, and a
>>little test program, I get this value on my machine: Frequency
>>is 3579545 ticks/sec Resolution is 0.279365114840015
>>microsecond/tick
>
> In Forth we add a small machine code routine that executes the
> RDTSC instruction. (I used that to play music on a couple of
> mechanical instruments in real time.) It just counts the (3
> Ghz) clock cycles in a 64 bit timer.

That's what clock.clock() does, except that it converts it into
a floating point value in seconds.

> Subtract two samples and you're done.

Nope. It would fail the same way that clock.clock() does on a
multi-core Windows machine.

> Is there a mechanism in Python to do something similar,
> embedded assembler or something?

You'd get the same results as using clock.clock(). Just
different format/units.

> (This is not a general solution, but at least it would work on
> Windows, that is i86 only.)

It fails on Windows for the same reason that clock.clock()
fails: the counters read by the RDTSC instruction are not
synchronized between the different cores.

--
Grant Edwards grante Yow! I'm a nuclear
at submarine under the
visi.com polar ice cap and I need
a Kleenex!

Gabriel Genellina

unread,
Feb 24, 2010, 12:32:59 PM2/24/10
to pytho...@python.org
En Sat, 13 Feb 2010 17:29:45 -0300, Tim Roberts <ti...@probo.com> escribiᅵ:

> "Gabriel Genellina" <gags...@yahoo.com.ar> wrote:
>>
>> The original problem was with the RDTSC instruction on multicore CPUs;
>> different cores may yield different results because they're not
>> synchronized at all times.
>
> Not true. The synchronization issue has two causes: initial
> synchronization at boot time, and power management making microscopic
> adjustments to the clock rate. In Windows NT 4, Microsoft took extra
> pains
> to adjust the cycle counters on multiprocessor computers during boot so
> that the processors started out very close together. Once set, they
> naturally remained in lock step, until aggressive power management
> because
> more prevalent. In XP, they stopped trying to align at boot time.

Thanks for the detailed explanation!

--
Gabriel Genellina

0 new messages