start = time.clock()
[stuff being timed]
stop = time.clock()
delta = stop-start
However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]
Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?
Is there a way to lock the python process to a single core so
that time.clock() works right?
--
Grant Edwards grante Yow! If I felt any more
at SOPHISTICATED I would DIE
visi.com of EMBARRASSMENT!
Did you try with the datetime module ?
import datetime
t0 = datetime.datetime.now()
t1 = t0 - datetime.datetime.now()
t1.microseconds
Out[4]: 644114
JM
No. What mechanism does it use to get the current date/time?
> import datetime
> t0 = datetime.datetime.now()
> t1 = t0 - datetime.datetime.now()
> t1.microseconds
> Out[4]: 644114
That looks very broken to me. I need to measure stuff in the
1-20ms range, and the smallest value you can measure using the
method above appears to be 640ms. Thats almost 2/3 of a second.
--
Grant Edwards grante Yow! If our behavior is
at strict, we do not need fun!
visi.com
Duh. It just occurred to me that was done interactively. I'll
give the datetime module a try once I reboot my test machine
back into Windows.
--
Grant Edwards grante Yow! Is this sexual
at intercourse yet?? Is it,
visi.com huh, is it??
Doesn't work. datetime.datetime.now has granularity of
15-16ms.
Intervals much less that that often come back with a delta of
0. A delay of 20ms produces a delta of either 15-16ms or
31-32ms
--
Grant Edwards grante Yow! I'm receiving a coded
at message from EUBIE BLAKE!!
visi.com
time.time() exhibits the same behavior, so I assume that
datetime.datetime.new() ends up making the same libc/system
call as time.time(). From what I can grok of the datetime
module source code, it looks like it's calling gettimeofday().
I can't find any real documentation on the granularity of Win32
gettimeofday() other than a blog post that claims it is 10ms
(which doesn't agree with what my tests show).
--
Grant Edwards grante Yow! I feel better about
at world problems now!
visi.com
> What's the correct way to measure small periods of elapsed
> time. I've always used time.clock() in the past:
>
> However on multi-processor machines that doesn't work.
> Sometimes I get negative values for delta. According to
> google, this is due to a bug in Windows that causes the value
> of time.clock() to be different depending on which core in a
> multi-core CPU you happen to be on. [insert appropriate
> MS-bashing here]
I'm not sure you can blame MS of this issue; anyway, this patch should fix
the problem:
http://support.microsoft.com/?id=896256
> Is there another way to measure small periods of elapsed time
> (say in the 1-10ms range)?
No that I know of. QueryPerformanceCounter (the function used by
time.clock) seems to be the best timer available.
> Is there a way to lock the python process to a single core so
> that time.clock() works right?
Interactively, from the Task Manager:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskman_assign_process.mspx
In code, using SetProcessAffinityMask and related functions:
http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx
--
Gabriel Genellina
I made repeated calls to time.clock() in a generator expression, which
is as fast a loop I can think of in Python. Then I computed the
successive time deltas to see if any granularities jumped out. Here
are the results:
>>> import time
>>> from itertools import groupby
>>>
>>> # get about 1000 different values of time.clock()
>>> ts = set(time.clock() for i in range(1000))
>>>
>>> # sort in ascending order
>>> ts = sorted(ts)
>>>
>>> # compute diffs between adjacent time values
>>> diffs = [j-i for i,j in zip(ts[:-1],ts[1:])]
>>>
>>> # sort and group
>>> diffs.sort()
>>> diffgroups = groupby(diffs)
>>>
>>> # print the distribution of time differences in microseconds
>>> for i in diffgroups: print "%3d %12.6f" % (len(list(i[1])), i[0]*1e6)
...
25 2.234921
28 2.234921
242 2.514286
506 2.514286
45 2.793651
116 2.793651
1 3.073016
8 3.073016
6 3.352381
4 3.631746
3 3.911112
1 3.911112
5 4.190477
2 4.469842
1 6.146033
1 8.660319
1 9.777779
1 10.895239
1 11.174605
1 24.304765
1 41.904767
There seems to be a step size of about .28 microseconds. So I would
guess time.clock() has enough resolution. But also beware of the
overhead of the calls to clock() - using timeit, I find that each call
takes about 2 microseconds (consistent with the smallest time
difference in the above data set).
-- Paul
Gabriel Genellina reports that time.clock() uses Windows'
QueryPerformanceCounter() API, which has much higher resolution than
the task switcher's 15ms. QueryPerformanceCounter's resolution is
hardware-dependent; using the Win API, and a little test program, I
get this value on my machine:
Frequency is 3579545 ticks/sec
Resolution is 0.279365114840015 microsecond/tick
-- Paul
I'm curious why it wouldn't be Microsoft's fault, because
A) Everything is Microsoft's fault. ;)
B) If a patch to MS Windows fixes the problem, how is it not a
problem in MS Windows?
>> Is there a way to lock the python process to a single core so
>> that time.clock() works right?
>
> Interactively, from the Task Manager:
> http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskman_assign_process.mspx
Thanks. That looks a bit easier than disabling the second core
(which is what I ended up doing).
> In code, using SetProcessAffinityMask and related functions:
> http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx
With help from google and some old mailing list posting I might
even try that.
--
Grant Edwards grante Yow! It's a lot of fun
at being alive ... I wonder if
visi.com my bed is made?!?
I won't argue against A) because its truthness (?) is self-evident :)
99% of my code does not run in Python 3.x; I may fix it and it will
eventually run fine, but that doesn't mean it's *my* fault.
The original problem was with the RDTSC instruction on multicore CPUs;
different cores may yield different results because they're not
synchronized at all times.
Windows XP was launched in 2001, and the first dual core processors able
to execute Windows were AMD Opteron and IBM Pentium D, both launched
around April 2005 (and targeting the server market, not the home/desktop
market of Windows XP).
How could MS know in 2001 of a hardware issue that would happen four years
in the future?
Guido seems very jealous of his time machine and does not lend it to
anyone.
--
Gabriel Genellina
> In code, using SetProcessAffinityMask and related functions:
> http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx
That solves the problem. If I don't set the process affinity
mask, I regularly get measurements that are off by 6ms. I
presume 6ms is the skew between the two cores' performance
counter values. I don't know if that difference is stable or
how it varies, but now it doesn't matter. :)
--
Not true. The synchronization issue has two causes: initial
synchronization at boot time, and power management making microscopic
adjustments to the clock rate. In Windows NT 4, Microsoft took extra pains
to adjust the cycle counters on multiprocessor computers during boot so
that the processors started out very close together. Once set, they
naturally remained in lock step, until aggressive power management because
more prevalent. In XP, they stopped trying to align at boot time.
>Windows XP was launched in 2001, and the first dual core processors able
>to execute Windows were AMD Opteron and IBM Pentium D, both launched
>around April 2005 (and targeting the server market, not the home/desktop
>market of Windows XP).
>How could MS know in 2001 of a hardware issue that would happen four years
>in the future?
No, you're underestimating the problem. The issue occurs just as much in
machines with multiple processor chips, which was supported clear back in
the original NT 3.1, 1992.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
In Forth we add a small machine code routine that executes the
RDTSC instruction. (I used that to play music on a couple of
mechanical instruments in real time.)
It just counts the (3 Ghz) clock cycles in a 64 bit timer.
Subtract two samples and you're done.
Is there a mechanism in Python to do something similar,
embedded assembler or something?
(This is not a general solution, but at least it would work on
Windows, that is i86 only.)
>
>-- Paul
--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
>>Gabriel Genellina reports that time.clock() uses Windows'
>>QueryPerformanceCounter() API, which has much higher resolution
>>than the task switcher's 15ms. QueryPerformanceCounter's
>>resolution is hardware-dependent; using the Win API, and a
>>little test program, I get this value on my machine: Frequency
>>is 3579545 ticks/sec Resolution is 0.279365114840015
>>microsecond/tick
>
> In Forth we add a small machine code routine that executes the
> RDTSC instruction. (I used that to play music on a couple of
> mechanical instruments in real time.) It just counts the (3
> Ghz) clock cycles in a 64 bit timer.
That's what clock.clock() does, except that it converts it into
a floating point value in seconds.
> Subtract two samples and you're done.
Nope. It would fail the same way that clock.clock() does on a
multi-core Windows machine.
> Is there a mechanism in Python to do something similar,
> embedded assembler or something?
You'd get the same results as using clock.clock(). Just
different format/units.
> (This is not a general solution, but at least it would work on
> Windows, that is i86 only.)
It fails on Windows for the same reason that clock.clock()
fails: the counters read by the RDTSC instruction are not
synchronized between the different cores.
--
Grant Edwards grante Yow! I'm a nuclear
at submarine under the
visi.com polar ice cap and I need
a Kleenex!
Thanks for the detailed explanation!
--
Gabriel Genellina