Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

more accurate than GetTickCount()

494 views
Skip to first unread message

Kenji chan

unread,
Jun 4, 2002, 3:32:06 AM6/4/02
to
I want a function which is more accurate than GetTickCount()
any idea?

Actually, the problem is when the drawing in OpenGL is very fast,
(dwTickNow -dwTickCount) will be zero

DWORD dwTickNow = GetTickCount();
if ((dwTickNow -dwTickCount) ==0) SleepEx(2000, 0); // problem here
sprintf(sz, "FPS %0.2f, %d c", (float)1000 /(dwTickNow -dwTickCount),
(int)f_R1 );
TextOut(hDC, 5, rect_GL.bottom - rect_GL.top -20,sz, strlen(sz));
dwTickCount = dwTickNow;

Alex Wenger

unread,
Jun 4, 2002, 4:14:18 AM6/4/02
to
> Actually, the problem is when the drawing in OpenGL is very fast,
> (dwTickNow -dwTickCount) will be zero
>
> DWORD dwTickNow = GetTickCount();
> if ((dwTickNow -dwTickCount) ==0) SleepEx(2000, 0); // problem here
> sprintf(sz, "FPS %0.2f, %d c", (float)1000 /(dwTickNow -dwTickCount),
> (int)f_R1 );
> TextOut(hDC, 5, rect_GL.bottom - rect_GL.top -20,sz, strlen(sz));
> dwTickCount = dwTickNow;

if ((dwTickNow -dwTickCount) ==0) {
sz = "FPS very fast";
}
else {


sprintf(sz, "FPS %0.2f, %d c", (float)1000
/(dwTickNow -dwTickCount),(int)f_R1 );
}

:-)


Kenji chan

unread,
Jun 4, 2002, 4:35:04 AM6/4/02
to
But I want to make it more accurate

"Alex Wenger" <a.we...@gmx.de> wrote in message
news:adhsmr$3o1$03$1...@news.t-online.com...

Eigil Hysvær

unread,
Jun 4, 2002, 5:03:09 AM6/4/02
to
"Kenji chan" <kenji...@bigpond.com> wrote in
news:YS_K8.4891$Hj3....@newsfeeds.bigpond.com:

> But I want to make it more accurate
>

We're off-topic now - so I'll be brief.
With win32 you can use performance counters which is 64-bit integers.
API-functions are: QueryPerformanceFrequence and QueryPerformanceCounter.

Stephen Kellett

unread,
Jun 4, 2002, 6:19:58 AM6/4/02
to
In message <Xns92236FFD05CD4ei...@194.19.1.61>, Eigil
Hysvær <eigil....@eudoramail.com> writes

I wrote a little program that called all of the various "get a time of
some sort or other" as many times as you wanted (I chose 100,000) and
then told you the average time per call.

QueryPerformanceCounter() was the slowest, by a big margin.
GetTickCount() was the fastest.

I'd love to know if there is something better (faster) than
GetTickCount().

Stephen
--
Stephen Kellett http://www.objmedia.demon.co.uk
Object Media Limited C++/Java/Windows NT/Unix/X Windows/Multimedia
If you are suffering from RSI, contact me for advice.
Unsolicited email from spam merchants not welcome.

Jan Wielemaker

unread,
Jun 4, 2002, 6:35:23 AM6/4/02
to
In article <OJA+WYGORJ$8E...@objmedia.demon.co.uk>, Stephen Kellett wrote:
> In message <Xns92236FFD05CD4ei...@194.19.1.61>, Eigil
> Hysvær <eigil....@eudoramail.com> writes
>>"Kenji chan" <kenji...@bigpond.com> wrote in
>>news:YS_K8.4891$Hj3....@newsfeeds.bigpond.com:
>>
>>> But I want to make it more accurate
>>>
>>We're off-topic now - so I'll be brief.
>>With win32 you can use performance counters which is 64-bit integers.
>>API-functions are: QueryPerformanceFrequence and QueryPerformanceCounter.
>
> I wrote a little program that called all of the various "get a time of
> some sort or other" as many times as you wanted (I chose 100,000) and
> then told you the average time per call.
>
> QueryPerformanceCounter() was the slowest, by a big margin.
> GetTickCount() was the fastest.
>
> I'd love to know if there is something better (faster) than
> GetTickCount().

Well, if you want to sacrifice portability (but Windows isn't portable
anyhow :-), you should be able to use the pentium clock directly. Here
is some code to this using gcc under Linux. With enough knowledge it
shouldn't be hard to convert this.

--- Jan

#include <SWI-Prolog.h>

#include <stdio.h>
#include <time.h>
#include <unistd.h>

#define TWO_POT_32 (double)4294967296.0


foreign_t pl_pentium_clock(term_t time_term)
{
register unsigned long iax;
unsigned long idx;
double processor_time;

asm("rdtsc"
: "=a" (iax), "=d" (idx)
: /* no input registers */
: "%eax", "%edx" );

processor_time = (double)iax + TWO_POT_32*(double)idx;

return PL_unify_float(time_term, processor_time);
}

install_t install()
{
PL_register_foreign("pentium_clock", 1, pl_pentium_clock, 0);
}

Ruud van Gaal

unread,
Jun 4, 2002, 7:27:29 AM6/4/02
to
On Tue, 4 Jun 2002 11:19:58 +0100, Stephen Kellett
<sn...@objmedia.demon.co.uk> wrote:

>In message <Xns92236FFD05CD4ei...@194.19.1.61>, Eigil
>Hysvær <eigil....@eudoramail.com> writes
>>"Kenji chan" <kenji...@bigpond.com> wrote in
>>news:YS_K8.4891$Hj3....@newsfeeds.bigpond.com:
>>
>>> But I want to make it more accurate
>>>
>>We're off-topic now - so I'll be brief.
>>With win32 you can use performance counters which is 64-bit integers.
>>API-functions are: QueryPerformanceFrequence and QueryPerformanceCounter.
>
>I wrote a little program that called all of the various "get a time of
>some sort or other" as many times as you wanted (I chose 100,000) and
>then told you the average time per call.
>
>QueryPerformanceCounter() was the slowest, by a big margin.
>GetTickCount() was the fastest.
>
>I'd love to know if there is something better (faster) than
>GetTickCount().

Even though 'slow' may be harmless if you only call it once per drawn
frame, you could also check out 'gettimeofday'.
I use that, and I think its performance is inbetween GetTickCount()
and QueryPC(), or so I've read once.


Ruud van Gaal
Free car sim: http://www.racer.nl/
Pencil art : http://www.marketgraph.nl/gallery/

Kenji chan

unread,
Jun 4, 2002, 7:50:24 AM6/4/02
to
Good discussions
In my case, I'd only need a very accurate tickCount because when drawing is
very fast (the window is very small), probably less than a second drawing
More than one frame, which also means, sometime, TickCount (Now) and the
TickCount (Last) are the same for DWORD.

After I using QueryPerformanceCounter(), it doesn't slow down the drawing or
better saying "I don't feel"

// Code here
char sz[256];
LARGE_INTEGER LInt;
QueryPerformanceCounter(&LInt);

__int64 i64_Interval = LInt.QuadPart - i64TickCount ;//interval
i64TickCount = LInt.QuadPart;

if (i64_Interval ==0)
sprintf(sz, "FPS N/A");
else
sprintf(sz, "FPS %0.2f, %d c", (float)1000000 / i64_Interval, (int)f_R1 );

Gernot Frisch

unread,
Jun 4, 2002, 8:14:06 AM6/4/02
to
Have you tried timeGetTime() ? It's in #include <winmm.h> A multimedia
timer.

Regards,
Gernot

__________________________
DREAM DESIGN ENTERTAINMENT
http://www.Dream-D-Sign.de
__________________________

"Kenji chan" <kenji...@bigpond.com> schrieb im Newsbeitrag
news:4K1L8.5055$Hj3....@newsfeeds.bigpond.com...

Ruud van Gaal

unread,
Jun 4, 2002, 8:25:21 AM6/4/02
to
On Tue, 4 Jun 2002 17:32:06 +1000, "Kenji chan"
<kenji...@bigpond.com> wrote:

>I want a function which is more accurate than GetTickCount()
>any idea?
>
>Actually, the problem is when the drawing in OpenGL is very fast,
>(dwTickNow -dwTickCount) will be zero
>
> DWORD dwTickNow = GetTickCount();
> if ((dwTickNow -dwTickCount) ==0) SleepEx(2000, 0); // problem here
> sprintf(sz, "FPS %0.2f, %d c", (float)1000 /(dwTickNow -dwTickCount),
>(int)f_R1 );

BTW for FPS usage, I'd keep the time for a stretch of frames for the
FPS doesn't wiggle that much. So keep time for the last 10 frames for
example. In that case, the TickCount granularity doesn't matter as
much.

WTH

unread,
Jun 4, 2002, 8:37:25 AM6/4/02
to
Cool man, thanks :)

WTH

"Jan Wielemaker" <j...@ct.localnet> wrote in message
news:slrnafp5o...@ct.localnet...

Peter van Merkerk

unread,
Jun 4, 2002, 9:21:44 AM6/4/02
to
> I wrote a little program that called all of the various "get a time of
> some sort or other" as many times as you wanted (I chose 100,000) and
> then told you the average time per call.
>
> QueryPerformanceCounter() was the slowest, by a big margin.
> GetTickCount() was the fastest.
>
> I'd love to know if there is something better (faster) than
> GetTickCount().

It depends on your requirements. The question is if you really do need to
call the time meaurement functions very frequently (like 100000 times a
second) and what resolution do you need. For a FPS counter both
QueryPerfomanceCounter() and timeGetTime() would do. Ironically the higher
the resolution the slower the function:

GetTickCount() is very fast.
On a 700 MHz P3 this function takes 13 nS or 9 clock cycles.
It offers only a low resolution so it isn't suitable for precision time
measurements. In spite of its speed there is little point in calling
GetTickCount frequently (that is more than 18 times a second) because its
resolution is too low too yield sensible information.

timeGetTime() is fast.
On a 700 MHz P3 this function takes 160 nS or 115 clock cycles.
After calling timeBeginPeriod(1), timeGetTime() offers millisecond
resolution. So it is a suitable choice for medium precision time
measurements (>5 ms).

QueryPerfomanceCounter() is relatively slow.
On a 700 MHz P3 this function takes 3.5uS or 2500 clock cycles.
It offers better than microsecond resolution, but because it is relatively
slow it isn't a good choice if you want to measure durations of less than
100 uS.

The Pentium rdtsc instruction allows for very high precision timing and has
little overhead. Unfortunately it requires some assembly programming. The
rdtsc instruction is a good (the only) choice if you want to measure events
that lasts less than 10 uS. If you want to measure events with very short
duration, keep in mind that the measurement itself may affect the results.
Writing timing results to the output (screen, file...etc) may have great
impact on the timing itself. With the rdtsc instruction you can also see the
effects of cache misses and wrong branch predictions; therefore its results
should be interpreted very carefully especially in cases where the measured
duration is less than a few hunderd clock cycles.

--
Peter van Merkerk
merkerk(at)turnkiek.nl

Jem Berkes

unread,
Jun 4, 2002, 11:07:36 AM6/4/02
to
> Well, if you want to sacrifice portability (but Windows isn't portable
> anyhow :-), you should be able to use the pentium clock directly.
> Here is some code to this using gcc under Linux. With enough
> knowledge it shouldn't be hard to convert this.

This isn't specific to Windows, it's specific to Pentium and newer CPUs
(supports all the AMD line too). The key instruction is RDTSC (read time
stamp counter). The counter counts all CPU cycles since powerup, which
offers an incredible accuracy. The way to use it is to first calibrate it
to the high performance counter like other people have mentioned.

Then you dig into an RDTSC-based delay system. Believe it or not, with a
1.7 GHz machine (for example), the accuracy is 1/(1.7*10^9) = 0.6 ns
(better than a nanosecond resolution).

:)

--
Jem E. Berkes
Student IEEE (Winnipeg)

http://www.pc-tools.net/
Windows, Linux & UNIX software

fungus

unread,
Jun 4, 2002, 2:10:25 PM6/4/02
to
Peter van Merkerk wrote:
>
> It depends on your requirements. The question is if you really do need to
> call the time meaurement functions very frequently (like 100000 times a
> second) and what resolution do you need. For a FPS counter both
> QueryPerfomanceCounter() and timeGetTime() would do. Ironically the higher
> the resolution the slower the function:
>

For a game you should only be calling the timer function
once per frame so I doubt it will matter much.

> The Pentium rdtsc instruction allows for very high
> precision timing and has little overhead.
>

How do you get the clock frequency of the CPU?
Presumably you need a number to divide the number
of cycles by.

--
<\___/>
/ O O \
\_____/ FTB.

Peter van Merkerk

unread,
Jun 4, 2002, 3:03:57 PM6/4/02
to
> How do you get the clock frequency of the CPU?
> Presumably you need a number to divide the number
> of cycles by.

True. You get the CPU frequency by counting the number of CPU cycles that
pass over a certain measuring interval and then divide the number of cpu
cycles by the measurement interval. This way you can measure the CPU
frequency:

__int64 PerformanceTimer::measureCPUFrequency(unsigned long aInterval)
{
// Raise priority to avoid interference from other threads.
int Priority = ::SetThreadPriority(::GetCurrentThread(),
THREAD_PRIORITY_TIME_CRITICAL);

__int64 lStart, lEnd;
__int64 lPCStart, lPCEnd;

// Count number of processor cycles inside the specified interval
QueryPerformanceCounter((LARGE_INTEGER*)&lPCStart);
lStart = rdtsc();
Sleep(aInterval);
lEnd = rdtsc();
QueryPerformanceCounter((LARGE_INTEGER*)&lPCEnd);

// Restore thread priority.
::SetThreadPriority(::GetCurrentThread(), Priority);

// Calculate Rdtsc/PerformanceCounter ratio;
__int64 lDiff = lEnd - lStart;
__int64 lPCDiff = lPCEnd - lPCStart;

double lRatio = double(lDiff) / double(lPCDiff);
__int64 lPCFreq;
QueryPerformanceFrequency((LARGE_INTEGER*)&lPCFreq);

// Calculate CPU frequency.
m_CPUFreq = __int64(lPCFreq * lRatio);

return m_CPUFreq;

Stephen Kellett

unread,
Jun 4, 2002, 4:13:07 PM6/4/02
to
In message <3CFD02FA...@egg.chips.and.spam.com>, fungus
<sp...@egg.chips.and.spam.com> writes

>Peter van Merkerk wrote:
>>
>> It depends on your requirements. The question is if you really do need to
>> call the time meaurement functions very frequently (like 100000 times a
>> second) and what resolution do you need. For a FPS counter both
>> QueryPerfomanceCounter() and timeGetTime() would do. Ironically the higher
>> the resolution the slower the function:
>>
>
>For a game you should only be calling the timer function
>once per frame so I doubt it will matter much.

I'm not writing a game. It does matter a lot to me.

>> The Pentium rdtsc instruction allows for very high
>> precision timing and has little overhead.
>>
>
>How do you get the clock frequency of the CPU?

Its in the NT registry (sorry I don't know about the Win9x line - I just
looked on a Win95 machine and if its there its in a different place)

HKEY_LOCAL_MACHINE\Hardware\description\system\centralprocessor\0\~MHz

My 1.1GHz Athlon says 0x457 => 1111Mhz

There is probably a machine instruction to get you the exact speed.

Ratch

unread,
Jun 6, 2002, 9:34:54 PM6/6/02
to
Do not the Win32 APIs, QueryPerformanceCounter, and
QueryPerformanceFrequency, use the Read Time Stamp Register? Anyway, here
is a good link to using RDTSP. Ratch

http://cedar.intel.com/software/idap/media/pdf/rdtscpm1.pdf


"Peter van Merkerk" <mer...@deadspam.com> wrote in message
news:adiesr$11nejl$1...@ID-133164.news.dfncis.de...

Peter van Merkerk

unread,
Jun 7, 2002, 5:25:06 AM6/7/02
to
> Do not the Win32 APIs, QueryPerformanceCounter, and
> QueryPerformanceFrequency, use the Read Time Stamp Register?

No, at least not always. Just look at the frequency returned by
QueryPerformanceFrequency(). Long ago I've seen it on one system to be equal
to the processor frequency. But every other system I've checked it is either
1.19 Mhz (which happens to be the frequency of one of the timer chips) or
3.5 Mhz. Also, looking at the number of cycles a QueryPerformanceCounter()
call takes, it suggests that a user-kernel mode switch is taking place,
probably to access hardware. The rdtsc instruction can be executed in user
mode, so if QueryPerformanceCounter() would use this instruction it would
have been a lot faster (comparable to the GetTickCount() function).

Pete

unread,
Jun 7, 2002, 7:36:57 AM6/7/02
to
Here's the code that I use to read the Pentium TSC. I don't recall how
many clock cycles it uses, but I recall that it's very fast. Boosting
the thread priority during the benchmarking portion of the code would be
a good idea.

This code returns the Pentium TSC in microseconds, in double precision
floating point.

Best regards,
Pete

#include <windows.h>
#include <stdio.h>
#include <conio.h>
#include <lcttimer.h>

/* Function
prototypes: */
double lct_TimerRead(void);
_int64 readTSC (void);

/****************************************************************************/
double lct_TimerRead(void)
{
_int64 time1;
_int64 time2;
int iProcessorSpeed;
static BOOL bInitialized;
static double dDiv;

if (bInitialized == FALSE)
{
/* NOTE: we're using Windows to try to sleep for a second here in
order */
/* to get a reading on the processor speed. Windows' clock is
accurate */
/* to 10 milliseconds so we should be accurate to within about 1% of
true*/
time1 = readTSC();
Sleep(1000);
time2 = readTSC();

/* The processor speed is the number of clock cycles in a
second. */
iProcessorSpeed = (int)(time2 - time1);

/* Simplify conversion to
microseconds. */
dDiv = (double)1000000 / iProcessorSpeed;

/* Do this only
once. */
bInitialized = TRUE;
}

return (readTSC() * dDiv);
}
/****************************************************************************/
_int64 readTSC (void)
{
/* This code originally came
from: */
/*
http://www.ese-metz.fr/metz/personnel/dedu/docs/timer.html */

/* From the Intel Architecture Software Developer’s Manual Volume
2: */
/* Instruction Set
Reference */
/*
-------------------------------------------------------------------------*/
/* This instruction loads the current value of the processor’s
time-stamp */
/* counter into the EDX:EAX registers. The time-stamp counter is
contained */
/* in a 64-bit MSR. The high-order 32 bits of the MSR are loaded into
the */
/* EDX register, and the low-order 32 bits are loaded into the EAX
register.*/
/* The processor increments the time-stamp counter MSR every clock cycle
and*/
/* resets it to 0 whenever the processor is
reset. */

_int64 t;
unsigned int a,b;
unsigned int *c = (unsigned int *)&t;
_asm {
_emit 0x0f;
_emit 0x31;
mov a,eax;
mov b,edx;
}
c[0]=a;c[1]=b;
return t;
}
/****************************************************************************/

fungus

unread,
Jun 9, 2002, 11:32:19 AM6/9/02
to
Peter van Merkerk wrote:
>
> No, at least not always. Just look at the frequency returned by
> QueryPerformanceFrequency(). Long ago I've seen it on one system to be equal
> to the processor frequency. But every other system I've checked it is either
> 1.19 Mhz (which happens to be the frequency of one of the timer chips) or
> 3.5 Mhz. Also, looking at the number of cycles a QueryPerformanceCounter()
> call takes, it suggests that a user-kernel mode switch is taking place,
> probably to access hardware. The rdtsc instruction can be executed in user
> mode, so if QueryPerformanceCounter() would use this instruction it would
> have been a lot faster (comparable to the GetTickCount() function).
>

I've been investigating rdtsc recently and I've come
to the conclusion that it's almost useless.

Apart from the difficulty (impossibility?) of reliably
finding out the CPU clock frequency, there are a couple
of traps waiting for you.

Firstly, if you're on a laptop with power management
the CPU frequency can change depending on how often
you press keys, move the mouse, plug in the mains
transformer, etc.

Secondly, if you're on a multi CPU system your program
can switch between CPUs and return the rdtsc value from
a completely different chip.

Phil Frisbie, Jr.

unread,
Jun 10, 2002, 11:50:38 AM6/10/02
to
fungus wrote:
>
> I've been investigating rdtsc recently and I've come
> to the conclusion that it's almost useless.
>
> Apart from the difficulty (impossibility?) of reliably
> finding out the CPU clock frequency, there are a couple
> of traps waiting for you.
>
> Firstly, if you're on a laptop with power management
> the CPU frequency can change depending on how often
> you press keys, move the mouse, plug in the mains
> transformer, etc.

That is a very good point.

> Secondly, if you're on a multi CPU system your program
> can switch between CPUs and return the rdtsc value from
> a completely different chip.

This would only be a problem with multiple threads, but most useful applications
are multithreaded.

For these reasons I have stopped using rdtsc except for pinpoint profiling of
code. QueryPerformanceCounter() is high enough resolution, and most applications
should not be calling it more than a couple of hundred times a second anyway.


Phil Frisbie, Jr.
Hawk Software
http://www.hawksoft.com

fungus

unread,
Jun 10, 2002, 12:59:02 PM6/10/02
to
Phil Frisbie, Jr. wrote:
>
>>If you're on a multi CPU system your program

>>can switch between CPUs and return the rdtsc value from
>>a completely different chip.
>
>
> This would only be a problem with multiple threads
>

No, it can happen even with a single thread.
I since found out there's actually a function
SetThreadAffinityMask() to keep your thread
running on the same chip.

Whatever, rdtsc seems like a lot of trouble.

Ruud van Gaal

unread,
Jun 10, 2002, 3:37:24 PM6/10/02
to
On Mon, 10 Jun 2002 16:59:02 GMT, fungus <sp...@egg.chips.and.spam.com>
wrote:

Like most undocumented features of the past stretch of years. Using
raw code like RDTSC just shortens the lifespan of your product. Great
for games! ;-)

Peter van Merkerk

unread,
Jun 10, 2002, 5:03:02 PM6/10/02
to
> Like most undocumented features of the past stretch of years. Using
> raw code like RDTSC just shortens the lifespan of your product. Great
> for games! ;-)

The RDTSC instruction has been documented for at least seven years by now,
if not longer. The RDTSC instruction can be a useful tool in test programs
to measure how long an API call takes, or to measure the time it takes
execute an algorithm that cannot be reliably or accurately profiled by a
profiling tool. For production code the RDTSC instruction is not a very good
idea for the reasons pointed out in previous posts. As often, it eventualy
boils down to choosing the right tool for the job.

Ruud van Gaal

unread,
Jun 11, 2002, 6:37:04 AM6/11/02
to

You're right there indeed. I do fear however that a lot of RDTSC
instructions are kept in production code, if only by mistake.
But in closed environments, it can be very useful.

0 new messages