GetThreadTimes...

Jochen Kalmbach

unread,

Oct 21, 2004, 1:34:10 AM10/21/04

to

Hello,

I just wanted to "test" the GetThreadTimes function and I was surprised
by the result!

I did a small app, which creats two threads. This threads are only
waiting for each other on an event.

Two Events: A (signaled), B (non-signaled)
Thread1: WaitForEvent-A; Signal Event-B; loop;
Thread2: WaitForEvent-B; Signal Event-A; loop;

The main thread is just setting up the events and threads. The loop is
executed 1,000,000 times.

Now I measured the overall time in the main thread from the start of the
threads until the threads have finished the loops.

Now I also retrive the GetThreadTimes for the threads.
The surprising result is here:
<result>
Overal duration: 9109.691 ms

Reported time for thread 1
Kernel: 00:03.046
User: 00:00.781

Reported time for thread 2
Kernel: 00:01.812
User: 00:00.687
</result>

There was no other processes running!

So this leads me to some questions:

1. Where is the rest of the time?

Thread1+2 are using only 6.326 seconds
WHat about the rest of 2.7 seconds?
Is this really used for Task-Switching?

2. Why is the kernel time for thread 1 and 2 so different?
Normaly it should be the same, beacuse they execute
(almost) the same code.

Has someone any ideas?
Did I miss something?

Here is the code:

<code>
#include <windows.h>
#include <stdio.h>

HANDLE hWaitA;
HANDLE hWaitB;

DWORD counterA = 0;
DWORD counterB = 0;

DWORD __stdcall ThreadA(LPVOID)
{
while(counterA <= 1000000)
{
WaitForSingleObject(hWaitA, INFINITE);
SetEvent(hWaitB);
counterA++;
}
return 0;
}

DWORD WINAPI ThreadB(LPVOID)
{
while(counterB <= 1000000)
{
WaitForSingleObject(hWaitB, INFINITE);
SetEvent(hWaitA);
counterB++;
}
return 0;
}

int _tmain(int argc, _TCHAR* argv[])
{
DWORD id;
HANDLE hThread[2];
hWaitA = CreateEvent(NULL, FALSE, TRUE, NULL);
hWaitB = CreateEvent(NULL, FALSE, FALSE, NULL);

LARGE_INTEGER liStart, liEnd, liFreq;

QueryPerformanceCounter(&liStart);

hThread[0] = CreateThread(NULL, 0, ThreadA, 0, 0, &id);
hThread[1] = CreateThread(NULL, 0, ThreadB, 0, 0, &id);

WaitForMultipleObjects(2, hThread, TRUE, INFINITE);

QueryPerformanceCounter(&liEnd);
QueryPerformanceFrequency(&liFreq);

double ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) /
(double)liFreq.QuadPart;
printf("Duration: %.3f ms\n\n", ms);

for(DWORD i=0; i<2; i++)
{
FILETIME ftCreate, ftExit, ftKernel, ftUser;
GetThreadTimes(hThread[i], &ftCreate, &ftExit, &ftKernel, &ftUser);
printf("Reported time for thread %d\n", i+1);

SYSTEMTIME st;
FileTimeToSystemTime(&ftKernel, &st);
printf("Kernel: %2.2d:%2.2d.%3.3d\n", st.wMinute, st.wSecond,
st.wMilliseconds);

FileTimeToSystemTime(&ftUser, &st);
printf("User: %2.2d:%2.2d.%3.3d\n\n", st.wMinute, st.wSecond,
st.wMilliseconds);
}

return 0;
}
</code>

--
Greetings
Jochen

My blog about Win32 and .NET
http://blog.kalmbachnet.de/

Arkady Frenkel

unread,

Oct 21, 2004, 6:15:29 PM10/21/04

to

Jochen...
"Jochen Kalmbach" <nospam-Joch...@holzma.de> wrote in message
news:Xns95894D39CEEE9J...@127.0.0.1...

But what's about system it's a lot of threads running , see that with
Process Explorer from
sysinternals

> 2. Why is the kernel time for thread 1 and 2 so different?
> Normaly it should be the same, beacuse they execute
> (almost) the same code.
>

They similar in user mode but not in kernel ,that show that kernel threads
do all the business , but
why the are differ almost twice ?.... Maybe do your experiment few times to
see if really that
will be such in other tests
Arkady

Arkady Frenkel

unread,

Oct 21, 2004, 6:20:57 PM10/21/04

to

I mean if some interrupts happen in the time second thread run ( like timer
interrupt occasionally happen in the time of second thread and all
scheduling of system threads done on the account of second thread )
Arkady

"Arkady Frenkel" <arkadyf@hotmailxdotxcom> wrote in message
news:OXZwtt7t...@TK2MSFTNGP15.phx.gbl...

youhua.wang

unread,

Oct 21, 2004, 9:44:46 PM10/21/04

to

Hi , Jochen.

I have just run the sample. And here is my output:

RESULT 1:
Duration: 6739.888 ms

Reported time for thread 1

Kernel: 00:01.922
User: 00:01.011

Reported time for thread 2

Kernel: 00:01.922
User: 00:01.111

RESULT 2:

Duration: 5970.463 ms

Reported time for thread 1

Kernel: 00:01.642
User: 00:01.392

Reported time for thread 2

Kernel: 00:01.892
User: 00:01.001

And GetSystemTimeAdjustment() test result on my system:
SystemTimeIncreastmen: 10014 (us)
SystemTimeAdjustmen: 10014 (us)
SystemTimeAdjustAllow: 1

It seem very correct. I run in debug mode.
I still quit understand the concept of "kernel time".
Is it time that count from entry of each API call? Does time consumption of
"printf()" like CRT function belong to kernel time?

Is there document say that the resolution of GetThreadTimes is
"SystemTimeAdjustmen" ? Is that just your observation from
"UserTime/KernelTime values are always a multiple of "SystemTimeAdjustmen" "
? I doubt about this conclusion.

"Jochen Kalmbach" <nospam-Joch...@holzma.de> wrote in message
news:Xns95894D39CEEE9J...@127.0.0.1...

Jochen Kalmbach

unread,

Oct 22, 2004, 1:34:10 AM10/22/04

to

Hi youhua.wang,

> Is there document say that the resolution of GetThreadTimes is
> "SystemTimeAdjustmen" ? Is that just your observation from
> "UserTime/KernelTime values are always a multiple of
> "SystemTimeAdjustmen" " ? I doubt about this conclusion.

You can dig into the system via windbg/ida and then you will see the
following (XP-SP1):

kernel32:
GetThreadTimes calls NtQueryInformationThread with ThreadTimes. This will
then return an "THREAD_TIMES_INFORMATION" with the desired times and only
needs to be assigned to the provided FILETIME structures.

ntoskrnl:
NtQueryInformationThread then gets the Thread (TCB?) and retrives the
Kernel- and User-Times from the TCB and then it will multiply with the
"nt!KeMaximumIncrement" and return the result.

You can also dig into "GetSystemTimeAdjustment" and you will see that for
"lpTimeIncrement" the same value is retrived as which is used in the
multiply operation in "NtQueryInformationThread"...

Therefor my conclusion.

It seems that internally the user/kernel-time is always incremented with
every clock-cycle (and I guessed that it would also increment for every
thread-swtch... but this should result in an increased user/kernl-time
because these threads should never consume all his threads quantum...,
because it only singals an event and then blocks immediately again)

See:
http://undocumented.ntinternals.net/UserMode/Undocumented%20Functions/NT%
20Objects/Thread/NtQueryInformationThread.html
http://undocumented.ntinternals.net/UserMode/Undocumented%20Functions/NT%
20Objects/Thread/THREAD_INFORMATION_CLASS.html
http://undocumented.ntinternals.net/UserMode/Structures/THREAD_TIMES_INFO
RMATION.html

hector

unread,

Oct 22, 2004, 5:03:28 AM10/22/04

to

"Jochen Kalmbach" <nospam-Joch...@holzma.de> wrote in message
news:Xns95894D39CEEE9J...@127.0.0.1...

>

> So this leads me to some questions:
>
> 1. Where is the rest of the time?
>
> Thread1+2 are using only 6.326 seconds
> WHat about the rest of 2.7 seconds?
> Is this really used for Task-Switching?
>
> 2. Why is the kernel time for thread 1 and 2 so different?
> Normaly it should be the same, beacuse they execute
> (almost) the same code.
>
>
> Has someone any ideas?
> Did I miss something?
>

You were not calculating (printing) the times correctly.

Here is the correct way showing they do add up, including
comparing it against the process (main thread) times.

#include <afx.h>
#include <stdio.h>

HANDLE hWaitA;
HANDLE hWaitB;

DWORD counterA = 0;
DWORD counterB = 0;

DWORD __stdcall ThreadA(LPVOID)
{
while(counterA <= 1000000)
{
WaitForSingleObject(hWaitA, INFINITE);
SetEvent(hWaitB);
counterA++;
}
return 0;
}

DWORD WINAPI ThreadB(LPVOID)
{
while(counterB <= 1000000)
{
WaitForSingleObject(hWaitB, INFINITE);
SetEvent(hWaitA);
counterB++;
}
return 0;
}

// minus (diff) operator for FILETIME, return milliseconds
inline int operator -(const FILETIME &left, const FILETIME &right)
{
return int((*(__int64 *)&left - *(__int64 *)&right) / 10000);
}

int _tmain(int argc, _TCHAR* argv[])
{

DWORD id;
HANDLE hThread[2];
hWaitA = CreateEvent(NULL, FALSE, TRUE, NULL);
hWaitB = CreateEvent(NULL, FALSE, FALSE, NULL);

LARGE_INTEGER liStart, liEnd, liFreq;

QueryPerformanceCounter(&liStart);

hThread[0] = CreateThread(NULL, 0, ThreadA, 0, 0, &id);
hThread[1] = CreateThread(NULL, 0, ThreadB, 0, 0, &id);

printf("start...");
WaitForMultipleObjects(2, hThread, TRUE, INFINITE);
printf("done\n");

QueryPerformanceCounter(&liEnd);
QueryPerformanceFrequency(&liFreq);

double ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) /
(double)liFreq.QuadPart;
printf("Duration: %.3f ms\n\n", ms);

__int64 ktotal = 0;
__int64 utotal = 0;

FILETIME ftCreate, ftExit, ftKernel, ftUser;

for(DWORD i=0; i<2; i++)
{
GetThreadTimes(hThread[i], &ftCreate, &ftExit, &ftKernel, &ftUser);
printf("Reported time for thread %d\n", i+1);

__int64 ktime = *(__int64 *)&ftKernel;
__int64 utime = *(__int64 *)&ftUser;
printf(" kernel: %6d ms\n", int(ktime / 10000));
printf(" user : %6d ms\n", int(utime / 10000));
printf(" diff : %6d ms (Exit-Create)\n", int(ftExit-ftCreate));
ktotal += ktime;
utotal += utime;

}

printf("\n");
printf("Kernel+User Thread Totals:\n");
printf(" kernel: %6d ms\n", int(ktotal / 10000));
printf(" user : %6d ms\n", int(utotal / 10000));
printf(" total : %6d ms\n", int((ktotal+utotal) / 10000));

printf("\n");
printf("Reported time for process (main thread)\n");

GetProcessTimes(GetCurrentProcess(), &ftCreate, &ftExit, &ftKernel,
&ftUser);
__int64 utime = *(__int64 *)&ftUser;
__int64 ktime = *(__int64 *)&ftKernel;
printf(" kernel: %6d ms\n", int(ktime / 10000));
printf(" user : %6d ms\n", int(utime / 10000));
printf(" total : %6d ms\n", int((ktime+utime) / 10000));

return 0;
}

--
Hector Santos, Santronics Software, Inc.
http://www.santronics.com

Jochen Kalmbach

unread,

Oct 22, 2004, 5:59:34 AM10/22/04

to

Hi hector,

> You were not calculating (printing) the times correctly.

What is wrong with "FileTimeToSystemTime" ?
It outputs exactly the same values as your "hand-coding".

By the way: In the example there should be one line at the beginning:

SetProcessAffinityMask(GetCurrentProcess(), 1);

The goal for the whole thing was to show that the time-measurement for
"GetThreadTimes (and therfor also for "GetProcessTimes") is not very
correct... because all times only have a resolution of about 10-15 ms (see
my other post).

My main question is: How is the thread-time for user/kernel-mode
incremented?
I think, they are incremented in quantum-steps and therefor it is not very
accurate... but I couldn´t find an example to verify this...

Arkady Frenkel

unread,

Oct 24, 2004, 7:05:02 AM10/24/04

to

Yes , system receive interrupt each 10 ms in NT kernel Oses so that's the
delta for update s/w time variables.

In WinCE it's 1 ms and possible to see in Platform Builder sources how
Interrupt routine PeRPISR ( single for all interrupts in CEPC at least ) in
the case timer interrupt ocur , increment the variables used afterwards in
GetThreadTimes

Arkady

"Jochen Kalmbach" <nospam-Joch...@holzma.de> wrote in message

news:Xns958A7A3924169J...@127.0.0.1...

Jochen Kalmbach

unread,

Oct 25, 2004, 4:24:10 AM10/25/04

to

Hi Arkady,

> Yes , system receive interrupt each 10 ms in NT kernel Oses so that's
> the delta for update s/w time variables.

Now I found an example, in which we will see that the "counting" of
user/kernel-time is mainly "coincidence"...

The counting works this way:
Every time the timer-interrupt is received (on SP: 10ms, on DP: 15ms),
the values from the *current* running thread are incremented (either the
kernel or the user-times, depending on the *current* running mode).

Therefor you can produce an example, in which the thread is doing a time-
consuming calculation, but the "GetThreadTimes" returns 0 for user and 0
for kernel-times.
This is the case, if the thread *always* aborts his quantum (for example
by "Sleep"), befor the timer-interrupt is fired!

Therefor you need at least two threads (one which is doing the
"calculation" and one which is always ready to run).

Here is the output

<output>
Inc duration: 2.887 ms

Duration: 15689.486 ms

Reported time for thread 1 (Calc-Thread)
Kernel: 00:00.000
User: 00:00.000

Reported time for thread 2 (Idle-Thread)
Kernel: 00:13.015
User: 00:02.562
</output>

The calculation which is done needs 2.887 ms; this calculation is done
1000 times, so GetThreadTimes should report 2887 ms... but it reports 0
ms.

Example:

<code>
#include <windows.h>
#include <stdio.h>

DWORD loopCounter = 0;
DWORD loopCounterMax = 1000;
DWORD internalCounter = 0xFFF00000;

DWORD __stdcall CalcThread(LPVOID)
{
while(loopCounter <= loopCounterMax)
{
DWORD cnt = internalCounter;
while(cnt != 0) cnt++;
Sleep(1);
loopCounter++;
}
return 0;
}

DWORD WINAPI IdleThread(LPVOID)
{
while(loopCounter <= loopCounterMax)
{
Sleep(0); // just do something...
}
return 0;
}

int _tmain(int argc, _TCHAR* argv[])
{

// be sure we only use 1 processor!
SetProcessAffinityMask(GetCurrentProcess(), 1);

LARGE_INTEGER liStart, liEnd, liFreq;

// test, how much time the inc is using...
QueryPerformanceCounter(&liStart);
DWORD cnt = internalCounter;
while(cnt != 0) cnt++;

QueryPerformanceCounter(&liEnd);
QueryPerformanceFrequency(&liFreq);
double ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) /
(double)liFreq.QuadPart;

printf("Inc duration: %.3f ms\n\n", ms);
// test-end

DWORD id;
HANDLE hThread[2];

QueryPerformanceCounter(&liStart);

hThread[0] = CreateThread(NULL, 0, CalcThread, 0, 0, &id);
hThread[1] = CreateThread(NULL, 0, IdleThread, 0, 0, &id);

WaitForMultipleObjects(2, hThread, TRUE, INFINITE);

QueryPerformanceCounter(&liEnd);
QueryPerformanceFrequency(&liFreq);

ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) / (double)

liFreq.QuadPart;
printf("Duration: %.3f ms\n\n", ms);

FILETIME ftCreate, ftExit, ftKernel, ftUser;
for(DWORD i=0; i<2; i++)
{
GetThreadTimes(hThread[i], &ftCreate, &ftExit, &ftKernel, &ftUser);

printf("Reported time for thread %d\n", i+1);

SYSTEMTIME st;

FileTimeToSystemTime(&ftKernel, &st);
printf("Kernel: %2.2d:%2.2d.%3.3d\n", st.wMinute, st.wSecond,
st.wMilliseconds);

FileTimeToSystemTime(&ftUser, &st);
printf("User: %2.2d:%2.2d.%3.3d\n\n", st.wMinute, st.wSecond,
st.wMilliseconds);
}

return 0;
}
</code>

PS: Of course this example works *only* in debug mode, because the
optimizing compiler will remove the loop (:->)

Arkady Frenkel

unread,

Oct 25, 2004, 4:41:17 AM10/25/04

to

Thanks , Jochen !
Now try to change your_tmain setting
MMRESULT mm = timeBeginPeriod(1) on start and
timeEndPeriod(mm) on exit to see the difference
TIA
Arkady

"Jochen Kalmbach" <nospam-Joch...@holzma.de> wrote in message

news:Xns958D6A0D0EB62J...@127.0.0.1...

Jochen Kalmbach

unread,

Oct 25, 2004, 5:01:35 AM10/25/04

to

Hi Arkady,

> Now try to change your_tmain setting
> MMRESULT mm = timeBeginPeriod(1) on start and
> timeEndPeriod(mm) on exit to see the difference

Yes, I know the effect.... I just wanted to find out, who the
GetThreadTimes is internally working... and if the times are "really"
correct...

hector

unread,

Oct 25, 2004, 12:37:55 PM10/25/04

to

"Jochen Kalmbach" <nospam-Joch...@holzma.de> wrote in message

news:Xns958D70649FB2CJ...@127.0.0.1...

> Hi Arkady,
>
> > Now try to change your_tmain setting
> > MMRESULT mm = timeBeginPeriod(1) on start and
> > timeEndPeriod(mm) on exit to see the difference
>
> Yes, I know the effect.... I just wanted to find out, who the
> GetThreadTimes is internally working... and if the times are "really"
> correct...
>

Just like I said there were. :-)

hmmm, this new example is poor in trying to understand the thread and
synchronization concepts in Windows. You really learn nothing about
anything. Your first example is more relevant. Go with your first ideas
<g>.

Keep in mind that the quantum size and time slicing is based the total OP
codes. When you implement kernel event object, the OS preempts the threads
and the time slicing. i.e, it doesn't wait until 10/15 ms.

Also, there is a major difference between Sleep(1) and Sleep(0).

Sleep(0) is technically considered a "yield" Your IdleThread() isn't really
idle at all, but a high overhead context switching thread. This is
definitely one of the things you watch for and do not do. Just look at your
CPU at 99%

On the other hand, Sleep(1) will wait the quantum size (10/15ms) unless you
change the period as Arkady suggested.

Also, again by not using Kernel objects, you are telling windows "Hey, you
take care of the time slicing." So in theory, you will get closer to quantum
delta values. But that Sleep(0) changes everything too because your context
switching is high!

Finally again, you are losing conversion when you use
FileTimeToSystemTime(). Change your printout to using a 64bit dump and you
will see your CalcThread() does produce a non-zero values.

//SYSTEMTIME st;
//FileTimeToSystemTime(&ftKernel, &st);
//printf("Kernel: %2.2d:%2.2d.%3.3d\n", st.wMinute,
st.wSecond,st.wMilliseconds);

//FileTimeToSystemTime(&ftUser, &st);
//printf("User: %2.2d:%2.2d.%3.3d\n\n", st.wMinute,
st.wSecond,st.wMilliseconds);

__int64 ktime = *(__int64 *)&ftKernel;

__int64 utime = *(__int64 *)&ftUser;
printf(" kernel: %6d ms\n", int(ktime / 10000));
printf(" user : %6d ms\n", int(utime / 10000));

Anyway, I fail to see whats to be gain by your example. What are you trying
to show? How GetThreadTimes() works?

Well, I will take a SWAG (scientific wild ass guess).

Everything time the scheduler preempts a thread, it updates it thread timing
data! So if you have it waiting on kernel objects, you are not going to get
your simple 10/15 deltas.

Your first example is more education in understanding the thread timing.
Your second one, well, you can't trust it. :-) Just take a look at your
near 99% CPU with that Sleep(0) you are using.

See ya

Jochen Kalmbach

unread,

Oct 25, 2004, 1:31:48 PM10/25/04

to

Hi hector,

> hmmm, this new example is poor in trying to understand the thread and
> synchronization concepts in Windows. You really learn nothing about
> anything. Your first example is more relevant. Go with your first
> ideas <g>.

Which example do you mean?
The example which produces 0 ms time in GetThreadTimes is really good
for me...
But you are right: with just the example, you will learn nothing...
But I also looked at the ntoskrnl.exe in which context I saw the
behaviour... I just wanted an example to prove my "digging into nt".

> Also, there is a major difference between Sleep(1) and Sleep(0).

Yes, I know...

> Also, again by not using Kernel objects, you are telling windows "Hey,
> you take care of the time slicing." So in theory, you will get closer
> to quantum delta values. But that Sleep(0) changes everything too
> because your context switching is high!

In the last example, change Sleep(0); to ; and you will see the same
effect...

> Finally again, you are losing conversion when you use
> FileTimeToSystemTime(). Change your printout to using a 64bit dump and
> you will see your CalcThread() does produce a non-zero values.

No, please provide me a working example!
This migh be true if we would have times smaller than 10/15 ms... but as
you see, the smalles time is either 0 or 10/15 ms.

> Anyway, I fail to see whats to be gain by your example. What are you
> trying to show? How GetThreadTimes() works?

Yes. And that GetThreadTimes will produce wrong results in special
cases...

> Your first example is more education in understanding the thread
> timing.

I know the threads timing.... I wanted to know if GetThreadTimes
*always* produces a reliable result! And this I showed that it *does
not* produce reliable values.

Rhett Gong [MSFT]

unread,

Oct 28, 2004, 2:43:02 AM10/28/04

to

Hello Jochen,
GetThreadTimes calls ZwQueryInformationThread internally, which is similar
with this:
MOV EAX, 6Eh
LEA EDX, [ESP+4]
INT 2Eh
RET nn

Since you said GetThreadTimes does not return correct value, could you let
me know what value you are expecting from GetThreadTimes for these 2
threads?
Here is the output I get from your 2nd program: please help to clarify your
questions according this output. Thanks.
//--------------------------------------
Inc duration: 5.364 ms

Duration: 6144.289 ms

Reported time for thread 1

kernel: 0 ms
user : 4296 ms
Reported time for thread 2
kernel: 0 ms
user : 1722 ms
//---------------------------------------

Best regards,
Rhett Gong [MSFT]
Microsoft Online Partner Support

This posting is provided "AS IS" with no warranties, and confers no rights.
Please reply to newsgroups only. Thanks.

Jochen Kalmbach

unread,

Oct 28, 2004, 2:55:45 AM10/28/04

to

Hi Rhett,

> Inc duration: 5.364 ms

Please reduce the "inc-duration" to about 1-2 ms, to see the effect.

As I said:
The thread/process times are only updated if the timer-tick occurs. This is
normally on single processor systems 10ms and on dual processor systems
15ms.

If you now make an thread which *always* aborts its quantum before the 10
ms is over, then the thread times will *never* be incremented!

If you have created such an thread, then you can calculate as much as you
want (but always only for a short time), but your thread times will never
be increased.

My second examples tries to verify this.

You can also dig into the kernel to see that this is really the case...

hector

unread,

Oct 28, 2004, 10:59:19 AM10/28/04

to

"Jochen Kalmbach" <nospam-Joch...@holzma.de> wrote in message

news:Xns95905B0EB242CJ...@127.0.0.1...

> If you now make an thread which *always* aborts its quantum before the 10
> ms is over, then the thread times will *never* be incremented!

This is not true. Your example code and analysis faulty.

The following code below proves the timing is correct.

Example output #1: Using Kernel Event Synchronization

Clock interval : 15 ms
Main Thread Period: 1 ms
Thread A Period : 1 ms
Thread B Period : 1 ms
start...done

Reported time for process (main thread)

kernel: 3265 ms
user : 1171 ms
total : 4437 ms

Reported time for thread 1

kernel: 1640 ms
user : 562 ms
time : 4421 ms (Exit-Create)
ticks : 4422 ms

Reported time for thread 2

kernel: 1625 ms
user : 593 ms
time : 4421 ms (Exit-Create)
ticks : 4422 ms

Kernel+User Thread Totals:
kernel: 3265 ms
user : 1156 ms
total : 4421 ms

Duration: 4392 ms
Ticks : 4422 ms

As you can see the the summation of the kernel and user threads time MUST
equal or be very close to the main thread time. The above output uses a
period of 1 ms for each thread including the main thread.

Example output #2: No Kernel Event Sychronization

Clock interval : 15 ms
Main Thread Period: 1 ms
Thread A Period : 1 ms
Thread B Period : 1 ms
start...done

Reported time for process (main thread)

kernel: 1640 ms
user : 484 ms
total : 2125 ms

Reported time for thread 1

kernel: 843 ms
user : 234 ms
time : 2109 ms (Exit-Create)
ticks : 2110 ms

Reported time for thread 2

kernel: 796 ms
user : 234 ms
time : 2109 ms (Exit-Create)
ticks : 2110 ms

Kernel+User Thread Totals:
kernel: 1640 ms
user : 468 ms
total : 2109 ms

Duration: 2082 ms
Ticks : 2110 ms

As you can see the system runs faster and the TIMING if all threads are
equal or nearly the same.

As I said in an earlier post, your second example is wrong. Your first
example is closer to analysis the question "Does GetTThreadTime() return
good values!?"

I say it does and this code below proves it.

// File: D:\wc5beta\testthreadtime.cpp

#include <afx.h>
#include <conio.h>
#include <stdio.h>
#include <Mmsystem.h>
#pragma comment(lib,"Winmm.lib")

#define WaitObjectSleepTime INFINITE // play with 0 to X value
#define MainThreadResolution 1 // set main thread quantum to 1 ms
#define ThreadResolutionA 1 // set ThreadA quantum to 1 ms
#define ThreadResolutionB 1 // set ThreadB quantum to 1 ms
#define USE_THREAD_SYNC 1 // set 0 for no thread sync

static long MAXCOUNT = 1000000;
static long counterA = MAXCOUNT;
static long counterB = MAXCOUNT;

DWORD dwThreadTicks[2] = {0,0};
HANDLE hWaitA;
HANDLE hWaitB;

DWORD __stdcall ThreadA(LPVOID)
{
if (ThreadResolutionA) timeBeginPeriod(ThreadResolutionA);
dwThreadTicks[0] = GetTickCount();
while(counterA > 0)
{
#if USE_THREAD_SYNC > 0
WaitForSingleObject(hWaitA, INFINITE);
SetEvent(hWaitB);
#else
Sleep(0);
#endif

counterA--;
}
dwThreadTicks[0] = GetTickCount() - dwThreadTicks[0];
if (ThreadResolutionA) timeEndPeriod(ThreadResolutionA);
return 0;
}

DWORD WINAPI ThreadB(LPVOID)
{
if (ThreadResolutionB) timeBeginPeriod(ThreadResolutionB);
dwThreadTicks[1] = GetTickCount();

while(counterB > 0)
{
#if USE_THREAD_SYNC > 0
WaitForSingleObject(hWaitB, INFINITE);
SetEvent(hWaitA);
#else
Sleep(0);
#endif
counterB--;
}
dwThreadTicks[1] = GetTickCount() - dwThreadTicks[1];
if (ThreadResolutionB) timeEndPeriod(ThreadResolutionB);
return 0;
}

// minus (diff) operator, return milliseconds

inline int operator -(const FILETIME &left, const FILETIME &right)
{
return int((*(__int64 *)&left - *(__int64 *)&right) / 10000);
}

int _tmain(int argc, _TCHAR* argv[])
{
DWORD adj,clockInterval;
BOOL adjDisabled;
GetSystemTimeAdjustment( &adj,&clockInterval,&adjDisabled);
printf( "Clock interval : %d ms\n",clockInterval / 10000 );
printf( "Main Thread Period: %d ms\n",MainThreadResolution);
printf( "Thread A Period : %d ms\n",ThreadResolutionA);
printf( "Thread B Period : %d ms\n",ThreadResolutionB);

DWORD dwStart = GetTickCount();
LARGE_INTEGER liStart, liEnd, liFreq;

if (MainThreadResolution) {
timeBeginPeriod(MainThreadResolution);
}

QueryPerformanceCounter(&liStart);

DWORD id;
HANDLE hThread[2];

hWaitA = CreateEvent(NULL, FALSE, TRUE, NULL);
hWaitB = CreateEvent(NULL, FALSE, FALSE, NULL);

hThread[0] = CreateThread(NULL, 0, ThreadA, 0, 0, &id);
hThread[1] = CreateThread(NULL, 0, ThreadB, 0, 0, &id);

cprintf("start...");

#if 0
WaitForMultipleObjects(2, hThread, TRUE, INFINITE);
#else
while (WaitForMultipleObjects(2, hThread, TRUE, WaitObjectSleepTime) ==
WAIT_TIMEOUT) {
if (kbhit() && getch() == 27) {
counterA = 0;
counterB = 0;
Sleep(0); // Switch context to allow OS to stop threads
break;
}
Sleep(0);
}
#endif
cprintf("done\n");

FILETIME ftCreate, ftExit, ftKernel, ftUser;

printf("\n");

printf("Reported time for process (main thread)\n");

GetProcessTimes(GetCurrentProcess(), &ftCreate, &ftExit, &ftKernel,
&ftUser);

__int64 utime = *(__int64 *)&ftUser;
__int64 ktime = *(__int64 *)&ftKernel;
printf(" kernel: %6d ms\n", int(ktime / 10000));
printf(" user : %6d ms\n", int(utime / 10000));

printf(" total : %6d ms\n", int((ktime+utime) / 10000));

__int64 ktotal = 0;
__int64 utotal = 0;

for(DWORD i=0; i<2; i++)

{
GetThreadTimes(hThread[i], &ftCreate, &ftExit, &ftKernel, &ftUser);

printf("\n");

printf("Reported time for thread %d\n", i+1);

__int64 ktime = *(__int64 *)&ftKernel;
__int64 utime = *(__int64 *)&ftUser;
printf(" kernel: %6d ms\n", int(ktime / 10000));
printf(" user : %6d ms\n", int(utime / 10000));

printf(" time : %6d ms (Exit-Create)\n", int(ftExit-ftCreate));
printf(" ticks : %6d ms\n", dwThreadTicks[i]);

ktotal += ktime;
utotal += utime;

}

printf("\n");
printf("Kernel+User Thread Totals:\n");
printf(" kernel: %6d ms\n", int(ktotal / 10000));
printf(" user : %6d ms\n", int(utotal / 10000));
printf(" total : %6d ms\n", int((ktotal+utotal) / 10000));

if (MainThreadResolution) {
timeEndPeriod(MainThreadResolution);
}

QueryPerformanceCounter(&liEnd);
QueryPerformanceFrequency(&liFreq);

double ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) /
(double)liFreq.QuadPart;

DWORD dwTicks = GetTickCount() - dwStart;
printf("\n");
printf("Duration: %6.0f ms\n", ms);
printf("Ticks : %6d ms\n", dwTicks);

hector

unread,

Oct 28, 2004, 11:11:25 AM10/28/04

to

Jochen, this is a follow up and clarification:

"hector" <nos...@nospam.com> wrote in message
news:e$JSl9PvE...@TK2MSFTNGP11.phx.gbl...

> > If you now make an thread which *always* aborts its quantum before the
10
> > ms is over, then the thread times will *never* be incremented!
>

> This is not true. Your example code and analysis is faulty.
>
> [snip]

>
> As you can see the the summation of the kernel and user threads time MUST
> equal or be very close to the main thread time. The above output uses a
> period of 1 ms for each thread including the main thread.

In your hypothesis is that the GetThreadTimes() are incorrect, then to prove
this you have to "Add up" all secondard threads and try to match it with the
main thread.

If the SUMMATION of the Thread Times (Kernel+User) is equal to the Main
Thread Time, then your hypothesis is wrong.

If the SUMMATION is "significantly" less to the Main Thread Time, then your
hypothesis is correct.

I don't see the summations being "significantly" less than the main thread
time.

I highlite "significantly" to factor in the differences in the main thread
time calculation.

Run my code over and over and you will see the hypothesis is incorrect.

Jochen Kalmbach

unread,

Oct 28, 2004, 12:08:05 PM10/28/04

to

Hi hector,

> Run my code over and over and you will see the hypothesis is
> incorrect.

I can just say: you are wrong.

To prove this I made an example... but you have to "tune" it so it can
produce the desired result on your computer!

The assumtion is (and it is prooved via digging into the kernel):
Thread/Kernel-Times are only updated during the tick-interrupt.

The trick is very simple:
Create two threads:
- One thread is doing a very small calculation (if tick-count is 10 ms,
do a calculation of 1 ms).
- The second thread is just for filling the gap. This thread will run
all the time.

Now if you look at the time-slices, you would see the following:

T1: Thread1
T2: Thread2

T1(2ms) - T2(8ms);Count - T1(2ms) - T2(8ms);Count - T1(2ms) - T2
(8ms);Count

And this is exactly the times I get if I run my app on my computer (for
example for 130 seconds, the times from thread1 are either equal 0 or
about 30 ms... (which is due to teh fact that more than 2 threads are
running).

Please try to run my example again! And change "internalCounter" to an
value which produces and delay for 1-2 ms!!!

Then you will see an output like here:

<output>
Inc duration: 2.825 ms

Duration: 46884.327 ms

Reported time for thread 1

Kernel: 00:00.000
User: 00:00.000

Reported time for thread 2
Kernel: 00:33.546
User: 00:13.296

Press any key to continue
</output>

You see that thread1 has *no time*!!!! All the time was calculated to
thread2 (as I promised)!!!

If you do not belive this, then you should explain how GetThreadTimes is
working and how the counting is working!!!

Jochen Kalmbach

unread,

Oct 28, 2004, 1:26:17 PM10/28/04

to

Hi hector,

please let my example run at your PC, and adjust "internalCounter", so
it will display about 1-2 ms (duration).
And be sure that no other app is doing some work...

Then please provide the output....

<code>
#include <windows.h>
#include <stdio.h>

DWORD loopCounter = 0;
DWORD loopCounterMax = 1000;

DWORD internalCounter = 0xFF000000;

DWORD __stdcall CalcThread(LPVOID)
{
while(loopCounter <= loopCounterMax)
{
DWORD cnt = internalCounter;
while(cnt != 0) cnt++;
Sleep(1);
loopCounter++;
}
return 0;
}

DWORD WINAPI IdleThread(LPVOID)
{
while(loopCounter <= loopCounterMax)
{
Sleep(0); // just do something...
}
return 0;
}

int _tmain(int argc, _TCHAR* argv[])
{

// be sure we only use 1 processor!
SetProcessAffinityMask(GetCurrentProcess(), 1);

LARGE_INTEGER liStart, liEnd, liFreq;

// test, how much time the inc is using...
QueryPerformanceCounter(&liStart);
DWORD cnt = internalCounter;
while(cnt != 0) cnt++;

QueryPerformanceCounter(&liEnd);
QueryPerformanceFrequency(&liFreq);
double ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) /
(double)liFreq.QuadPart;

printf("Inc duration: %.3f ms\n\n", ms);
// test-end

DWORD id;
HANDLE hThread[2];

QueryPerformanceCounter(&liStart);

hThread[0] = CreateThread(NULL, 0, CalcThread, 0, 0, &id);
hThread[1] = CreateThread(NULL, 0, IdleThread, 0, 0, &id);

WaitForMultipleObjects(2, hThread, TRUE, INFINITE);

QueryPerformanceCounter(&liEnd);
QueryPerformanceFrequency(&liFreq);

ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) / (double)
liFreq.QuadPart;
printf("Duration: %.3f ms\n\n", ms);

FILETIME ftCreate, ftExit, ftKernel, ftUser;

for(DWORD i=0; i<2; i++)
{
GetThreadTimes(hThread[i], &ftCreate, &ftExit, &ftKernel, &ftUser);

printf("Reported time for thread %d\n", i+1);

SYSTEMTIME st;

FileTimeToSystemTime(&ftKernel, &st);
printf("Kernel: %2.2d:%2.2d.%3.3d\n", st.wMinute, st.wSecond,
st.wMilliseconds);

FileTimeToSystemTime(&ftUser, &st);
printf("User: %2.2d:%2.2d.%3.3d\n\n", st.wMinute, st.wSecond,
st.wMilliseconds);
}

return 0;
}
</code>

--

Rhett Gong [MSFT]

unread,

Oct 28, 2004, 11:13:48 PM10/28/04

to

Hi Jochen,
The behavior you are observing is correct. Windows will increase the CPU
usage for a thread *ONLY* when the scheduler interrupts a thread.

Fundamentally, Windows has no concept of time *other* than an interrupt
(the Time/Scheduling interrupt) that fires every 10 milliseconds.
- Windows uses the scheduler interrupt to update its own internal
clock.
- The CPU does not have a clock to tell "how much time since the
last interrupt"
* Some new CPUs *do* have an internal clock, but Windows can not
rely on new CPUs.
- QueryPerformanceCounter() relies on hardware inside the ACPI
chipset. This functionality only exists on ACPI hardware, so Windows can
not rely on it.

Basically, hardware limitations force this architecture. When a thread
calls Sleep(), Windows can not tell if it used 1 nanosecond or 9.99
milliseconds of its quantum.

Arkady Frenkel

unread,

Oct 30, 2004, 4:20:17 PM10/30/04

to

Hi , some comment in advance !
"Rhett Gong [MSFT]" <v-ra...@online.microsoft.com> wrote in message
news:#YcpSVWv...@cpmsftngxa10.phx.gbl...

> Hi Jochen,
> The behavior you are observing is correct. Windows will increase the CPU
> usage for a thread *ONLY* when the scheduler interrupts a thread.
>
> Fundamentally, Windows has no concept of time *other* than an interrupt
> (the Time/Scheduling interrupt) that fires every 10 milliseconds.
> - Windows uses the scheduler interrupt to update its own internal
> clock.
> - The CPU does not have a clock to tell "how much time since the
> last interrupt"
> * Some new CPUs *do* have an internal clock, but Windows can not
> rely on new CPUs.

That really strange because timer ( all new processors ( I mean by new -
Pentium II and newer have two timers ) so the number of channels doubled)
have three channels which can be used.
OTOH RT CMOS clock can be used for additional correctness
Arkady

Rhett Gong [MSFT]

unread,

Nov 2, 2004, 5:14:59 AM11/2/04

to

Windows can't rely on these new features of PII or above CPUs.
I am not sure what do you mean by "timer have three channels which can be
used." could you clarify it?

Arkady Frenkel

unread,

Nov 2, 2004, 5:59:06 AM11/2/04

to

"Rhett Gong [MSFT]" <v-ra...@online.microsoft.com> wrote in message

news:CRidVTMw...@cpmsftngxa10.phx.gbl...

> Windows can't rely on these new features of PII or above CPUs.

Why , sysenter( x86 ) /syscall ( AMD ) opcodes used in XP now instead of int
2e and sysenter appeared in Pentium II ?
System check if such sysenter exist in CPU and use it or
int 2e if no such . The same possible with timers IMHO

Arkady

Arkady Frenkel

unread,

Nov 2, 2004, 9:10:50 AM11/2/04

to

Oops ,forgot to mention about Programmable interval timer
on http://www.intel.com/design/archives/periphrl/docs/23124406.pdf

That have 3 outs ( 40h , 41h , 42h ) each on can be prorgammed
independantly.

Arkady

"Arkady Frenkel" <ark...@hotmailxdotx.com> wrote in message
news:cm7p9h$1t9$1...@home.itg.ti.com...

Gary Chanson

unread,

Nov 2, 2004, 10:58:58 AM11/2/04

to

"Arkady Frenkel" <ark...@hotmailxdotx.com> wrote in message

news:cm84h1$8pm$1...@home.itg.ti.com...

> Oops ,forgot to mention about Programmable interval timer
> on http://www.intel.com/design/archives/periphrl/docs/23124406.pdf
>
> That have 3 outs ( 40h , 41h , 42h ) each on can be prorgammed
> independantly.

PCs have always had those timers.

--
-GJC [MS Windows SDK MVP]
-Software Consultant (Embedded systems and Real Time Controls)
- http://www.mvps.org/ArcaneIncantations/consulting.htm
-gcha...@mvps.org

Arkady Frenkel

unread,

Nov 3, 2004, 2:14:13 AM11/3/04

to

That what I mean , that 3 channels always was there in 8253/8254/82c54
timers.
So even on old ( before 10 years computers , which have 1 timer and
not 2 ) possible to use periodically free channel ( like channel 2
which dedicated to loud-speaker to check time correctness )
Arkady

"Gary Chanson" <gcha...@No.Spam.TheWorld.net> wrote in message news:<OF5DhVPw...@TK2MSFTNGP12.phx.gbl>...

Gary Chanson

unread,

Nov 3, 2004, 2:47:32 AM11/3/04

to

"Arkady Frenkel" <ark...@hotmail.com> wrote in message
news:d0708a1.04110...@posting.google.com...

> That what I mean , that 3 channels always was there in 8253/8254/82c54
> timers.
> So even on old ( before 10 years computers , which have 1 timer and
> not 2 ) possible to use periodically free channel ( like channel 2
> which dedicated to loud-speaker to check time correctness )

I don't know about the original IBM PC, but as far back as the IBM AT
there were three timers. I have 15 year old documentation which shows three

Rhett Gong [MSFT]

unread,

Nov 3, 2004, 4:07:18 AM11/3/04

to

So far as I can tell, it is by design.
However, we are looking at continual improvement, and it's this kind of
feedback that let us know what things you're trying to do. Thanks for the
suggestion, I will submit your request to product group, or you can send
email to msw...@microsoft.com directly.

Arkady Frenkel

unread,

Nov 3, 2004, 8:03:29 AM11/3/04

to

Tnx , Rhett !
You can submit the proposition to use two timers ( channels ) because even
in non-RT system like XP that can serve different purposes for improving
timing
Arkady

"Rhett Gong [MSFT]" <v-ra...@online.microsoft.com> wrote in message

news:8cHhNSYw...@cpmsftngxa10.phx.gbl...

Rhett Gong [MSFT]

unread,

Nov 3, 2004, 10:17:48 PM11/3/04

to

OK, I will submit a propositionas you suggested. However, I can't
guarantee it will be used at this time. So please keep an eye out for it
in the future.

Thanks.