Real world "range" of Process/Thread IDs

3249 views
Skip to first unread message

gleaves@hotmail.com > <hugh<underbar>

unread,
Dec 18, 2009, 4:50:01 AM12/18/09
to
I'm seeking info on the actual true real world numeric range of Process and
Thread IDs on Windows (including 64-bit W7, Server 2008 etc).

These are all defined as a 32-bit integer, and it is documented that both
processes and threads obtain the ID from a common pool, so their range will
be the same (i.e. there is no real difference, numerically, between therm).

It seems to me that no running version of Windows, ever generates a process
or thread ID that requires more than two bytes to store (try looking at Task
Manager, you never see anything with more than 4 decimal digits and certainly
never even close to 65,536).

Does anyone know for sure?
did MS choose a 32-bit integer just for convenience but other design factors
prevent it from ever needing that many bytes?

Since these IDs are reused, the system is also self limiting in a way, and
can rarely (if ever?) get into a state where it would need to ever use more
than 2 bytes to hold the actail ID.

Any info much appreciated.

Hugo

Doron Holan [MSFT]

unread,
Dec 18, 2009, 12:49:14 PM12/18/09
to
why does it matter? just store a PID or TID in a 4 byte field and you never
have to worry about it. I have seen PIDs get into the hundreds of thousands

d

--

This posting is provided "AS IS" with no warranties, and confers no rights.


"Hugog...@hotmail.com>" <hugh<underbar> wrote in message
news:086BA9B6-AC45-4735...@microsoft.com...

Remy Lebeau

unread,
Dec 18, 2009, 1:51:38 PM12/18/09
to

"Hugo gle...@hotmail.com>" <hugh<underbar> wrote in message news:086BA9B6-AC45-4735...@microsoft.com...

> It seems to me that no running version of Windows, ever generates a
> process or thread ID that requires more than two bytes to store (try
> looking at Task Manager, you never see anything with more than 4
> decimal digits and certainly never even close to 65,536).

Have you ever seen a system have more than 65535 threads running simultaneously? Process and Thread IDs are reused after being released. Unless you can reall hammer on the system, running thousands of processes with hundreds/thousands of threads each, you are not likely to see the IDs grow that high in most situations.

--
Remy Lebeau (TeamB)

Don Burn

unread,
Dec 18, 2009, 1:58:58 PM12/18/09
to
But as Doron already pointed out, PID can exceed 64K so why in the world
would you want to construct a broken piece of code by assuming anything less
than the maximum?


--
Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr


"Remy Lebeau" <no....@no.spam.com> wrote in message
news:%23V$LkMBgK...@TK2MSFTNGP06.phx.gbl...

--
Remy Lebeau (TeamB)

__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4700 (20091218) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4700 (20091218) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


gleaves@hotmail.com > <hugh<underbar>

unread,
Dec 18, 2009, 2:46:01 PM12/18/09
to
Hi

Well there is a good reason, we want to decompose a thread-id into two
bytes, and use those to index into a 2D array of bytes, array size will just
be 64K.

If the ID really can ever exceed FFFFFFFF then our algorithm will not be
possible and it's important us.

Now just because the data types used to represent a Process or Thread ID is
4 bytes, that does not mean it really does need 4 bytes or that cases can
arise that will require 4 bytes.

Like Remy below say, these IDs are reused so I cannot see any way that the
system would find it necessary to create an ID > FFFFFFFF.

Microsoft use a DWORD for a lot of stuff, it doesn't mean that every bit is
absolutely needed or will ever be used in practice.

I'm not arguing, but I am seeking hard facts, perhaps someone who know the
system internals (Sysinternals ?) may be able to provide some concrete answer.

Thanks


"Doron Holan [MSFT]" wrote:

> .
>

Don Burn

unread,
Dec 18, 2009, 3:09:15 PM12/18/09
to
Hugo,

Doron is part of the kernel group at Microsoft so he should know. I've
read the sources for Windows since NT 3.5 days, and I never saw a
restriction for that would limit things to less than 64K, so you may get
away with it for a long time, but sooner of later you are likely to crash.


--
Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

"Hugo gle...@hotmail.com>" <hugh<underbar> wrote in message
news:99342AF4-460D-4833...@microsoft.com...

Pavel A.

unread,
Dec 18, 2009, 3:09:20 PM12/18/09
to
"Doron Holan [MSFT]" <doron...@online.microsoft.com> wrote in message
news:uaCfrpAg...@TK2MSFTNGP02.phx.gbl...

> why does it matter? just store a PID or TID in a 4 byte field and you
> never have to worry about it. I have seen PIDs get into the hundreds of
> thousands
>
> d

Doron, have you seen this on x86 or 64bit OS?
The "client ID" is a HANDLE (32 bits on x86) composed of process id and
thread id.
So, can either of them be > 16 bits on x86?
--pa

Remy Lebeau

unread,
Dec 18, 2009, 4:08:17 PM12/18/09
to

"Don Burn" <bu...@stopspam.windrvr.com> wrote in message news:eETsqQBg...@TK2MSFTNGP06.phx.gbl...

> But as Doron already pointed out, PID can exceed 64K so why
> in the world would you want to construct a broken piece of code
> by assuming anything less than the maximum?

I never said to do anything like that.

--
Remy Lebeau (TeamB)

Remy Lebeau

unread,
Dec 18, 2009, 4:11:59 PM12/18/09
to

"Hugo gle...@hotmail.com>" <hugh<underbar> wrote in message news:99342AF4-460D-4833...@microsoft.com...

> Well there is a good reason, we want to decompose a thread-id into
> two bytes, and use those to index into a 2D array of bytes, array size
> will just be 64K.

Why would you ever use a re-usable ID as an array index?

Since a PID can be higher than 65535, as Doron mentioned, you would need to decompose the value into 2 16-bit Word values instead of 2 8-bit Byte values, thus requiring a 4GB array. Sounds like you have a flawed code design.

> If the ID really can ever exceed FFFFFFFF then our algorithm will
> not be possible and it's important us.

What are you tyring to accomplish exactly? Perhaps a PID lookup table would be more suitable to your needs.

--
Remy Lebeau (TeamB)

m

unread,
Dec 18, 2009, 8:42:23 PM12/18/09
to
In my experience, only the high bit of the DWORD carries any significance
and there are PIDs and TIDs that exceed 2^16

"Hugog...@hotmail.com>" <hugh<underbar> wrote in message
news:086BA9B6-AC45-4735...@microsoft.com...

gleaves@hotmail.com > <hugh<underbar>

unread,
Dec 21, 2009, 6:27:01 AM12/21/09
to
OK well here are some little known facts that I have managed to discover.

First, all PIDs & TIDs have last two bits set to zero, thus they can be
divided by four with no loss of data.

Thus, if we now still limit ourselves to 16 bits, we are able to deal with a
PID or TID as high as 262,140.

Second, it seems that even under deliberate stressing of the PID/TID table
in the OS, values can hardly, if ever, reach 300,000 and this is test code
designed purely and simply to force very high TID/PID values.

So I am satisfied that I can safely assume any pid or tid under any
circumstances on any current build of the OS, will never have a PID/TID
greater than 262,140 and can therefore always be represented in 16 bits.

Finally, it is easy to test for this being exceeded in very very rare cases
and just report an error, saying that some system limit has been reached etc
etc, restarting the app or something is all it would take to get back to a
workable PID/TID.

This is called thinking out of the box incidentally, for those of an overly
conservative mindset.

Hugh


"m" wrote:

> .
>

Leo Davidson

unread,
Dec 21, 2009, 7:12:28 AM12/21/09
to
On Dec 21, 11:27 am, Hugo glea...@hotmail.com> <hugh<underbar> wrote:
> OK well here are some little known facts that I have managed to discover.

Unless those things are guaranteed by API contract (which AFAIK they
are not for process and thread IDs), you could be building something
which is broken by a new OS version (or even a hotfix release) that
changes the way the handles are allocated and reused.

It seems like you've decided on a solution too early in the process
and are now trying to convince yourself that it'll probably work. (And
it probably will work, almost all the time, on current versions of
Windows, but why build that when you could build something that will
work all the time with all future versions of Windows?)

As for "thinking outside the box," a big part of that is not getting
blinkered by a particular implementation that was thought of too early
in the design process.

Jeroen Mostert

unread,
Dec 21, 2009, 10:55:25 AM12/21/09
to
On 2009-12-21 12:27, Hugo gle...@hotmail.com> <hugh wrote:
> OK well here are some little known facts that I have managed to discover.
>
> First, all PIDs& TIDs have last two bits set to zero, thus they can be

> divided by four with no loss of data.
>
This is a coincidence. See
http://blogs.msdn.com/oldnewthing/archive/2008/02/28/7925962.aspx. It is
true that kernel handles have the bottom two bits set to zero, for reasons
of extensibility
(http://blogs.msdn.com/oldnewthing/archive/2005/01/21/358109.aspx). This is
intentional. That this is true for process and thread identifiers is not
intentional and subject to change.

> So I am satisfied that I can safely assume any pid or tid under any
> circumstances on any current build of the OS, will never have a PID/TID
> greater than 262,140 and can therefore always be represented in 16 bits.
>

I look forward to you actually fixing the problem -- if you think changing
things so you store the full 32-bit value is hard *now*, wait until you have
to do it under stress, while your application is failing.

> Finally, it is easy to test for this being exceeded in very very rare cases
> and just report an error, saying that some system limit has been reached etc
> etc, restarting the app or something is all it would take to get back to a
> workable PID/TID.
>

I'm skeptical that restarting the app would somehow give it a much lower
PID/TID. If you ever encounter that situation, it's far more likely a
restart of the entire server is necessary for that. Try explaining that to
your customers under the guise of a system limit.

> This is called thinking out of the box incidentally, for those of an overly
> conservative mindset.
>

Your box is 16 bits large. If you were thinking out of it, you'd find a way
to shuffle 32 bits from here to there somehow. Or obviate the need for
shuffling it in the first place. Since we have no idea what your actual
problem is, it's a little disingenuous to imply that criticizing a flawed
design is "overly conservative". The flaw may be a reasonable trade-off, but
without further input it's more reasonable to assume that it isn't and
you're making a mistake.

--
J.

Keith Moore

unread,
Dec 21, 2009, 2:37:15 PM12/21/09
to
On 12/21/2009 3:27 AM, Hugo gle...@hotmail.com> <hugh wrote:
> OK well here are some little known facts that I have managed to discover.
>
> First, all PIDs & TIDs have last two bits set to zero, thus they can be
> divided by four with no loss of data.
>

FYI: It wasn't always this way for NT-based kernels. IIRC the behavior
your describe was introduced in NT 4.0. Yes, that's ancient history, and
you probably don't need to support NT 4.0 (almost no one does).

My point: changes in the behavior of PID and TID allocation are not
merely theoretical; they have already happened once.

>
> Thus, if we now still limit ourselves to 16 bits, we are able to deal with a
> PID or TID as high as 262,140.
>
> Second, it seems that even under deliberate stressing of the PID/TID table
> in the OS, values can hardly, if ever, reach 300,000 and this is test code
> designed purely and simply to force very high TID/PID values.
>
> So I am satisfied that I can safely assume any pid or tid under any
> circumstances on any current build of the OS, will never have a PID/TID
> greater than 262,140 and can therefore always be represented in 16 bits.
>

If you expect your software will only run on today's typical laptop &
desktop machines, then you're probably right.

If you expect your software may run on future high-end servers with
100's of GB RAM and uptimes measured in months or years, then maybe not.

>
> Finally, it is easy to test for this being exceeded in very very rare cases
> and just report an error, saying that some system limit has been reached etc
> etc, restarting the app or something is all it would take to get back to a
> workable PID/TID.
>

To be clear, it may require a *reboot* to get the PID/TIDs back into
your assumed range.

If/when your customers run into this problem, it will be a fun problem
for your support team to reproduce.

>
> This is called thinking out of the box incidentally, for those of an overly
> conservative mindset.
>

If ever there was an area demanding conservative thinking, it's
kernel-mode programming.

Personally, I don't like using words such as "assume" and "probably"
when designing kernel-mode software, but if that works for you, then go
for it.

KM

Doron Holan [MSFT]

unread,
Dec 21, 2009, 5:01:56 PM12/21/09
to
it is not out of the box thinking, it is making design decisions based on
internal/undocumented implementations and not the public contract which is
documented. using the ID as a flat index into an array has quite a few
problems, primarily being the assumed size of the array. what about the
sparseness of the array? you are assuming the IDs are going to be closely
clustered together, but there is no guarantee of that either. you could
create a sparse array (let's say create an array chunk of 4K with notes its
min/max and a pointer to the next possible chunk). this way the upper bound
of the ID does not matter. or forgo the ID as index concept altogether and
use another data structure like a hash tree or self balanced tree. that is
out of the box thinking.

d

--

This posting is provided "AS IS" with no warranties, and confers no rights.

"Hugog...@hotmail.com>" <hugh<underbar> wrote in message

news:E5D2D65B-807E-4872...@microsoft.com...

Sergei Zhirikov

unread,
Dec 22, 2009, 5:55:39 PM12/22/09
to
> Hugo gle...@hotmail.com> <hugh wrote:
> Second, it seems that even under deliberate stressing of the PID/TID table
> in the OS, values can hardly, if ever, reach 300,000 and this is test code
> designed purely and simply to force very high TID/PID values.

By the way, will the PID/TID be reused if the process/thread has exited,
but there are still open handles to it held by someone else?

If not, then a simple handle leak could cause high PID/TID values, right?

gleaves@hotmail.com > <hugh<underbar>

unread,
Dec 23, 2009, 3:06:02 AM12/23/09
to
Well I appreciate the feedback on this question, and I do understand every
issue and concern that has been voiced, I was aware of these factors from the
outset which is why I tried to get hard facts from experts here.

However, I still disagree with those who say (in effect) "Dont do this".
Obviously I can't disclose the underlying motivation but suffice to say it
has reduced a small CPU intensive bottleneck in an area that is otherwise
impossible to address, we are seeing a 50% drop in CPU on an operation that
is small but performed frequently in our case.

Now, many of the concerns are minor, after all one of the reasons we have
beta evaluation in our industry is to ascertain the true risk as opposed to
an estimated risk, in this case the risk is looking very very small.

If our code is beta tested and never has an issue in this area AND if we do
trap and log a fault when we see a PID/TID > 16bits AND if we specifcy the
code is designed to run on Version X of the OS etc etc, then many (not all)
of the concerns voiced here simply vanish.

Don't get me wrong, I accept that this is an undocumented "feature" and that
it may well change (hence we only run on Version X etc) but it boils down to
simply a risk, one of a great many in any project be they technical,
commerical, human etc.

We all use undocumeted stuff all the time, there are many Win32 API calls
that have gaps or grey areas in their documentation, yet people run tests,
make checks, push code and then interpret that documentation.

Now I accept that my reliance on only 16-bits being used is more than an
interpretation, it is a decision pure and simple, but many functions do have
these odd areas and so we just try and see.

One needs to trade off the benefits against the costs and risks, surely if
we find it impossible to make the code fail on test after test after test
then the risk is small? if that is so, then it slides down the list of risks,
beleive me business and developers face risks all the time, a large risk is
more worthy of effort than a tiny one.

Finally I have worked in this business for almost 40 years, coded on raw
machine code on 6502's, worked on high availability Stratus fault tolerant
machines running VOS (a derivative of Multics) developed compilers,
applications and so on, often carrying personal responsbility for significant
commercial risks and managing teams of sharp technical staff, so I am well
versed in this topic of risk.

I really don't think it is possible to develop non-trivial working software
systems when one relies soley on written documentation, it just doesn't
happen, we all have to "let me just code something to see if it is aligned
like that" or "well the docs don't say, but they do imply, let me run some
tests" etc.

How many libraries and products use NtXXX calls in user mode code? lots, yet
these are not documented and "could change anytime" etc.

Hugo


"Doron Holan [MSFT]" wrote:

> .
>

gleaves@hotmail.com > <hugh<underbar>

unread,
Dec 23, 2009, 3:07:04 AM12/23/09
to
Yes it could, which is discussed at length here:

http://winprogger.com/?p=29

Hugo

"Sergei Zhirikov" wrote:

> .
>

Richard Russell

unread,
Dec 23, 2009, 4:31:59 AM12/23/09
to
On Dec 23, 8:06 am, Hugo glea...@hotmail.com> <hugh<underbar> wrote:
> We all use undocumented stuff all the time, there are many Win32 API calls

> that have gaps or grey areas in their documentation, yet people run tests,
> make checks, push code and then interpret that documentation.

Whilst I don't agree with all your points, I must say that this
comment is spot on. If Microsoft is concerned about people "making


design decisions based on internal/undocumented implementations and

not the public contract" they need to put much more effort into
correcting the errors and omissions in the Win32 API documentation.

Richard.
http://www.rtrussell.co.uk/
To reply by email change 'news' to my forename.

Pavel A.

unread,
Dec 23, 2009, 6:12:24 AM12/23/09
to
"Richard Russell" <ne...@rtrussell.co.uk> wrote in message
news:0e7faef3-c674-495b...@k17g2000yqh.googlegroups.com...
....

> If Microsoft is concerned about people "making
> design decisions based on internal/undocumented implementations and
> not the public contract" they need to put much more effort into
> correcting the errors and omissions in the Win32 API documentation.

If you pay attention, this process is going all the time recently.
Many of Nt APIs have been documented in MSDN and included
in SDK header files (winternl.h), and "community content" can be added to
MSDN online docum.
Even if they have not done all that entirely out of their good will, this is
still a great improvement.
But internal implementation details are not contractual. They *can* change.
Happy testing!
--pa

Alexander Grigoriev

unread,
Dec 23, 2009, 11:19:49 AM12/23/09
to
There are APIs, and there are internal functions. A documented API is a
contract, it will be there to stay, but an undocumented internal function is
a private affair, which can go away with any update.

"Richard Russell" <ne...@rtrussell.co.uk> wrote in message
news:0e7faef3-c674-495b...@k17g2000yqh.googlegroups.com...

m

unread,
Dec 23, 2009, 8:06:45 PM12/23/09
to
IMHO, with few exceptions, MSDN constitutes high quality documentation. It
is certainly much better then the documentation for most systems that I have
dealt with, but no documentation can ever write or test the code for you
nor, assuredly, can it be understood in isolation.

"Richard Russell" <ne...@rtrussell.co.uk> wrote in message
news:0e7faef3-c674-495b...@k17g2000yqh.googlegroups.com...

gleaves@hotmail.com > <hugh<underbar>

unread,
Dec 24, 2009, 4:14:01 AM12/24/09
to
I agree fully, Microsoft's documentation is about as good as I have ever
seen, given the sheer scale of what is going on, it's a daunting undertaking.
I recall sitting at a Unix box for the first time some years ago and simply
typing "help", my downhill relationship with that OS and its docs began right
there and then.

H

"m" wrote:

> .
>

Grzegorz Wróbel

unread,
Dec 26, 2009, 1:03:32 AM12/26/09
to
Hugo gle...@hotmail.com> <hugh wrote:
> Hi
>
> Well there is a good reason, we want to decompose a thread-id into two
> bytes, and use those to index into a 2D array of bytes, array size will just
> be 64K.
>

So you want a hash table for running processes with a perfect hashing
function? That's rather not possible with a 64K sized array.

But the are other structures than hash tables like BST or AVL trees you
could use for it, and if you concerned abut speed you could use a hash
table of BST trees for example. With well designed hash function this
would combine the speed of hash table with capacity of a search tree.
A well performing hash function for such approach would be
(LOWORD(id)+HIWORD(id))%0xFFFF

> If the ID really can ever exceed FFFFFFFF then our algorithm will not be
> possible and it's important us.

Huh? process id can never exceed FFFFFFFFh as this is the maximum value
for a DWORD.

>
> Now just because the data types used to represent a Process or Thread ID is
> 4 bytes, that does not mean it really does need 4 bytes or that cases can
> arise that will require 4 bytes.
>

But why do you bother? Does it matter? Simply write your code well so it
will work with 4bytes ids.

>
> I'm not arguing, but I am seeking hard facts, perhaps someone who know the
> system internals (Sysinternals ?) may be able to provide some concrete answer.
>

The facts are process and thread ids are DWORDs. All other is speculation.

I don't know details of your project but it really sounds like you have
a very basic problem and being unable to design your algorithm well you
attempt to bend reality to fit it to your flawed design.


--
Grzegorz Wróbel
677265676F727940346E6575726F6E732E636F6D

Reply all
Reply to author
Forward
0 new messages