thread pool and CRT

Markus Mauhart

unread,

Feb 28, 2002, 10:17:50 AM2/28/02

to

Since win2k the SDK contains a ready to use IOCPORT based thread pool.
But I wonder whether I can use the VC++ CRT inside this pool's
worker threads, cause the CRT doku says
"thread should be created with beginthread"
and CreateThread() says
"A thread that uses functions from the C run-time libraries should use
the beginthread and endthread C run-time functions for thread management
rather than CreateThread and ExitThread. Failure to do so results in small
memory leaks when ExitThread is called."

Jon Wiswall [MS]

unread,

Feb 28, 2002, 12:51:05 PM2/28/02

to

The documentation is correct - endthread cleans up some state per-thread
created by the CRT as-needed. Threads created via QueueUserWorkItem (the
win2k+ threadpool) are started with CreateThread and never really "Exit"
(per-se, they do exit, but not until the pool is idle for a time) so you'll
leak that little bit.

You might consider switching from the CRT to using Win32 API functions
instead - the mapping is generally trivial (malloc -> new/HeapAlloc(),
fopen -> CreateFile, fread -> ReadFile, etc.) and you'll remove "yet another
layer" between your code and the actual OS.

--
Jon Wiswall - Microsoft
This posting is provided AS IS with no warranties, and confers no rights.

"Markus Mauhart" <markus....@nospamm.chello.at> wrote in message
news:eywqAxGwBHA.1480@tkmsftngp02...

Markus Mauhart

unread,

Feb 28, 2002, 1:21:49 PM2/28/02

to

"Jon Wiswall [MS]" <jon...@microsoft.com> wrote in message news:O7LtoBIwBHA.2540@tkmsftngp04...

> The documentation is correct - endthread cleans up some state per-thread
> created by the CRT as-needed. Threads created via QueueUserWorkItem (the
> win2k+ threadpool) are started with CreateThread and never really "Exit"
> (per-se, they do exit, but not until the pool is idle for a time) so you'll
> leak that little bit.

Is there some docu or whitepaper concerning w2k thread pool's timing
for worker thread exit ? I mean it makes a difference whether this little
memory leak occurres every minute or every hour.

> You might consider switching from the CRT to using Win32 API functions
> instead - the mapping is generally trivial (malloc -> new/HeapAlloc(),
> fopen -> CreateFile, fread -> ReadFile, etc.) and you'll remove "yet another
> layer" between your code and the actual OS.

Generally I like to use the win32 API functions directly, but additionally
I depend on the C++ library (.. operator new() uses malloc()) and sometimes
the C lib.

But thinking a little bit, the real interesting issue could be:
What about the (internal) thread pools used by COM/RPC, e.g. when calling
into the MTA ?
Are they based on 'the' win2k thread pool ?
Do they use win32-CreateThread or CRT-beginThread ?
When based on CreateThread(), my MTA server would get the same
memory leak, so again I would be interested in the timings used by
that thread pools to exit unnecessary worker threads.

Jon Wiswall [MS]

unread,

Feb 28, 2002, 5:13:20 PM2/28/02

to

Memory leaks are always bad. Whether they happen a little or a lot. The
machine that I type this on has been up for four weeks at this point, and is
perfectly stable with no leaks. If I ran your app on my desktop (or worse,
as a system service) for a week and it steadily leaked a few bytes an hour
as threads came and went, I'd be quite annoyed.

Thread pool details - this is sort of a black box. I don't know, but I'm
pretty sure knowing wouldn't make a difference... My bet is that like most
thread pools, it's resistant to thrashing. So once you've added a thread to
the pool, it'll hang around a while before it goes away to avoid the
startup/teardown cycle the next time you get heavily loaded.

MTA/COM/RPC: Since these technologies work back to win9x, I'm pretty sure
they don't use QueueUserWorkItem... Assume they're "good citizens" and use
CreateThread.

I don't know the leak overhead. You can certainly run some timings of your
own, I'm sure - snapshot the memory status, create a thread, use some CRT
functions and clean up after them (things like "strtok" are prime candidates
for this), exit the thread, snapshot memory again. You should see the leak
pretty easily. Then you can do some testing with the thread pool - get a
count of threads for a baseline, queue a lot of userworkitems (make sure
they do interesting things, not just exit when run - Sleep() for a while or
something to simulate work), exit all the workitem threadprocs, and count
threads again. That'll tell you how many threads got created for your
simulated workload. Multiply that by the leak, and you'll get a rough
estimate of the overhead.

--
Jon Wiswall - Microsoft
This posting is provided AS IS with no warranties, and confers no rights.

"Markus Mauhart" <markus....@nospamm.chello.at> wrote in message

news:uzkViTIwBHA.1460@tkmsftngp05...

Markus Mauhart

unread,

Feb 28, 2002, 7:20:08 PM2/28/02

to

"Jon Wiswall [MS]" <jon...@microsoft.com> wrote in message news:OB1JLUKwBHA.2112@tkmsftngp02...

> Memory leaks are always bad. Whether they happen a little or a lot. The
> machine that I type this on has been up for four weeks at this point, and is
> perfectly stable with no leaks. If I ran your app on my desktop (or worse,
> as a system service) for a week and it steadily leaked a few bytes an hour
> as threads came and went, I'd be quite annoyed.
>
> Thread pool details - this is sort of a black box. I don't know, but I'm
> pretty sure knowing wouldn't make a difference... My bet is that like most
> thread pools, it's resistant to thrashing. So once you've added a thread to
> the pool, it'll hang around a while before it goes away to avoid the
> startup/teardown cycle the next time you get heavily loaded.
>
> MTA/COM/RPC: Since these technologies work back to win9x, I'm pretty sure
> they don't use QueueUserWorkItem... Assume they're "good citizens" and use
> CreateThread.
>
> I don't know the leak overhead. You can certainly run some timings of your
> own, I'm sure - snapshot the memory status, create a thread, use some CRT
> functions and clean up after them (things like "strtok" are prime candidates
> for this), exit the thread, snapshot memory again. You should see the leak
> pretty easily. Then you can do some testing with the thread pool - get a
> count of threads for a baseline, queue a lot of userworkitems (make sure
> they do interesting things, not just exit when run - Sleep() for a while or
> something to simulate work), exit all the workitem threadprocs, and count
> threads again. That'll tell you how many threads got created for your
> simulated workload. Multiply that by the leak, and you'll get a rough
> estimate of the overhead.

Thanks for your detailed comment.
I'm not really worried about w2k-threadpools memory leak when used with the CRT.
I've just tested the timing: an idle w2k threadpool waits about 45s before closing
(allmost all) unnecessary threads.
On NT COM relies on RPC when proxying to and from MTAs, and NT-RPC definitely
is not based on win9x technology, hence I thought its thread cache could
use IOCPORT, eventually in the form of a w2k thread pool.

Sean Kelly

unread,

Mar 22, 2002, 3:11:18 PM3/22/02

to

"Jon Wiswall [MS]" <jon...@microsoft.com> wrote in message

news:O7LtoBIwBHA.2540@tkmsftngp04...

> The documentation is correct - endthread cleans up some state per-thread
> created by the CRT as-needed. Threads created via QueueUserWorkItem (the
> win2k+ threadpool) are started with CreateThread and never really "Exit"
> (per-se, they do exit, but not until the pool is idle for a time) so
you'll
> leak that little bit.

Owch. I was just considering switching to BindIoCompletionCallback instead
of explicitly using IOCP and hadn't even thought of this. So threads are
really created using CreateThread? I don't suppose there's a way around
this?

> You might consider switching from the CRT to using Win32 API functions
> instead - the mapping is generally trivial (malloc -> new/HeapAlloc(),
> fopen -> CreateFile, fread -> ReadFile, etc.) and you'll remove "yet
another
> layer" between your code and the actual OS.

I'm not sure that this is an acceptible restriction to put on people using
my library code. At the very least, C++ "new" frequently uses malloc(), and
portable C++ libraries occasionally use C functions as well.

Jon Wiswall [MS]

unread,

Mar 22, 2002, 7:26:54 PM3/22/02

to

"Sean Kelly" <ske...@advent.com> wrote in message
news:#c4lB4d0BHA.1512@tkmsftngp05...

> Owch. I was just considering switching to BindIoCompletionCallback
instead
> of explicitly using IOCP and hadn't even thought of this. So threads are
> really created using CreateThread? I don't suppose there's a way around
> this?

Let me hand-wave a little more and say "they're created in a similar way".
There's no way to tell QueueUserWorkItem how to start threads on its own.
There's nothing barring you from creating your own threadpool, however -
QUWI makes some "intelligent" guesses about how long your worker jobs are
going to last, coupled with some memory pressure and some some stuff based
on how long the queue is.

> I'm not sure that this is an acceptible restriction to put on people using
> my library code. At the very least, C++ "new" frequently uses malloc(),
and
> portable C++ libraries occasionally use C functions as well.

Things like malloc() probably don't create that extra bit of bookkeeping
that would leak, as on win32 you can watch them call right through to
HeapAlloc(). The places that do create this bookkeeping are CRT functions
that have state associated with them, like strtok, splitpath, a few others.

Sean Kelly

unread,

Mar 28, 2002, 6:29:19 PM3/28/02

to

"Jon Wiswall [MS]" <jon...@microsoft.com> wrote in message

news:#EooHwp1BHA.2544@tkmsftngp07...

> "Sean Kelly" <ske...@advent.com> wrote in message
> news:#c4lB4d0BHA.1512@tkmsftngp05...
> > Owch. I was just considering switching to BindIoCompletionCallback
> instead
> > of explicitly using IOCP and hadn't even thought of this. So threads
are
> > really created using CreateThread? I don't suppose there's a way around
> > this?
>
> Let me hand-wave a little more and say "they're created in a similar way".
> There's no way to tell QueueUserWorkItem how to start threads on its own.
> There's nothing barring you from creating your own threadpool, however -
> QUWI makes some "intelligent" guesses about how long your worker jobs are
> going to last, coupled with some memory pressure and some some stuff based
> on how long the queue is.

This is what I decided to do. I wrote a function that mimicks
BindIoCompletionCallback but uses _beginthreadex internally.

> > I'm not sure that this is an acceptible restriction to put on people
using
> > my library code. At the very least, C++ "new" frequently uses malloc(),
> and
> > portable C++ libraries occasionally use C functions as well.
>
> Things like malloc() probably don't create that extra bit of bookkeeping
> that would leak, as on win32 you can watch them call right through to
> HeapAlloc(). The places that do create this bookkeeping are CRT functions
> that have state associated with them, like strtok, splitpath, a few
others.

Ah, this is good to know. I hadn't known the leak was caused by a specific
subset of the CRT functions. Still, this is library code and I decided to
be paranoid since I can't gurantee what will be done in projects it's
integrated with. Implementing the function on my own wasn't as difficult as
I had feared. Thanks for the response!

Sean

George M. Garner Jr.

unread,

Mar 29, 2002, 10:28:49 AM3/29/02

to

Jon,

The MT CRT has two functions which the runtime uses to cleanup per thread
data: _getptd() and _freeptd(). These functions also may be used within the
work item entry point to cleanup up _tiddata before the system thread exits
the entry point. Unfortunately, these functions are not exported from the
MT DLL, so this approach only works with the static MT library. It would be
opportune if these two functions could be exported in the next release. It
would be even better if the cleanup code could be wrapped into a separate
function from _endthread() and _endthreadex() and exported. That way the
_tiddata could remain opaque and ISV's would have a documented way to
cleanup per thread data created by system threads. Please consider this a
feature request.

Regards,

George.

Jeff Henkels

unread,

Mar 29, 2002, 4:20:12 PM3/29/02

to

Note that the CRT source is included on the CD, so you could probably
recompile it with those functions exported.

"George M. Garner Jr." <gmga...@erols.com> wrote in message
news:e$5QYZz1BHA.1652@tkmsftngp03...

Markus Mauhart

unread,

Mar 29, 2002, 11:52:47 AM3/29/02

to

"George M. Garner Jr." <gmga...@erols.com> wrote in message news:e$5QYZz1BHA.1652@tkmsftngp03...

> Jon,
>
> The MT CRT has two functions which the runtime uses to cleanup per thread
> data: _getptd() and _freeptd(). These functions also may be used within the
> work item entry point to cleanup up _tiddata before the system thread exits
> the entry point. Unfortunately, these functions are not exported from the
> MT DLL, so this approach only works with the static MT library. It would be

But dont you think that the CRT DLL's approach (automatic per-thread cleanup
inside its entrypoint(DLL_THREAD_DETACH)) is a good solution ?