Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Linux 2MB Limit on Thread Stacks

39 views
Skip to first unread message

Paul Whittemore

unread,
Sep 8, 2002, 11:10:55 PM9/8/02
to
I posted a similar message in comp.programming.threads, but in
hindsight, while the issues are comparing pthreads implementations on
various platforms, the key questions are all Linux-specific. (Nobody
responded there, yet at least, anyway...) Also, I understand more
about the Linux implementation now, and can hopefully provide more
specific information . First, something just seems dead wrong:

Under Linux, it seems to me that the pthread_attr_setstack() function
has a bug(?) in that it indirectly uses __pthread_max_stacksize to
validate (and limit) the stack size specified, even if the caller
passes in their OWN stack address.

It seems somewhat arbitrary if the pthread_attr_setstack() function is
used to pass an address in, since the caller is responsible for
ensuring that the memory exists for the full range specified. How can
it possibly be invalid (which is the error code returned)?

If the Linux implementation did not have this bug, I would not have a
showstopper problem. See the detailed part marked *** below for more.

So the questions:

1. Under Linux (only it seems), is the 2MB upper limit on the stack
size passed to pthread_attr_setstack() is a hard limit, even if the
caller passes in their own memory for the stack. i.e. is there
NOTHING short of changing the linuxthreads implementation that could
remove this limit?

2. Is there anything special about the stack memory used by a linux
thread? In the sense that I am hoping that it is possible to allocate
the stack from the heap, and be able to do my own reassignment of the
stack pointer whenever I feel like it?

Update: I think I can probably answer my own question here. Yes, very
much so.

The Linux pthreads implementation uses the current stack
address to find the Linux pthreads per-thread data. So for example,
after a stack swap, any reference to pthread_self() from a thread that
has swapped stacks will crash under Linux. Any reference to errno
will crash. Etc.

So this makes it a much simpler question. What facilities or
mechanisms are available under the Linux version of pthreads for the
use of alternate stacks? What if I need to swap stacks somewhere
along the lines while the thread is running? Does that mean that I
can't make any runtime library calls (that might need thread-specific
data) under the Linux implementation? How could the designers ever
consider such a limitation acceptable? Why must it use memory beyond
the end of the user stack? This is a terrible implementation.

I remembered a global variable (__pthread_nonstandard_stacks) that
could be set to signal that alternate stacks were being used. A
simple boolean. That sounded promising, but I found the following
quote in this message:
http://groups.google.ca/groups?q=Linux+thread+%22user+stack%22+group:comp.*&hl=en&lr=&ie=UTF-8&selm=slrn8dtqgn.dmu.kaz%40ashi.FootPrints.net&rnum=2

> Note: once you use one or more non-standard
> stacks, LinuxThreads switches to a rather
> expensive implementation of thread_self() for
> all threads. It linearly searches through all the
> threads for the one whose stack contains the stack pointer.

Checking the source seems to confirm it. Unless I'm missing
something, this doesn't really support alternate user stacks. It
supports user-specified stacks that have been indicated as such with
pthread_attr_setstack...() routines on the thread creation. But
again, only up to the 2MB limit of pthread_attr_setstacksize(), even
if the caller passes in their own memory. It doesn't allow a thread
to switch to an alternate stack. I have one thread that switches
between hundreds, sometimes thousands of private stacks, representing
the contexts of various cooperative tasks.

I'd really like to use a heap-allocated block as a replacement for the
pthreads-allocated stack simply by updating the SP when I feel it is
appropriate to be running on a substitute stack. (We already have
68K, PPC, x86 and Sparc implementations of this for other platforms,
so don't worry about our ability to do this, but the 2MB limit under
Linux is a showstopper problem that will require something more.)

It looks to me like even the more expensive search for the correct
thread isn't going to find one unless I specified it BEFORE the thread
was CREATED. Correct? And that also limits me to a single specific
stack per thread? That would be okay too, if the pthreads lib didn't
have this arbitrary 2MB limit for user-supplied stack memory.

So all it triggers is the more lengthy linear search through all of
the stack regions known to the pthreads implementation. In my case,
the stack region wouldn't be known. So this isn't going to help in
the slightest, because it is not a stack region known to Linux
threads.

Another solution for me would be in how the pthreads lib handled that
not found case. Currently, it tends to SEGV on me. If it could call
a user-supplied callback function, or something similar, to ask the
application if it knows, my application would. It is always the main
thread. I only use alternate stacks in one thread. Easy one-line
implementation in my case. But the pthreads implementation doesn't
deal with the stack region not being known to it, and crashes instead.
Only on Linux, and probably only on Intel.

Is there any mechanism to replace pthread_self() with a wrapper that
allows me to provide an implementation if the one in the pthreads lib
can't figure it out?

Anyone know more? Hasn't anyone already run into this before? I know
alternate stacks are not often needed, as a percentage of
threads-based projects, but certainly in absolute numbers, I would
imagine that they turn up regularly.

____

The background is that I have a server application that uses
coroutine-like, non-preemptive multitasking services we developed
years ago. Originally for the Mac, which had no real multitasking, it
provided a high-performance way for us to simplify many parallel
"tasks" in our code, on the Mac, without introducing the added
complexity and synchronization issues associated with preemptive
threads. Since then, we've ported this server to Windows and OS X,
adding additional parallelism through the use of actual (real) native
worker *threads* that feed from the core *thread* that drives the
coroutine *tasks*.

The problem is that, like threads, these tasks require stacks. We
have traditionally just allocated them from the mainline stack, by
using alloca() or by triggering guard pages to commit the memory,
setting the SP and slicing it up). We found that in some cases it was
better for the OS to know that the memory was being used for stack.
It also just seems better if the OS knew it was stack. From the point
of view of our server though, we just need some memory to push data on
when we make a function call, etc. It could have come from the heap.
Later, under Windows, we used NT "fibers" which gave us a
system-supported way to pass a stack size for each task. As far as I
know, there is nothing like that under Linux or Mac OS X, our two
current target environments. (Solaris is next.)

Under OS X (BSD), this was solved by creating a second mainline thread
(a thread 1 to replace thread 0), thereby allowing me to specify a
stack that was large enough for all of our tasks, using
pthread_attr_setstack. This works like a charm. I'm quite pleased
that part went so smoothly.

*** However, under Linux, the pthread_attr_setstacksize function has
a (bug?) in that it indirectly uses __pthread_max_stacksize to
validate (and limit) the stack size specified, even if the caller
passes in their OWN stack address. This just seems DEAD WRONG to me,
and a bug specific to the Linux threads implementation. It's probably
a result of factoring the newer setstack code so that
pthreads_attr_setstacksize() is used as part of the implementation of
pthreads_attr_setstack(). Of course, pthreads_attr_setstacksize()
doesn't really know if a user stack address was also specified or not.
That's one of the benefits of the new setstack() interface -- it does
know. Still, pthreads_attr_setstacksize() could be improved to know,
or the code could be restructured a little to factor a little less of
the pthreads_attr_setstacksize() implementation as common code,
excluding that size check.

If anyone can explain why pthread_attr_setstack needs to limit
user-specified stacks to 2MB, I'd appreciate hearing why.

At any rate, I'm screwed in any attempts to maintain the current
design, even though it is supported by Mac/68K, Mac/PPC, OS X/PPC,
Win32/x86, and Solaris/Sparc. It also works fine under Linux if I
don't link with the pthreads lib. This is because it's the pthreads
functions that limit the stacks to 2MB. (They also corrupt the user
stack at the bottom of the 8MB region by using data below the SP for
their own data, e.g. pthread_self(), even if I don't create any
threads.)

Any constructive feedback would be very much appreciated.

Thanks,
Paul

Paul Whittemore

unread,
Sep 9, 2002, 7:12:57 PM9/9/02
to
Paul Whittemore <usemyfi...@usemylastname.com> writes:
> So all it triggers is the more lengthy linear search through all of
> the stack regions known to the pthreads implementation. In my case,
> the stack region wouldn't be known. So this isn't going to help in
> the slightest, because it is not a stack region known to Linux
> threads.

On 09 Sep 2002 22:55:23 +0200, Andi Kleen <fre...@alancoxonachip.com>
wrote:
>AFAIK only the upcomming glibc 2.3 will address this problem properly.
>It will use a segment register for the thread local data and not
>require the stack pointer for pthread_self().

Ooh, that sounds interesting. Any known plans for 2.3, in terms of
bundling with specific GCC versions (e.g. 3.2), or distro versions or
anything like that?

I'll have to find out more about the new glib. Thanks for that
pointer.

> Unfortunately this has other problems, but that's a different story...

Of course... :-) Isn't it always the way... Sigh. Hopefully those
particular issues won't matter to my software. (I'm an optimist at
heart.)

Thanks
Paul

Ulrich Weigand

unread,
Sep 9, 2002, 11:12:03 PM9/9/02
to
Paul Whittemore <usemyfi...@usemylastname.com> writes:

>The Linux pthreads implementation uses the current stack
>address to find the Linux pthreads per-thread data. So for example,
>after a stack swap, any reference to pthread_self() from a thread that
>has swapped stacks will crash under Linux. Any reference to errno
>will crash. Etc.

Well, that's one method. The current libpthread has two completely
distinct ways to implmenent pthread_self (), depending on whether
FLOATING_STACKS is set or not when building glibc.

If FLOATING_STACKS is not set, it works as you described: every
thread stack is exactly 2 MB, and the value of the stack pointer
is used to identify the current thread. (Either by just masking
off the low bits, or else by searching a linear list.)

If FLOATING_STACKS *is* set, however, an architecture-specific
'magic' way is used to identify the current thread, and thread
stacks are much more flexible, e.g. they can be larger (or
smaller!) than 2 MB, and do not need to start at 2 MB boundaries.
(Each thread stack still cannot be larger that the stack size
rlimit.)

Which of the two methods is used depends on how glibc was built.
On some platforms, FLOATING_STACKS is never set. On others,
like s390 (which I'm most familiar with), it is always set
(using e.g. access registers to implement pthread_self on s390).
On x86, things appear to be somewhat complex. FLOATING_STACKS
is either set or not (if it is set, segment register %gs is
used to implement pthread_self), depending on glibc version,
Linux kernel version, and CPU level.

If you build glibc 2.2.x against Linux kernel 2.4.x headers
for a i686 architecture, FLOATING_STACKS should be set, as
far as I can see. However, I seem to recall discussions on
linux-kernel that various kernels have / had various bugs
in segment register handling that might affect proper functioning
of linuxthreads FLOATING_STACKS on x86. I'm not sure what
the current status is.


>So this makes it a much simpler question. What facilities or
>mechanisms are available under the Linux version of pthreads for the
>use of alternate stacks? What if I need to swap stacks somewhere
>along the lines while the thread is running? Does that mean that I
>can't make any runtime library calls (that might need thread-specific
>data) under the Linux implementation? How could the designers ever
>consider such a limitation acceptable? Why must it use memory beyond
>the end of the user stack? This is a terrible implementation.

If FLOATING_STACKS are not used, you can never swap stacks;
e.g. every signal received would crash the application.

>Is there any mechanism to replace pthread_self() with a wrapper that
>allows me to provide an implementation if the one in the pthreads lib
>can't figure it out?

Not without rebuilding libpthread; (the equivalent of) pthread_self is
inlined all over the place.


--
Dr. Ulrich Weigand
wei...@informatik.uni-erlangen.de

Wolfram Gloger

unread,
Sep 10, 2002, 5:12:05 AM9/10/02
to
Andi Kleen <fre...@alancoxonachip.com> writes:

> AFAIK only the upcomming glibc 2.3 will address this problem properly.
> It will use a segment register for the thread local data and not
> require the stack pointer for pthread_self().

glibc-2.2 already had this mechanism, too. But it was/is only enabled
on ix86 when you compile for _i686_ on Kernel-2.4 or higher. Many
distributions didn't do this, Debian only configures for i386 AFAIK.

> Sounds like a bug yes.
>
> I would write to the glibc lists on that (see http://sources.redhat.com/glibc/)
> and perhaps open a gnats bug for it.

I've already submitted a patch for this, see

http://sources.redhat.com/ml/libc-alpha/2002-09/msg00187.html

Let me know if this patch doesn't work, I've only done light testing.

Regards,
Wolfram.

Paul Whittemore

unread,
Sep 10, 2002, 1:07:55 PM9/10/02
to
Thanks to Ulrich Weigand and Wolfram Gloger for their very helpful
replies. I think they have given me most of what I need to resolve
this issue, although their replies have left me with a couple of
smaller questions.

On 10 Sep 2002 03:12:03 GMT, Ulrich Weigand
<wei...@informatik.uni-erlangen.de> wrote:

>If FLOATING_STACKS *is* set, however, an architecture-specific
>'magic' way is used to identify the current thread, and thread
>stacks are much more flexible, e.g. they can be larger (or
>smaller!) than 2 MB, and do not need to start at 2 MB boundaries.
>(Each thread stack still cannot be larger that the stack size
>rlimit.)

This would not be a problem for me if there were some way to increase
the rlimit prior to the initialization of __pthread_max_stacksize.
Unfortunately, it is initialized prior to main() and prior to any
global class constructors (my normal platform-independent module
initialization method). It always uses the default (8MB) stack rlimit
even if I increase that as the first line of my code. Any ideas?

>Which of the two methods is used depends on how glibc was built.
>On some platforms, FLOATING_STACKS is never set. On others,
>like s390 (which I'm most familiar with), it is always set
>(using e.g. access registers to implement pthread_self on s390).
>On x86, things appear to be somewhat complex. FLOATING_STACKS
>is either set or not (if it is set, segment register %gs is
>used to implement pthread_self), depending on glibc version,
>Linux kernel version, and CPU level.

I think Windows uses a similar approach for it's thread-local storage.
And I suspect that the PPC implementation on BSD (Mac OS X) uses a
similar register-based approach, which is why I'm not having any
trouble there.

>If you build glibc 2.2.x against Linux kernel 2.4.x headers
>for a i686 architecture, FLOATING_STACKS should be set, as
>far as I can see. However, I seem to recall discussions on
>linux-kernel that various kernels have / had various bugs
>in segment register handling that might affect proper functioning
>of linuxthreads FLOATING_STACKS on x86. I'm not sure what
>the current status is.

Question about i686: does that meant Pentium Pro or later? What would
this do to our requirements.

(I'm not concerned; I just need to understand it so I can explain to
our QA and documentation staff what the requirements are.)

To be honest, I don't have any problem specifying a more recent
machine. My application is a server application, typically running on
a dedicated machine, supporting thousands of users, sometimes valued
at more than a million dollars just for the hardware. If I say it
requires a Pentium-II, customers will laugh. :-)

>Not without rebuilding libpthread; (the equivalent of) pthread_self is
>inlined all over the place.

Okay, thanks for that answer.

On 10 Sep 2002 11:12:05 +0200, Wolfram Gloger

I figured that I could probably fix it myself, but of course I'd use
your specific changes in the hopes of confirming a "standard" fix.
Thanks very much for that patch.

But what would this mean for my build process? I don't want to tell
customers that they must update their system in any way. Is the
pthreads lib self-contained (static), or does linking with
libpthread.a merely provide the static stubs to the dynamic library,
libpthread.so?

I guess the question is, are these two libs *variants* of each other,
or is the dynamic lib required for an app that links with the .a file?

Still learning about dynamic libs under Linux (and other *nix
platforms).

Thanks for all your answers. You are most helpful.

Paul

Ulrich Weigand

unread,
Sep 10, 2002, 3:22:14 PM9/10/02
to
Paul Whittemore <usemyfi...@usemylastname.com> writes:

>This would not be a problem for me if there were some way to increase
>the rlimit prior to the initialization of __pthread_max_stacksize.
>Unfortunately, it is initialized prior to main() and prior to any
>global class constructors (my normal platform-independent module
>initialization method). It always uses the default (8MB) stack rlimit
>even if I increase that as the first line of my code. Any ideas?

Well, the obvious hack would be to have a wrapper application
(or shell script) that increases the rlimit and then exec's
the real application ...

>Question about i686: does that meant Pentium Pro or later? What would
>this do to our requirements.

>(I'm not concerned; I just need to understand it so I can explain to
>our QA and documentation staff what the requirements are.)

Yes, that's Pentium Pro or later. However, this is most likely
not the problem; the problem is that glibc must have been *compiled*
for this CPU type. If it wasn't in any particular Linux
distribution, you'd have to ask your customers to replace
their system glibc and libpthread, which they probably
wouldn't like.

>But what would this mean for my build process? I don't want to tell
>customers that they must update their system in any way. Is the
>pthreads lib self-contained (static), or does linking with
>libpthread.a merely provide the static stubs to the dynamic library,
>libpthread.so?

I don't think you can easily use some other library instead of
the system libpthread. Because libpthread and glibc interact
closely, you'll need to have a libpthread that matches whatever
glibc your customers have; this would seriously restrict the
distributions you can support.

You could of course link your whole application statically,
against both glibc and libpthread. However, this is also not
really to be recommended, as the glibc authors will not
guarantee upwards compatibility for those applications (e.g.
because the formats of auxillary files used by glibc may
change as the system glibc is upgraded to a newer version,
causing your statically linked application to break).

IMO the only realistic way would be to require your customers
to have a distibution which provides a version of libpthread
with FLOATING_STACKS. On recent RedHat releases (I think)
both versions are provided, and you can tell the linker with
an environment variable which one to use ...

Paul Whittemore

unread,
Sep 10, 2002, 6:25:11 PM9/10/02
to
On 10 Sep 2002 19:22:14 GMT, Ulrich Weigand
<wei...@informatik.uni-erlangen.de> wrote:

>Paul Whittemore <usemyfi...@usemylastname.com> writes:
>
>>This would not be a problem for me if there were some way to increase
>>the rlimit prior to the initialization of __pthread_max_stacksize.
>>Unfortunately, it is initialized prior to main() and prior to any
>>global class constructors (my normal platform-independent module
>>initialization method). It always uses the default (8MB) stack rlimit
>>even if I increase that as the first line of my code. Any ideas?
>
>Well, the obvious hack would be to have a wrapper application
>(or shell script) that increases the rlimit and then exec's
>the real application ...

Good point. I hadn't considered using a separate external program to
do this. Also, I suppose the ulimit shell command could in fact be
that program... :-)

>>Question about i686: does that meant Pentium Pro or later? What would
>>this do to our requirements.
>
>>(I'm not concerned; I just need to understand it so I can explain to
>>our QA and documentation staff what the requirements are.)
>
>Yes, that's Pentium Pro or later. However, this is most likely
>not the problem; the problem is that glibc must have been *compiled*
>for this CPU type. If it wasn't in any particular Linux
>distribution, you'd have to ask your customers to replace
>their system glibc and libpthread, which they probably
>wouldn't like.

Thanks. I understood that it needed to be rebuilt. I just wanted to
confirm what the requirements would be (PPro, P2, etc) if I was using
i686 code. I thought it meant PPro but I wasn't certain about that.

>I don't think you can easily use some other library instead of
>the system libpthread. Because libpthread and glibc interact
>closely, you'll need to have a libpthread that matches whatever
>glibc your customers have; this would seriously restrict the
>distributions you can support.

Understood, and the main reasons for my reluctance to use a solution
that required a specially-built pthreads lib.

>You could of course link your whole application statically,
>against both glibc and libpthread. However, this is also not
>really to be recommended, as the glibc authors will not
>guarantee upwards compatibility for those applications (e.g.
>because the formats of auxillary files used by glibc may
>change as the system glibc is upgraded to a newer version,
>causing your statically linked application to break).

Due to the nature of our server app, it's not unreasonable to specify
a stricter set of requirements than, say, a more common user-end tool
(such as our client software). But for reasons such as you mention,
I'd rather find a mechanism to allow dynamic linking, especially of
the glibc version.

>IMO the only realistic way would be to require your customers
>to have a distibution which provides a version of libpthread
>with FLOATING_STACKS. On recent RedHat releases (I think)
>both versions are provided, and you can tell the linker with
>an environment variable which one to use ...

Now that's the kind of info tidbit that I was hoping to hear. I think
I'll focus some research on FLOATING_STACKS now.

Thanks again,
Paul

0 new messages