Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Why doesn't threading.join() return a value?

30 views
Skip to first unread message

Roy Smith

unread,
Sep 2, 2011, 10:53:35 AM9/2/11
to
I have a function I want to run in a thread and return a value. It
seems like the most obvious way to do this is to have my target
function return the value, the Thread object stash that someplace, and
return it as the return value for join().

Yes, I know there's other ways for a thread to return values (pass the
target a queue, for example), but making the return value of the
target function available would have been the most convenient. I'm
curious why threading wasn't implemented this way.

Steven D'Aprano

unread,
Sep 2, 2011, 11:45:22 AM9/2/11
to
Roy Smith wrote:

Because then the code launching the thread would have to block, waiting
until the thread is completed, so it will have a result to return.


--
Steven

Seebs

unread,
Sep 2, 2011, 12:42:34 PM9/2/11
to

Isn't "waiting until the thread is completed" sort of the point of join()?

-s
--
Copyright 2011, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Adam Skutt

unread,
Sep 2, 2011, 2:01:17 PM9/2/11
to

I assume it is because the underlying operating system APIs do not
support it. Windows and POSIX threads only support returning an
integer when a thread exits, similar to the exit code of a process.
More importantly, there's no way to tell whether the exit code of a
thread was set by user code or by the system. Even worse, some of
those integer values are reserved by some operating systems. If your
thread died via an exception, it still has an error code set by the
operating system. How would you going to distinguish those codes from
your own?

Adam

Alain Ketterlin

unread,
Sep 2, 2011, 2:23:28 PM9/2/11
to
Adam Skutt <ask...@gmail.com> writes:

> On Sep 2, 10:53 am, Roy Smith <r...@panix.com> wrote:
>> I have a function I want to run in a thread and return a value.  It
>> seems like the most obvious way to do this is to have my target
>> function return the value, the Thread object stash that someplace, and
>> return it as the return value for join().
>> > Yes, I know there's other ways for a thread to return values (pass the
>> target a queue, for example), but making the return value of the
>> target function available would have been the most convenient.  I'm
>> curious why threading wasn't implemented this way.
>
> I assume it is because the underlying operating system APIs do not
> support it. Windows and POSIX threads only support returning an
> integer when a thread exits, similar to the exit code of a process.

Sorry, you're wrong, at least for POSIX threads:

void pthread_exit(void *value_ptr);
int pthread_join(pthread_t thread, void **value_ptr);

pthread_exit can pass anything, and that value will be retrieved with
pthread_join. Threads of a process share their address space, there is
no reason to restrict their return value to an int.

> More importantly, there's no way to tell whether the exit code of a
> thread was set by user code or by the system. Even worse, some of
> those integer values are reserved by some operating systems.

I'm not sure what you are talking about here. Maybe you confuse threads
with processes?

Re. the original question: since you can define your own Thread
subclass, with wathever attribute you want, I guess there was no need to
use join() to communicate the result. The Thread's run() can store its
result in an attribute, and the "client" can get it from the same
attribute after a successful call to join().

-- Alain.

Adam Skutt

unread,
Sep 2, 2011, 2:53:43 PM9/2/11
to
On Sep 2, 2:23 pm, Alain Ketterlin <al...@dpt-info.u-strasbg.fr>
wrote:

> Sorry, you're wrong, at least for POSIX threads:
>
> void pthread_exit(void *value_ptr);
> int pthread_join(pthread_t thread, void **value_ptr);
>
> pthread_exit can pass anything, and that value will be retrieved with
> pthread_join.

No, it can only pass a void*, which isn't much better than passing an
int. Passing a void* is not equivalent to passing anything, not even
in C. Moreover, specific values are still reserved, like
PTHREAD_CANCELLED. Yes, it was strictly inappropriate for me to say
both return solely integers, but my error doesn't meaningful alter my
description of the situation. The interface provided by the
underlying APIs is not especially usable for arbitrary data transfer.
Doubly so when we're discussing something like Python's threading
module.

> I'm not sure what you are talking about here. Maybe you confuse threads
> with processes?

Windows threads have exit codes, just like processes. At least one
code is reserved and cannot be used by the programmer.

Adam

Chris Torek

unread,
Sep 2, 2011, 4:14:56 PM9/2/11
to
>On Sep 2, 2:23 pm, Alain Ketterlin <al...@dpt-info.u-strasbg.fr>
>wrote:
>> Sorry, you're wrong, at least for POSIX threads:
>>
>> void pthread_exit(void *value_ptr);
>> int pthread_join(pthread_t thread, void **value_ptr);
>>
>> pthread_exit can pass anything, and that value will be retrieved with
>> pthread_join.

In article <bf50c8e1-1476-41e1...@s12g2000yqm.googlegroups.com>


Adam Skutt <ask...@gmail.com> wrote:
>No, it can only pass a void*, which isn't much better than passing an
>int.

It is far better than passing an int, although it leaves you with
an annoying storage-management issue, and sidesteps any reasonable
attempts at type-checking (both of which are of course "par for
the course" in C). For instance:

struct some_big_value {
... lots of stuff ...
};
struct some_big_value storage_management_problem[SIZE];
...
void *func(void *initial_args) {
...
#ifdef ONE_WAY_TO_DO_IT
pthread_exit(&storage_management_problem[index]);
/* NOTREACHED */
#else /* the other way */
return &storage_management_problem[index];
#endif
}
...
int error;
pthread_t threadinfo;
pthread_attr_t attr;
...
pthread_attr_init(&attr);
/* set attributes if desired */
error = pthread_create(&threadinfo, &attr, func, &args_to_func);
if (error) {
... handle error ...
} else {
...
void *rv;
result = pthread_join(&threadinfo, &rv);
if (rv == PTHREAD_CANCELED) {
... the thread was canceled ...
} else {
struct some_big_value *ret = rv;
... work with ret->field ...
}
}

(Or, do dynamic allocation, and have a struct with a distinguishing
ID followed by a union of multiple possible values, or a flexible
array member, or whatever. This means you can pass any arbitrary
data structure back, provided you can manage the storage somehow.)

>Passing a void* is not equivalent to passing anything, not even
>in C. Moreover, specific values are still reserved, like
>PTHREAD_CANCELLED.

Some manual pages are clearer about this than others. Here is one
that I think is not bad:

The symbolic constant PTHREAD_CANCELED expands to a constant
expression of type (void *), whose value matches no pointer to
an object in memory nor the value NULL.

So, provided you use pthread_exit() "correctly" (always pass either
NULL or the address of some actual object in memory), the special
reserved value is different from all of "your" values.

(POSIX threads are certainly klunky, but not all *that* badly designed
given the constraints.)

>>Re. the original question: since you can define your own Thread
>>subclass, with wathever attribute you want, I guess there was no need to
>>use join() to communicate the result. The Thread's run() can store its
>>result in an attribute, and the "client" can get it from the same
>>attribute after a successful call to join().

For that matter, you can use the following to get what the OP asked
for. (Change all the instance variables to __-prefixed versions
if you want them to be Mostly Private.)

import threading

class ValThread(threading.Thread):
"like threading.Thread, but the target function's return val is captured"
def __init__(self, group=None, target=None, name=None,
args=(), kwargs=None, verbose=None):
super(ValThread, self).__init__(group, None, name, None, None, verbose)
self.value = None
self.target = target
self.args = args
self.kwargs = {} if kwargs is None else kwargs

def run(self):
"run the thread"
if self.target:
self.value = self.target(*self.args, **self.kwargs)

def join(self, timeout = None):
"join, then return value set by target function"
super(ValThread, self).join(timeout)
return self.value
--
In-Real-Life: Chris Torek, Wind River Systems
Intel require I note that my opinions are not those of WRS or Intel
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html

Steven D'Aprano

unread,
Sep 2, 2011, 4:49:17 PM9/2/11
to
Seebs wrote:

> On 2011-09-02, Steven D'Aprano <steve+comp....@pearwood.info>
> wrote:

[...]


>> Because then the code launching the thread would have to block, waiting
>> until the thread is completed, so it will have a result to return.
>
> Isn't "waiting until the thread is completed" sort of the point of join()?

Doh!

I mean, well done, you have passed my little test!

<wink>

--
Steven

Adam Skutt

unread,
Sep 2, 2011, 6:02:08 PM9/2/11
to
On Sep 2, 4:14 pm, Chris Torek <nos...@torek.net> wrote:
> In article <bf50c8e1-1476-41e1-b2bc-61e329bfa...@s12g2000yqm.googlegroups.com>

> Adam Skutt  <ask...@gmail.com> wrote:
>
> >No, it can only pass a void*, which isn't much better than passing an
> >int.
>
> It is far better than passing an int, although it leaves you with
> an annoying storage-management issue, and sidesteps any reasonable
> attempts at type-checking (both of which are of course "par for
> the course" in C).

And when written out, makes it sound distinctly worse than passing an
int :p. And let's not kid ourselves, unless you're a C programmer, it
is distinctly worse than passing an int. Heck, your example (snipped)
goes out of your way to unnecessarily leverage the functionality
provided by pthreads.

> Some manual pages are clearer about this than others.  Here is one
> that I think is not bad:
>
>     The symbolic constant PTHREAD_CANCELED expands to a constant
>     expression of type (void *), whose value matches no pointer to
>     an object in memory nor the value NULL.
>
> So, provided you use pthread_exit() "correctly" (always pass either
> NULL or the address of some actual object in memory), the special
> reserved value is different from all of "your" values.

Unfortunately, I'm not sure all implementations behave that way. Not
that cancellation is really worth bothering with anyway, but it's a
pretty nasty corner case.

Adam

Roy Smith

unread,
Sep 2, 2011, 7:16:29 PM9/2/11
to
In article
<5da6bf87-9412-46c4...@o15g2000vbe.googlegroups.com>,
Adam Skutt <ask...@gmail.com> wrote:

> On Sep 2, 10:53�am, Roy Smith <r...@panix.com> wrote:
> > I have a function I want to run in a thread and return a value. �It
> > seems like the most obvious way to do this is to have my target
> > function return the value, the Thread object stash that someplace, and
> > return it as the return value for join().
> > > Yes, I know there's other ways for a thread to return values (pass the
> > target a queue, for example), but making the return value of the
> > target function available would have been the most convenient. �I'm
> > curious why threading wasn't implemented this way.
>
> I assume it is because the underlying operating system APIs do not
> support it. Windows and POSIX threads only support returning an
> integer when a thread exits, similar to the exit code of a process.

But the whole point of higher level languages is to hide the warts of
the lower-level APIs they are built on top of. Just because a POSIX
thread can only return an int (actually, a void *) doesn't mean that
level of detail needed to be exposed at the Python threading library
level.

> More importantly, there's no way to tell whether the exit code of a
> thread was set by user code or by the system. Even worse, some of
> those integer values are reserved by some operating systems. If your
> thread died via an exception, it still has an error code set by the
> operating system. How would you going to distinguish those codes from
> your own?

I think you're talking about processes, not threads, but in any case,
it's a non-sequitur. Thread.join() currently returns None, so there's
no chance for confusion.

Chris Torek

unread,
Sep 2, 2011, 9:04:16 PM9/2/11
to
In article <roy-030914.1...@news.panix.com>

Roy Smith <r...@panix.com> wrote:
>Thread.join() currently returns None, so there's
>no chance for [return value] confusion.

Well, still some actually. If you use my example code (posted
elsethread), you need to know:

- that there was a target function (my default return
value if there is none is None); and
- that the joined thread really did finish (if you pass
a timeout value, rather than None, and the join times
out, the return value is again None).

Of course, if your target function always exists and never returns
None, *then* there's no chance for confusion. :-)

Alain Ketterlin

unread,
Sep 3, 2011, 4:33:10 AM9/3/11
to
Adam Skutt <ask...@gmail.com> writes:

> On Sep 2, 2:23 pm, Alain Ketterlin <al...@dpt-info.u-strasbg.fr>
> wrote:
>> Sorry, you're wrong, at least for POSIX threads:
>>
>> void pthread_exit(void *value_ptr);
>> int pthread_join(pthread_t thread, void **value_ptr);
>>
>> pthread_exit can pass anything, and that value will be retrieved with
>> pthread_join.
>
> No, it can only pass a void*, which isn't much better than passing an
> int.

We'll have to disagree. A void* simply can point to anything you want.
Since thread stacks disappear at end of thread, only dynamically
allocated memory can be used to store the result. That's why you get a
pointer. There is no restriction on that pointer provided it doesn't
point to memory that has been deallocated.

> Passing a void* is not equivalent to passing anything, not even in C.
> Moreover, specific values are still reserved, like PTHREAD_CANCELLED.

Thread cancellation is program logic (pthread_cancel), it doesn't mean
you thread crashed, it means your program decided to cancel the thread.
If you still care about the return value after having called
pthread_cancel(),

> Yes, it was strictly inappropriate for me to say both return solely
> integers, but my error doesn't meaningful alter my description of the
> situation. The interface provided by the underlying APIs is not
> especially usable for arbitrary data transfer.

Again, I may misunderstand your wording, but there is no "data transfer"
at all, since memory is shared between threads.

> Doubly so when we're discussing something like Python's threading
> module.

The OP was clearly discussing the case where a thread has a result, and
how to get it back. POSIX threads let you do that. There are of course
tons of other ways to do the same thing. Win32 will force you to use
some other way.

>> I'm not sure what you are talking about here. Maybe you confuse threads
>> with processes?
>
> Windows threads have exit codes, just like processes. At least one
> code is reserved and cannot be used by the programmer.

Is that STILL_ACTIVE that we are talking about? That's an artefact of
the design of GetExitCodeThread, which will return either the thread
exit code or its own error code. The python lib could easily hide this,
and use run()'s return value to store the (python) result somewhere.

-- Alain.

Alain Ketterlin

unread,
Sep 3, 2011, 5:00:22 AM9/3/11
to
Alain Ketterlin <al...@dpt-info.u-strasbg.fr> writes:

>> Passing a void* is not equivalent to passing anything, not even in C.
>> Moreover, specific values are still reserved, like PTHREAD_CANCELLED.
>
> Thread cancellation is program logic (pthread_cancel), it doesn't mean
> you thread crashed, it means your program decided to cancel the thread.
> If you still care about the return value after having called
> pthread_cancel(),

Sotry, forgot to end this sentence... What I mean is:

If you still care about the return value after having called

pthread_cancel(), your program logic is unnecessarily complex, and
you should find some other way to handle this case.

-- Alain.

Carl Banks

unread,
Sep 3, 2011, 8:04:42 AM9/3/11
to
On Friday, September 2, 2011 11:01:17 AM UTC-7, Adam Skutt wrote:

> On Sep 2, 10:53 am, Roy Smith <r....@panix.com> wrote:
> > I have a function I want to run in a thread and return a value.  It
> > seems like the most obvious way to do this is to have my target
> > function return the value, the Thread object stash that someplace, and
> > return it as the return value for join().
> > > Yes, I know there's other ways for a thread to return values (pass the
> > target a queue, for example), but making the return value of the
> > target function available would have been the most convenient.  I'm
> > curious why threading wasn't implemented this way.
>
> I assume it is because the underlying operating system APIs do not
> support it.

Nope. This could easily be implemented by storing the return value in the Thread object.

It's not done that way probably because no one thought of doing it.


Carl Bannks

Carl Banks

unread,
Sep 3, 2011, 8:27:51 AM9/3/11
to
On Friday, September 2, 2011 11:53:43 AM UTC-7, Adam Skutt wrote:
> On Sep 2, 2:23 pm, Alain Ketterlin <al...@dpt-info.u-strasbg.fr>
> wrote:
> > Sorry, you're wrong, at least for POSIX threads:
> >
> > void pthread_exit(void *value_ptr);
> > int pthread_join(pthread_t thread, void **value_ptr);
> >
> > pthread_exit can pass anything, and that value will be retrieved with
> > pthread_join.
>
> No, it can only pass a void*, which isn't much better than passing an
> int. Passing a void* is not equivalent to passing anything, not even
> in C. Moreover, specific values are still reserved, like
> PTHREAD_CANCELLED. Yes, it was strictly inappropriate for me to say
> both return solely integers, but my error doesn't meaningful alter my
> description of the situation. The interface provided by the
> underlying APIs is not especially usable for arbitrary data transfer.

I'm sorry, but your claim is flat out wrong. It's very common in C programming to use a void* to give a programmer ability to pass arbitrary data through some third-party code.

The Python API itself uses void* in this way in several different places. For instance, ake a look at the Capsule API (http://docs.python.org/c-api/capsule.html). You'll notice it uses a void* to let a user pass in opaque data. Another case is when declaring properties in C: it's common to define a single get or set function, and only vary some piece of data for the different properties. The API provides a void* so that the extension writer can pass arbitrary data to the get and set functions.


Carl Banks

Roy Smith

unread,
Sep 3, 2011, 12:51:03 PM9/3/11
to
In article <j3rdg...@news6.newsguy.com>,
Chris Torek <nos...@torek.net> wrote:

> For that matter, you can use the following to get what the OP asked
> for. (Change all the instance variables to __-prefixed versions
> if you want them to be Mostly Private.)
>
> import threading
>
> class ValThread(threading.Thread):
> "like threading.Thread, but the target function's return val is captured"
> def __init__(self, group=None, target=None, name=None,
> args=(), kwargs=None, verbose=None):
> super(ValThread, self).__init__(group, None, name, None, None,
> verbose)
> self.value = None
> self.target = target
> self.args = args
> self.kwargs = {} if kwargs is None else kwargs
>
> def run(self):
> "run the thread"
> if self.target:
> self.value = self.target(*self.args, **self.kwargs)
>
> def join(self, timeout = None):
> "join, then return value set by target function"
> super(ValThread, self).join(timeout)
> return self.value

Yeah, that's pretty much what I had in mind. I'm inclined to write up a
PEP proposing that this become the standard behavior of
threading.Thread. It seems useful, and I can't see any way it would
break any existing code.

0 new messages