Is pthread_cleanup_push/pop Supposed To Work On Mac OSX?

Keith

unread,

Jan 28, 2003, 4:25:37 PM1/28/03

to

Here is a little program that doesn't work on the Darwin machine. It
will not call the "exit_routine". The program functions properly on
Sun, IRIX and Linux. I'd appreciate any help. Thanks.

/**********************/
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

void* routine() ;
void exit_routine() ;

pthread_t thread ;

int main() {

pthread_create( &thread, NULL, routine, (void*) NULL ) ;
usleep( 500000 );
pthread_cancel( thread ) ;
pthread_join( thread, (void*) NULL ) ;
}

void* routine() {

int execute = 1 ;
int ii = 0 ;

pthread_cleanup_push(exit_routine, (void*)NULL) ;
while ( 1 ) {
fprintf(stderr, "> %d\n", ii++ ) ;
usleep(100000);
}
pthread_cleanup_pop(execute) ;
}

void exit_routine() {
fprintf(stderr, "Routine successfully exited.\n");
}
/********************************/

Michael Fuhr

unread,

Jan 28, 2003, 6:28:10 PM1/28/03

to

vet...@lincom-asg.com (Keith) writes:

> Here is a little program that doesn't work on the Darwin machine. It
> will not call the "exit_routine". The program functions properly on
> Sun, IRIX and Linux. I'd appreciate any help. Thanks.

I'm no thread expert but I'll give it a shot. Somebody please
correct my comments if I'm mistaken about anything.

> /**********************/
> #include <stdio.h>
> #include <unistd.h>
> #include <pthread.h>
>
> void* routine() ;
> void exit_routine() ;
>
> pthread_t thread ;
>
> int main() {
>
> pthread_create( &thread, NULL, routine, (void*) NULL ) ;
> usleep( 500000 );
> pthread_cancel( thread ) ;
> pthread_join( thread, (void*) NULL ) ;
> }

main() is supposed to return an integer but it doesn't return
anything. I'd suggest adding a statement like "return EXIT_SUCCESS"
or "return 0".

Your code never checks functions' return values to see if they were
successful or not. Perhaps that's just to simplify the example;
in the real code it would be a good idea to check the return value
of any function upon whose success your program depends.

> void* routine() {
>
> int execute = 1 ;
> int ii = 0 ;
>
> pthread_cleanup_push(exit_routine, (void*)NULL) ;
> while ( 1 ) {
> fprintf(stderr, "> %d\n", ii++ ) ;
> usleep(100000);
> }
> pthread_cleanup_pop(execute) ;
> }

routine() is supposed to return a void * but it doesn't return
anything. I'd suggest adding a statement like "return NULL".

Apparently Darwin doesn't treat fprintf() or usleep() as cancellation
points; FreeBSD 4.x behaves the same way. A workaround might be
to add a call to pthread_testcancel() in the while-loop.

> void exit_routine() {
> fprintf(stderr, "Routine successfully exited.\n");
> }
> /********************************/

According to the specification at www.opengroup.org[1], usleep()
is required to be a cancellation point, so it would appear that
Darwin isn't compliant in this respect; you may wish to report this
as an implementation bug. Note that usleep() is marked as
"obsolescent," so it might be preferable to use nanosleep() instead
if your platform has it. Nanosleep() is also required to be
a cancellation point, and it does appear to be one on FreeBSD 4.x.

According to the specification, fprintf() may be a cancellation
point but it's not required to be one. Although one might think
that calling fprintf() eventually implies a call to write(), which
is a required cancellation point, David Butenhof has pointed out[2]
that library functions such as fprintf() are permitted to disable
cancellation while calling other functions that are required
cancellation points.

[1] http://www.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_09.html
[2] http://groups.google.com/groups?selm=3577CD0E.42DEFF02%40zko.dec.com

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

David Butenhof

unread,

Jan 29, 2003, 6:42:33 AM1/29/03

to

Keith wrote:

I don't see any programming errors, except you've assumed that Darwin
implements POSIX threads. Unfortunately, it does not.

I haven't really kept track of the Darwin versions underlying various Mac OS
X releases, but I know that for Mac OS X 10.2, substantial effort was
expended in improving the POSIX thread implementation in order to "mostly
conform" to the standard, whereas the previous version had been wildly out
of conformance. I presume that you HAVE the equivalent Darwin code, since I
believe earlier versions didn't even have pthread_cancel().

Cancellation remains extremely limited, though, apparently due to concerns
about breaking existing code that depended on the previous less conforming
implementation. (While I can see their point, there ARE ways to deal with
that discontinuity without preventing the creation of conforming
applications, and the developers had already allowed for this.)

As Michael Fuhr has suggested in another response, a solution in your test
case would be to add a pthread_testcancel() call in your loop. That may or
may not apply to your real code. And, by all means, file a bug report
against Darwin. This may be a deliberate bug instead of an accidental bug,
but it's still a bug... so "bug 'em" about it. ;-)

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

David Butenhof

unread,

Jan 29, 2003, 7:03:20 AM1/29/03

to

Michael Fuhr wrote:

> vet...@lincom-asg.com (Keith) writes:
>
>> Here is a little program that doesn't work on the Darwin machine. It
>> will not call the "exit_routine". The program functions properly on
>> Sun, IRIX and Linux. I'd appreciate any help. Thanks.
>
> I'm no thread expert but I'll give it a shot. Somebody please
> correct my comments if I'm mistaken about anything.

"I'm not a thread expert, but I play one on TV". No, no, wait, that's
backwards. I AM a thread expert, but, alas, nobody's ever asked me to play
one on TV. Oh well.

I'll just add in a little commentary 'cause, well, that's what I do... but
you did well.

>> int main() {
>>
>> pthread_create( &thread, NULL, routine, (void*) NULL ) ;
>> usleep( 500000 );
>> pthread_cancel( thread ) ;
>> pthread_join( thread, (void*) NULL ) ;
>> }
>
> main() is supposed to return an integer but it doesn't return
> anything. I'd suggest adding a statement like "return EXIT_SUCCESS"
> or "return 0".

It's likely that he's implicitly returning the result of pthread_join(),
(since that's what's likely sitting in the "return value bucket"), which is
probably 0. But then, as you say, there's absolutely no guarantee.

> Your code never checks functions' return values to see if they were
> successful or not. Perhaps that's just to simplify the example;
> in the real code it would be a good idea to check the return value
> of any function upon whose success your program depends.

Absolutely -- that's always good advice.

>> void* routine() {
>>
>> int execute = 1 ;
>> int ii = 0 ;
>>
>> pthread_cleanup_push(exit_routine, (void*)NULL) ;
>> while ( 1 ) {
>> fprintf(stderr, "> %d\n", ii++ ) ;
>> usleep(100000);
>> }
>> pthread_cleanup_pop(execute) ;
>> }
>
> routine() is supposed to return a void * but it doesn't return
> anything. I'd suggest adding a statement like "return NULL".

Since the thread return value argument of pthread_join() is NULL, the
thread's final return is thrown away and makes no difference. I tend to
"toss a NULL into the void" (as it were) just for the heck of it, but
there's really no point.

> Apparently Darwin doesn't treat fprintf() or usleep() as cancellation
> points; FreeBSD 4.x behaves the same way. A workaround might be
> to add a call to pthread_testcancel() in the while-loop.

Yes indeed!

> According to the specification at www.opengroup.org[1], usleep()
> is required to be a cancellation point, so it would appear that
> Darwin isn't compliant in this respect; you may wish to report this
> as an implementation bug. Note that usleep() is marked as
> "obsolescent," so it might be preferable to use nanosleep() instead
> if your platform has it. Nanosleep() is also required to be
> a cancellation point, and it does appear to be one on FreeBSD 4.x.

This is a deliberate design decision in Mac OS X 10.2 out of foolish binary
compatibility concerns. (That is, there probably are broken programs out
there that depended on the lack of system cancellation, but that's no
excuse to prevent proper programs from using it.)

> According to the specification, fprintf() may be a cancellation
> point but it's not required to be one. Although one might think
> that calling fprintf() eventually implies a call to write(), which
> is a required cancellation point, David Butenhof has pointed out[2]
> that library functions such as fprintf() are permitted to disable
> cancellation while calling other functions that are required
> cancellation points.

OK, here's the commentary that's my only real excuse for this reply...

In practice, that's exactly what happens most often. The existing stdio code
isn't structured in a way that makes cancellation cleanup straightforward,
and it's easier to disable it entirely on the syscalls. But that's NOT the
(explicit) justification for making stdio functions "optional" cancellation
points in the standard.

Remember that the point of stdio is to buffer the I/O stream, cutting down
on the number of relatively expensive kernel I/O calls. Therefore, each
fprintf() call does not necessarily call write(). The original point of the
"may be a cancellation point" was really that the working group didn't want
to require that fprintf() test for cancellation when it DIDN'T call
write(). But it did make an awfully convenient excuse for "lazy" developers
to avoid the whole issue by disabling cancellation.

There are often other reasons. In Tru64 UNIX, for example, where
cancellation is an exception, handling cancel requires linking in the
exception support library, libexc. When libexc is included, the loader
can't ignore all the (often large) procedure descriptor information in the
file, which can affect activation time, libc initialization overhead, etc.
Dealing with exceptions inside libc would mean making ALL programs use
exceptions and carry around the extra weight. So libc (including all of
stdio) doesn't handle (or allow) cancellation in any routine large or
sufficiently complicated to require any form of cleanup.

This is one of those strategic decisions (like Apple's decision not to let
the developers do cancellation right) that makes sense at some time to
someone in a position to make the decision -- but often seems pretty stupid
to everyone else afterwards. Which is one major reason why I love to be the
one giving sage technical advice, and would hate to be the one actually
making the decision. ;-)

Michael Fuhr

unread,

Jan 29, 2003, 11:40:26 AM1/29/03

to

David Butenhof <David.B...@hp.com> writes:

> Michael Fuhr wrote:
>
> > vet...@lincom-asg.com (Keith) writes:
> >
> >> Here is a little program that doesn't work on the Darwin machine. It
> >> will not call the "exit_routine". The program functions properly on
> >> Sun, IRIX and Linux. I'd appreciate any help. Thanks.
> >
> > I'm no thread expert but I'll give it a shot. Somebody please
> > correct my comments if I'm mistaken about anything.
>
> "I'm not a thread expert, but I play one on TV". No, no, wait, that's
> backwards. I AM a thread expert, but, alas, nobody's ever asked me to play
> one on TV. Oh well.
>
> I'll just add in a little commentary 'cause, well, that's what I do... but
> you did well.

Thanks.

> >> int main() {
> >>
> >> pthread_create( &thread, NULL, routine, (void*) NULL ) ;
> >> usleep( 500000 );
> >> pthread_cancel( thread ) ;
> >> pthread_join( thread, (void*) NULL ) ;
> >> }
> >
> > main() is supposed to return an integer but it doesn't return
> > anything. I'd suggest adding a statement like "return EXIT_SUCCESS"
> > or "return 0".
>
> It's likely that he's implicitly returning the result of pthread_join(),
> (since that's what's likely sitting in the "return value bucket"), which is
> probably 0. But then, as you say, there's absolutely no guarantee.

Sounds risky. I prefer to be explicit about the return value if
the function is defined to have one. Besides, returning something
shuts up compiler warnings such as "control reaches end of non-void
function" -- warnings that I almost always enable because they help
catch programming oversights.

[snip]

> >> void* routine() {
> >>
> >> int execute = 1 ;
> >> int ii = 0 ;
> >>
> >> pthread_cleanup_push(exit_routine, (void*)NULL) ;
> >> while ( 1 ) {
> >> fprintf(stderr, "> %d\n", ii++ ) ;
> >> usleep(100000);
> >> }
> >> pthread_cleanup_pop(execute) ;
> >> }
> >
> > routine() is supposed to return a void * but it doesn't return
> > anything. I'd suggest adding a statement like "return NULL".
>
> Since the thread return value argument of pthread_join() is NULL, the
> thread's final return is thrown away and makes no difference. I tend to
> "toss a NULL into the void" (as it were) just for the heck of it, but
> there's really no point.

I return something as a defensive programming practice: the caller
can discard the return value if it wants, but the called function
shouldn't depend on that since the caller could change its mind
(i.e., the programmer could change the calling code). I view it
as making the called function keep up its end of the contract,
regardless of what the caller decides to do. And again, returning
something shuts up compiler warnings.

Alexander Terekhov

unread,

Jan 29, 2003, 12:17:53 PM1/29/03

to

Michael Fuhr wrote:
[...]

> > >> int main() {
> > >>
> > >> pthread_create( &thread, NULL, routine, (void*) NULL ) ;
> > >> usleep( 500000 );
> > >> pthread_cancel( thread ) ;
> > >> pthread_join( thread, (void*) NULL ) ;
> > >> }
> > >
> > > main() is supposed to return an integer but it doesn't return
> > > anything. I'd suggest adding a statement like "return EXIT_SUCCESS"
> > > or "return 0".
> >
> > It's likely that he's implicitly returning the result of pthread_join(),
> > (since that's what's likely sitting in the "return value bucket"), which is
> > probably 0. But then, as you say, there's absolutely no guarantee.
>
> Sounds risky. I prefer to be explicit about the return value if
> the function is defined to have one. Besides, returning something
> shuts up compiler warnings such as "control reaches end of non-void
> function" -- warnings that I almost always enable because they help
> catch programming oversights.

Well,

C99:

"reaching the } that terminates the main function returns a
value of 0."

C++:

"If control reaches the end of main without encountering a
return statement, the effect is that of executing return 0;"

regards,
alexander.

David Butenhof

unread,

Jan 29, 2003, 1:14:54 PM1/29/03

to

Alexander Terekhov wrote:

> C99:
>
> "reaching the } that terminates the main function returns a
> value of 0."

Interesting. I missed that. Well, it's consistent with the POSIX "passive
exit" when the last thread terminates.

James Dennett

unread,

Jan 29, 2003, 1:40:28 PM1/29/03

to

David Butenhof wrote:
> Michael Fuhr wrote:
>
>
>>vet...@lincom-asg.com (Keith) writes:
>>
>>
>>>Here is a little program that doesn't work on the Darwin machine. It
>>>will not call the "exit_routine". The program functions properly on
>>>Sun, IRIX and Linux. I'd appreciate any help. Thanks.
>>
>>I'm no thread expert but I'll give it a shot. Somebody please
>>correct my comments if I'm mistaken about anything.
>
>
> "I'm not a thread expert, but I play one on TV". No, no, wait, that's
> backwards. I AM a thread expert, but, alas, nobody's ever asked me to play
> one on TV. Oh well.
>
> I'll just add in a little commentary 'cause, well, that's what I do... but
> you did well.
>
>
>>>int main() {
>>>
>>> pthread_create( &thread, NULL, routine, (void*) NULL ) ;
>>> usleep( 500000 );
>>> pthread_cancel( thread ) ;
>>> pthread_join( thread, (void*) NULL ) ;
>>>}
>>
>>main() is supposed to return an integer but it doesn't return
>>anything. I'd suggest adding a statement like "return EXIT_SUCCESS"
>>or "return 0".
>
>
> It's likely that he's implicitly returning the result of pthread_join(),
> (since that's what's likely sitting in the "return value bucket"), which is
> probably 0. But then, as you say, there's absolutely no guarantee.

I'm not a threads expert, but I do know something about
the C and C++ standards, and they both provide a guarantee
that falling off the end of main is equivalent to returning 0.

Back to the real world: C89/C90 did not guarantee this,
and so many current compilers don't. In any case, many
would say that it's good form to be explicit even though
the standards say you don't need to. And it avoids the
endless discussion in newsgroups:
"You fell off the end of main, that's undefined badness"
"No, it's well-defined by 9899:1999 and 14882:1998"
"Eh? It's surely not. Anyway, I don't trust that, or
my compiler doesn't play that way."
"OK, go ahead and write return 0; anyway, it doesn't hurt"
"Stupid picky language lawyer go home."

Not really about the threading question, but I don't like
to leave incorrect statements unchallenged.

-- James.

Alexander Terekhov

unread,

Jan 29, 2003, 2:07:45 PM1/29/03

to

David Butenhof wrote:
>
> Alexander Terekhov wrote:
>
> > C99:
> >
> > "reaching the } that terminates the main function returns a
> > value of 0."
>
> Interesting. I missed that. Well, it's consistent with the POSIX "passive
> exit" when the last thread terminates.

I wish they had defined it as "reaching the } that terminates
the main function shall terminate the initial thread with the
effect of executing pthread_exit(0)". ;-)

regards,
alexander.

Michael Fuhr

unread,

Jan 29, 2003, 2:36:50 PM1/29/03

to

Alexander Terekhov <tere...@web.de> writes:

Another message in this thread pointed out that earlier standards
didn't make such guarantees, so not all implementations currently
in use do. In any case, I'm on the "it's good form to have an
explicit return statement if the function is defined to return a
value" side of the debate, but then I'm the kind of guy who checks
the return value of close() and looks both ways when crossing a
one-way street.

But that's just my opinion. I could be wrong.

Alexander Terekhov

unread,

Jan 29, 2003, 3:34:00 PM1/29/03

to

Michael Fuhr wrote:
[...]

> Another message in this thread pointed out that earlier standards
> didn't make such guarantees, so not all implementations currently
> in use do. In any case, I'm on the "it's good form to have an
> explicit return statement if the function is defined to return a
> value" side of the debate, but then I'm the kind of guy who checks
> the return value of close() and looks both ways when crossing a
> one-way street.

Well, I'd suggest that you look "^" as well. ;-)

http://groups.google.com/groups?selm=3M%2BzSQlY9149Ew7l%40romana.davros.org
(Subject: Re: [JW] implicit "return 0;" guaranteed for recursive calls to main()?)

regards,
alexander.

--
"The function main shall not be used (3.2) within a program." --14882:1998

David Butenhof

unread,

Jan 31, 2003, 8:13:57 AM1/31/03

to

Alexander Terekhov wrote:

That could be a problem, since POSIX lacks Sun's "daemon thread" attribute.
That is, your change would rely on passive process exit rather than forcing
immediate exit, and some applications would simply cease to exit at all.

At least, for nonthreaded programs, it'd have the same effect as the C99
rule: the main thread is the only one, and its termination results in
passive process termination with the status 0.