Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

pthread_atfork() and and double-forking

36 views
Skip to first unread message

Nathan J. Williams

unread,
Feb 20, 2003, 2:15:51 PM2/20/03
to

The SUSv3 rules for fork() in a multi-threaded program say that the
child process may only perform async-signal-safe operations until an
exec*() call is made. As has been pointed out before, this makes
pthread_atfork() less useful than it might appear to be, because the
pthread_*() calls a library or application could want to make in the
'child' handlers are not async-signal-safe.

However, fork() itself is listed as async-signal-safe, so an
application author can conclude that calling fork() again from the
child is OK. There's nothing I've found that implies that
pthread_atfork() handlers are cleared in the child process, so it
seems like the required behavior is that the 'prepare' and 'parent'
handlers are executed again in the first child, and the 'child'
handlers are executed in the second child. This seems like a problem,
since the intended use of the 'prepare' and 'parent' handlers is to
acquire and release resources for the child, using the pthread_*()
calls that aren't safe at this point.

An application author can be aware of this and avoid double-forking if
they use pthread_atfork(). But what can they do if a library is
multi-threaded, unknown to them? What can a library author do to avoid
this double-forking problem if they want to use pthread_atfork() to
protect library state? Is this just another indication that
pthread_atfork() is half-baked?

- Nathan

David Butenhof

unread,
Feb 21, 2003, 10:01:05 AM2/21/03
to
Nathan J. Williams wrote:

Hmm. Interesting point. There's always been a problem in the definition of
fork() and atfork handlers. The mechanism was invented as a pragmatic hack
by OSF for the DCE project, and it was simply a "best effort" sort of tool;
if you're really careful, maybe this'll work, if not, well we tried.
Arguably it never really belonged in a standard, but, pragmatically, people
still have the same needs and constraints. We can't do any better, but we
were (marginally) convinced that trying to allow for something like this
was better than ignoring the problem.

We missed a lot of details, though, including the fact that fork is
async-signal safe whereas very little one might really want to do in an
atfork prepare handler is async-signal safe.

SUSv3 TC1 (that is, the first technical corrigenda, or corrections, to XSH6,
aka Single UNIX Specification, Version 3, aka POSIX 1003.1-2001) addresses
this by specifying that implementations may ignore atfork handlers when
fork() is called from a signal handler.

This is a small provision, but at least it enables a solution.

However, as you point out, the solution isn't complete. Well, heck, we
always knew it wasn't complete, but even the "mop up" completely missed
this "shadow area" where we're also technically restricted to async-signal
safe functions but NOT inside a signal handler.

Making use of the TC1 allowance has always seemed awkward to me, since
there's really no concept of "signal state" in UNIX. If one were to add
some new process state that would be set between delivery of a signal and
sigreturn (or equivalent), then that same state could be set between fork
and exec in the child. Technically, however, that's not allowed by the
current standard.

Perhaps we should broaden the allowance from "in a signal handler" to "when
called from a region where use of non-async-signal safe code is
restricted". (I wish I could decide whether I'm joking...)

Removing the atfork handlers in the child, or precluding "double fork",
isn't really a solution. DESPITE the "async-safe to exec" provision (which
was really just something we missed when adding threads to POSIX), the
intent of atfork is that the cloned child could continue running the same
program and using threads. That means it must also be able to fork.

Some of us always knew that this idea could never really work reliably, but
on balance it got into the standard anyway. Only later did I notice the
restriction about what can be done between fork and exec. I actually tried
to get it changed when I found it, but as we discussed the issue in the
POSIX interpretations committee, we came around to the realization that
this was very nearly the provision that ought to be there. You're in
unknown and uncharted territory, and you're pretty much on your own. We're
rooting for you, and we hope you make it out safely... but if you fall into
a pit, well, we DID post "no hiking" signs at the entrance, and nobody's
going to come looking for you. ;-)

What all this means in practice is that no "strictly conforming" POSIX
application can do anything other than async-signal safe operations in an
atfork handler. Even the TC1 modification merely allows a more loosely
"conforming" application to code PARENT and PREPARE handlers that use non
async-signal safe functions; but as you've pointed out the CHILD handler is
still restricted to async-signal safe functions. Which, pragmatically,
places the same restriction on the PARENT and PREPARE handlers since the
main purpose of the CHILD handler is to undo what the PREPARE handler did
to cloned resources like mutexes. (What it can't UNdo can't be done in the
first place.)

Really, the solution to all this is posix_spawn(). That is, there's no fork
or exec -- they're combined into an atomic operation that creates a new
process running a new program. If you want another parallel entity in the
same program, you create a thread, not a process. If you want a new
program, you spawn.

Unfortunately, posix_spawn is complicated and relatively new, and not widely
available. As it's an option in the standard, I don't know that it will
ever be widely available, much less widely used.

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

Patrick TJ McPhee

unread,
Feb 21, 2003, 3:26:40 PM2/21/03
to
In article <Sar5a.124$BI5...@news.cpqcorp.net>,
David Butenhof <David.B...@hp.com> wrote:

% You're in
% unknown and uncharted territory, and you're pretty much on your own. We're
% rooting for you, and we hope you make it out safely...

What would one have to do to get _that_ into the standard?

--

Patrick TJ McPhee
East York Canada
pt...@interlog.com

Gerhard Wesp

unread,
Feb 24, 2003, 4:24:14 AM2/24/03
to
David Butenhof <David.B...@hp.com> wrote:
> SUSv3 TC1 (that is, the first technical corrigenda, or corrections, to XSH6,
> aka Single UNIX Specification, Version 3, aka POSIX 1003.1-2001) addresses

These a.k.a.'s are a little confusing, especially for newbies like me :)

Are there any plans in the standards committees to try and unify the
nomenclature?

-Gerhard
--
| voice: +43 (0)676 6253725 *** web: http://www.cosy.sbg.ac.at/~gwesp/
|
| Passts auf, seid's vuasichdig, und lossds eich nix gfoin!
| -- Dr. Kurt Ostbahn

Alexander Terekhov

unread,
Feb 24, 2003, 4:43:53 AM2/24/03
to

Gerhard Wesp wrote:
>
> David Butenhof <David.B...@hp.com> wrote:
> > SUSv3 TC1 (that is, the first technical corrigenda, or corrections, to XSH6,
> > aka Single UNIX Specification, Version 3, aka POSIX 1003.1-2001) addresses
>
> These a.k.a.'s are a little confusing, especially for newbies like me :)
>
> Are there any plans in the standards committees to try and unify the
> nomenclature?

AFAIK, SUS is nothing but a "profile" [and "branding"] thing.

http://www.opengroup.org/austin/faq.html
http://www.unix.org/version3/iso_std.html

regards,
alexander.

David Butenhof

unread,
Feb 24, 2003, 9:29:16 AM2/24/03
to
Patrick TJ McPhee wrote:

> In article <Sar5a.124$BI5...@news.cpqcorp.net>,
> David Butenhof <David.B...@hp.com> wrote:
>
> % You're in
> % unknown and uncharted territory, and you're pretty much on your own.
> We're % rooting for you, and we hope you make it out safely...
>
> What would one have to do to get _that_ into the standard?

You mean, that phrase, which you quoted from my posting? That is, an
explicit warning that all this stuff is a major mess? The complication is
that POSIX covers a vast range from embedded realtime systems up. The
problem here exists only for complicated modular programming environments,
where state is held across a set of uncoordinated and semi-independent
facilities. Any implication that it also might unavoidably exist in simple
monolithic systems, or even in carefully coordinated complicated systems,
would be unacceptable.

David Butenhof

unread,
Feb 24, 2003, 10:17:03 AM2/24/03
to
Alexander Terekhov wrote:

> Gerhard Wesp wrote:
>>
>> David Butenhof <David.B...@hp.com> wrote:
>> > SUSv3 TC1 (that is, the first technical corrigenda, or corrections, to
>> > XSH6, aka Single UNIX Specification, Version 3, aka POSIX 1003.1-2001)
>> > addresses
>>
>> These a.k.a.'s are a little confusing, especially for newbies like me :)
>>
>> Are there any plans in the standards committees to try and unify the
>> nomenclature?
>
> AFAIK, SUS is nothing but a "profile" [and "branding"] thing.

UNIX 98 is a BRAND for implementations verifiably conforming to the SUSv2
specification.

Many of the overlapping names are a matter of history and politics, but they
also signify the "approval stamp" of various segments of the industry.

POSIX was developed to standardize UNIX interface -- but omitting all but
the basics from the various UNIX lineages. POSIX became an international
standard when the POSIX 1003.1-1990 specification was accepted and
published by the ISO/IEC international standards body, but using their own
naming: ISO/IEC 9945-1:1990. So we've got 2 names on that side of the
family.

Meanwhile, The Open Group (at the time known as X/Open, essentially meaning
"Open UNIX") built the XPG (X/Open Portability Guide) as something of a
profile on top of POSIX, but also to bring in additional divergent (and
sometimes redundant) historical functions from the System V and BSD lines.
The UNIX trademark went looking for a home in the early 90s, and ended up
with The Open Group. They tried to exploit it by unifying the divergent
lines, in the end identifying 1170 common interfaces, which became
"SPEC1170", the basis for the 4th edition of the X/Open Portability Guide
(XPG4). This was reinforced with a test system for the "UNIX 93" brand,
which authorized use of the "UNIX" trademark.

This was expended to become XPG4.2, which was also called the Single UNIX
Specification (SUSv2) and designated by the UNIX 95 brand.

Meanwhile, POSIX developed realtime and threads (the 1003.1b and 1003.1c
amendments to 1003.1-1990), which were combined in the 1003.1-1996 issue of
POSIX. The Open Group responded with XPG5, bringing in both sets of
interfaces as well as a set of useful extensions. This of course became
SUSv2, and was designated by the UNIX 98 brand.

Then, in 2001, POSIX 1003.1-1996 came up for its 5 year renewal. POSIX and
The Open Group worked out a landmark deal to finally combine the two
separate documents in a single source file. Thus, POSIX 1003.1-2001 and
ISO/IEC 9945-1:2002 get added to the list of aliases, along with XPG6, and
the corresponding UNIX 03 brand.

Yes, there are a lot of names. Dropping any of them could be politically
contentious and also potentially confusing as various groups of people are
used to following one or more of these successions of specifications.

The distinction between the brand and the specifications is important.
Anyone can reference the specification, but the brand proves validated
conformance. The ISO name is important because it's part of the
international formal standards community. POSIX is important because it's a
long-standing pillar of the UNIX market. SUS remains something beyond basic
POSIX (many sections of the common document are still shaded as applying
only to SUS conformance, not strict POSIX conformance). The XPG name takes
a much more secondary significance now, and can mostly be ignored.

Patrick TJ McPhee

unread,
Feb 24, 2003, 4:57:58 PM2/24/03
to
In article <0%p6a.249$0N1...@news.cpqcorp.net>,
David Butenhof <David.B...@hp.com> wrote:
% Patrick TJ McPhee wrote:
%
% > In article <Sar5a.124$BI5...@news.cpqcorp.net>,

% > David Butenhof <David.B...@hp.com> wrote:
% >
% > % You're in
% > % unknown and uncharted territory, and you're pretty much on your own.
% > We're % rooting for you, and we hope you make it out safely...
% >
% > What would one have to do to get _that_ into the standard?
%
% You mean, that phrase, which you quoted from my posting?

I meant those precise words, in particular the second sentence.
Perhaps something in the rationale:

Inclusion of the phrase `We're rooting for you, and we hope you make it
out safely' was considered by the standard developers. The intent of
the standard is to allow application developers create correct, portable
applications, while the phrase primariliy provides amusement to one
such developer. It was decided the developer should watch _This Hour
has 22 Minutes_ and read comic novels.

0 new messages