Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Solaris stat(2) not POSIX compliant?

11 views
Skip to first unread message

Paul D. Smith

unread,
Jan 9, 2003, 10:17:48 PM1/9/03
to
Hi all;

I'm having a serious problem on Solaris (tried on both 2.7 and 8) with
stat(2), SIGCHLD signal handlers, and the SA_RESTART flag.

The short of it is this: I have a program (GNU make, actually) which is
doing a lot of stat(2) system calls.

I also have installed a signal handler for SIGCHLD, and I've set the
SA_RESTART flag, something like this:

sa.sa_handler = my_handler;
sa.sa_flags = SA_RESTART;
sigaction(SIGCHLD, &sa, NULL);

(of course I check error codes, etc.)

The problem is that if I'm running on an NFS-mounted filesystem, then
even with the SA_RESTART flag set, the stat(2) system call still fails
with EINTR. This is wreaking havoc on my code! Adding EINTR loops to
all my system calls is not very appetizing at all--that's what
SA_RESTART is supposed to fix for me.


According to POSIX, stat(2) is not allowed to fail with EINTR at all.
Further according to POSIX, the SA_RESTART flag is supposed to restart
all system calls which are interruptible (can fail with EINTR).

Neither of these appear to be true on Solaris: my stat(2) _is_ failing
with EINTR (the Solaris man pages say it can), and SA_RESTART is _not_
causing it to restart (the Solaris sigaction man page doesn't list
stat(2) as a system call that is restarted).

So, is Solaris simply not compliant with the POSIX spec here? Or am I
missing something?


Just as a point of order, this code works perfectly every single time on
Linux... :-/.


PS. By "POSIX" I'm actually looking at the SingleUNIX v2 spec, but I'm
pretty sure that the official POSIX spec has the same statements.

--
-------------------------------------------------------------------------------
Paul D. Smith <psm...@gnu.org> Find some GNU make tips at:
http://www.gnu.org http://make.paulandlesley.org
"Please remain calm...I may be mad, but I am a professional." --Mad Scientist

Paul Eggert

unread,
Jan 10, 2003, 1:33:11 AM1/10/03
to
"Paul D. Smith" <psm...@gnu.org> writes:

> According to POSIX, stat(2) is not allowed to fail with EINTR at all.

stat(2) can fail with EINTR, since POSIX says that implementations
"may generate errors included in this list under circumstances other
than those described here"
<http://www.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_03.html#tag_02_03>.
There are some exceptions to this rule but stat(2) isn't one of them.


> Further according to POSIX, the SA_RESTART flag is supposed to restart
> all system calls which are interruptible (can fail with EINTR).

Yes, that's true.


> So, is Solaris simply not compliant with the POSIX spec here?

If you're using NFS, Solaris does not conform to POSIX. It never has,
and it probably never will. This is not the only problem area; there
are others (e.g., mmap and advisory locks).


> Just as a point of order, this code works perfectly every single time on
> Linux... :-/.

Score one for Linux. However, Solaris isn't alone: IRIX has the same
problem here, if memory serves. (Not too surprising given the shared
code base.)


> PS. By "POSIX" I'm actually looking at the SingleUNIX v2 spec, but I'm
> pretty sure that the official POSIX spec has the same statements.

That's obsolete now; the current spec is POSIX 1003.1-2001 (SuSv3)
<http://www.unix.org/version3/online.html>.

Rich Teer

unread,
Jan 10, 2003, 2:23:47 AM1/10/03
to
On 9 Jan 2003, Paul Eggert wrote:

> > PS. By "POSIX" I'm actually looking at the SingleUNIX v2 spec, but I'm
> > pretty sure that the official POSIX spec has the same statements.
>
> That's obsolete now; the current spec is POSIX 1003.1-2001 (SuSv3)
> <http://www.unix.org/version3/online.html>.

True, but current versions of Solaris only claim
compliance with SUSv2.

--
Rich Teer

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-online.net

Casper H.S. Dik

unread,
Jan 10, 2003, 4:38:47 AM1/10/03
to
"Paul D. Smith" <psm...@gnu.org> writes:

>The problem is that if I'm running on an NFS-mounted filesystem, then
>even with the SA_RESTART flag set, the stat(2) system call still fails
>with EINTR. This is wreaking havoc on my code! Adding EINTR loops to
>all my system calls is not very appetizing at all--that's what
>SA_RESTART is supposed to fix for me.

Have you tried mounting with "nointr"?

This behaviour is there for a reason: all system calls over NFS
are considered blocking system calls (in general, system calls that
do file I/O are considered non-blocking)

It's an extremely useful feature; it allows you to ^C out of a non-responding
NFS server. You really don't want to restart system calls in that case.

Try the "nointr" mount option and see if that fixes your problem.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

Joerg Schilling

unread,
Jan 10, 2003, 5:01:49 AM1/10/03
to
In article <3e1e94a7$0$49100$e4fe...@news.xs4all.nl>,

Casper H.S. Dik <Caspe...@Sun.COM> wrote:
>"Paul D. Smith" <psm...@gnu.org> writes:
>
>>The problem is that if I'm running on an NFS-mounted filesystem, then
>>even with the SA_RESTART flag set, the stat(2) system call still fails
>>with EINTR. This is wreaking havoc on my code! Adding EINTR loops to
>>all my system calls is not very appetizing at all--that's what
>>SA_RESTART is supposed to fix for me.
>
>Have you tried mounting with "nointr"?
>
>This behaviour is there for a reason: all system calls over NFS
>are considered blocking system calls (in general, system calls that
>do file I/O are considered non-blocking)

A traditional UNIX decision:

All system calls to fast devices are high priority calls and not interrupable
by signals. A FS did always live on a disk wich is (opposed to a TTY) a
fast device. IIRC, the first NFS inplementation did not include the intr mount
option. It was added later for user's convenience.

>It's an extremely useful feature; it allows you to ^C out of a non-responding
>NFS server. You really don't want to restart system calls in that case.
>
>Try the "nointr" mount option and see if that fixes your problem.

I suspect that the SA_RESTART flag does not work for NFS because NFS does not
interrupt stat(2) at a point where syscall restarting is done. In any case, if
SA_RESTART would work like the OP requested, then it would undermine the "intr"
NFS mount option.

BTW: I see n o reason why GNU make should catch signals except for the puppose
to work around a bash(1) bug which unfortunately is installed as /bin/sh on
Linux. Bash does not handle process groups correctly and for this reason,
commands started via system(3) cannot be killed correctly via ^C on Linux.
This is a pain in the ass if you like to kill recursive make systems via ^C on
Linux.

--
EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
j...@cs.tu-berlin.de (uni) If you don't have iso-8859-1
schi...@fokus.fhg.de (work) chars I am J"org Schilling
URL: http://www.fokus.fhd.de/usr/schilling ftp://ftp.berlios.de/pub/schily

Paul Eggert

unread,
Jan 10, 2003, 8:33:23 PM1/10/03
to
j...@cs.tu-berlin.de (Joerg Schilling) writes:

> if SA_RESTART would work like the OP requested, then it would
> undermine the "intr" NFS mount option.

I don't see why. If the program catches signals, and restarts 'stat'
by hand, that is entirely equivalent to having SA_RESTART behave the
way the OP requested. The only difference is that the C code gets
uglier and harder to read, which is what the OP was trying to avoid.

> I see no reason why GNU make should catch signals

GNU make needs to catch signals for lots of reasons. For example,
it uses signals to interrupt reads off its internal jobserver pipe,
which is a neat portability hack that more people should know about.

Obviously it's a bug in Solaris that "stat" doesn't restart even if
SA_RESTART is in effect. I should submit a bug report to sunsolve,
for what it's worth.

Unfortunately, GNU make still has to deal with Solaris, warts and all.

0 new messages