Or must each process somehow bracket its fgets( ) with explicit stream
locks and unlocks?
N869 seems to be silent on this issue and I don't know where else to
look, so if there's a reference somewhere that treats this issue, I'd
appreciate having that.
Thanks!
-- Ben
--
comp.lang.c.moderated - moderation address: cl...@plethora.net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.
A single stream or a single file?
> Is A's fgets( ) guaranteed to complete before B's fgets
> ( ) begins? IOW, can both A and B count on receiving coherent lines
> when they are competing to fgets( ) from the same stream?
>
> Or must each process somehow bracket its fgets( ) with explicit stream
> locks and unlocks?
>
> N869 seems to be silent on this issue and I don't know where else to
> look, so if there's a reference somewhere that treats this issue, I'd
> appreciate having that.
The C standard is silent on issues of parallel processing. No, there
are no guarantees.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Well, if they're two separate *processes* and not just threads, he
can at least be sure that the output will be coherent. There are
no guarantees as to what order they're read in, but the two
processes should each have their own offset stored in the FILE
struct, so each process will receive the exact same text.
If they are threads, then yes, he'll need to read the docs of
whatever library he's using for that to determine how things
work.
> Keith Thompson wrote:
>> The C standard is silent on issues of parallel processing. No, there
>> are no guarantees.
>
> Well, if they're two separate *processes* and not just threads, he
> can at least be sure that the output will be coherent.
What output? fgets() does input, not output.
> There are
> no guarantees as to what order they're read in, but the two
> processes should each have their own offset stored in the FILE
> struct, so each process will receive the exact same text.
There is no requirement to have an offset stored in the FILE
struct. I guess, most implementations don't have it. Any
necessary offset is maintained by the OS.
There are streams where it is impossible to read any data twice.
Terminals, network sockets, pipes and some special devices like
random number generators belong to this category.
--
Greetings,
Jens Schmidt
By output, I meant the characters copied into the buffer pointed to
by the parameter passed to fgets. With respect to the various types
of files you mentioned, true enough; I assumed he was asking what
would happen if two processes were reading from a normal file at the
same time.
> Or must each process somehow bracket its fgets( ) with explicit stream
> locks and unlocks?
yeah. or avoid sharing the socket.
--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---
Applying the terminology of the C programming language, I think that is
impossible by definition, because a "stream" is what you get once you
fopen() something. I.e. if two processes fopen() the same thing, that's
two streams, not one.
For the rest of your questions the answer is: the C standard doesn't say
anything about processes, let alone concurrent ones.
> Or must each process somehow bracket its fgets( ) with explicit stream
> locks and unlocks?
It can't --- there is no such thing as a stream lock or unlock in
standard C.
The language doesn't say whether it's possible or not.
Under a Unix-like system, a process could call fork() to create a new
process, and both the parent and child process could share the same
stream. (I think; there are probably reasons why this is a bad idea.)
> For the rest of your questions the answer is: the C standard doesn't
> say anything about processes, let alone concurrent ones.
Exactly.
And the OP hasn't bothered to come back and tell us whether he
really means a single stream, a single file, or something else.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
> And the OP hasn't bothered to come back and tell us whether he
> really means a single stream, a single file, or something else.
Keith and Francis answered the question completely so why any further
words? I wish I'd followed the discussion before today.
I should have posed the complete (but still imprecise) question:
-----
I have a process P that first fopen()s one single disk file F,
associating with F a FILE * fp; and then fork()s two child processes,
A and B. A and B each inherits one copy of fp. Both copies of fp point
to one single FILE struct containing (functionally) the single file
offset into F.
Now A and B call fgets() on fp such that B's call can arrive at fgets
() before A's call is completed; that is, in particular, before the
file offset is advanced to the next line.
There's no need for fgets() to deliver lines to A and B in any
particular order, only that fgets() deliver complete lines to A and B
and that the file offset always advance (never retreat) one complete
line at a time.
In these circs, is there any guarantee that both A and B will receive
complete, coherent lines from fgets()?
-----
I asked the question because once in a while there show up in
newsgroups, including I believe c.l.c., the claim that fread() and
friends operate in way that is guaranteed atomic.
Thinking about this, and forgetting what the language spec says, I see
that there's no practical way for fgets() to make any such guarantee:
fgets() can't hold off other calls while the OS seeks the next
physical block for the text to complete the read of a line.
Conclusion: fgets(), more probably than not, does indeed deliver
complete lines; and does advance the file offset just the way I want.
But the behavior is purely accidental.
-- ben
> > Or must each process somehow bracket its fgets( ) with explicit stream
> > locks and unlocks?
>
> It can't --- there is no such thing as a stream lock or unlock in
> standard C.
Yes, thanks. I had the idea that flock() was standard.
-- ben
> I have a process P that first fopen()s one single disk file F,
> associating with F a FILE * fp; and then fork()s two child processes, A
> and B. A and B each inherits one copy of fp. Both copies of fp point to
> one single FILE struct containing (functionally) the single file offset
> into F.
It's more complex than that. Since you mention fork(), I suggest you
determine first which version of the Single UNIX(R) Specification you want
to code for, and then read the corresponding section on the "Interaction
of File Descriptors and Standard I/O Streams" (and the preceding
introduction to "Standard I/O Streams").
SUSv2 (UNIX 98):
http://www.opengroup.org/onlinepubs/007908775/xsh/stdio.html#tag_000_009_001
SUSv3 (UNIX 03 / POSIX 2004):
http://www.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_05.html#tag_02_05_01
SUSv4 (POSIX 2008):
http://www.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05_01
I'll put a very crude personal summary here, reiterating some of my "18
Sep 2009 at 9:18 pm" comment under http://pl.atyp.us/wordpress/?p=2298.
This is very much not C89 or C99, so I apologize for posting in the wrong
group. You should really read the pages linked to above instead of what
follows.
1. The chain from standard I/O stream to UNIX file is:
stream -> file descriptor -> file description -> inode (file)
(For the necessity of the "stream -> file descriptor" reference, look at
the fileno() rationale under [0], for example).
The "stream" object usually lives in user space (libc) and certainly has a
file-position indicator. (See fseek() / fseeko() [1] and fsetpos() [2].)
It also provides user-space buffering, see setvbuf() [3].
The file descriptor has no "offset" notion associated with it. An example
for file descriptor level characteristics is FD_CLOEXEC, see fcntl() /
F_SETFD [4]. The file descriptor is the link between user-space and
kernel-space.
The file description ("open file") lives in kernel-space and contains a
file offset. See lseek() [5].
The inode *is* the file. Access rights are inode attributes, for example.
See fstat() [6].
2. Two file descriptors can perfectly well refer to the same file
description, both when those descriptors are available to the same process
or when they are available to distinct processes. (See dup() [7], or
open() [8] plus fork() [9].) Accesses through these two descriptors share
the file offset, because the file offset is maintained in the underlying,
common file description.
3. Two separate file descriptions can refer to the same inode. It happens
when two "unrelated" processes open() [8] the file. Modifications through
any of them will update the same "last data modification timestamp" [a],
for example.
4. If you don't go "above" file descriptor level, "sharing" on any level
is a little less problematic because everything is managed by the kernel.
Record locking is available via two interfaces (whose interaction is
unspecified), fcntl() / F_SETLK [4] and lockf() [b].
5. If at least one of the two accessors acts through a standard I/O
stream, you'll additionally have to follow the synchronization rules
linked to at the top. Even then there seem to be only output guarantees;
"It is implementation-defined whether, and under what conditions, all
input is seen exactly once."
If you want to parallelize the processing of the file, a single splitter
process (or thread) might prove workable. It could read bigger chunks,
"newline-align" them, then pass them on to workers via IPC.
I apologize for any mistakes above, please anybody correct them. Thanks.
Happy new year,
lacos
[0] http://www.opengroup.org/onlinepubs/9699919799/functions/fileno.html
[1] http://www.opengroup.org/onlinepubs/9699919799/functions/fseek.html
[2] http://www.opengroup.org/onlinepubs/9699919799/functions/fsetpos.html
[3] http://www.opengroup.org/onlinepubs/9699919799/functions/setvbuf.html
[4] http://www.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
[5] http://www.opengroup.org/onlinepubs/9699919799/functions/lseek.html
[6] http://www.opengroup.org/onlinepubs/9699919799/functions/fstat.html
[7] http://www.opengroup.org/onlinepubs/9699919799/functions/dup.html
[8] http://www.opengroup.org/onlinepubs/9699919799/functions/open.html
[9] http://www.opengroup.org/onlinepubs/9699919799/functions/fork.html
[a] http://www.opengroup.org/onlinepubs/9699919799/basedefs/sys_stat.h.html
[b] http://www.opengroup.org/onlinepubs/9699919799/functions/lockf.html
It probably is - for some particular standard; it just isn't standard C.
Heh! Another accuritis sufferer :-)
- Hide quoted text -
- Show quoted text -
> From: lcplben <b...@sellmycalls.com>
> Date: Thu, 31 Dec 2009 20:00:22 -0600 (CST)
> Message-ID: <clcm-20091231-0...@plethora.net>
> > I have a process P that first fopen()s one single disk file F,
> > associating with F a FILE * fp; and then fork()s two child processes, A
> > and B. A and B each inherits one copy of fp. Both copies of fp point to
> > one single FILE struct containing (functionally) the single file offset
> > into F.
> It's more complex than that. Since you mention fork(), I suggest you
> determine first which version of the Single UNIX(R) Specification you want
> to code for, and then read the corresponding section on the "Interaction
> of File Descriptors and Standard I/O Streams" (and the preceding
> introduction to "Standard I/O Streams").
> SUSv2 (UNIX 98):http://www.opengroup.org/onlinepubs/007908775/xsh/stdio.html#tag_000_...
> SUSv3 (UNIX 03 / POSIX 2004):http://www.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_05...
> SUSv4 (POSIX 2008):http://www.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.ht...
> I'll put a very crude personal summary here, reiterating some of my "18
> Sep 2009 at 9:18 pm" comment underhttp://pl.atyp.us/wordpress/?p=2298.
> Happy new year,
> lacos
Laszlo, thank you very much for all of your research and your
perfectly à propos response. Your terrific post has answered my
question completely; it has revealed documentation that I didn't know
exested, and it has peeled away another layer of the onion that
obscures the file system at its core.
The answer is that the file system, even at the file-descriptor level,
makes no claim about the atomicity of fwrite( ) and fread( ).
But the docs suggest another approach that /is/ atomic. IEEE Std
1003.1-2008, your third reference above, says of a write( ) to a pipe:
O_NONBLOCK disabled, n <= PIPE_BUF
All n bytes are written atomically; write(2) may block if there
is not room for n bytes to be written immediately
So a read( ) on the other end of the pipe can easily be made atomic.
Thank you again, lacos.
-- ben
--
comp.lang.c.moderated - moderation address: c...@plethora.net -- you
No, they would be different streams backed by a shared file descriptor.
Scheduling and buffering render the semantics of using both concurrently
unpredictable, to say the least. Using POSIX file descriptors directly
instead of C streams eliminates the buffering issues, but not the
scheduling issues.
The following program illustrates my point:
| #include <stdio.h>
| #include <unistd.h>
|
| int
| main(void)
| {
| pid_t pid;
|
| printf("hello");
| if ((pid = fork()) != 0) {
| printf(" parent");
| } else {
| printf(" child");
| }
| printf("\n");
| return 0;
| }
My understanding of §7.19.3 is that if stdout refers to an interactive
device, it may not be fully buffered, but it may be either unbuffered or
line buffered, at the implementer's discretion. I believe (but am not
certain) that POSIX requires it to be line buffered; that is certainly
the case on all unices I've worked on. Hence, when this program is run
on a terminal on a UNIX-like system, the most commonly seen result is
either
| hello parent
| hello child
or
| hello child
| hello parent
Assuming a reasonable buffer size (i.e. more than 13 characters), there
are several other possibilities, including:
| hello parent
| child
| hello parent child
|
| child
| hello parent
| childhello parent
|
and, if fork() failed:
| hello parent
If stdout is unbuffered, the details of the printf() implementation come
into play. If printf() has its own internal buffer, the result will be
some variation of "hello" [ " parent" "\n" " child" ] "\n", where the
strings between the brackets can appear in any order. If printf() is
unbuffered, the result will be "hello" [ " parent\n" " child\n" ] where
the individual characters in the strings between the brackets can be
intermingled, but not reordered.
DES
--
Dag-Erling Smørgrav - d...@des.no
Thanks for the correction. I was speculating about something that's
off-topic for this newsgroup; it's not too surprising that I got it
wrong. My apologies.
This discussion should have been taken to comp.unix.programmer some
time ago.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Not if the file is connected to a pipe or FIFO on a Unix system; nor
indeed for /dev/random, nor a terminal, nor many other things. And even
if it is a disk file, they're aren't guaranteed to read the same data;
there could be other processes also modifying the file (or, indeed,
either of the original two could modify the file and the other would see
the modified version, not the data originally read from it).
Jonathan Leffler