builtin echo command redirection misbehaves in detached scripts when terminal is closed

28 views
Skip to first unread message

Pierre-Philippe Coupard

unread,
Sep 9, 2007, 10:58:23 AM9/9/07
to bug-...@gnu.org

Configuration Information [Automatically generated, do not change]:
Machine: i486
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='i486'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i486-pc-linux-gnu'
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
-DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include
-I../bash/lib -g -O2
uname output: Linux akula 2.6.17.7 #4 Sat Sep 8 23:10:46 CEST 2007 i686
GNU/Linux
Machine Type: i486-pc-linux-gnu

Bash Version: 3.1
Patch Level: 17
Release Status: release

Description:
With Unix-98 ptys, the builtin echo command gets executed
even when writing to stdout or redirecting to stderr fails, and
the output gets written to the wrong file descriptor if any other
redirection is used in the script.

Repeat-By:
For example, with the following script:

while [ 1 ];do
echo Test1
echo Test2 >> file.txt
sleep 1
done

As expected, when this script is run in the background (&), the
console
slowly fills with "Test1" lines, and the file.txt file slowly
fills with
"Test2" lines.

Now exit the shell leaving the script running (don't simply
close the
xterm, that'd kill the script. Type "exit"). Since the terminal has
closed, stdout is closed, so "echo Test1" should fail. It doesn't,
instead it writes "Test1" lines into whatever open file
descriptor it
can find. In this case, file.txt starts filling up with

Test2
Test1
Test2
Test1
...

This does not happen with BSD-style ptys, because apparently
when the
terminal is closed, the tty seen by the detached bash script stays
intact, and whatever is written to the now-closed terminal is simply
discarded by the kernel, so the script keeps seeing open stdout and
stderr file descriptors. In the case of Unix-98 ptys, this bug
happens
because the tty file descriptors the bash script uses are really
closed

This also does not happen with an external echo command: with
/bin/echo,
the redirection fails and the command is not executed, as expected.


Stephane Chazelas

unread,
Sep 9, 2007, 1:17:24 PM9/9/07
to Pierre-Philippe Coupard, bug-...@gnu.org
On Sun, Sep 09, 2007 at 04:58:23PM +0200, Pierre-Philippe Coupard wrote:
[...]

> while [ 1 ];do
> echo Test1
> echo Test2 >> file.txt
> sleep 1
> done
>
> As expected, when this script is run in the background (&), the
> console
> slowly fills with "Test1" lines, and the file.txt file slowly fills
> with
> "Test2" lines.
>
> Now exit the shell leaving the script running (don't simply close
> the
> xterm, that'd kill the script. Type "exit"). Since the terminal has
> closed, stdout is closed, so "echo Test1" should fail. It doesn't,
> instead it writes "Test1" lines into whatever open file descriptor
> it
> can find. In this case, file.txt starts filling up with
>
> Test2
> Test1
> Test2
> Test1
[...]

Bonjour Pierre-Philippe,

can be reproduced with 3.2.25 and with:

bash -c 'trap "" PIPE; sleep 1; echo a; echo b > a' | :

It seems to be down to the usage of stdio.

According to ltrace, echo seems to be doing printf("Test1\n")
followed by fflush(stdout). When the write(2) underneath
fflush() fails, "Test1\n" remains in the stdio buffer.

Then bash does an dup2(open("file.txt"), fileno(stdout)) instead
of doing an stdio freopen(3), so the next fflush(3) flushes both
"Test1\n" and "Test2\n" to the now working stdout.

Maybe bash can't use freopen(3) as that would mean closing the
original fd. Best is probably not to use stdio at all.

Note that zsh has the same problem, and AT&T ksh seems to have
an even worse problem (in the example above, it outputs "b\n"
twice). ash and pdksh are OK.

Best regards,
Stéphane


Andreas Schwab

unread,
Sep 9, 2007, 1:36:52 PM9/9/07
to Stephane Chazelas, Pierre-Philippe Coupard, bug-...@gnu.org
Stephane Chazelas <Stephane...@yahoo.fr> writes:

> Bonjour Pierre-Philippe,
>
> can be reproduced with 3.2.25 and with:
>
> bash -c 'trap "" PIPE; sleep 1; echo a; echo b > a' | :

I get this:

$ bash -c 'trap "" PIPE; sleep 1; echo a; echo b > a' | :
bash: line 0: echo: write error: Broken pipe

and the file contains only one line.

Andreas.

--
Andreas Schwab, SuSE Labs, sch...@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Stephane Chazelas

unread,
Sep 9, 2007, 2:10:59 PM9/9/07
to Andreas Schwab, Pierre-Philippe Coupard, bug-...@gnu.org
On Sun, Sep 09, 2007 at 07:36:52PM +0200, Andreas Schwab wrote:
> Stephane Chazelas <Stephane...@yahoo.fr> writes:
>
> > Bonjour Pierre-Philippe,
> >
> > can be reproduced with 3.2.25 and with:
> >
> > bash -c 'trap "" PIPE; sleep 1; echo a; echo b > a' | :
>
> I get this:
>
> $ bash -c 'trap "" PIPE; sleep 1; echo a; echo b > a' | :
> bash: line 0: echo: write error: Broken pipe
>
> and the file contains only one line.
[...]

Hi Andreas,

What OS and version of glibc? I do get the error message but I
get both a and b in the file.

That was on Linux, glibc 2.6.1.

--
Stéphane


Stephane Chazelas

unread,
Sep 9, 2007, 2:34:52 PM9/9/07
to Andreas Schwab, Pierre-Philippe Coupard, bug-...@gnu.org
On Sun, Sep 09, 2007 at 07:10:59PM +0100, Stephane Chazelas wrote:
[...]

> What OS and version of glibc? I do get the error message but I
> get both a and b in the file.
>
> That was on Linux, glibc 2.6.1.
[...]

Actually,

bash -c 'echo a; echo b > a' >&-

is enough for me to reproduce the problem.

And that program below shows the same behavior when run as
./a.out >&-

#include <stdio.h>
#include <fcntl.h>
int main()
{
printf("a\n");
fflush(stdout);
dup2(open("a", O_WRONLY|O_CREAT, 0644), 1);
printf("b\n");
fflush(stdout);
return 0;
}

--
Stéphane


Pierre-Philippe Coupard

unread,
Sep 9, 2007, 2:44:25 PM9/9/07
to Andreas Schwab, Stephane Chazelas, bug-...@gnu.org
Andreas Schwab wrote:
> I get this:
> $ bash -c 'trap "" PIPE; sleep 1; echo a; echo b > a' | :
> bash: line 0: echo: write error: Broken pipe
>
> and the file contains only one line.
>
> Andreas.
>
>
I did more tests, and this is what I came up with:

- akula, my bleeding edge box, is a Debian-unstable box upgraded
yesterday sept 8, 2007. It runs linux-2.6.17.7, libc6-2.6.1

- kilo, my most up-to-date box where bash still seems to behave properly
with regard to that problem, is also a Debian-unstable box, upgraded on
may 1st, 2007. It runs linux2.6.18, libc6-2.5

On both boxes, I tried Stephane's test line with bash-3.1.17 and bash-2.05b:

On both boxes, with bash-3.1.17, I get the "bash: line 0: echo: write
error: Broken pipe" message, and no error message with bash-2.05b.

On akula, both versions of bash generate a file with 2 lines.
On kilo, both version of bash generate a correct file with 1 line.


I hope this helps. I'll keep the kilo box in "working bash state" if
anybody wants ssh access to it to test further.


Andreas Schwab

unread,
Sep 9, 2007, 3:03:13 PM9/9/07
to Stephane Chazelas, Pierre-Philippe Coupard, bug-...@gnu.org
Stephane Chazelas <Stephane...@yahoo.fr> writes:

> That was on Linux, glibc 2.6.1.

Same.

Stephane Chazelas

unread,
Sep 9, 2007, 4:01:15 PM9/9/07
to Pierre-Philippe Coupard, Andreas Schwab, bug-...@gnu.org
On Sun, Sep 09, 2007 at 08:44:25PM +0200, Pierre-Philippe Coupard wrote:
[...]
> - akula, my bleeding edge box, is a Debian-unstable box upgraded yesterday
> sept 8, 2007. It runs linux-2.6.17.7, libc6-2.6.1
>
> - kilo, my most up-to-date box where bash still seems to behave properly
> with regard to that problem, is also a Debian-unstable box, upgraded on may
> 1st, 2007. It runs linux2.6.18, libc6-2.5
>
> On both boxes, I tried Stephane's test line with bash-3.1.17 and
> bash-2.05b:
>
> On both boxes, with bash-3.1.17, I get the "bash: line 0: echo: write
> error: Broken pipe" message, and no error message with bash-2.05b.
>
> On akula, both versions of bash generate a file with 2 lines.
> On kilo, both version of bash generate a correct file with 1 line.
[...]

Would seem to be down to the version of glibc where the behavior
of fflush() would have changed. But I don't explain why Andreas
doesn't get the same behavior as me with the same version of
glibc.

I tried on a glibc 2.3.4 and with the C file I see only "b" in
the "a" file. Same thing with Solaris and HPUX with the system's
libc. So that would confirm that the behavior changed in the
glibc (probably somewhere after 2.5). And the problem may be
more of a glibc problem than a bash problem.

I've checked SUSv3 and it doesn't say wether the output buffer
should be emptied after a non-successful fflush() (as older
glibc seemed to be doing but as newer ones seem no longer to be
doing).

--
Stéphane


Andreas Schwab

unread,
Sep 9, 2007, 4:08:14 PM9/9/07
to Stephane Chazelas, Pierre-Philippe Coupard, bug-...@gnu.org
Stephane Chazelas <Stephane...@yahoo.fr> writes:

> On Sun, Sep 09, 2007 at 07:10:59PM +0100, Stephane Chazelas wrote:
> [...]
>> What OS and version of glibc? I do get the error message but I
>> get both a and b in the file.
>>

>> That was on Linux, glibc 2.6.1.

> [...]
>
> Actually,
>
> bash -c 'echo a; echo b > a' >&-
>
> is enough for me to reproduce the problem.

Guess you have a buggy libc, then.

Stephane Chazelas

unread,
Sep 9, 2007, 5:18:07 PM9/9/07
to Andreas Schwab, Pierre-Philippe Coupard, Dmitry Potapov, bug-...@gnu.org, 429...@bugs.debian.org
On Sun, Sep 09, 2007 at 10:08:14PM +0200, Andreas Schwab wrote:
> Stephane Chazelas <Stephane...@yahoo.fr> writes:
>
> > On Sun, Sep 09, 2007 at 07:10:59PM +0100, Stephane Chazelas wrote:
> > [...]
> >> What OS and version of glibc? I do get the error message but I
> >> get both a and b in the file.
> >>
> >> That was on Linux, glibc 2.6.1.
> > [...]
> >
> > Actually,
> >
> > bash -c 'echo a; echo b > a' >&-
> >
> > is enough for me to reproduce the problem.
>
> Guess you have a buggy libc, then.
[...]

I wouldn't be surprised if it has to do with the fix to debian
bug #429021. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021
(I'm CCing Dmitry who is the author of that change according to
bugs.debian.org)

I was testing with debian package 2.6.1-2 that includes Dmitry's
fix for that bug. I don't know if that fix will is planned to be
included in the GNU tree, it doesn't seem it yet in the glibc
CVS repository.

Now, I'm not sure if we can say that the new glibc behavior
observed is bogus (other than it's different from the behavior
observed in all the libcs I tried with). It is not a harmless
change, for sure as it seems to have broken at least bash, zsh
and possibly ksh93.

Dmitry, you may find that whole thread at:
http://groups.google.com/group/gnu.bash.bug/browse_thread/thread/e311bdd4f945a21e/621b7189217760f1

Best regards,
Stéphane


Pierre-Philippe Coupard

unread,
Sep 9, 2007, 5:41:56 PM9/9/07
to Stephane Chazelas, Andreas Schwab, Dmitry Potapov, bug-...@gnu.org, 429...@bugs.debian.org
The change is far from trivial or harmless, if it was intended. I had to
rebuild a custom server I run in a hurry because it was flooding an IRC
channel with log lines a backend bash script sent to stderr. And I can
think of plenty of ways to trash files with this bug.

Anyway, thanks a lot Stéphane and Andreas for testing this!

Eric Blake

unread,
Sep 9, 2007, 10:21:27 PM9/9/07
to Stephane Chazelas, Pierre-Philippe Coupard, bug-...@gnu.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Stephane Chazelas on 9/9/2007 11:17 AM:


> can be reproduced with 3.2.25 and with:
>

> bash -c 'trap "" PIPE; sleep 1; echo a; echo b > a' | :
>
> It seems to be down to the usage of stdio.

Indeed. I raised this very bug several months ago:
http://lists.gnu.org/archive/html/bug-bash/2007-04/msg00070.html

where the cd builtin has the same issue. I also proposed several
approaches for fixing the issue.

- --
Don't work too hard, make some time for fun as well!

Eric Blake eb...@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG5Kom84KuGfSFAYARAgOfAKCiUMKGYRG8+xRJRIoxM5PSnkMoNACgqane
BFh6hhMjAibDs0PgSf032Xw=
=OF9G
-----END PGP SIGNATURE-----


Aurelien Jarno

unread,
Sep 9, 2007, 6:05:57 PM9/9/07
to Stephane Chazelas, 429...@bugs.debian.org, Andreas Schwab, Pierre-Philippe Coupard, Dmitry Potapov, bug-...@gnu.org
Stephane Chazelas a écrit :

> On Sun, Sep 09, 2007 at 10:08:14PM +0200, Andreas Schwab wrote:
>> Stephane Chazelas <Stephane...@yahoo.fr> writes:
>>
>>> On Sun, Sep 09, 2007 at 07:10:59PM +0100, Stephane Chazelas wrote:
>>> [...]
>>>> What OS and version of glibc? I do get the error message but I
>>>> get both a and b in the file.
>>>>
>>>> That was on Linux, glibc 2.6.1.
>>> [...]
>>>
>>> Actually,
>>>
>>> bash -c 'echo a; echo b > a' >&-
>>>
>>> is enough for me to reproduce the problem.
>> Guess you have a buggy libc, then.
> [...]
>
> I wouldn't be surprised if it has to do with the fix to debian
> bug #429021. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021
> (I'm CCing Dmitry who is the author of that change according to
> bugs.debian.org)
>

I can reproduce the "bug" with glibc from etch, or even from sarge, so I
really doubt that it comes from this change.

--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' aur...@debian.org | aure...@aurel32.net
`- people.debian.org/~aurel32 | www.aurel32.net


Stephane Chazelas

unread,
Sep 10, 2007, 3:41:51 AM9/10/07
to Aurelien Jarno, Andreas Schwab, Pierre-Philippe Coupard, Dmitry Potapov, bug-...@gnu.org, 429...@bugs.debian.org
On Mon, Sep 10, 2007 at 12:05:57AM +0200, Aurelien Jarno wrote:
[...]

> >>> bash -c 'echo a; echo b > a' >&-
> >>>
> >>> is enough for me to reproduce the problem.

[both "a" and "b" seen in file "a".]

> >> Guess you have a buggy libc, then.
> > [...]
> >
> > I wouldn't be surprised if it has to do with the fix to debian
> > bug #429021. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021
> > (I'm CCing Dmitry who is the author of that change according to
> > bugs.debian.org)
> >
>
> I can reproduce the "bug" with glibc from etch, or even from sarge, so I
> really doubt that it comes from this change.

[...]

Hi Aurelien.

The reason I suspected that is that Andreas with a glibc-2.6.1
was not seeing the problem so that it could be because it was a
debian issue. Also Pierre-Philippe says it is not in debian
unstable from 1st of May 2007 (glibc-2.5 based). And the only
diff on libio/fileops.c in glibc-2.6.1-2 is that fix for 429021,
and the log for that bug talks of something very related.

I could not reproduce the problem with a glibc-2.3.4 on an old
RedHat system. That version of glibc was inbetween sarge's
(2.3.2) and etch's (2.3.6).

Andreas, could you please confirm which distribution of Linux
you have and which version of the libc package?

All in all, it would suggest that the change was introduced by
debian if not in the fix for 429021. To sum up:

glibc's fflush seems to empty its buffer upon a unsuccessful
fflush() (a fflush(3) where the write(2) fails) on
- debian unstable glibc 2.5 (according to Pierre-Philippe)
- Andreas' glibc 2.6.1
- Some RedHat glibc 2.3.4 (according to me)
- Solaris 7 system libc (not glibc)
- HPUX 11.11 system libc (not glibc)

And it seems not to empty it in
- debian unstable 2.6.1-2 (according to me and
Pierre-Philippe)
- debian etch (2.3.6?) according to Aurelien
- debian sarge (2.3.2?) according to Aurelien

Best regards,
Stéphane


Stephane Chazelas

unread,
Sep 10, 2007, 4:08:33 AM9/10/07
to Dmitry Potapov, Andreas Schwab, Pierre-Philippe Coupard, bug-...@gnu.org, 429...@bugs.debian.org
On Mon, Sep 10, 2007 at 11:56:33AM +0400, Dmitry Potapov wrote:

> On Sun, Sep 09, 2007 at 10:18:07PM +0100, Stephane Chazelas wrote:
> > Now, I'm not sure if we can say that the new glibc behavior
> > observed is bogus (other than it's different from the behavior
> > observed in all the libcs I tried with).
>
> What libc have you tried?
>
> To me, the new behavior makes much more sense, as dropping buffer on
> error is really weird thing to do. I have looked at the source code of
> newlib and dietlibc, none of them drops buffer on error, and I am not
> aware about any other implementation of libc that does.

Hi Dmitry,

thanks for replying, I gave a list in another email. I tried on
Solaris 7 and HPUX and both seem to flush the buffer upon an
unsuccessful fflush()

> > It is not a harmless
> > change, for sure as it seems to have broken at least bash, zsh
> > and possibly ksh93.
>

> Unfortunately, you are right. I did not foresee that some shells may use
> "dup2(open("file.txt"), fileno(stdout))". It is a dirty hack, which may
> cause some other problems. Frankly, I am a bit surprised that bash uses
> printf instead of write(2). BTW, you cannot use 'printf' in signal
> handlers, so it seems that you cannot use 'echo' in trap commands too.
>
> Perhaps, we should rollback my patch and give some time for developers
> to fix their broken shells, but, in this case, what is actually broken
> are those shells, not libc!
[...]

On the other end, how would you force the flush of the buffer?

And how would you redirect stdout? We can use freopen() instead
of the hack above for files, but not for pipes or arbitrary fds
(as in >&3). Erik Blake was suggesting to use freopen(NULL) (not
to fix that very problem but because of the fact that if you
reassign stdout to some resource of a different nature, you need
to tell stdio as stdio may need to operate differently), but
that's not very portable according to POSIX. Would freopen(NULL)
flush the output buffer?

You cannot simply assign stdout to some value returned by
fdopen() as that's not portable either...

--
Stéphane


Stephane Chazelas

unread,
Sep 10, 2007, 7:13:12 AM9/10/07
to Dmitry Potapov, Andreas Schwab, Pierre-Philippe Coupard, bug-...@gnu.org, 429...@bugs.debian.org
On Mon, Sep 10, 2007 at 02:17:41PM +0400, Dmitry Potapov wrote:
[...]

> On Mon, Sep 10, 2007 at 09:08:33AM +0100, Stephane Chazelas wrote:
> > thanks for replying, I gave a list in another email. I tried on
> > Solaris 7 and HPUX and both seem to flush the buffer upon an
> > unsuccessful fflush()
>
> I see... I wonder how they work in regard of my original problem
> described in the Bug#429021, because it is possible to not discard data
> when write failed, but still clean buffer in fflush(). So, functions
> like fwrite, printf will not lose some previously written data on error,
> but fflush() will always have a clean output buffer at return, so
> it will not break existing software, which use dup2 trick.

I'll investigate this evening (BTW, it wasn't Solaris 7, but
Solaris 8).

> > On the other end, how would you force the flush of the buffer?
>

> The flush means to _deliver_ data, which is impossible in this case.

Sorry, I meant flush() as in emptying the buffer (wether
flushing it to the fd or down the drain (discard it)).

BTW, does anybody know why our emails don't seem to make it to
the bash mailing list anymore?

--
Stéphane


Dmitry Potapov

unread,
Sep 10, 2007, 4:06:14 AM9/10/07
to Aurelien Jarno, Andreas Schwab, Pierre-Philippe Coupard, Stephane Chazelas, bug-...@gnu.org, 429...@bugs.debian.org
On Mon, Sep 10, 2007 at 12:05:57AM +0200, Aurelien Jarno wrote:
> > I wouldn't be surprised if it has to do with the fix to debian
> > bug #429021. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021
> > (I'm CCing Dmitry who is the author of that change according to
> > bugs.debian.org)
> >
>
> I can reproduce the "bug" with glibc from etch, or even from sarge, so I
> really doubt that it comes from this change.

I can NOT reproduce the problem with glibc from etch, and I do believe
that my patch caused the aforementioned problem, though I do not think
that the patch was incorrect, as to the real bug lies inside of those
shells.

Dmitry


Dmitry Potapov

unread,
Sep 10, 2007, 6:17:41 AM9/10/07
to Stephane Chazelas, Andreas Schwab, Pierre-Philippe Coupard, bug-...@gnu.org, 429...@bugs.debian.org
Hi Stephane,

On Mon, Sep 10, 2007 at 09:08:33AM +0100, Stephane Chazelas wrote:
> thanks for replying, I gave a list in another email. I tried on
> Solaris 7 and HPUX and both seem to flush the buffer upon an
> unsuccessful fflush()

I see... I wonder how they work in regard of my original problem
described in the Bug#429021, because it is possible to not discard data
when write failed, but still clean buffer in fflush(). So, functions
like fwrite, printf will not lose some previously written data on error,
but fflush() will always have a clean output buffer at return, so
it will not break existing software, which use dup2 trick.

> On the other end, how would you force the flush of the buffer?

The flush means to _deliver_ data, which is impossible in this case.

> And how would you redirect stdout? We can use freopen() instead


> of the hack above for files, but not for pipes or arbitrary fds
> (as in >&3).

I see... POSIX has fdopen to create a stream based on the existing
file descriptor, but there is no function to change an existing
stream like 'stdout'. So, I don't know any other portable solution
except avoiding 'stdout'. For some implementations, you can just
assign any FILE pointer to stdout like this:

FILE* out = fdopen(fd, mode);
if (out != NULL)
{
fclose(stdout);
stdout = out;
}
else
report_error();

but in general it does not work, because stdout is rvalue.

> Erik Blake was suggesting to use freopen(NULL) (not
> to fix that very problem but because of the fact that if you
> reassign stdout to some resource of a different nature, you need
> to tell stdio as stdio may need to operate differently), but
> that's not very portable according to POSIX. Would freopen(NULL)
> flush the output buffer?

In Glibc, freopen:

if (filename == NULL && _IO_fileno (fp) >= 0)
{
fd = __dup (_IO_fileno (fp));
if (fd != -1)
filename = fd_to_filename (fd);
}

Then it closes, the original stream and opens a new one in
the same place. So I believe it should work with glibc
provided you do that you called it after dup2 and that your
system have /proc, because fd_to_filename relies on it.

freopen in newlib does not do anything special about NULL,
so I believe it does not work with NULL.

Perhaps, freopen("/dev/stdout") is a more portable way to
do what you want.

Regards,
Dmitry


Message has been deleted

Chet Ramey

unread,
Sep 10, 2007, 9:18:55 AM9/10/07
to Dmitry Potapov, Pierre-Philippe Coupard, Stephane Chazelas, 429...@bugs.debian.org, Andreas Schwab, bug-...@gnu.org, ch...@case.edu
Dmitry Potapov wrote:

>
> Unfortunately, you are right. I did not foresee that some shells may use
> "dup2(open("file.txt"), fileno(stdout))". It is a dirty hack, which may
> cause some other problems. Frankly, I am a bit surprised that bash uses
> printf instead of write(2). BTW, you cannot use 'printf' in signal
> handlers, so it seems that you cannot use 'echo' in trap commands too.

Luckily, neither of these things is true.

What's needed is a portable interface like BSD's fpurge(3).

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
Live Strong. No day but today.
Chet Ramey, ITS, CWRU ch...@case.edu http://cnswww.cns.cwru.edu/~chet/


Dmitry Potapov

unread,
Sep 10, 2007, 9:36:46 AM9/10/07
to Stephane Chazelas, Andreas Schwab, Pierre-Philippe Coupard, bug-...@gnu.org, 429...@bugs.debian.org
Hello Stephane,

I was wrong about suggestion freopen("/dev/stdout") in my previous mail.
It cannot be used to redirect stdout.

Regards,
Dmitry


Andreas Schwab

unread,
Sep 10, 2007, 9:53:41 AM9/10/07
to chet....@case.edu, Pierre-Philippe Coupard, Stephane Chazelas, 429...@bugs.debian.org, bug-...@gnu.org, Dmitry Potapov, ch...@case.edu
Chet Ramey <chet....@case.edu> writes:

> What's needed is a portable interface like BSD's fpurge(3).

This is also available from glibc as __fpurge (likewise on Solaris).

Eric Blake-1

unread,
Sep 10, 2007, 10:29:43 AM9/10/07
to Bug-...@gnu.org

> What's needed is a portable interface like BSD's fpurge(3).

Gnulib provides this[1]. Maybe you should consider using
gnulib to enhance the portability of future versions of bash.

[1] http://www.gnu.org/software/gnulib/MODULES.html#module=fpurge

--
Eric Blake

--
View this message in context: http://www.nabble.com/builtin-echo-command-redirection-misbehaves-in-detached-scripts-when-terminal-is-closed-tf4409627.html#a12594005
Sent from the Gnu - Bash mailing list archive at Nabble.com.

Chet Ramey

unread,
Sep 10, 2007, 11:57:34 AM9/10/07
to Andreas Schwab, Pierre-Philippe Coupard, Stephane Chazelas, 429...@bugs.debian.org, bug-...@gnu.org, Dmitry Potapov, ch...@case.edu
Andreas Schwab wrote:

> Chet Ramey <chet....@case.edu> writes:
>
>> What's needed is a portable interface like BSD's fpurge(3).
>
> This is also available from glibc as __fpurge (likewise on Solaris).

Yes, though I have an aversion to calling functions with a `__' prefix
from user application code.

However:

"These functions are nonstandard and not portable."

It would be nice to have something standardized. I can certainly add
yet another configure test for this -- I just wish I didn't have to.

Stephane Chazelas

unread,
Sep 10, 2007, 12:39:09 PM9/10/07
to Chet Ramey, Pierre-Philippe Coupard, 429...@bugs.debian.org, Andreas Schwab, bug-...@gnu.org, Dmitry Potapov, ch...@case.edu
On Mon, Sep 10, 2007 at 11:57:34AM -0400, Chet Ramey wrote:
> Andreas Schwab wrote:
> > Chet Ramey <chet....@case.edu> writes:
> >
> >> What's needed is a portable interface like BSD's fpurge(3).
> >
> > This is also available from glibc as __fpurge (likewise on Solaris).
>
> Yes, though I have an aversion to calling functions with a `__' prefix
> from user application code.
>
> However:
>
> "These functions are nonstandard and not portable."
>
> It would be nice to have something standardized. I can certainly add
> yet another configure test for this -- I just wish I didn't have to.
[...]

Note that zsh seems to have the same problem as bash here
(except that it uses fwrite + fputc instead of printf).

The problem I saw with ksh93 seems to be unrelated as ksh93
doesn't seem to be using stdio.

Dmitry, your t.c in the debian report gives:

On Solaris 8:

$ ./t
signal handler called, sig=2
error at num_bytes=15352
fputs: Interrupted system call
writer: num_bytes=80000 num_lines=10000
reader: num_bytes=74888 num_lines=9361
reader: number of missing bytes: 5112

On HPUX 11.11:

$ ./t
signal handler called, sig=2
error at num_bytes=16376
fputs: Interrupted system call
fclose: Interrupted system call
reader: num_bytes=71816 num_lines=8977
reader: number of missing bytes: 8184

So they don't seem to care either to retry and send the data
if the first write() fails.

With dietlibc:

$ ./t
signal handler called, sig=2
writer: num_bytes=80008 num_lines=10001
writer: expected num_bytes=80000 but was 80008
reader: num_bytes=80007 num_lines=10000
reader: number of missing bytes: -7

And dietlibc behaves the same as glibc patched with your
(Dmitry's) change upon the fflush. That is bash would misbehave
the same if linked against dietlibc.

I've also verified that if I revert your change and recompile
the glibc, bash's (and zsh's) problem goes away, so that would
confirm if needed be that it was that fix that introduced the
change in behavior.

--
Stéphane


Dmitry Potapov

unread,
Sep 10, 2007, 1:25:26 PM9/10/07
to Stephane Chazelas, Pierre-Philippe Coupard, 429...@bugs.debian.org, Chet Ramey, Andreas Schwab, ch...@case.edu, bug-...@gnu.org
On Mon, Sep 10, 2007 at 05:39:09PM +0100, Stephane Chazelas wrote:
> Dmitry, your t.c in the debian report gives:
>
> On Solaris 8:
[...]
> On HPUX 11.11:
[...]

>
> So they don't seem to care either to retry and send the data
> if the first write() fails.

Yes, it seems they purge all data in the IO buffer on error.

> With dietlibc:
>
> $ ./t
> signal handler called, sig=2
> writer: num_bytes=80008 num_lines=10001
> writer: expected num_bytes=80000 but was 80008
> reader: num_bytes=80007 num_lines=10000
> reader: number of missing bytes: -7
>
> And dietlibc behaves the same as glibc patched with your
> (Dmitry's) change upon the fflush.

No, glibc with my patch gives:

$ ./t
signal handler called, sig=2

error at num_bytes=69632


fputs: Interrupted system call
writer: num_bytes=80000 num_lines=10000

reader: num_bytes=80000 num_lines=10000

-7 indicates an error in dietlibc. Somehow, dietlibc does not take into
account that write(2) can write only part of data, and it should not be
considered as an error. But this bug in dietlibc is irrelevant to our
problem. Newlib should work as glibc with my patch, but I have not
tested it.

Dmitry


Stephane Chazelas

unread,
Sep 10, 2007, 1:36:51 PM9/10/07
to Dmitry Potapov, Pierre-Philippe Coupard, 429...@bugs.debian.org, Chet Ramey, Andreas Schwab, ch...@case.edu, bug-...@gnu.org
On Mon, Sep 10, 2007 at 09:25:26PM +0400, Dmitry Potapov wrote:
[...]
> > With dietlibc:
> >
> > $ ./t
> > signal handler called, sig=2
> > writer: num_bytes=80008 num_lines=10001
> > writer: expected num_bytes=80000 but was 80008
> > reader: num_bytes=80007 num_lines=10000
> > reader: number of missing bytes: -7
> >
> > And dietlibc behaves the same as glibc patched with your
> > (Dmitry's) change upon the fflush.
>
> No, glibc with my patch gives:
[...]

Sorry for the misunderstanding, I meant "upon the fflush", as in
wrt the issue at stake, that is the fact that dietlibc doesn't
seem to empty the output buffer upon an unsuccessful fflush
either, which confirms what you suspected earlier through
reading the dietlibc code. I did not mean that "t" was behaving
the same in glibc and dietlibc. With the glibc, I obtain:

$ ~/t
signal handler called, sig=2
error at num_bytes=66560
fputs: Interrupted system call
reader: num_bytes=80000 num_lines=10000
writer: num_bytes=80000 num_lines=10000

And with your fix reverted:

.../glibc-2.6.1/build-tree/i386-libc$ LD_LIBRARY_PATH=$PWD ~/t
signal handler called, sig=2
error at num_bytes=66560


fputs: Interrupted system call
writer: num_bytes=80000 num_lines=10000

reader: num_bytes=78976 num_lines=9872
reader: number of missing bytes: 1024

as expected.

Best regards,
Stéphane


Reply all
Reply to author
Forward
0 new messages