Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

IFS handling and read

6 views
Skip to first unread message

Юрий Пухальский

unread,
Nov 26, 2009, 3:02:52 PM11/26/09
to bug-...@gnu.org
Good day!

Theres is a problem with a following code:

echo a:b|IFS=: read a b; echo $a

According to a standard, i expect the field splitting to occur, thus
setting a variable to "a" value. But It doesn't work even with POSIX
option. Other shells i can lay my hands upon (native shells on HP-UX
and AIX of different versions), ksh and zsh work as expected.

(GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu))

——
«The good thing about standards is there are so many to choose from.»


Eric Blake

unread,
Nov 26, 2009, 3:09:48 PM11/26/09
to Юрий Пухальский, bug-...@gnu.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Юрий Пухальский on 11/26/2009 1:02 PM:


> Good day!
>
> Theres is a problem with a following code:
>
> echo a:b|IFS=: read a b; echo $a

This is E4 in the FAQ:
ftp://ftp.cwru.edu/pub/bash/FAQ

POSIX permits, but does not require, that the final element of a pipeline
be executed in a subshell. Bash uses the subshell, ksh does not.
Variable assignments in a subshell do not affect the parent.

Meanwhile, read obeys IFS according to POSIX, as shown by:

$ IFS=: read a b <<EOF
> 1:2
> EOF
$ echo $a
1

- --
Don't work too hard, make some time for fun as well!

Eric Blake eb...@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEUEARECAAYFAksO4IwACgkQ84KuGfSFAYBZJACXegDrdvJQ/f/gS5e/8Yv4uK4o
3wCgkMPnOh7I9ttElaHtQrMBzCJa1lg=
=C2Ol
-----END PGP SIGNATURE-----


Юрий Пухальский

unread,
Nov 25, 2009, 6:15:14 AM11/25/09
to bug-...@gnu.org
Good day!

Theres is a problem with a following code:

echo a:b|IFS=: read a b; echo $a

According to a standard, i expect the field splitting to occur, thus


setting a variable to "a" value. But It doesn't work even with POSIX
option. Other shells i can lay my hands upon (native shells on HP-UX
and AIX of different versions), ksh and zsh work as expected.

(GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu))

--

Marc Herbert

unread,
Nov 30, 2009, 5:34:10 AM11/30/09
to bug-...@gnu.org
Eric Blake a écrit :

>
> This is E4 in the FAQ:
> ftp://ftp.cwru.edu/pub/bash/FAQ
>
> POSIX permits, but does not require, that the final element of a pipeline
> be executed in a subshell. Bash uses the subshell, ksh does not.
> Variable assignments in a subshell do not affect the parent.

I am regularly bitten by this. This is a major pain; it makes "read" very
inconvenient to use (whatever IFS is).

Could this be changed in the future?

Cheers,

Marc

Lhunath (Maarten B.)

unread,
Nov 30, 2009, 5:46:03 AM11/30/09
to Marc Herbert, bug-...@gnu.org

Don't use pipelines to send streams to read. Use file redirection instead:

Instead of ''command | read var''
Use ''read var < <(command)''

I hardly see a need to change the existing implementation.

Greg Wooledge

unread,
Nov 30, 2009, 8:13:29 AM11/30/09
to Marc Herbert, bug-...@gnu.org
On Mon, Nov 30, 2009 at 11:46:03AM +0100, Lhunath (Maarten B.) wrote:
> Don't use pipelines to send streams to read. Use file redirection instead:
>
> Instead of ''command | read var''
> Use ''read var < <(command)''
>
> I hardly see a need to change the existing implementation.

Or for the original problem case, use a here string:

IFS=: read a b <<< "1:2"

Between process substitutions (the <(command) thing) and here strings,
you should be able to do all your reads without subshells.


Marc Herbert

unread,
Nov 30, 2009, 8:10:10 AM11/30/09
to bug-...@gnu.org
Lhunath (Maarten B.) a écrit :

> On 30 Nov 2009, at 11:34, Marc Herbert wrote:
>
>> Eric Blake a écrit :
>>> This is E4 in the FAQ:
>>> ftp://ftp.cwru.edu/pub/bash/FAQ

> Instead of ''commands | read var''
> Use ''read var < <(commands)''


> I hardly see a need to change the existing implementation.

As mentioned in the FAQ, ''read var < <(commands)'' is not portable.

All alternatives in the FAQ (portable or not) are less readable than a
simple pipe. They are all more verbose and introduce an extra level of
nesting when you have only one "command". They all need to be read
"backwards" with respect to the execution flow. If you want to keep your
code readable, they practically all force you to define a function for
"commands" as soon as you have more than a few commands.

Every entry in an FAQ is by mere definition a problem that many people
wast... spend time on.

It is admittedly not a question of life or death but some other shells
apparently have it so why not bash? Just asking.

Lhunath (Maarten B.)

unread,
Nov 30, 2009, 8:38:07 AM11/30/09
to Marc Herbert, bug-...@gnu.org

Let me try to guess what your definition of portability is by assuming it means "will run in any POSIX shell".

Firstly, if you are writing FOR the bash shell, you needn't worry about this type of portability. Putting bash in your hashbang means the script will only ever be interpreted by a bash shell, not any other POSIX shell.

Secondly, if you do decide that for some reason you want to have your script interpretable by other POSIX shells (which means you avoid all other bash-specific features, too) your concern over portability still does not warrant the implementation being changed, as POSIX does not require shells to avoid subshelling components of a pipeline. So you still can't rely on other non-bash shells that are POSIX-compliant to treat your script's implementation the same.

That said, the command substitution is an excellent alternative in any case for the pipeline-to-read problem. It is clean and has no side-effects. If your real issue is that many people struggle with this because they are newbies and haven't learned the intricacies of the shell yet, then surely this is not the first or biggest obstacle in that respect. Even (self-)proclaimed bash geniuses still fail at quoting expansions properly because they do not understand or appreciate the intricacies of word-splitting and pathname expansions.

Chris F.A. Johnson

unread,
Nov 30, 2009, 9:21:31 AM11/30/09
to bug-...@gnu.org
On Mon, 30 Nov 2009, Marc Herbert wrote:

> Lhunath (Maarten B.) a ?crit :


> > On 30 Nov 2009, at 11:34, Marc Herbert wrote:
> >

> >> Eric Blake a ?crit :


> >>> This is E4 in the FAQ:
> >>> ftp://ftp.cwru.edu/pub/bash/FAQ
>
> > Instead of ''commands | read var''
> > Use ''read var < <(commands)''
> > I hardly see a need to change the existing implementation.
>
> As mentioned in the FAQ, ''read var < <(commands)'' is not portable.
>
> All alternatives in the FAQ (portable or not) are less readable than a
> simple pipe. They are all more verbose and introduce an extra level of
> nesting when you have only one "command". They all need to be read
> "backwards" with respect to the execution flow. If you want to keep your
> code readable, they practically all force you to define a function for
> "commands" as soon as you have more than a few commands.
>
> Every entry in an FAQ is by mere definition a problem that many people
> wast... spend time on.
>
> It is admittedly not a question of life or death but some other shells
> apparently have it so why not bash? Just asking.

Why should it be the last element of a pipeline that is executed in
the current shell and not the first?

Suppose that I have a group of commands that sets some variables
and outputs information to the screen, for example (this is much
oversimplified):

{
x=$(( $something * 2 ))
printf "%d\n" "$x"
}

Now, I want to modify the output. I pipe it through a formatting
command:

{
x=$(( $something * 2 ))
printf "%d\n" "$x"
} | tr 0-9 9-0

All of a sudden, x is not set (or set to the wrong value). So it
should be the *first* command, not the last, that is executed in
the calling shell.

--
Chris F.A. Johnson, webmaster <http://woodbine-gerrard.com>
===================================================================
Author:
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
Pro Bash Programming: Scripting the GNU/Linux Shell (2009, Apress)


Chris F.A. Johnson

unread,
Nov 30, 2009, 9:56:07 AM11/30/09
to bug-...@gnu.org
On Mon, 30 Nov 2009, Greg Wooledge wrote:

> On Mon, Nov 30, 2009 at 11:46:03AM +0100, Lhunath (Maarten B.) wrote:
> > Don't use pipelines to send streams to read. Use file redirection instead:
> >

> > Instead of ''command | read var''
> > Use ''read var < <(command)''


> >
> > I hardly see a need to change the existing implementation.
>

> Or for the original problem case, use a here string:
>
> IFS=: read a b <<< "1:2"
>
> Between process substitutions (the <(command) thing) and here strings,
> you should be able to do all your reads without subshells.

Or, to be portable, use a here document:

IFS=: read a b <<.
1:2
.

This works with the output of commands, too:

IFS=- read year month day <<.
$(date +%Y-%m-%d)

Lhunath (Maarten B.)

unread,
Nov 30, 2009, 10:07:33 AM11/30/09
to Chris F.A. Johnson, bug-...@gnu.org
On 30 Nov 2009, at 15:56, Chris F.A. Johnson wrote:
>
> On Mon, 30 Nov 2009, Greg Wooledge wrote:
>
>> On Mon, Nov 30, 2009 at 11:46:03AM +0100, Lhunath (Maarten B.) wrote:
>>> Don't use pipelines to send streams to read. Use file redirection instead:
>>>
>>> Instead of ''command | read var''
>>> Use ''read var < <(command)''
>>>
>>> I hardly see a need to change the existing implementation.
>>
>> Or for the original problem case, use a here string:
>>
>> IFS=: read a b <<< "1:2"
>>
>> Between process substitutions (the <(command) thing) and here strings,
>> you should be able to do all your reads without subshells.
>
> Or, to be portable, use a here document:
>
> IFS=: read a b <<.
> 1:2
> .
>
> This works with the output of commands, too:
>
> IFS=- read year month day <<.
> $(date +%Y-%m-%d)
> .

Note that 'read' is a bash feature; not a POSIX shell feature. In that sense, "read" alone is limiting your "portability". So portability in the meaning of POSIX is out of the question.

Perhaps you're talking about backward compatibility instead of portability, in which case the only compatibility gain you get from using the more verbose heredoc over the herestring is compatibiltiy with pre-2.05b-alpha1 bash.

Hardly worth it.

Chris F.A. Johnson

unread,
Nov 30, 2009, 10:15:58 AM11/30/09
to bug-...@gnu.org
On Mon, 30 Nov 2009, Lhunath (Maarten B.) wrote:

> On 30 Nov 2009, at 15:56, Chris F.A. Johnson wrote:
> >
> > On Mon, 30 Nov 2009, Greg Wooledge wrote:
> >
> >> On Mon, Nov 30, 2009 at 11:46:03AM +0100, Lhunath (Maarten B.) wrote:
> >>> Don't use pipelines to send streams to read. Use file redirection instead:
> >>>
> >>> Instead of ''command | read var''
> >>> Use ''read var < <(command)''
> >>>
> >>> I hardly see a need to change the existing implementation.
> >>
> >> Or for the original problem case, use a here string:
> >>
> >> IFS=: read a b <<< "1:2"
> >>
> >> Between process substitutions (the <(command) thing) and here strings,
> >> you should be able to do all your reads without subshells.
> >
> > Or, to be portable, use a here document:
> >
> > IFS=: read a b <<.
> > 1:2
> > .
> >
> > This works with the output of commands, too:
> >
> > IFS=- read year month day <<.
> > $(date +%Y-%m-%d)
> > .
>
> Note that 'read' is a bash feature; not a POSIX shell feature.

?????

The read command has been around since the early Bourne shells.

> In that sense, "read" alone is limiting your "portability". So
> portability in the meaning of POSIX is out of the question.

> Perhaps you're talking about backward compatibility instead of
> portability, in which case the only compatibility gain you get from
> using the more verbose heredoc over the herestring is compatibiltiy
> with pre-2.05b-alpha1 bash.

> Hardly worth it.
>

--

Andreas Schwab

unread,
Nov 30, 2009, 10:16:21 AM11/30/09
to Chris F.A. Johnson, bug-...@gnu.org
"Chris F.A. Johnson" <ch...@cfajohnson.com> writes:

> This works with the output of commands, too:
>
> IFS=- read year month day <<.
> $(date +%Y-%m-%d)
> .

The disadvantage is that the command is executed synchronously.

Andreas.

--
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Chet Ramey

unread,
Nov 30, 2009, 10:15:06 AM11/30/09
to Lhunath (Maarten B.), Chris F.A. Johnson, bug-...@gnu.org, chet....@case.edu
Lhunath (Maarten B.) wrote:

> Note that 'read' is a bash feature; not a POSIX shell feature. In that sense, "read" alone is limiting your "portability". So portability in the meaning of POSIX is out of the question.

Pardon me? `read' is a feature of every historical shell and standardized
by Posix. The bash implementation is a superset of Posix.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU ch...@case.edu http://cnswww.cns.cwru.edu/~chet/


Lhunath (Maarten B.)

unread,
Nov 30, 2009, 10:19:37 AM11/30/09
to chet....@case.edu, Chris F.A. Johnson, bug-...@gnu.org

My bad. I was under the impression `read` was a Bourne shell-only thing and not standardized under POSIX.

Marc Herbert

unread,
Nov 30, 2009, 11:21:33 AM11/30/09
to bug-...@gnu.org
Chris F.A. Johnson a écrit :

> Why should it be the last element of a pipeline that is executed in
> the current shell and not the first?


Because that's POSIX' choice?


Because the last element is the last one in the data stream. So it feels
more natural to get everything from the last element rather than side
effects from the first and stdout from the last.

> Suppose that I have a group of commands that sets some variables
> and outputs information to the screen, for example (this is much
> oversimplified):

Thanks for the example. I find this less common than using "read".

pk

unread,
Nov 30, 2009, 11:32:30 AM11/30/09
to
Marc Herbert wrote:

> Chris F.A. Johnson a écrit :
>> Why should it be the last element of a pipeline that is executed in
>> the current shell and not the first?
>
>
> Because that's POSIX' choice?

No, POSIX allow either behavior. In fact, it allows any behavior ranging
from running all parts in their own subshells, to running all parts in the
current shell.

pk

unread,
Nov 30, 2009, 11:35:04 AM11/30/09
to
pk wrote:

>> Because that's POSIX' choice?
>
> No, POSIX allow either behavior. In fact, it allows any behavior ranging
> from running all parts in their own subshells, to running all parts in the
> current shell.

"...each command of a multi-command pipeline is in a subshell environment;
as an extension, however, any or all commands in a pipeline may be executed
in the current environment. All other commands shall be executed in the
current shell environment."

Greg Wooledge

unread,
Nov 30, 2009, 12:13:31 PM11/30/09
to Marc Herbert, bug-...@gnu.org
On Mon, Nov 30, 2009 at 04:21:33PM +0000, Marc Herbert wrote:
> Chris F.A. Johnson a �crit :
> > Why should it be the last element of a pipeline that is executed in
> > the current shell and not the first?

>
> Because that's POSIX' choice?

Because that's what Korn shell does. (But not pdksh, last time I checked.)


Jan Schampera

unread,
Nov 30, 2009, 3:13:00 PM11/30/09
to Lhunath (Maarten B.), bug-...@gnu.org
Lhunath (Maarten B.) schrieb:

> My bad. I was under the impression `read` was a Bourne shell-only
thing and not standardized under POSIX.

(not personal for you only, I see that very often)

It would be nice if people actually read POSIX before they talk about it.

Jan


Antonio Macchi

unread,
Dec 1, 2009, 3:35:20 AM12/1/09
to gnu-ba...@moderators.isc.org
Юрий Пухальский wrote:
> Good day!
>
> Theres is a problem with a following code:
>
> echo a:b|IFS=: read a b; echo $a


this seems work

$ echo "a:b" | { IFS=":" read a b; echo $a; }
a
0 new messages