IFS

Kenny McCormack

unread,

Feb 16, 2018, 12:26:54 PM2/16/18

to

So, I want to set the current shell's parameters to be the same as that of
another process.

This works:

$ echo $BASH_VERSION
4.3.30(1)-release
$ IFS=$'\001';set -- $(tr '\0' '\001' < /proc/12345/cmdline);IFS=;echo $#
3
$

The output is the number of params (plus 1, because of the usual argv[0])
as that of process 12345.

But then, what about this:

$ IFS=A;set -- aAbAc;echo $#
1
$ echo $1
a b c
$

Shouldn't $# be 3? And shouldn't $1 be either "a" or "aAbAc"? Where did
the spaces come from?

P.S. In fact, when I do:

$ echo "$1"
aAbAc
$

Hmmmm...

--
The last time a Republican cared about you, you were a fetus.

Kenny McCormack

unread,

Feb 16, 2018, 12:34:07 PM2/16/18

to

In article <p6748o$hnk$1...@news.xmission.com>,
Kenny McCormack <gaz...@shell.xmission.com> wrote:
...

>P.S. In fact, when I do:
>
>$ echo "$1"
>aAbAc
>$
>
>Hmmmm...

Actually, some of this seems to be an artifact of not resetting IFS in this
example (as I did in the first example).

If I set IFS=, as I should have, then I get:

$ IFS=;echo $1
aAbAc
$

BTW, is there any difference (in practice) between IFS= and
IFS=<space><tab><newline> ?

--
Which of these is the crazier bit of right wing lunacy?
1) We've just had another mass shooting; now is not the time to be talking about gun control.

2) We've just had a massive hurricane; now is not the time to be talking about climate change.

Janis Papanagnou

unread,

Feb 16, 2018, 1:05:44 PM2/16/18

to

On 16.02.2018 18:26, Kenny McCormack wrote:
> So, I want to set the current shell's parameters to be the same as that of
> another process.
>
> This works:
>
> $ echo $BASH_VERSION
> 4.3.30(1)-release
> $ IFS=$'\001';set -- $(tr '\0' '\001' < /proc/12345/cmdline);IFS=;echo $#
> 3
> $
>
> The output is the number of params (plus 1, because of the usual argv[0])
> as that of process 12345.
>
> But then, what about this:
>
> $ IFS=A;set -- aAbAc;echo $#
> 1
> $ echo $1
> a b c
> $

If the shell would behave as you think it would you'd get a tokeniziation of
'set -- a' 'b' 'c'

The point is that field splitting is done if the shell expands ${} $() $(()).

Try x=aAbAc and inspect $x (or set -- $x and inspect the arguments).

Janis

William Ahern

unread,

Feb 16, 2018, 4:45:12 PM2/16/18

to

Kenny McCormack <gaz...@shell.xmission.com> wrote:
> So, I want to set the current shell's parameters to be the same as that of
> another process.
>
> This works:
>
> $ echo $BASH_VERSION
> 4.3.30(1)-release
> $ IFS=$'\001';set -- $(tr '\0' '\001' < /proc/12345/cmdline);IFS=;echo $#
> 3
> $
>
> The output is the number of params (plus 1, because of the usual argv[0])
> as that of process 12345.
>
> But then, what about this:
>
> $ IFS=A;set -- aAbAc;echo $#
> 1
> $ echo $1
> a b c
> $
>
> Shouldn't $# be 3? And shouldn't $1 be either "a" or "aAbAc"? Where did
> the spaces come from?

This is an odd bash field splitting issue specific to \001. I discovered it
years ago when testing the behavior of various shells when setting IFS to
\000, and then trying other control characters. I once tracked down the
relevant code but have long forgotten why \001 was treated differently.

This code prints 1

IFS="$(printf '\001')"
set -- $(printf "1${IFS}2${IFS}3")
printf "%d\n" $#

but this code prints 3

IFS="$(printf '\002')"
set -- $(printf "1${IFS}2${IFS}3")
printf "%d\n" $#

This code

I=0
while [ $I -le 32 ]; do
IFS="$(printf "%b" "\0$(printf "%o" $I)")"
set -- $(printf "1${IFS}2${IFS}3")
printf '\\%.3o: %d\n' $I $#
I=$(($I + 1))
done

splits the string into 3 fields for all control characters except
\000 (nul), \001 (soh), and \012 (nl).

For pdksh only \000 and \012 are exceptions. For zsh only \012. (\000 works
in zsh!) Why \000 and \012 might not work is understandable. \001 is a bash
quirk.

Kaz Kylheku

unread,

Feb 16, 2018, 4:58:04 PM2/16/18

to

On 2018-02-16, William Ahern <wil...@25thandClement.com> wrote:
> This is an odd bash field splitting issue specific to \001. I discovered it
> years ago when testing the behavior of various shells when setting IFS to
> \000, and then trying other control characters. I once tracked down the
> relevant code but have long forgotten why \001 was treated differently.

Bash's expansion code uses internal hacks based on the hypothesis
"this character won't ever appear in user code/data, so we can use it as
a special marker".

Helmut Waitzmann

unread,

Feb 17, 2018, 12:48:44 AM2/17/18

to

gaz...@shell.xmission.com (Kenny McCormack):

[…]

> $ IFS=A;set -- aAbAc;echo $#
> 1
> $ echo $1
> a b c
> $
>
> Shouldn't $# be 3?

Field splitting does not happen at the time the command line is
parsed. It takes place only when unquoted shell parameters are
evaluated (“$1” in that example).

The command line

“IFS=A; set -- aAbAc”

sets the positional parameters to be just one parameter having got
the value “aAbAc”. Therefore the command line

“echo "$#"”

yields 1.

In the command line

“echo $1”,

field splitting takes place on the value (“aAbAc”) of the unquoted
parameter expression “$1”, when it comes to constructing the
invocation arguments list of the “echo” utility. Therefore the
“echo” utility is given the four invocation arguments “echo”, “a”,
“b”, and “c”.

> And shouldn't $1 be either "a" or "aAbAc"? Where did the spaces
> come from?

The spaces come from the fact, that “echo” outputs all of its
invocation arguments (except for the first) glued together with
spaces in between. Using the “bash” builtin “printf” rather than
“echo”, try the command lines

“printf '%q\n' "$1"”

and

“printf '%q\n' $1”.

The “bash” builtin “printf” format specifier “%q” tells “printf”
to output the corresponding parameter in a “bash”‐conformant
syntax that could be given to the “eval” builtin to get its value
back. It is also an excellent way to print any byte sequence in
an unambigous way not only for “bash” but also for human readers.

> $ echo "$1"
> aAbAc
> $

Because “"$1"” is a quoted rather than an unquoted reference to
the value of the first positional parameter, there will be no
field splitting, thus “echo” will be invoked with the two
invocation arguments “echo” and “aAbAc”.

Helmut Waitzmann

unread,

Feb 17, 2018, 12:53:15 AM2/17/18

to

gaz...@shell.xmission.com (Kenny McCormack):

> BTW, is there any difference (in practice) between IFS= and
> IFS=<space><tab><newline> ?

Using the “bash” builtin “printf”, try the following commands:

(
set '' a b c && shift &&
printf 'Number of positional parameters: %s\n' "$#" &&
printf '%s\n' 'The positional parameters, each' \
'of them in a line of its own:' &&
printf ' %q\n' "$@" &&
printf 'IFS=%q\n' "${IFS}" &&
printf 'The positional parameters, glued together\n'\
'by the first IFS character, if any (%q): %q\n' \
"${IFS+"${IFS%"${IFS#?}"}"}" "$*" &&
IFS= &&
printf 'IFS=%q\n' "${IFS}" &&
printf 'The positional parameters, glued together\n'\
'by the first IFS character, if any (%q): %q\n' \
"${IFS+"${IFS%"${IFS#?}"}"}" "$*"
)

printf '%s\n' '1 2 3' |
(
IFS= read one two three &&
printf '%q\n' "$one" "$two" "$three"
)

printf '%s\n' '1 2 3' |
(
read one two three &&
printf '%q\n' "$one" "$two" "$three"
)

Janis Papanagnou

unread,

Feb 17, 2018, 4:43:19 AM2/17/18

to

On 17.02.2018 06:48, Helmut Waitzmann wrote:
> [...]

>
> Field splitting does not happen at the time the command line is
> parsed. It takes place only when unquoted shell parameters are
> evaluated (“$1” in that example).

Just to make it clear, it takes place not only when parameters are
expanded. As mentioned upthread, also with (unquoted) expansions
of $() - the OP's case that "worked" - and in $(()).

Janis

>
> [...]

Kenny McCormack

unread,

Feb 17, 2018, 6:36:19 AM2/17/18

to

In article <201802161...@kylheku.com>,

Kaz Kylheku <217-67...@kylheku.com> wrote:
>On 2018-02-16, William Ahern <wil...@25thandClement.com> wrote:
>> This is an odd bash field splitting issue specific to \001. I discovered it
>> years ago when testing the behavior of various shells when setting IFS to
>> \000, and then trying other control characters. I once tracked down the
>> relevant code but have long forgotten why \001 was treated differently.

I found Bill's post interesting, but not particularly relevant.

Because the thing is, it *DOES* work with IFS=Control/A. I know it does,
because I wrote a script that uses that trick and it has been running
correctly for years. The point here is that I'd be interested to hear
about quirks in using Control/A as the field separator, except for the fact
that it does work.

I was more curious about why it *DOESN'T* work with IFS=A.

>Bash's expansion code uses internal hacks based on the hypothesis
>"this character won't ever appear in user code/data, so we can use it as
>a special marker".

That all said, I'm not surprised to hear this.

--
"I have a simple philosophy. Fill what's empty. Empty what's full. And
scratch where it itches."

Alice Roosevelt Longworth

Helmut Waitzmann

unread,

Feb 17, 2018, 8:12:00 AM2/17/18

to

William Ahern <wil...@25thandClement.com>:

> Kenny McCormack <gaz...@shell.xmission.com> wrote:

> This code
>
> I=0
> while [ $I -le 32 ]; do
> IFS="$(printf "%b" "\0$(printf "%o" $I)")"
> set -- $(printf "1${IFS}2${IFS}3")
> printf '\\%.3o: %d\n' $I $#
> I=$(($I + 1))
> done

… may set IFS to be the empty string, when

“[ $I -eq 10 ]”,

because command substitution may delete the trailing newline from
the output of the

“printf "%b" "\0$(printf "%o" $I)"”

command. Compare with the output of the following command:

(
i=0 &&
oldifs="${IFS-}" && ${IFS+:} unset oldifs &&
while [ "$i" -le 32 ]
do
ifs="$(printf '%b.' '\'"$(printf '%.4o' "$i")")" &&
ifs="${ifs%.}" &&
printf '\n%s\n' \
'Octal dump of "$ifs" (od -v -t o1):' &&
printf '%s' "$ifs" | od -v -t o1 &&
IFS="$ifs" &&
set -- $(printf "1${IFS}2${IFS}3") &&
IFS="${oldifs-}" && ${oldifs+:} unset IFS &&
printf 'Number of positional parameters: %d\n' $#
i=$(($i + 1))
done
)

You might test it in various shells.

Helmut Waitzmann

unread,

Feb 17, 2018, 8:13:43 AM2/17/18

to

Janis Papanagnou <janis_pa...@hotmail.com>:

Yes. Thank you for your correction.

Barry Margolin

unread,

Feb 17, 2018, 7:51:27 PM2/17/18

to

In article <p6943d$lvq$2...@news.xmission.com>,

gaz...@shell.xmission.com (Kenny McCormack) wrote:

> In article <201802161...@kylheku.com>,
> Kaz Kylheku <217-67...@kylheku.com> wrote:
> >On 2018-02-16, William Ahern <wil...@25thandClement.com> wrote:
> >> This is an odd bash field splitting issue specific to \001. I discovered it
> >> years ago when testing the behavior of various shells when setting IFS to
> >> \000, and then trying other control characters. I once tracked down the
> >> relevant code but have long forgotten why \001 was treated differently.
>
> I found Bill's post interesting, but not particularly relevant.
>
> Because the thing is, it *DOES* work with IFS=Control/A. I know it does,
> because I wrote a script that uses that trick and it has been running
> correctly for years. The point here is that I'd be interested to hear
> about quirks in using Control/A as the field separator, except for the fact
> that it does work.
>
> I was more curious about why it *DOESN'T* work with IFS=A.

Because it has nothing to do with IFS -- bash uses \001 internally in a
special way, and when you put this in IFS it coicidentally was the same
as the internal code.

IFS=A works for the normal use of IFS:

vars=aAbAcAd
IFS=A
set -- $vars
echo $#

prints 4

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Kenny McCormack

unread,

Feb 17, 2018, 9:11:46 PM2/17/18

to

In article <barmar-B16DF1....@reader.eternal-september.org>,

Barry Margolin <bar...@alum.mit.edu> wrote:
>In article <p6943d$lvq$2...@news.xmission.com>,
> gaz...@shell.xmission.com (Kenny McCormack) wrote:
>
>> In article <201802161...@kylheku.com>,
>> Kaz Kylheku <217-67...@kylheku.com> wrote:
>> >On 2018-02-16, William Ahern <wil...@25thandClement.com> wrote:
>> >> This is an odd bash field splitting issue specific to \001. I
>> >> discovered it years ago when testing the behavior of various
>> >> shells when setting IFS to \000, and then trying other control
>> >> characters. I once tracked down the relevant code but have long
>> >> forgotten why \001 was treated differently.
>>
>> I found Bill's post interesting, but not particularly relevant.
>>
>> Because the thing is, it *DOES* work with IFS=Control/A. I know it
>> does, because I wrote a script that uses that trick and it has been
>> running correctly for years. The point here is that I'd be interested
>> to hear about quirks in using Control/A as the field separator,
>> except for the fact that it does work.
>>
>> I was more curious about why it *DOESN'T* work with IFS=A.
>
>Because it has nothing to do with IFS -- bash uses \001 internally in a

>special way, and when you put this in IFS it coincidentally was the same
>as the internal code.

OK, well, let's go with that.

If I understand where you're going with it, then this should print 3, but
it prints 1:

$ set -- "$(printf "A\001B\001C")";echo $#
1

(I.e., you seem to be arguing that because of the weird undocumented quirk
of bash using \001 internally, it should work w/o messing with IFS at all)

Yet, *this* works as expected:

$ oldIFS="$IFS";IFS=$'\001';set -- $(printf "A\001B\001C");IFS="$oldIFS";echo $#
3

So, which is it?

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/ForFoxViewers

Kenny McCormack

unread,

Feb 18, 2018, 3:27:09 AM2/18/18

to

In article <p6ancu$ig0$1...@news.xmission.com>,

Kenny McCormack <gaz...@shell.xmission.com> wrote:
...

>$ set -- "$(printf "A\001B\001C")";echo $#
>1

Transcription error!

Should be:

$ set -- $(printf "A\001B\001C");echo $#
1

I had tested both with and without the quotes, but copy/pasted the wrong
test.

--
He continues to assert that 2 plus 2 equals 4, despite being repeatedly
told otherwise.