Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

bash: how to split element separated with null character

1,063 views
Skip to first unread message

Francis Moreau

unread,
Jun 30, 2011, 1:07:56 PM6/30/11
to
Hello,

I have a file with lines having fields separated by null char.

I'd like to parse those lines to retrieve each fields.

Basicaly I'd like to do this:

while IFS=$'\0' read a b c;
do
...
done <file

Could anybody give me some help here ?

Thanks !

mhenn

unread,
Jun 30, 2011, 2:14:44 PM6/30/11
to

It seems that this is not possible in normal shells.
googling for "bash +ifs null" offered this:

<http://unix.derkeiler.com/Newsgroups/comp.unix.shell/2007-05/msg00672.html>

and this

<http://unix.stackexchange.com/questions/7904/can-ifs-internal-field-separator-function-as-a-single-seperator-for-multiple-co>

where they suggest to use sed/tr to translate the \0-chars to something
else which is parsable, or use zsh, which can set IFS to \0.

Stephane CHAZELAS

unread,
Jun 30, 2011, 2:43:35 PM6/30/11
to
2011-06-30, 20:14(+02), mhenn:

> Am 30.06.2011 19:07, schrieb Francis Moreau:
>> Hello,
>>
>> I have a file with lines having fields separated by null char.
>>
>> I'd like to parse those lines to retrieve each fields.
>>
>> Basicaly I'd like to do this:
>>
>> while IFS=$'\0' read a b c;
>> do
>> ...
>> done <file
>>
>> Could anybody give me some help here ?
>>
>> Thanks !
>
> It seems that this is not possible in normal shells.
[...]

In which way isn't zsh normal?

$'\0' is in zsh's default IFS.

$ echo 'a b\0\0c d\\\0e\0f g' | IFS=$'\0' read -A argv; printf '<%s>\n' "$@"
<a b>
<>
<c de>
<f g>

--
Stephane

goarilla

unread,
Jun 30, 2011, 3:14:40 PM6/30/11
to

Or you could change the \0 with something more decent before processing it
with the shell.

Afterwards you can always change it back.

goarilla

unread,
Jun 30, 2011, 3:26:30 PM6/30/11
to
On Thu, 30 Jun 2011 18:43:35 +0000, Stephane CHAZELAS wrote:

> 2011-06-30, 20:14(+02), mhenn:
>> Am 30.06.2011 19:07, schrieb Francis Moreau:
>>> Hello,
>>>
>>> I have a file with lines having fields separated by null char.
>>>
>>> I'd like to parse those lines to retrieve each fields.
>>>
>>> Basicaly I'd like to do this:
>>>
>>> while IFS=$'\0' read a b c;
>>> do
>>> ...
>>> done <file
>>>
>>> Could anybody give me some help here ?
>>>
>>> Thanks !
>>
>> It seems that this is not possible in normal shells.
> [...]
>
> In which way isn't zsh normal?
>

He probably wants to target the POSIX standard.

> $'\0' is in zsh's default IFS.
>
> $ echo 'a b\0\0c d\\\0e\0f g' | IFS=$'\0' read -A argv; printf '<%s>\n'

can you explain this elaborately
from what i understand the output of
line 3 should be <c \de>

on my system it even outputs something more bizarre

\u@\h:\w $ echo 'a b\0\0c d\\\0e\0f g' | IFS=$'\0' read -A argv; printf '<%
s>\n'
<>

but as you can see my environment isn't really set up for zsh anyway

Stephane CHAZELAS

unread,
Jul 1, 2011, 2:22:15 AM7/1/11
to
2011-06-30, 19:26(+00), goarilla:

> On Thu, 30 Jun 2011 18:43:35 +0000, Stephane CHAZELAS wrote:
>
>> 2011-06-30, 20:14(+02), mhenn:
>>> Am 30.06.2011 19:07, schrieb Francis Moreau:
>>>> Hello,
>>>>
>>>> I have a file with lines having fields separated by null char.
>>>>
>>>> I'd like to parse those lines to retrieve each fields.
>>>>
>>>> Basicaly I'd like to do this:
>>>>
>>>> while IFS=$'\0' read a b c;
>>>> do
>>>> ...
>>>> done <file
>>>>
>>>> Could anybody give me some help here ?
>>>>
>>>> Thanks !
>>>
>>> It seems that this is not possible in normal shells.
>> [...]
>>
>> In which way isn't zsh normal?
>
> He probably wants to target the POSIX standard.

If he were, he wouldn't be using $'...' which is not POSIX yet.
Not that zsh aims to be POSIX conformant when called as sh, just
like bash. Some earlier versions of MacOS/X even had zsh as
their /bin/sh,

>
>> $'\0' is in zsh's default IFS.
>>
>> $ echo 'a b\0\0c d\\\0e\0f g' | IFS=$'\0' read -A argv; printf '<%s>\n'
>
> can you explain this elaborately
> from what i understand the output of
> line 3 should be <c \de>

No, you'd need "read -r" for that. Like in every POSIX shell,
backslash is meant to escape the IFS characters, itself and the
line terminator unless -r is passed.

> on my system it even outputs something more bizarre
>
> \u@\h:\w $ echo 'a b\0\0c d\\\0e\0f g' | IFS=$'\0' read -A argv; printf '<%
> s>\n'
> <>

You forgot the "$@" (short for "$argv[@]") from my original posting.

> but as you can see my environment isn't really set up for zsh anyway

[...]

I can see PS1 is in your environmment which is wrong as its
syntax is shell specific, but it's a common mistake even made
by many OS distributions.

--
Stephane

0 new messages