Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Split line on tabs

75 views
Skip to first unread message

Bit Twister

unread,
Oct 10, 2017, 7:34:51 PM10/10/17
to
Spent way too many hours on this.

I can not get this script to split a string on the tab character.
Would some kind person show me where I screwed up.


#!/bin/bash
#**********************************************
#*
#* quick kludge to change
#* from wd1 wd2 remaing string
#* to _wd1_wd2_ remaing string
#*
#**********************************************
set -u

# _in_fn="/local/doc/unix.help"
_in_fn="unix.help"
_tab_fn="unix.help.tab"
_tmp_fn="unix.help.tmp"
_ch1=""
_ch2=""
_tail=""

echo 'line 1 echo ${stringZ:7:3}' > $_in_fn
echo 'line 2_ text string' >> $_in_fn
echo '_line 3 text string' >> $_in_fn
echo 'line4' >> $_in_fn
echo '_line 5_ text string' >> $_in_fn

cp --force /dev/null $_tmp_fn
unexpand --tabs=2 $_in_fn > $_tab_fn

while read -r line ; do
set -- $(IFS=$'\t' ; echo $line)
_wd="$(echo $1 | tr ' ' '_')"
if [ $# -ne 0 ] ; then
_tail=""
if [ $# -gt 1 ] ; then
shift
_tail="\t$@"
fi

_ch1=${_wd:0:1}
_ch2=${_wd:${#_wd}-1:1}
if [ "$_ch1" != "_" ] ; then
_wd="_${_wd}"
fi
if [ "$_ch2" != "_" ] ; then
_wd="${_wd}_"
fi

echo -e ${_wd}${_tail} >> $_tmp_fn
fi
done < $_tab_fn

cat $_tmp_fn

#**********************************************

Janis Papanagnou

unread,
Oct 10, 2017, 8:24:39 PM10/10/17
to
On 11.10.2017 01:34, Bit Twister wrote:
> Spent way too many hours on this.
>
> I can not get this script to split a string on the tab character.
> Would some kind person show me where I screwed up.

I'm not sure what's the intention of your code below. If I want to
split strings on TAB in shell I'd probably just do

IFS=$'\t' read -A line # ksh
IFS=$'\t' read -a line # bash

used as in this ksh test case:

$ printf "abc\tdef ghi\tjkl mno\tpqr\n" |
IFS=$'\t' read -A line && set -- "${line[@]}" && printf "'%s'\n" "$@"
'abc'
'def ghi'
'jkl mno'
'pqr'

or use another tool like awk:
$ printf "abc\tdef ghi\tjkl mno\tpqr\n" |
awk -F$'\t' -vOFS=$'\n' '$1=$1'

depending on the application context.

Janis

Bit Twister

unread,
Oct 10, 2017, 8:56:29 PM10/10/17
to
On Wed, 11 Oct 2017 02:24:34 +0200, Janis Papanagnou wrote:
>
> I'm not sure what's the intention of your code below. If I want to
> split strings on TAB in shell I'd probably just do

Intention of code was in the header. Note added charters on/in word(s)
before the tab.

Marek Novotny

unread,
Oct 10, 2017, 9:22:01 PM10/10/17
to
Did you try putting $line in quotes, like this: "$line"?

I created a text file with a tab in it and used the following to read
line by line and translate a tab into an underscore.

#!/bin/bash

filename=$1

while IFS= read -r line
do
echo "$line" | tr '\t' '_'
done < $filename

## END

Is that ultimately what you wanted to do?

also

cat file.txt | tr '\t' '_'

--
Marek Novotny
https://github.com/marek-novotny

Bit Twister

unread,
Oct 10, 2017, 9:34:32 PM10/10/17
to
On Tue, 10 Oct 2017 20:21:52 -0500, Marek Novotny wrote:
>
> Did you try putting $line in quotes, like this: "$line"?
>
> I created a text file with a tab in it and used the following to read
> line by line and translate a tab into an underscore.

Hehehe, just save the file and run it.
It has the self contained test cases.

Janis Papanagnou

unread,
Oct 10, 2017, 9:51:15 PM10/10/17
to
On 11.10.2017 02:56, Bit Twister wrote:
> On Wed, 11 Oct 2017 02:24:34 +0200, Janis Papanagnou wrote:
>>
>> I'm not sure what's the intention of your code below. If I want to
>> split strings on TAB in shell I'd probably just do
>
> Intention of code was in the header. Note added charters on/in word(s)
> before the tab.

I had read it but it was still obscure to me what you intend to do here.
Your stated requirements were "split a string on the tab character" and
"Split line on tabs", respectively. - Or is it something else you want?

Maybe (taken your kludge literally),

from=$'\twd1\twd2\t'
to=${from//$t/_}

(just guessing).

Janis

Bit Twister

unread,
Oct 10, 2017, 9:56:02 PM10/10/17
to
On Wed, 11 Oct 2017 03:51:13 +0200, Janis Papanagnou wrote:
>
> I had read it but it was still obscure to me what you intend to do here.
> Your stated requirements were "split a string on the tab character" and
> "Split line on tabs", respectively. - Or is it something else you want?

Example string in my /łocal/bin/unix.help file

go inside and poke around mock -r mageia-cauldron-i586 --shell

To become:

_go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell

Just extract the script and run it in your tmp directory.

Marek Novotny

unread,
Oct 10, 2017, 10:09:54 PM10/10/17
to
made a text file with this one line inside.

go inside and poke around

and used this to generate the output you mentioned:

echo _"$(cat file.txt | tr ' ' '_')"_

But if I understand what you're saying, you intend to split the line of
text by the tab and have the underscores affect the first half and not
the second half. Is that correct?

Janis Papanagnou

unread,
Oct 10, 2017, 10:11:15 PM10/10/17
to
On 11.10.2017 03:55, Bit Twister wrote:
> On Wed, 11 Oct 2017 03:51:13 +0200, Janis Papanagnou wrote:
>>
>> I had read it but it was still obscure to me what you intend to do here.
>> Your stated requirements were "split a string on the tab character" and
>> "Split line on tabs", respectively. - Or is it something else you want?
>
> Example string in my /łocal/bin/unix.help file

(I don't have such a file available.)

>
> go inside and poke around mock -r mageia-cauldron-i586 --shell
^^^^^
Is there a TAB between the left and right part?

>
> To become:
>
> _go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell

If you want to replace underscores in the first word of a TAB-separated
string (and prepending/appending another underscore) you can try this
fragment

IFS=$'\t' read -r first rest &&
printf "%s\t%s\n" "_${first// /_}_" "${rest}"

If it's something else you want I'll bite.

Janis

Bit Twister

unread,
Oct 10, 2017, 10:19:02 PM10/10/17
to
Here is the before and after desired results.

go inside and poke around mock -r mageia-cauldron-i586 --shell
_go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell


Just extract the script and run it in tmp.
It creates the test cases and shows the results.

Bit Twister

unread,
Oct 10, 2017, 10:34:52 PM10/10/17
to
On Wed, 11 Oct 2017 04:11:12 +0200, Janis Papanagnou wrote:
> On 11.10.2017 03:55, Bit Twister wrote:
>> On Wed, 11 Oct 2017 03:51:13 +0200, Janis Papanagnou wrote:
>>>
>>> I had read it but it was still obscure to me what you intend to do here.
>>> Your stated requirements were "split a string on the tab character" and
>>> "Split line on tabs", respectively. - Or is it something else you want?
>>
>> Example string in my /łocal/bin/unix.help file
>
> (I don't have such a file available.)
>
>>
>> go inside and poke around mock -r mageia-cauldron-i586 --shell
> ^^^^^
> Is there a TAB between the left and right part?
>
>>
>> To become:
>>
>> _go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell
>
> If you want to replace underscores in the first word of a TAB-separated
> string (and prepending/appending another underscore) you can try this
> fragment
>
> IFS=$'\t' read -r first rest &&
> printf "%s\t%s\n" "_${first// /_}_" "${rest}"
>

Well shuckey dern, what an elegant solution using printf.


After adding in a blank line in test case I get.

$ t
_line_1_ echo ${stringZ:7:3}
_line_2__ text string
__line_3_ text string

_line4_
__line_5__ text string

Will have to add some code to keep from having double underscores.

Thank you and Marek for your time.

Ben Bacarisse

unread,
Oct 11, 2017, 5:53:13 AM10/11/17
to
By all means thank others who have helped, but I don't think the
solution you are pleased with was from Marek -- not unless the quoting
has got all messed up.

--
Ben.

Bit Twister

unread,
Oct 11, 2017, 9:22:54 AM10/11/17
to
On Wed, 11 Oct 2017 10:53:07 +0100, Ben Bacarisse wrote:
> Bit Twister <BitTw...@mouse-potato.com> writes:
>
>> On Wed, 11 Oct 2017 04:11:12 +0200, Janis Papanagnou wrote:
>>> On 11.10.2017 03:55, Bit Twister wrote:
>>>
>>> IFS=$'\t' read -r first rest &&
>>> printf "%s\t%s\n" "_${first// /_}_" "${rest}"
>>>
>>
>> Well shuckey dern, what an elegant solution using printf.

>>
>> Will have to add some code to keep from having double underscores.
>>
>> Thank you and Marek for your time.
>
> By all means thank others who have helped, but I don't think the
> solution you are pleased with was from Marek -- not unless the quoting
> has got all messed up.

Nope, check it out, or go check the parent post. "you" was to Janis. :)

After a little research the magic was in the { // /_} variable, not printf.


Current working solution:
#!/bin/bash
#********************************************************
#*
#* quick kludge to change
#* from keywords wd1 wd2 command and arguments
#* to _wd1_wd2_ command and arguments
#*
#********************************************************
set -u

_cmd=""
_in_fn="/local/doc/unix.help"
_in_fn="unix.help"
_keywd=""
_tab_fn=unix.help.tab
_tmp_fn=unix.help.tmp
_wd=""

if [ $_in_fn = "unix.help" ] ; then # create test case input file
echo 'line 1 echo ${stringZ:7:3}' > $_in_fn
echo 'line 2_ text string' >> $_in_fn
echo '_line 3 text string' >> $_in_fn
echo "" >> $_in_fn
echo 'line4' >> $_in_fn
echo '_line 5_ text string' >> $_in_fn
fi
cp --force /dev/null $_tmp_fn
unexpand --tabs=2 $_in_fn > $_tab_fn

while IFS=$'\t' read -r _keywd _cmd ; do
if [ -n "$_keywd" ] ; then
_wd="${_keywd#_}" # strip leading underscore
_keywd="${_wd%_}" # strip trailing underscore
printf "%s\t%s\n" "_${_keywd// /_}_" "${_cmd}" >> $_tmp_fn
else
printf "\n" >> $_tmp_fn

Ed Morton

unread,
Oct 11, 2017, 9:42:05 AM10/11/17
to
I can't tell from the above what a sample input file would look like, nor what
the expected output would be, nor what problem you're having but this Q/A
describes the usual problem when trying to read tab-separated data with a shell
script and has a solution for it so hopefully it'll be useful:

https://stackoverflow.com/q/4622355/1745001

If not then please just post concise, testable sample input, expected output and
a clear statement of what you're trying to do and the problem you're having so
we can help you.

Obviously you shouldn't be trying to manipulate text with a shell loop anyway of
course, that's what awk is for. See https://unix.stackexchange.com/a/169765/133219.

Regards,

Ed.

Janis Papanagnou

unread,
Oct 11, 2017, 10:07:09 AM10/11/17
to
On 11.10.2017 04:34, Bit Twister wrote:
> On Wed, 11 Oct 2017 04:11:12 +0200, Janis Papanagnou wrote:
[...]
>>
>> If you want to replace underscores in the first word of a TAB-separated
>> string (and prepending/appending another underscore) you can try this
>> fragment
>>
>> IFS=$'\t' read -r first rest &&
>> printf "%s\t%s\n" "_${first// /_}_" "${rest}"
>>
>
> Well shuckey dern, what an elegant solution using printf.

There are two effective parts here; a) the TAB-separated read, leaving
the rest of the line intact, and b) (as I see downthread you noticed),
the replacement variable substitution. The printf is only to re-create
(and show) the resulting (appropriately TAB-separated) string; it can
be embedded in a res=$(...) context, for example, or the variable parts
directly assigned to a variable (i.e. without the need of printf).

Janis

> [...]


Ed Morton

unread,
Oct 11, 2017, 10:35:50 AM10/11/17
to
On 10/10/2017 9:11 PM, Janis Papanagnou wrote:
> On 11.10.2017 03:55, Bit Twister wrote:
>> On Wed, 11 Oct 2017 03:51:13 +0200, Janis Papanagnou wrote:
>>>
>>> I had read it but it was still obscure to me what you intend to do here.
>>> Your stated requirements were "split a string on the tab character" and
>>> "Split line on tabs", respectively. - Or is it something else you want?
>>
>> Example string in my /łocal/bin/unix.help file
>
> (I don't have such a file available.)
>
>>
>> go inside and poke around mock -r mageia-cauldron-i586 --shell
> ^^^^^
> Is there a TAB between the left and right part?
>
>>
>> To become:
>>
>> _go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell
>
> If you want to replace underscores in the first word of a TAB-separated
> string (and prepending/appending another underscore) you can try this
> fragment
>
> IFS=$'\t' read -r first rest &&
> printf "%s\t%s\n" "_${first// /_}_" "${rest}"

Just be aware that'll compress tabs around empty fields and treat the whole
tab-null-tab string as if it was a single tab, e.g. with bash:

$ IFS=$'\t' read -r first second rest <<<$(printf 'field1\t\tfield3\n')
$ printf '<%s> ' "$first" "$second" "$rest"
<field1> <field3> <> $

when the output we'd have liked/expected would have been:

<field1> <> <field3> $

idk if that'll be a problem for the OP or not but the solution unfortunately is
non-trivial, see https://stackoverflow.com/q/4622355/1745001. Hence awk... :-).

Ed

Bit Twister

unread,
Oct 11, 2017, 10:38:08 AM10/11/17
to
On Wed, 11 Oct 2017 08:41:56 -0500, Ed Morton wrote:
> On 10/10/2017 6:34 PM, Bit Twister wrote:
>> Spent way too many hours on this.
>>

<snip>

>
> I can't tell from the above what a sample input file would look like,

I posted the code to create the input file. See

>> echo 'line 1 echo ${stringZ:7:3}' > $_in_fn
>> echo 'line 2_ text string' >> $_in_fn
>> echo '_line 3 text string' >> $_in_fn
>> echo 'line4' >> $_in_fn
>> echo '_line 5_ text string' >> $_in_fn


> nor what the expected output would be,

I had it in the header. See

>> #* quick kludge to change
>> #* from wd1 wd2 remaing string
>> #* to _wd1_wd2_ remaing string


> nor what problem you're having

Posted after the subject. See

>> I can not get this script to split a string on the tab character.
>> Would some kind person show me where I screwed up.



> but this Q/A describes the usual problem when trying to read
> tab-separated data with a shell script and has a solution for it so
> hopefully it'll be useful:
> https://stackoverflow.com/q/4622355/1745001

It is now, but stackoverflow links I found on google did not work at
oh dark thirdey in the morning. :(
Same links are working as I type this.


> If not then please just post concise, testable sample input,
> expected output and a clear statement of what you're trying to do
> and the problem you're having so we can help you.

Sorry, I thought it would be more productive to provide the script
which provide the test case, dumped the output, and had assumed
the header of the script indicated desired output.


> Obviously you
> shouldn't be trying to manipulate text with a shell loop anyway of
> course, that's what awk is for.

Well, I did try a few awk runs, and checked out some sed examples.
Could not do what I wanted.

> See https://unix.stackexchange.com/a/169765/133219.

only if it is up when I wanted access.

If you are the author of stackexchange or an awk wizard, paste the
following in a terminal
_in_fn="unix.help"
_tab_fn=unix.help.tab
_tmp_fn=unix.help.tmp

echo 'line 1 echo ${stringZ:7:3}' > $_in_fn
echo 'line 2_ text string' >> $_in_fn
echo '_line 3 text string' >> $_in_fn
echo "" >> $_in_fn
echo 'line4' >> $_in_fn
echo '_line 5_ text string' >> $_in_fn
unexpand --tabs=2 $_in_fn > $_tab_fn

and see if
$ cat $_tab_fn
line 1 echo ${stringZ:7:3}
line 2_ text string
_line 3 text string

line4
_line 5_ text string

Your one line awk commands here
comes out looking like
$ cat $_tmp_fn
_line_1_ echo ${stringZ:7:3}
_line_2_ text string
_line_3_ text string

_line4_
_line_5_ text string

Bit Twister

unread,
Oct 11, 2017, 10:51:01 AM10/11/17
to
Yeah, but I had tried to use echo instead of printf and wound up with
all underscores where the tab should have been. :(

Then again that may have been a side effect using
unexpand $_in_fn > $_tab_fn
instead of
unexpand --tabs=2 $_in_fn > $_tab_fn

I made all sorts of attempts before posting my last attempt.
It would be nice to know exactly why the original script did not work
with the set command. I have had cases where processing variable field
lines will not work well using

Ed Morton

unread,
Oct 11, 2017, 10:52:51 AM10/11/17
to
I'm not sure if that last text is your desired output or bad output but is this
what you're trying to do?

$ awk 'BEGIN{FS=OFS="\t"} NF{gsub(/ /,"_",$1); gsub(/^_|_$/,"",$1); $1="_"$1"_"}
1' unix.help.tab
_line_1_ echo ${stringZ:7:3}
_line_2_ text string
_line_3_ text string

_line4_
_line_5_ text string


With tabs changed to `#`s for visibility:

$ cat unix.help.tab | tr '\t' '#'
line 1#echo ${stringZ:7:3}
line 2_##text string
_line 3### text string

line4
_line 5_####text string

$ awk 'BEGIN{FS=OFS="\t"} NF{gsub(/ /,"_",$1); gsub(/^_|_$/,"",$1); $1="_"$1"_"}
1' unix.help.tab | tr '\t' '#'
_line_1_#echo ${stringZ:7:3}
_line_2_##text string
_line_3_### text string

_line4_
_line_5_####text string

Regards,

Ed.

Bit Twister

unread,
Oct 11, 2017, 11:03:18 AM10/11/17
to
On Wed, 11 Oct 2017 09:35:41 -0500, Ed Morton wrote:
> On 10/10/2017 9:11 PM, Janis Papanagnou wrote:

>>
>> If you want to replace underscores in the first word of a TAB-separated
>> string (and prepending/appending another underscore) you can try this
>> fragment
>>
>> IFS=$'\t' read -r first rest &&
>> printf "%s\t%s\n" "_${first// /_}_" "${rest}"
>
> Just be aware that'll compress tabs around empty fields and treat the whole
> tab-null-tab string as if it was a single tab. idk if that'll be a
> problem for the OP or not

It's not. It is a desired feature in this case. The 1,500+ lines in
the input file had the keywords separated from the command by a
variable number of spaces.

Ed Morton

unread,
Oct 11, 2017, 11:09:39 AM10/11/17
to
I see from another comment you just made that you actually WANT the sequences of
tabs in the input compressed in the output. OK, just change FS to say "1 or more
tabs" instead of just "1 tab":

$ awk 'BEGIN{FS="\t+"; OFS="\t"} NF{gsub(/ /,"_",$1); gsub(/^_|_$/,"",$1);
$1="_"$1"_"} 1' unix.help.tab
_line_1_ echo ${stringZ:7:3}
_line_2_ text string
_line_3_ text string

_line4_
_line_5_ text string

Regards,

Ed.

Janis Papanagnou

unread,
Oct 11, 2017, 11:13:50 AM10/11/17
to
On 11.10.2017 16:35, Ed Morton wrote:
> On 10/10/2017 9:11 PM, Janis Papanagnou wrote:
>> On 11.10.2017 03:55, Bit Twister wrote:
>>> On Wed, 11 Oct 2017 03:51:13 +0200, Janis Papanagnou wrote:
>>>>
>>>> I had read it but it was still obscure to me what you intend to do here.
>>>> Your stated requirements were "split a string on the tab character" and
>>>> "Split line on tabs", respectively. - Or is it something else you want?
>>>
>>> Example string in my /łocal/bin/unix.help file
>>
>> (I don't have such a file available.)
>>
>>>
>>> go inside and poke around mock -r mageia-cauldron-i586 --shell
>> ^^^^^
>> Is there a TAB between the left and right part?
>>
>>>
>>> To become:
>>>
>>> _go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell
>>
>> If you want to replace underscores in the first word of a TAB-separated
>> string (and prepending/appending another underscore) you can try this
>> fragment
>>
>> IFS=$'\t' read -r first rest &&
>> printf "%s\t%s\n" "_${first// /_}_" "${rest}"
>
> Just be aware that'll compress tabs around empty fields and treat the whole
> tab-null-tab string as if it was a single tab, e.g. with bash:

Different application case. The OP wanted the first field to be processed,
and the rest of the line be unchanged. In the given case I don't think that
two TABs would be something else than a separator, say, one TAB a separator
the second TAB a prefix part of the value. YMMV. Not worth a discussion, IMO.

Janis

Janis Papanagnou

unread,
Oct 11, 2017, 11:18:51 AM10/11/17
to
On 11.10.2017 16:50, Bit Twister wrote:
> [...]
>
> I made all sorts of attempts before posting my last attempt.
> It would be nice to know exactly why the original script did not work
> with the set command. I have had cases where processing variable field
> lines will not work well using
> while IFS=$'\t' read -r _keywd _cmd ; do

It would need to see those concrete cases to identify what ia/was wrong.

(If you do massive text processing it's probably better [as I think was
already suggested in this thread] to use something like awk.)

Janis

Ed Morton

unread,
Oct 11, 2017, 11:19:02 AM10/11/17
to
Yeah looks like you're right. I couldn't find the sample input, output, or
requirements in the question so thought it was probably about the usual issue
around reading tab-separated data with shell but it turns out that just for once
the default shell behavior is actually desirable!

Ed.

Ben Bacarisse

unread,
Oct 11, 2017, 11:25:44 AM10/11/17
to
Bit Twister <BitTw...@mouse-potato.com> writes:

> On Wed, 11 Oct 2017 10:53:07 +0100, Ben Bacarisse wrote:
>> Bit Twister <BitTw...@mouse-potato.com> writes:
>>
>>> On Wed, 11 Oct 2017 04:11:12 +0200, Janis Papanagnou wrote:
>>>> On 11.10.2017 03:55, Bit Twister wrote:
>>>>
>>>> IFS=$'\t' read -r first rest &&
>>>> printf "%s\t%s\n" "_${first// /_}_" "${rest}"
>>>>
>>>
>>> Well shuckey dern, what an elegant solution using printf.
>
>>>
>>> Will have to add some code to keep from having double underscores.
>>>
>>> Thank you and Marek for your time.
>>
>> By all means thank others who have helped, but I don't think the
>> solution you are pleased with was from Marek -- not unless the quoting
>> has got all messed up.
>
> Nope, check it out, or go check the parent post. "you" was to Janis.
> :)

Ah, sorry. The usual phrase is "thank you" so I though you were
thanking Marek with "and" being an editing error (mainly because I make
so many of those). "Thanks to you and Marek" would have been
unambiguous, but I should not have defaulted to assuming and error -- I
should have tried to find a reading that was the intended one. Sorry
about that.

<snip>
--
Ben.

Ed Morton

unread,
Oct 11, 2017, 11:26:39 AM10/11/17
to
The original script said:

set -- $(IFS=$'\t' ; echo $line)

The `echo $line` will convert all tabs to blank chars as a result of you not
quoting `$line` because given `line="foo<tab>bar"` when you do `echo $line` that
is seen as `echo "foo" "bar"`. Compare with `echo "$line"` which is seen as
`echo "foo<tab>bar"`. So your set is trying to use tab-separated fields when
there are no tabs in the output of the echo command.

There may be other issues, idk..

Ed.

Ben Bacarisse

unread,
Oct 11, 2017, 11:36:25 AM10/11/17
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:

> Bit Twister <BitTw...@mouse-potato.com> writes:
>
>> On Wed, 11 Oct 2017 10:53:07 +0100, Ben Bacarisse wrote:
>>> Bit Twister <BitTw...@mouse-potato.com> writes:
>>>
>>>> On Wed, 11 Oct 2017 04:11:12 +0200, Janis Papanagnou wrote:
>>>>> On 11.10.2017 03:55, Bit Twister wrote:
>>>>>
>>>>> IFS=$'\t' read -r first rest &&
>>>>> printf "%s\t%s\n" "_${first// /_}_" "${rest}"
>>>>>
>>>>
>>>> Well shuckey dern, what an elegant solution using printf.
>>
>>>>
>>>> Will have to add some code to keep from having double underscores.
>>>>
>>>> Thank you and Marek for your time.
>>>
>>> By all means thank others who have helped, but I don't think the
>>> solution you are pleased with was from Marek -- not unless the quoting
>>> has got all messed up.
>>
>> Nope, check it out, or go check the parent post. "you" was to Janis.
>> :)
>
> Ah, sorry.

Here's a stab at it in Perl by way of apology!

perl -pe 's/(.*)\t/ "_" . $1 =~ s| |_|gr . "_\t" /e'

It probably does not cover the corner cases, but I fancied trying a
nested substitution

<snip>
--
Ben.

Ed Morton

unread,
Oct 11, 2017, 11:40:40 AM10/11/17
to
Not only doesn't it cover any corner cases, it doesn't produce the output you
wanted from the sample input you provided:

$ perl -pe 's/(.*)\t/ "_" . $1 =~ s| |_|gr . "_\t" /e' unix.help.tab
_line_1_ echo ${stringZ:7:3}
_line_2_ _ text string
__line_3 _ text string

line4
__line_5_ _ text string



Bit Twister

unread,
Oct 11, 2017, 11:49:49 AM10/11/17
to
On Wed, 11 Oct 2017 17:18:47 +0200, Janis Papanagnou wrote:
> On 11.10.2017 16:50, Bit Twister wrote:
>> [...]
>>
>> I made all sorts of attempts before posting my last attempt.
>> It would be nice to know exactly why the original script did not work
>> with the set command. I have had cases where processing variable field
>> lines will not work well using
>> while IFS=$'\t' read -r _keywd _cmd ; do
>
> It would need to see those concrete cases to identify what ia/was wrong.

Not so much as wrong, as having to put about 13 variables in the
while line
to get all the fields from the longest string and then decide where the
last field was filled in on short strings.

> (If you do massive text processing it's probably better [as I think was
> already suggested in this thread] to use something like awk.)

Thanks to Ed for his time and yours.

I now have 5 more entries in my unix.help file. :)

Bit Twister

unread,
Oct 11, 2017, 12:26:27 PM10/11/17
to
On Wed, 11 Oct 2017 10:26:31 -0500, Ed Morton wrote:
> The original script said:
>
> set -- $(IFS=$'\t' ; echo $line)
>
> The `echo $line` will convert all tabs to blank chars as a result of
> you not quoting `$line` because given `line="foo<tab>bar"` when you
> do `echo $line` that is seen as `echo "foo" "bar"`. Compare with
> `echo "$line"` which is seen as `echo "foo<tab>bar"`. So your set is
> trying to use tab-separated fields when there are no tabs in the
> output of the echo command.

Ah, I see says the blind man. Is there a way to make the inline set
parse on a character. Example for =

line="wd1 wd3 = wd4"
set -- $(IFS='=' ; echo "$line")
echo "($1) and ($2)" to get (wd1 wd3) and (wd4)



I have been doing single first field processing and never knew
what I was doing. Example parsing field=value with
set -- $(IFS='=' ; echo $line)
gave me value in $2. Had line="wd1 wd2 = value" then I now see I would
get wd2. :(

I am sooooo glad all the system configuration files I munge after
install are that simple.

grep 'set --' * | grep IFS | wc -l
141
times I have been lucky until this script.

Thank you for making me see the light. :)

Bit Twister

unread,
Oct 11, 2017, 12:32:07 PM10/11/17
to
On Wed, 11 Oct 2017 16:36:20 +0100, Ben Bacarisse wrote:
> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>
>> Bit Twister <BitTw...@mouse-potato.com> writes:
>>>
>>> Nope, check it out, or go check the parent post. "you" was to Janis.
>>> :)
>>
>> Ah, sorry.

Don't worry about it. Ed's reply seem to indicate it was me posting
the perl attempt.


> Here's a stab at it in Perl by way of apology!
>
> perl -pe 's/(.*)\t/ "_" . $1 =~ s| |_|gr . "_\t" /e'
>
> It probably does not cover the corner cases, but I fancied trying a
> nested substitution

As you see from Ed's reply it did not quite work as intended but
I would not mind having a working perl example.

I have done a few small programs in perl but nothing like what we are
doing here.

Ed Morton

unread,
Oct 11, 2017, 1:02:11 PM10/11/17
to
On 10/11/2017 11:26 AM, Bit Twister wrote:
> On Wed, 11 Oct 2017 10:26:31 -0500, Ed Morton wrote:
>> The original script said:
>>
>> set -- $(IFS=$'\t' ; echo $line)
>>
>> The `echo $line` will convert all tabs to blank chars as a result of
>> you not quoting `$line` because given `line="foo<tab>bar"` when you
>> do `echo $line` that is seen as `echo "foo" "bar"`. Compare with
>> `echo "$line"` which is seen as `echo "foo<tab>bar"`. So your set is
>> trying to use tab-separated fields when there are no tabs in the
>> output of the echo command.
>
> Ah, I see says the blind man. Is there a way to make the inline set
> parse on a character. Example for =
>
> line="wd1 wd3 = wd4"
> set -- $(IFS='=' ; echo "$line")
> echo "($1) and ($2)" to get (wd1 wd3) and (wd4)


Sure:

line="wd1 wd3 = wd4"
oIFS="$IFS"; IFS='='; set -- $line; IFS="$oIFS"
echo "($1) and ($2)"

will output

(wd1 wd3 ) and ( wd4)

YMMV with what will happen given globbing chars and matching file names, etc.
within $line.

You should really consider using arrays instead of overwriting the positional
parameters in your scripts though:

line="wd1 wd3 = wd4"
oIFS="$IFS"; IFS='='; a=( $line ); IFS="$oIFS"
echo "(${a[0]}) and (${a[1]})"
(wd1 wd3 ) and ( wd4)

but you should also probably be using awk so that may be a moot point.

Ed.

Bit Twister

unread,
Oct 11, 2017, 1:38:18 PM10/11/17
to
On Wed, 11 Oct 2017 12:02:02 -0500, Ed Morton wrote:
> On 10/11/2017 11:26 AM, Bit Twister wrote:
>>
>> Ah, I see says the blind man. Is there a way to make the inline set
>> parse on a character. Example for =
>>
>> line="wd1 wd3 = wd4"
>> set -- $(IFS='=' ; echo "$line")
>> echo "($1) and ($2)" to get (wd1 wd3) and (wd4)
>
>
> Sure:
>
> line="wd1 wd3 = wd4"
> oIFS="$IFS"; IFS='='; set -- $line; IFS="$oIFS"
> echo "($1) and ($2)"
>
> will output
>
> (wd1 wd3 ) and ( wd4)

I agree that is "inline" but I was trying to not change IFS outside
of the set command.

Added your example to my help file.

> but you should also probably be using awk

Thinking that would a bit ugly from a code maintenance stand point.
Majority of time I am changing values for different lines.
Worst case so far is changing a same variable name value under
different sections. Example of just one change in a file with several other
directives to be changed. Simple example:

[Server]
blksize=4k

[Client]
blksize=2g <----- was 4k

[Loging]
blksize=4k
--------------------------

cat /etc/my.cnf to see how tweaking several parameters would be
pretty awk intensive coding wise.

Ed Morton

unread,
Oct 11, 2017, 2:55:22 PM10/11/17
to
On 10/11/2017 12:38 PM, Bit Twister wrote:
> On Wed, 11 Oct 2017 12:02:02 -0500, Ed Morton wrote:
>> On 10/11/2017 11:26 AM, Bit Twister wrote:
>>>
>>> Ah, I see says the blind man. Is there a way to make the inline set
>>> parse on a character. Example for =
>>>
>>> line="wd1 wd3 = wd4"
>>> set -- $(IFS='=' ; echo "$line")
>>> echo "($1) and ($2)" to get (wd1 wd3) and (wd4)
>>
>>
>> Sure:
>>
>> line="wd1 wd3 = wd4"
>> oIFS="$IFS"; IFS='='; set -- $line; IFS="$oIFS"
>> echo "($1) and ($2)"
>>
>> will output
>>
>> (wd1 wd3 ) and ( wd4)
>
> I agree that is "inline" but I was trying to not change IFS outside
> of the set command.
>
> Added your example to my help file.
>
>> but you should also probably be using awk
>
> Thinking that would a bit ugly from a code maintenance stand point.

I get the impression you have been grossly misinformed about awk.

> Majority of time I am changing values for different lines.
> Worst case so far is changing a same variable name value under
> different sections. Example of just one change in a file with several other
> directives to be changed. Simple example:
>
> [Server]
> blksize=4k
>
> [Client]
> blksize=2g <----- was 4k
>
> [Loging]
> blksize=4k
> --------------------------
>
> cat /etc/my.cnf to see how tweaking several parameters would be
> pretty awk intensive coding wise.
>

I don't have a /etc/my.cnf file but not at all - any text manipulation you can
do in shell will be clearer, simpler, orders of magnitude more efficient and
better in almost every other way if you do it in awk instead of shell because
that's not what shell was designed to do and the guys who designed shell also
designed awk for shell to call to perform tasks like this.

If you post a specific question related to your last comment above with concise,
testable sample input and expected output then I'm sure someone can show you how
to do whatever it is you want to do.

Ed.

Ed Morton

unread,
Oct 11, 2017, 3:00:23 PM10/11/17
to
Here's an example given one possible interpretation of your requirements for the
above:

$ cat file
[Server]
blksize=4k

[Client]
blksize=4k

[Loging]
blksize=4k

$ awk -v RS= -v ORS='\n\n' '/^\[Client\]/{sub(/=.*/,"=2g")}1' file
[Server]
blksize=4k

[Client]
blksize=2g

[Loging]
blksize=4k

Like I said, post a complete question if you'd like more help.

Ed.

Bit Twister

unread,
Oct 11, 2017, 3:53:20 PM10/11/17
to
On Wed, 11 Oct 2017 13:55:12 -0500, Ed Morton wrote:
> On 10/11/2017 12:38 PM, Bit Twister wrote:
>>
>> Thinking that would a bit ugly from a code maintenance stand point.
>
> I get the impression you have been grossly misinformed about awk.

Not "grossly misinformed" more like grossly ignorant of the possibilities.
All the awk I have see on the net so far were pretty cryptic one liners.

>
>> Majority of time I am changing values for different lines.
>> Worst case so far is changing a same variable name value under
>> different sections. Example of just one change in a file with several other
>> directives to be changed. Simple example:
>>
>> [Server]
>> blksize = 4k
>>
>> [Client]
>> blksize = 2g <----- was 4k
>>
>> [Loging]
>> blksize = 4k
>> --------------------------
>>
>> cat /etc/my.cnf to see how tweaking several parameters would be
>> pretty awk intensive coding wise.
>>
>
> I don't have a /etc/my.cnf file



Sorry, my mistake, another bad assumption upon my part. Now that I
look, what is Microsoft OS name for your "Windows NT 6.1" install?

I do not have that version in my unix.help file.

$ uh _nt_
_Windows_NT_5.0_ Windows 2000
_Windows_NT_5.1_ Windows XP Home
_Windows_NT_/20060308_ Windows XP Pro SP2




> but not at all - any text manipulation you can
> do in shell will be clearer, simpler, orders of magnitude more efficient and
> better in almost every other way if you do it in awk instead of shell because
> that's not what shell was designed to do and the guys who designed shell also
> designed awk for shell to call to perform tasks like this.
>
> If you post a specific question related to your last comment above
> with concise, testable sample input and expected output then I'm
> sure someone can show you how to do whatever it is you want to do.

The above example was where I wanted to change only one directive out
of three.

A quick search of awk case got me
https://www.gnu.org/software/gawk/manual/html_node/Switch-Statement.html

which is pretty readable. I'll need to start playing with awk.

Ben Bacarisse

unread,
Oct 11, 2017, 3:53:47 PM10/11/17
to
In a reply to me, "you" would seem to mean me, but I provided no sample
input. I based the code on a verbal description from Janis that
elicited a positive response from the OP. I've gone back to look over
the thread and I still can't find a better description of what's wanted.

I later saw a remark about not getting duplicate underscores in which
case

perl -pe 's/(.*?\t)/ $1 =~ s!^_*|[\t _]+!_!gr /e'

might suit (though it will still reflect whatever misconceptions I had
before about the task before) and basic idea is now getting lost in the
details. (It also substitutes only up to the first tab which I think
has also been talked about.)

The main value of the example was in the technique not the details. It
allows a replacement to be done in a part of a line matched by another
RE. That was an interesting idea to me. It may be that the technique
can't do what's wanted, but that is by no means obvious (to me!) at this
point.

<snip>
--
Ben.

Ed Morton

unread,
Oct 11, 2017, 4:10:43 PM10/11/17
to
On 10/11/2017 2:53 PM, Bit Twister wrote:
> On Wed, 11 Oct 2017 13:55:12 -0500, Ed Morton wrote:
>> On 10/11/2017 12:38 PM, Bit Twister wrote:
>>>
>>> Thinking that would a bit ugly from a code maintenance stand point.
>>
>> I get the impression you have been grossly misinformed about awk.
>
> Not "grossly misinformed" more like grossly ignorant of the possibilities.
> All the awk I have see on the net so far were pretty cryptic one liners.
>
>>
>>> Majority of time I am changing values for different lines.
>>> Worst case so far is changing a same variable name value under
>>> different sections. Example of just one change in a file with several other
>>> directives to be changed. Simple example:
>>>
>>> [Server]
>>> blksize = 4k
>>>
>>> [Client]
>>> blksize = 2g <----- was 4k
>>>
>>> [Loging]
>>> blksize = 4k
>>> --------------------------
>>>
>>> cat /etc/my.cnf to see how tweaking several parameters would be
>>> pretty awk intensive coding wise.
>>>
>>
>> I don't have a /etc/my.cnf file
>
>
>
> Sorry, my mistake, another bad assumption upon my part. Now that I
> look, what is Microsoft OS name for your "Windows NT 6.1" install?

I'm sorry I don't even know what that question means.

>
> I do not have that version in my unix.help file.
>
> $ uh _nt_
> _Windows_NT_5.0_ Windows 2000
> _Windows_NT_5.1_ Windows XP Home
> _Windows_NT_/20060308_ Windows XP Pro SP2
>
>
>
>
>> but not at all - any text manipulation you can
>> do in shell will be clearer, simpler, orders of magnitude more efficient and
>> better in almost every other way if you do it in awk instead of shell because
>> that's not what shell was designed to do and the guys who designed shell also
>> designed awk for shell to call to perform tasks like this.
>>
>> If you post a specific question related to your last comment above
>> with concise, testable sample input and expected output then I'm
>> sure someone can show you how to do whatever it is you want to do.
>
> The above example was where I wanted to change only one directive out
> of three.
>
> A quick search of awk case got me
> https://www.gnu.org/software/gawk/manual/html_node/Switch-Statement.html
>
> which is pretty readable. I'll need to start playing with awk.

switch statements are gawk-specific and don't actually add much value (I don't
recall ever having used one in the past 25 years of using awk whereas I've used
them thousands of times in shell, C, etc.).

Learn the awk paradigm first, not how to make awk constructs fit the paradigm
you use in other tools. Otherwise you risk being like the C programmer who
learns how to write C-like procedural programs in C++ but never learns the OO
programming, etc. that C++ makes available.

For example this in shell:

read line
case line in
*foo* ) do_X ;;
*bar* ) do_Y ;;
* ) do_Z ;;
esac

could be written in GNU awk using a switch statement but it's likely that this
would be a more idiomatic awk approach:

/foo/ { do_X; next }
/bar/ { do_Y; next }
{ do_Z }

Regards,

Ed.

Ed Morton

unread,
Oct 11, 2017, 4:12:39 PM10/11/17
to
Sorry, I thought you were the OP.

Ed.

Janis Papanagnou

unread,
Oct 11, 2017, 4:56:56 PM10/11/17
to
On 11.10.2017 22:10, Ed Morton wrote:
>> [...]
>
> switch statements are gawk-specific and don't actually add much value (I don't
> recall ever having used one in the past 25 years of using awk whereas I've
> used them thousands of times in shell, C, etc.).

You should be aware of the fact that GNU awk's 'switch' statement (unlike C)
supports regexps; a useful property. And you can also compare strings (which
needs bulky if-cascades in C); another useful property (if compared to C).

>
> [...]
>
> For example this in shell:
>
> read line
> case line in
> *foo* ) do_X ;;
> *bar* ) do_Y ;;
> * ) do_Z ;;
> esac
>
> could be written in GNU awk using a switch statement but it's likely that this
> would be a more idiomatic awk approach:

This is certainly correct, but you can't assume shell's 'read' above in the
general case. Then your code below would rather become something like

line ~ /foo/ { do_X; next }
line ~ /bar/ { do_Y; next }
{ do_Z }

or (using switch)

switch (line) {
case /foo/: do_X; break;
case /bar/: do_Y; break;
default: do_Z;
}

The point to observe is that the switch argument need only be evaluated once,
whereas in if-cascades or patter/action pairs you evaluate the variable with
every comparison. Conceptually 'switch' is a special case relying on a single
variable, and conceptually it is (in my book) also clearer (and a bit less
error prone) to express such comparisons using 'switch' in cases where you
intend to compare the same variable against different values.

Janis

Bit Twister

unread,
Oct 11, 2017, 5:20:40 PM10/11/17
to
On Wed, 11 Oct 2017 15:10:35 -0500, Ed Morton wrote:
> On 10/11/2017 2:53 PM, Bit Twister wrote:

>>
>> Sorry, my mistake, another bad assumption upon my part. Now that I
>> look, what is Microsoft OS name for your "Windows NT 6.1" install?
>
> I'm sorry I don't even know what that question means.

That is ok, I have my Usenet client (slrn) configured to give me a
clue as to what OS and Usenet client the poster is using.

Ok, a little googling suggests you are running Windows 7 Ultimate
Does that sound close to what OS you are running Thunderbird 52.3.0 on?


> switch statements are gawk-specific and don't actually add much
> value (I don't recall ever having used one in the past 25 years of
> using awk whereas I've used them thousands of times in shell, C,
> etc.).
>
> Learn the awk paradigm first, not how to make awk constructs fit the paradigm
> you use in other tools. Otherwise you risk being like the C programmer who
> learns how to write C-like procedural programs in C++ but never learns the OO
> programming, etc. that C++ makes available.

I understand, but all I can do is hunt through documents looking for
the statement in awk that I use in bash to see how to use it. :-/


> For example this in shell:
>
> read line
> case line in
> *foo* ) do_X ;;
> *bar* ) do_Y ;;
> * ) do_Z ;;
> esac
>
> could be written in GNU awk using a switch statement but it's likely
> that this would be a more idiomatic awk approach:
>
> /foo/ { do_X; next }
> /bar/ { do_Y; next }
> { do_Z }

Yes, saw that during my goggle search, but it is going to take awhile
to find/get that 'idiomatic' approch to become habit.

I like to do defensive coding, from your example
awk -v RS= -v ORS='\n\n' '/^\[Client\]/{

I worry that the code might not work if there is no blank line above
the [section] id. My bash script would case on the first character of
the line.

Then it is a simple case on the first word to get the correct section
to start modifying until I hit the next [.

Once I get a nice/complex awk script working I'll copy it to a
awk_skeleton file which become my skeleton for starting a new script.

Bit Twister

unread,
Oct 11, 2017, 5:35:10 PM10/11/17
to
As a coder, I want "at a glance" easy to read code. When I worked for a
telephone company, the QA person would give me a lower grade in code
reviews because I used switch instead nested if.

Ed Morton

unread,
Oct 11, 2017, 7:06:49 PM10/11/17
to
On 10/11/2017 3:56 PM, Janis Papanagnou wrote:
> On 11.10.2017 22:10, Ed Morton wrote:
>>> [...]
>>
>> switch statements are gawk-specific and don't actually add much value (I don't
>> recall ever having used one in the past 25 years of using awk whereas I've
>> used them thousands of times in shell, C, etc.).
>
> You should be aware of the fact that GNU awk's 'switch' statement (unlike C)
> supports regexps; a useful property. And you can also compare strings (which
> needs bulky if-cascades in C); another useful property (if compared to C).

Yes, I'm aware, I just don't see the benefits as outweighing the loss in
portability to other awks or simply the need for me (or anyone else reading the
code) to have to be aware of the syntax and semantics of switch statements.
Again, I understand the benefits but just don't find they're enough to use the
constructs. Consider this:

$ cat tst.awk
{
switch ($0) {
case /foo/: { print "got foo" }
case /bar/: { print "got bar" }
}
}

$ echo 'bar' | awk -f tst.awk
got bar

$ echo 'foo' | awk -f tst.awk
got foo
got bar

The first result makes sense but why the second? That seems very
counter-intuitive since `bar` isn't present and that isn't the behavior I'd get
from a similar statement in C or shell nor is it the behavior I'd get from awk
with the normal construct:

$ cat tst2.awk
/foo/ { print "got foo" }
/bar/ { print "got bar" }

$ echo 'bar' | awk -f tst2.awk
got bar

$ echo 'foo' | awk -f tst2.awk
got foo

I honestly don't know the answer and, most importantly, I don't have to think
about it since I don't use the construct :-).

Ed.

Ed Morton

unread,
Oct 11, 2017, 7:16:17 PM10/11/17
to
On 10/11/2017 4:20 PM, Bit Twister wrote:
> On Wed, 11 Oct 2017 15:10:35 -0500, Ed Morton wrote:
>> On 10/11/2017 2:53 PM, Bit Twister wrote:
>
>>>
>>> Sorry, my mistake, another bad assumption upon my part. Now that I
>>> look, what is Microsoft OS name for your "Windows NT 6.1" install?
>>
>> I'm sorry I don't even know what that question means.
>
> That is ok, I have my Usenet client (slrn) configured to give me a
> clue as to what OS and Usenet client the poster is using.
>
> Ok, a little googling suggests you are running Windows 7 Ultimate
> Does that sound close to what OS you are running Thunderbird 52.3.0 on?

Windows 7 Home Premium and doing all my work in cygwin.

>> switch statements are gawk-specific and don't actually add much
>> value (I don't recall ever having used one in the past 25 years of
>> using awk whereas I've used them thousands of times in shell, C,
>> etc.).
>>
>> Learn the awk paradigm first, not how to make awk constructs fit the paradigm
>> you use in other tools. Otherwise you risk being like the C programmer who
>> learns how to write C-like procedural programs in C++ but never learns the OO
>> programming, etc. that C++ makes available.
>
> I understand, but all I can do is hunt through documents looking for
> the statement in awk that I use in bash to see how to use it. :-/

No, you can ask questions here or in stackoverflow about "how do I do this in
awk". The answer will usually involve the same or equivalent constructs to those
you use in bash for the same task.

>> For example this in shell:
>>
>> read line
>> case line in
>> *foo* ) do_X ;;
>> *bar* ) do_Y ;;
>> * ) do_Z ;;
>> esac
>>
>> could be written in GNU awk using a switch statement but it's likely
>> that this would be a more idiomatic awk approach:
>>
>> /foo/ { do_X; next }
>> /bar/ { do_Y; next }
>> { do_Z }
>
> Yes, saw that during my goggle search, but it is going to take awhile
> to find/get that 'idiomatic' approch to become habit.
>
> I like to do defensive coding, from your example
> awk -v RS= -v ORS='\n\n' '/^\[Client\]/{
>
> I worry that the code might not work if there is no blank line above
> the [section] id. My bash script would case on the first character of
> the line.

It would fail if there was no blank line BELOW the 2nd line. If it's possible
that blank line doesn't exist then you can use this approach instead:

awk 'f{sub(/=.*/,"=2g"); f=0} /^\[Client\]/{f=1} 1' file

Just like programming in bash, there are many ways to accomplish a task in awk
and the right one depends on many factors with a big one obviously being what
can your input file look like.

> Then it is a simple case on the first word to get the correct section
> to start modifying until I hit the next [.
>
> Once I get a nice/complex awk script working I'll copy it to a
> awk_skeleton file which become my skeleton for starting a new script.

That's reasonable I guess as long as you just keep doing the same things over
and over again (but then why not write a common script?).

Ed.

Ed Morton

unread,
Oct 11, 2017, 7:18:33 PM10/11/17
to
On 10/11/2017 6:16 PM, Ed Morton wrote:
> On 10/11/2017 4:20 PM, Bit Twister wrote:
<snip>
>> I understand, but all I can do is hunt through documents looking for
>> the statement in awk that I use in bash to see how to use it.  :-/

Lost a "not" below, fixed now:

> No, you can ask questions here or in stackoverflow about "how do I do this in
> awk". The answer will usually _NOT_ involve the same or equivalent constructs to those

Bit Twister

unread,
Oct 11, 2017, 8:50:24 PM10/11/17
to
On Wed, 11 Oct 2017 18:16:09 -0500, Ed Morton wrote:
> On 10/11/2017 4:20 PM, Bit Twister wrote:
>>
>> Ok, a little googling suggests you are running Windows 7 Ultimate
>> Does that sound close to what OS you are running Thunderbird 52.3.0 on?
>
> Windows 7 Home Premium and doing all my work in cygwin.

Thank you.



>> Once I get a nice/complex awk script working I'll copy it to a
>> awk_skeleton file which become my skeleton for starting a new script.
>
> That's reasonable I guess as long as you just keep doing the same things over
> and over again (but then why not write a common script?).

awk_skeleton will just be a template file. The copy will be hacked to
do its new task/job, which can be completely different.

Kenny McCormack

unread,
Oct 11, 2017, 9:45:30 PM10/11/17
to
In article <slrnottf2d.c...@wb.home.test>,
Bit Twister <BitTw...@mouse-potato.com> wrote:
>On Wed, 11 Oct 2017 18:16:09 -0500, Ed Morton wrote:
>> On 10/11/2017 4:20 PM, Bit Twister wrote:
>>>
>>> Ok, a little googling suggests you are running Windows 7 Ultimate
>>> Does that sound close to what OS you are running Thunderbird 52.3.0 on?
>>
>> Windows 7 Home Premium and doing all my work in cygwin.
>
>Thank you.

Why should you care?

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/FreeCollege

Bit Twister

unread,
Oct 11, 2017, 10:44:59 PM10/11/17
to
On Thu, 12 Oct 2017 01:45:26 +0000 (UTC), Kenny McCormack wrote:
> In article <slrnottf2d.c...@wb.home.test>,
> Bit Twister <BitTw...@mouse-potato.com> wrote:
>>On Wed, 11 Oct 2017 18:16:09 -0500, Ed Morton wrote:
>>> On 10/11/2017 4:20 PM, Bit Twister wrote:
>>>>
>>>> Ok, a little googling suggests you are running Windows 7 Ultimate
>>>> Does that sound close to what OS you are running Thunderbird 52.3.0 on?
>>>
>>> Windows 7 Home Premium and doing all my work in cygwin.
>>
>>Thank you.
>
> Why should you care?

1. just wanted to get the OS to put it in my unix.help file.
2. I normally glance at it when I see the poster using google's news
client but the OS is not Microsoft.

If the user OS is Linux and news client is knode, I have to warn them
about any of my scripts may not run because it has screwed up
the code when user tries to paste/run whatever I provided.

If I see the Win 7 OS and user is having grief trying to dual/multi
booting one or more linux installs I can recommend installing win7 in
a VirtualBox guest and only have linux boot loader. I know that works
because that is my setup.

I click the desktop shortcut and about 16 seconds later I am using Win 7.

Janis Papanagnou

unread,
Oct 11, 2017, 10:47:35 PM10/11/17
to
On 12.10.2017 01:06, Ed Morton wrote:
> On 10/11/2017 3:56 PM, Janis Papanagnou wrote:
>> On 11.10.2017 22:10, Ed Morton wrote:
>>>> [...]
>>>
>>> switch statements are gawk-specific and don't actually add much value (I don't
>>> recall ever having used one in the past 25 years of using awk whereas I've
>>> used them thousands of times in shell, C, etc.).
>>
>> You should be aware of the fact that GNU awk's 'switch' statement (unlike C)
>> supports regexps; a useful property. And you can also compare strings (which
>> needs bulky if-cascades in C); another useful property (if compared to C).
>
> Yes, I'm aware, I just don't see the benefits as outweighing the loss in
> portability to other awks or simply the need for me (or anyone else reading
> the code) to have to be aware of the syntax and semantics of switch statements.

Portability can often be an issue, I agree. WRT the switch syntax, well, given
your statement it seems to be personal preference. I don't see much difference
in learning syntactical constructs of any programming languages, specifically
not in awk which has not a big set of control constructs. But YMMV, of course.

I used GNU awk's 'switch' in the past, and while I often missed and cursed C
for the inability to compare strings in 'switch' statements I appreciated
its availability in GNU awk all the more, and also its logical extension to
regexps.
Erm, in C there's also the 'break' statement, and if omitted in a 'switch'
branch the fall-through effect (that you observe in GNU awk as well) becomes
visible. It seems that the GNU awk people implemented 'switch' the same way
as in C so that folks who don't want new syntax and semantics here won't
have to adapt to something new. Actually, in GNU awk, only the possibility
to define /.../ and "..." values to compare against is an extension, and a
straightforward and obvious one I'd say.

To prevent the fall-through effect in both, C and GNU awk, you have to use
an explicit 'break' statement. Personally - having learned Pascal before C -
I was repelled by that ballast, but for most people comming from C language
family with its, erm, comparatively ugly syntax it should not be an issue;
they already know from either C, or C++, or Java, etc. etc. - and from gawk.

You also mention shell behaviour here; note that shell syntax is influenced
by Algol 68; Algol 'case' constructs are only marginally comparable, but
(as in Pascal) they don't require 'break's. In shell there's the ';;' symbol
to 'break' a branch; as a small syntactical token it is less bulky compared
to C's (and gawk's) 'break', but it's there. There's also a fall-through
token ';&' invented by Kornshell decades ago (probably also available in
bash meanwhile, haven't tried) to achieve what you get in C and GNU awk,
i.e. the behaviour you have been puzzled about.

(We've been digressing it seems. :-)

Janis

Ed Morton

unread,
Oct 12, 2017, 12:33:00 AM10/12/17
to
On 10/11/2017 9:47 PM, Janis Papanagnou wrote:
<snip>
> Erm, in C there's also the 'break' statement, and if omitted in a 'switch'
> branch the fall-through effect

Ugh of course. I only wrote C daily for 30 years. Didn't engage my brain before
posting that...

Like I say, I really just don't see enough benefits to bother with gawks switch
statement and you're right, it's just a preference.

Ed.

Thomas 'PointedEars' Lahn

unread,
Oct 19, 2017, 7:26:47 AM10/19/17
to
Bit Twister wrote:
^^^^^^^^^^^
It is considered polite here to post using your real name.

> On Tue, 10 Oct 2017 21:09:47 -0500, Marek Novotny wrote:
>> On 2017-10-11, Bit Twister <BitTw...@mouse-potato.com> wrote:
>> But if I understand what you're saying, you intend to split the line of
>> text by the tab and have the underscores affect the first half and not
>> the second half. Is that correct?
>
> Here is the before and after desired results.
>
> go inside and poke around mock -r mageia-cauldron-i586 --shell
> _go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell

awk -F $'\t' '{ gsub(" ", "_", $1); printf "_%s_\t%s\n", $1, $2; }'

--
PointedEars

<https://github.com/PointedEars> | <http://PointedEars.de/wsvn/>
Twitter: @PointedEars2 | Please do not cc me./Bitte keine Kopien per E-Mail.

Bit Twister

unread,
Oct 19, 2017, 7:57:37 AM10/19/17
to
On Thu, 19 Oct 2017 13:26:41 +0200, Thomas 'PointedEars' Lahn wrote:
> Bit Twister wrote:
> ^^^^^^^^^^^
> It is considered polite here to post using your real name.
>
>> On Tue, 10 Oct 2017 21:09:47 -0500, Marek Novotny wrote:
>>> On 2017-10-11, Bit Twister <BitTw...@mouse-potato.com> wrote:
>>> But if I understand what you're saying, you intend to split the line of
>>> text by the tab and have the underscores affect the first half and not
>>> the second half. Is that correct?
>>
>> Here is the before and after desired results.
>>
>> go inside and poke around mock -r mageia-cauldron-i586 --shell
>> _go_inside_and_poke_around_ mock -r mageia-cauldron-i586 --shell
>
> awk -F $'\t' '{ gsub(" ", "_", $1); printf "_%s_\t%s\n", $1, $2; }'

Solutions have already been posted to the thread.
Yours does not work if there are more than one tab in the line.

Thomas 'PointedEars' Lahn

unread,
Oct 19, 2017, 8:28:36 AM10/19/17
to
Bit Twister wrote:
^^^^^^^^^^^
Which part of “real name” did you not understand?
That does not mean that they are better.

<https://en.wikipedia.org/wiki/Dunning–Kruger_effect>

> Yours does not work if there are more than one tab in the line.

Your example input did not include such a case. My solution can be easily
adapted.

Next time, do your own homework.

Score adjusted.

Kaz Kylheku

unread,
Oct 19, 2017, 3:05:30 PM10/19/17
to
Plus, a few extra keystrokes to eliminate a gratuitous Bash extension:

awk -v FS='\t' ...

Kenny McCormack

unread,
Oct 19, 2017, 4:31:11 PM10/19/17
to
In article <201710191...@kylheku.com>,
$ printf "one\ttwo\tthree\n"|awk '-F\t' '{ for (i=1; i<=NF; i++) print i,$i}'
1 one
2 two
3 three
$

--
To my knowledge, Jacob Navia is not a Christian.

- Rick C Hodgin -

Kaz Kylheku

unread,
Oct 19, 2017, 5:03:07 PM10/19/17
to
On 2017-10-19, Kenny McCormack <gaz...@shell.xmission.com> wrote:
> In article <201710191...@kylheku.com>,
> Kaz Kylheku <217-67...@kylheku.com> wrote:
>>On 2017-10-19, Bit Twister <BitTw...@mouse-potato.com> wrote:
>>> On Thu, 19 Oct 2017 13:26:41 +0200, Thomas 'PointedEars' Lahn wrote:
>>>> awk -F $'\t' '{ gsub(" ", "_", $1); printf "_%s_\t%s\n", $1, $2; }'
>>>
>>> Solutions have already been posted to the thread.
>>> Yours does not work if there are more than one tab in the line.
>>
>>Plus, a few extra keystrokes to eliminate a gratuitous Bash extension:
>>
>> awk -v FS='\t' ...
>>
>
> $ printf "one\ttwo\tthree\n"|awk '-F\t' '{ for (i=1; i<=NF; i++) print i,$i}'
> 1 one
> 2 two
> 3 three

Good one; in fact POSIX says (right as the first item under OPTIONS)
that -F sepstring "shall be equivalent to -V FS=sepstring".

Janis Papanagnou

unread,
Oct 19, 2017, 6:53:48 PM10/19/17
to
s/Bash/{Ksh,Zsh,Bash}/

Janis

Thomas 'PointedEars' Lahn

unread,
Oct 21, 2017, 8:06:49 PM10/21/17
to
As you have realized in the meantime, “-F” is a feature of POSIX awk, so
there is no need for the more obscure “-v FS=…” (note that the option name
“-v” must be *lowercase*, different to your self-correction).

<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html>

$'\t' is longer than the *here*-sufficient '\t', and it does not appear to
be part of the POSIX Shell Grammar, but it is _not_ a Bash extension.
Rather, it may be one of the features that Bash inherited from KornShell93:

<http://web.archive.org/web/20151025145158/http://www2.research.att.com:80/sw/download/man/man1/ksh.html>
(archived version of what is unfortunately still dead-linked on
<http://www.kornshell.com/docs>)

or e.g. <https://docs.oracle.com/cd/E36784_01/html/E36870/ksh-1.html>

(it is not obvious; search for “$'”).

POSIX1-2008 allows the syntax $'…' in that it specifies:

,-
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06>
|
| […]
|
| If an unquoted '$' is followed by a character that is not one of the
| following:
|
| - A numeric character
| - The name of one of the special parameters (see Special Parameters)
| - A valid first character of a variable name
| - A <left-curly-bracket> ( '{' )
| - A <left-parenthesis>
|
| the result is unspecified.

--
PointedEars

Thomas 'PointedEars' Lahn

unread,
Oct 21, 2017, 8:22:27 PM10/21/17
to
As you have realized in the meantime, “-F” is a feature of POSIX awk, so
there is no need for the more obscure “-v FS=…” (note that the option name
“-v” must be *lowercase*, different to your self-correction).

<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html>

$'\t' is longer than the *here*-sufficient '\t', and it does not appear to
be part of the POSIX Shell Grammar, but it is _not_ a Bash extension.
Rather, it may be one of the features that Bash inherited from KornShell93:

<http://web.archive.org/web/20151025145158/http://www2.research.att.com:80/sw/download/man/man1/ksh.html>
(archived version of what is unfortunately still dead-linked on
<http://www.kornshell.com/docs>)

or e.g. <https://docs.oracle.com/cd/E36784_01/html/E36870/ksh-1.html>

It is not obvious, but if you search for “$'” in the manpage, you find under
“Parameter Expansion”:

| The following variables are used by the shell:
|
| […]
| TIMEFORMAT
| […] If unset, the default value
| $'\nreal\t%2lR\nuser\t%2lU\nsys\t%2lS', is used.

indicating that KornShell93 supports this. If you look further, you find

| Quoting
|
| […] A single quoted string preceded by an unquoted $ is processed as an
| ANSI-C string except for the following:
|
| \0 Causes the remainder of the string to be ignored.
| \E Equivalent to the escape character (ascii 033),
| \e Equivalent to the escape character (ascii 033),
| \cx Expands to the character control-x.
| \C[.name.]
| Expands to the collating element name.
|
| […] A $ in front of a double quoted string will be ignored in the "C" or
| "POSIX" locale, and may cause the string to be replaced by a locale
| specific string otherwise.

The latter is a very useful extension as it makes it easy to make
shellscripts locale-aware.

See also:

<https://www.gnu.org/software/gettext/manual/html_node/Preparing-Shell-Scripts.html#Preparing-Shell-Scripts>


POSIX1-2008 allows the syntaxes $'…' and $"…" in that it specifies:

,-
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06>
|
| […]
|
| If an unquoted '$' is followed by a character that is not one of the
| following:
|
| - A numeric character
| - The name of one of the special parameters (see Special Parameters)
| - A valid first character of a variable name
| - A <left-curly-bracket> ( '{' )
| - A <left-parenthesis>
|
| the result is unspecified.

--
PointedEars

Twitter: @PointedEars2

Pyt T.

unread,
Oct 23, 2017, 12:37:19 PM10/23/17
to
On Sun, 22 Oct 2017 02:22:21 +0200, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
> ..
> $'\t' is longer than the *here*-sufficient '\t', and it does not appear to
> be part of the POSIX Shell Grammar, but it is _not_ a Bash extension.
> Rather, it may be one of the features that Bash inherited from KornShell93:

ANSI C strings
http://www.gnu.org/software/bash/manual/html_node/ANSI_002dC-Quoting.html#ANSI_002dC-Quoting
The book "the new korn shell command and programming language" writes
"ANSI C Strings are available only on versions of ksh newer than the
11/16/88 version".

Janis Papanagnou

unread,
Oct 23, 2017, 12:56:38 PM10/23/17
to
On 23.10.2017 18:37, Pyt T. wrote:
> On Sun, 22 Oct 2017 02:22:21 +0200, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
>> ..
>> $'\t' is longer than the *here*-sufficient '\t', and it does not appear to
>> be part of the POSIX Shell Grammar, but it is _not_ a Bash extension.
>> Rather, it may be one of the features that Bash inherited from KornShell93:
>
> ANSI C strings
> http://www.gnu.org/software/bash/manual/html_node/ANSI_002dC-Quoting.html#ANSI_002dC-Quoting

Yes it's a bash extension, but neither a bash-specific extension nor a
feature invented by bash.

> The book "the new korn shell command and programming language" writes
> "ANSI C Strings are available only on versions of ksh newer than the
> 11/16/88 version".

Note that this may be relevant on historic Unixes and commercial Unixes
that have only support for an old ksh88. The 1993 version of ksh (which
is commonly available for quite a long time now, and is also available
in some prominent commericial Unixes) supports it.

Janis

Pyt T.

unread,
Oct 23, 2017, 1:20:07 PM10/23/17
to
I think the modern commercial Unixes ship with a version of ksh88 newer
than 11/16/88 nowadays so for this feature ksh93 is not necessary.

Kaz Kylheku

unread,
Oct 23, 2017, 2:20:44 PM10/23/17
to
Anything in Bash that is not in POSIX is a "Bash extension", regardless
of what else has it, and what had it first.

Thomas 'PointedEars' Lahn

unread,
Oct 23, 2017, 5:10:22 PM10/23/17
to
Pyt T. wrote:
^^^^^^
It is considered polite here to post using one’s real name.
^^
Which part of “KornShell93” did you not get?

Thomas 'PointedEars' Lahn

unread,
Oct 23, 2017, 5:10:37 PM10/23/17
to
Kaz Kylheku wrote:

> Anything in Bash that is not in POSIX is a "Bash extension", regardless
> of what else has it, and what had it first.

Nonsense.

Kaz Kylheku

unread,
Oct 23, 2017, 6:06:41 PM10/23/17
to
On 2017-10-23, Kaz Kylheku <217-67...@kylheku.com> wrote:
> Anything in Bash that is not in POSIX is a "Bash extension", regardless
^ documented as a feature
[of course]

Janis Papanagnou

unread,
Oct 23, 2017, 6:22:13 PM10/23/17
to
On 23.10.2017 23:10, Thomas 'PointedEars' Lahn wrote:
> Pyt T. wrote:
[...]
>>
>> ANSI C strings
>> http://www.gnu.org/software/bash/manual/html_node/ANSI_002dC-Quoting.html#ANSI_002dC-Quoting
>> The book "the new korn shell command and programming language" writes
>> "ANSI C Strings are available only on versions of ksh newer than the
>> 11/16/88 version".
> ^^
> Which part of “KornShell93” did you not get?

The general problem with this view is that between version "11/16/88" and
version "12/28/93" there have been features added with some patch levels.
So the [pathological] phrase "newer than the 11/16/88 version" does not
guarantee that it's not already been added in a ksh88 version before ksh93
and does not necessarily imply non-availability in some ksh88 versions.

In this concrete case, though, it is documented that ANSI-strings had been
added with the "12/28/93" version.

Janis

Janis Papanagnou

unread,
Oct 23, 2017, 6:22:28 PM10/23/17
to
On 23.10.2017 19:19, Pyt T. wrote:
>>[...]
>
> I think the modern commercial Unixes ship with a version of ksh88 newer
> than 11/16/88 nowadays so for this feature ksh93 is not necessary.

Unfortunately this is not guaranteed. You may find an older ksh88, or a
ksh88 version with extensions, or a ksh88 as default but ksh93 shipped
as well, etc.

Janis

Pyt T.

unread,
Oct 24, 2017, 1:20:20 AM10/24/17
to
Yes, I should have known.

Pyt T.

unread,
Oct 24, 2017, 1:34:51 AM10/24/17
to
On Mon, 23 Oct 2017 23:10:15 +0200, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
> Pyt T. wrote:
> ^^^^^^
> It is considered polite here to post using one’s real name.

I like your input but please stay on topic and do not make assumptions about
my name.

Thomas 'PointedEars' Lahn

unread,
Nov 10, 2017, 5:08:42 PM11/10/17
to
Pyt T. wrote:

> On Mon, 23 Oct 2017 23:10:15 +0200, Thomas 'PointedEars' Lahn
> <Point...@web.de> wrote:
>> Pyt T. wrote:
>> ^^^^^^
>> It is considered polite here to post using one’s real name.
>
> I like your input but please stay on topic

I did. You did not.

> and do not make assumptions about my name.

It is not an assumption. While “Pyt” might be a real first name, “T.” is
not a proper last name.

Eric

unread,
Nov 11, 2017, 4:40:07 AM11/11/17
to
On 2017-11-10, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
> Pyt T. wrote:
>
>> On Mon, 23 Oct 2017 23:10:15 +0200, Thomas 'PointedEars' Lahn
>> <Point...@web.de> wrote:
>>> Pyt T. wrote:
>>> ^^^^^^
>>> It is considered polite here to post using one’s real name.
>>
>> I like your input but please stay on topic
>
> I did. You did not.
>
>> and do not make assumptions about my name.
>
> It is not an assumption. While “Pyt” might be a real first name, “T.” is
> not a proper last name.

You really need to read this:

http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

Eric
--
ms fnd in a lbry

Thomas 'PointedEars' Lahn

unread,
Nov 11, 2017, 7:37:42 AM11/11/17
to
</killfile>
[You are a known troll at the bottom of my scorefile,
but for clarification I’ll bite this time.]

I already know this. If you had read what you cite, you would find that the
condition here is not described therein.

As the “.” in “T.” shows, it is *obviously* an abbreviation because the
paranoid poster did not want to reveal their last name, which they *have*
(otherwise this would look *quite* differently). If “Pyt T.” is even the
abbreviation of a real name.

HTH

<killfile>

Jim Beard

unread,
Nov 11, 2017, 9:33:49 AM11/11/17
to
On Sat, 11 Nov 2017 13:37:35 +0100, Thomas 'PointedEars' Lahn wrote:

> </killfile>
>
> Eric wrote:
>
>> On 2017-11-10, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
>>> Pyt T. wrote:
>>>> and do not make assumptions about my name.
>>> It is not an assumption. While “Pyt” might be a real first name, “T.”
>>> is not a proper last name.
>>
>> You really need to read this:
>>
>> http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-
about-names/
>
> [You are a known troll at the bottom of my scorefile,
> but for clarification I’ll bite this time.]
>
> I already know this. If you had read what you cite, you would find that
> the condition here is not described therein.
>
> As the “.” in “T.” shows, it is *obviously* an abbreviation because the
> paranoid poster did not want to reveal their last name, which they
> *have* (otherwise this would look *quite* differently). If “Pyt T.” is
> even the abbreviation of a real name.

Within the United States, your "real name" is the name you use as your
real name.

You can use as your entire name the capital letter T followed by a
period, and if you routinely use that as your name it is your name,
legally.

If it differs from your birth certificate name (or any other name you use
or have used -- yes, you can use more than one name) you can go to court
and have a judge decree it to be your legal name, but that is
superfluous. It merely makes your name recognized in an official public
record, but the name is no less valid if that official public record does
not exist.

Cheers!

jim b.

--
UNIX is not user-unfriendly, it merely expects users to be computer-
friendly.

Eric

unread,
Nov 11, 2017, 10:40:08 AM11/11/17
to
On 2017-11-11, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
></killfile>
>
> Eric wrote:
>
>> On 2017-11-10, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
>>> Pyt T. wrote:
>>>> and do not make assumptions about my name.
>>> It is not an assumption. While “Pyt” might be a real first name, “T.” is
>>> not a proper last name.
>>
>> You really need to read this:
>>
>> http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
>
> [You are a known troll at the bottom of my scorefile,
> but for clarification I’ll bite this time.]

If you think I am a troll then I have a nice bridge to sell you :-)

In fact I don't post to provoke a response, I post when I have something
to say.

> I already know this. If you had read what you cite, you would find that the
> condition here is not described therein.

Admittedly the 40 points do not include one about punctuation. It is a
pity that the comments which were formerly there are gone. They included
other points, including the problems of people with names like
O'Donnell. In any case, to quote from just below the list, "This list is
by no means exhaustive."

> As the “.” in “T.” shows, it is *obviously* an abbreviation because the
> paranoid poster did not want to reveal their last name, which they *have*
> (otherwise this would look *quite* differently). If “Pyt T.” is even the
> abbreviation of a real name.
>
><killfile>

There is nothing obvious about it. It appears to conform to a well-known
convention for abbreviations, but that is not evidence that is *is* an
abbreviation.

In general it seems that you are very fond of rules, and of pointing out
other peoples breaches of them, regardless of whether a rule is
universal or even applicable, and regardless of common sense.

Oh, and you read your killfile!

Happy November,

Ivan Shmakov

unread,
Nov 11, 2017, 11:32:28 AM11/11/17
to
>>>>> Jim Beard <jim....@verizon.net> writes:
>>>>> On Sat, 11 Nov 2017 13:37:35 +0100, Thomas 'PointedEars' Lahn wrote:

[...]

>> As the “.” in “T.” shows, it is *obviously* an abbreviation because
>> the paranoid poster did not want to reveal their last name, which
>> they *have* (otherwise this would look *quite* differently). If
>> “Pyt T.” is even the abbreviation of a real name.

> Within the United States, your “real name” is the name you use as
> your real name.

> You can use as your entire name the capital letter T followed by a
> period, and if you routinely use that as your name it is your name,
> legally.

> If it differs from your birth certificate name (or any other name you
> use or have used – yes, you can use more than one name) you can go to
> court and have a judge decree it to be your legal name, but that is
> superfluous. It merely makes your name recognized in an official
> public record, but the name is no less valid if that official public
> record does not exist.

… And as Usenet itself has originated in the US, and the “Big 8”
newsgroups still seem largely US-centric, I’d say that the
established legal practices there /are/ relevant to the matters
of netiquette.

(Yet it makes wonder if Thomas could teach one Mr. T some good
manners. Or if George Sand was the epitome of rudeness.)

To think of it, my /real/ “real” name is written in Cyrillic
script, second name first, and has one more word at the end.
Should I feel shame of /not/ using it for 15 years now while
contributing to free software, Usenet, and a few random Web
resources?

(And I consider myself lucky enough not to one named ప్లేటో, 太郎
or 刘, for either of these names would surely have brought all
kind of joy to an average Usenet Joe.)

--
FSF associate member #7257 np. The Longest Day — Out of the Shadows

Pyt T.

unread,
Nov 12, 2017, 9:06:02 AM11/12/17
to
On Fri, 10 Nov 2017 23:08:39 +0100, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
> Pyt T. wrote:
>
>> On Mon, 23 Oct 2017 23:10:15 +0200, Thomas 'PointedEars' Lahn
>> <Point...@web.de> wrote:
>>> Pyt T. wrote:
>>> ^^^^^^
>>> It is considered polite here to post using one’s real name.
>>
>> I like your input but please stay on topic
>
> I did. You did not.

Huh?

RFC1855 says:
"Messages and articles should be brief and to the point. Don't
wander off-topic, don't ramble and don't send mail or post
messages solely to point out other people's errors in typing
or spelling. These, more than any other behavior, mark you
as an immature beginner."


>> and do not make assumptions about my name.
>
> It is not an assumption. While “Pyt” might be a real first name, “T.” is
> not a proper last name.

What do you mean with "real name"? Is "Thomas 'PointedEars' Lahn" your
real name?

Thomas 'PointedEars' Lahn

unread,
Nov 12, 2017, 10:43:43 AM11/12/17
to
Jim Beard wrote:

> Within the United States, your "real name" is the name you use as your
> real name.

That is simply untrue. It is also irrelevant.

<https://en.wikipedia.org/wiki/Legal_name#United_States>

IANAL.

Thomas 'PointedEars' Lahn

unread,
Nov 12, 2017, 10:57:22 AM11/12/17
to
Pyt T. wrote:

> On Fri, 10 Nov 2017 23:08:39 +0100, Thomas 'PointedEars' Lahn
> <Point...@web.de> wrote:
>> Pyt T. wrote:
>>> On Mon, 23 Oct 2017 23:10:15 +0200, Thomas 'PointedEars' Lahn
>>> <Point...@web.de> wrote:
>>>> Pyt T. wrote:
>>>> ^^^^^^
>>>> It is considered polite here to post using one’s real name.
>>>
>>> I like your input but please stay on topic
>>
>> I did. You did not.
>
> Huh?
>
> RFC1855 says:
> "Messages and articles should be brief and to the point. Don't
> wander off-topic, don't ramble and don't send mail or post
> messages solely to point out other people's errors in typing
> or spelling. These, more than any other behavior, mark you
> as an immature beginner."

But I did not do any of those things. *You* made it all off-topic when you
did not refer at all to the *on-topic* content of my posting further below.

>>> and do not make assumptions about my name.
>>
>> It is not an assumption. While “Pyt” might be a real first name, “T.” is
>> not a proper last name.
>
> What do you mean with "real name"?

The name that can be found in your legal documents, i.e. your birth
certificate, identity card, driver’s license, passport aso. If you have
second, third aso. names (as opposed to only a first name), they are
optional here (because storage and display space is limited), unless you
think it helps using them to distinguish your postings from other postings
of people with similar names. Many people use initials for their second,
third aso. name in order to keep their real name short.

> Is "Thomas 'PointedEars' Lahn" your real name?

Read my signature. Also, Google is your friend. [psf 6.1] „Thomas Lahn“ is
my real name. “PointedEars” is my nickname, under which I want to be known
(for rather obvious reasons) and I am known on the Net; it is how I can be
distinguished on the Net from other Thomas Lahns. It is therefore customary
to insert one’s nickname this way, if one wants to use any. Another
possibility is to append the nickname in parentheses.

Did you really not know that?

Thomas 'PointedEars' Lahn

unread,
Nov 12, 2017, 11:09:11 AM11/12/17
to
Since we are off-topic anyway…

Ivan Shmakov wrote:

>>>>>> Jim Beard <jim....@verizon.net> writes:
>>>>>> On Sat, 11 Nov 2017 13:37:35 +0100, Thomas 'PointedEars' Lahn wrote:

<http://www.netmeister.org/news/learn2quote.html>

> > Within the United States, your “real name” is the name you use as
> > your real name.
> > [more irrelevant nonsense]
>
> … And as Usenet itself has originated in the US, and the “Big 8”
> newsgroups still seem largely US-centric, I’d say that the
> established legal practices there /are/ relevant to the matters
> of netiquette.

Ex falso quodlibet. Also, you are not all in a position to teach other
manners as you are willfully ignoring Usenet conventions, starting with
your irritating use of prefixes which are supposed to indicate quotation
levels.

> (Yet it makes wonder if Thomas could teach one Mr. T some good
> manners. Or if George Sand was the epitome of rudeness.)

Both are real names in the way this term is meant.

Besides, Mr. T’s name does not end with a period, and he is not likely to
post on Usenet; neither is George Sand, for a different reason.

You are just trolling again. I can promise you that if you keep this up,
you will end in at least one killfile – is that *really* what you want?

Score adjusted, F’up2 poster

Thomas 'PointedEars' Lahn

unread,
Nov 12, 2017, 11:09:42 AM11/12/17
to
Since we are off-topic anyway…

Ivan Shmakov wrote:

>>>>>> Jim Beard <jim....@verizon.net> writes:
>>>>>> On Sat, 11 Nov 2017 13:37:35 +0100, Thomas 'PointedEars' Lahn wrote:

<http://www.netmeister.org/news/learn2quote.html>

> > Within the United States, your “real name” is the name you use as
> > your real name.
> > [more irrelevant nonsense]
>
> … And as Usenet itself has originated in the US, and the “Big 8”
> newsgroups still seem largely US-centric, I’d say that the
> established legal practices there /are/ relevant to the matters
> of netiquette.

Ex falso quodlibet. Also, you are not at all in a position to teach others
manners as you are willfully ignoring Usenet conventions, starting with your
irritating use of prefixes which are supposed to indicate quotation levels.

> (Yet it makes wonder if Thomas could teach one Mr. T some good
> manners. Or if George Sand was the epitome of rudeness.)

Both are real names in the way this term is meant.

Besides, Mr. T’s name does not end with a period, and he is not likely to
post on Usenet; neither is George Sand, for a different reason.

You are just trolling again. I can promise you that if you keep this up,
you will end in at least one killfile – is that *really* what you want?

Score adjusted, F’up2 poster

Ivan Shmakov

unread,
Nov 12, 2017, 11:24:47 AM11/12/17
to
>>>>> Thomas 'PointedEars' Lahn <Point...@web.de> writes:
>>>>> Jim Beard wrote:

>> Within the United States, your "real name" is the name you use as
>> your real name.

> That is simply untrue.

The Wikipedia article section you've referenced below seem to
support Jim's position. For instance:

Most state courts have held that a legally assumed name (i. e., for
a non-fraudulent purpose) is a legal name and usable as their true
name, though assumed names are often not considered the person's
technically true name.

> It is also irrelevant.

I'd think it would've been beneficial to the community if you've
took effort to explain the reasoning behind your statement.

(Just a suggestion, mind you. It isn't my intent to teach
anyone manners in this group, or other comp.* ones I follow, for
we have more than enough volunteers on that task already.)

> <https://en.wikipedia.org/wiki/Legal_name#United_States>

> IANAL.

--
FSF associate member #7257 http://am-1.org/~ivan/

Pyt T.

unread,
Nov 12, 2017, 11:58:42 AM11/12/17
to
On Sun, 12 Nov 2017 16:57:16 +0100, Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
> Pyt T. wrote:
>> On Fri, 10 Nov 2017 23:08:39 +0100, Thomas 'PointedEars' Lahn
>> <Point...@web.de> wrote:
>>> Pyt T. wrote:
>>>> On Mon, 23 Oct 2017 23:10:15 +0200, Thomas 'PointedEars' Lahn
>>>> <Point...@web.de> wrote:
>>>>> Pyt T. wrote:
>>>>> ^^^^^^
>>>>> It is considered polite here to post using one’s real name.
>>>>
>>>> I like your input but please stay on topic
>>>
>>> I did. You did not.
>>
>> Huh?
>>
>> RFC1855 says:
>> "Messages and articles should be brief and to the point. Don't
>> wander off-topic, don't ramble and don't send mail or post
>> messages solely to point out other people's errors in typing
>> or spelling. These, more than any other behavior, mark you
>> as an immature beginner."
>
> But I did not do any of those things. *You* made it all off-topic when you
> did not refer at all to the *on-topic* content of my posting further below.

This is a group about unix shell(s) and you start commenting about the
"From" field in my message(s) which is pretty off topic here.
Then you ask a question that was already "answered" in
<osl8bv$jfj$1...@gioia.aioe.org>.

>>>> and do not make assumptions about my name.
>>> It is not an assumption. While “Pyt” might be a real first name, “T.” is
>>> not a proper last name.
>>
>> What do you mean with "real name"?
>
> The name that can be found in your legal documents, ..
> ..
>> Is "Thomas 'PointedEars' Lahn" your real name?
>
> Read my signature. Also, Google is your friend. [psf 6.1] „Thomas Lahn“ is

Is see "PointedEars" in your sig, and I do not use twitter. I do not see
what your signature has to do with your real name except that you put
that nick in between your first and last name but for that to know I
also have to inspect your "From" header field.

> my real name. “PointedEars” is my nickname, under which I want to be known
> (for rather obvious reasons) and I am known on the Net; it is how I can be
> distinguished on the Net from other Thomas Lahns. It is therefore customary
> to insert one’s nickname this way, if one wants to use any. Another
> possibility is to append the nickname in parentheses.
>
> Did you really not know that?

I never felt the need to google for Thomas Lahn. Actually, most of the
time the headers from usenet messages do not interest me except the
subject and maybe the date and sometimes the From field may be fun to
watch, I know there are people using that field for kill files.

You consider it polite here (this group? usenet? internet?) to post using one’s
real name, why? Nicknames are very, very common on usenet. I admit "Pyt
T." is a nickname. If you want to know for whatever reason my real name
just send me a mail, my address is in the "From" field.

Thomas 'PointedEars' Lahn

unread,
Nov 12, 2017, 1:02:27 PM11/12/17
to
Pyt T. wrote:

> On Sun, 12 Nov 2017 16:57:16 +0100, Thomas 'PointedEars' Lahn
> <Point...@web.de> wrote:
>> *You* made it all off-topic when you did not refer at all to the
>> *on-topic* content of my posting further below.
>
> This is a group about unix shell(s) and you start commenting about the
> "From" field in my message(s) which is pretty off topic here.

This was *just one line* of my posting. (OK, two, with the markers.)

> Then you ask a question that was already "answered" in
> <osl8bv$jfj$1...@gioia.aioe.org>.

Further below in my posting I had asked (you) – sarcastically –
the *rhetorical* question:

,-<news:1997000.J...@PointedEars.de>
|
| Which part of “KornShell93” did you not get?

(“get” = “understand”)

Because I had just said before:

,-<news:4240212.m...@PointedEars.de>
|
| […] Rather, it may be one of the features that Bash inherited from
| KornShell93:
^^
You attempted to refute that by quoting, ridiculously

,-<news:osl5rk$f1k$1...@gioia.aioe.org>
|
| […] "ANSI C Strings are available only on versions of ksh newer than the
^^^^^^^^^^^^^^
| 11/16/88 version".
^^^^^^^^^^
93 > 88. Learn to read.

> [trolling]

Score adjusted.

Pyt T.

unread,
Nov 12, 2017, 2:20:05 PM11/12/17
to
Yes, Janis Papanagnou's replies made it all clear for me.

>> [trolling]

I did not start discussing From fields.

> Score adjusted.

Jim Beard

unread,
Nov 12, 2017, 5:19:14 PM11/12/17
to
On Sun, 12 Nov 2017 16:24:40 +0000, Ivan Shmakov wrote:

>>>>>> Thomas 'PointedEars' Lahn <Point...@web.de> writes:
>>>>>> Jim Beard wrote:
>
> >> Within the United States, your "real name" is the name you use as
> >> your real name.
>
> > That is simply untrue.
>
> The Wikipedia article section you've referenced below seem to
support
> Jim's position. For instance:
>
> Most state courts have held that a legally assumed name (i. e., for
> a non-fraudulent purpose) is a legal name and usable as their true
> name, though assumed names are often not considered the person's
> technically true name.
<snip>
The WIKI article basically agrees with me. The equivocation that
"assumed names are often not considered the person's technically true
name" is an equivocation, as they may be used at will and will be
recognized in a court of law.

U.S. law does prohibit using an assumed name for criminal purpose (e.g.
to commit fraud) and by implication prohibits using a trademarked name as
your own, though you may respond to a lawsuit by the trademark holder by
proving there can be no confusion between your use of the name and the
trademark. Some attempts to prove that fail, and the usual result is a
cease and desist order against future use and may include award of
damages (civil damages, not criminal) or conviction of a crime (if one is
involved, such as fraud).

Even if a name is found illegal, it will be recognized by a court as your
name (one among at least one and possibly multiple others) at the time
you used it.

The phrase "Most state courts have held" is significant, as each state
within the United States may create its own law on such matters. This is
usually not of consequence, as you may adopt the name in a state where it
is legal and the requirement of "equal treatment under the law" is
interpreted by Federal courts as requiring other states to recognize it.

Additional pettifogging on the subject is possible, and may be lucrative
for lawyers involved.

Thomas 'PointedEars' Lahn

unread,
Nov 13, 2017, 2:37:57 PM11/13/17
to
Jim Beard wrote:

[Quotation fixed]

> On Sun, 12 Nov 2017 16:24:40 +0000, Ivan Shmakov wrote:
>> Thomas 'PointedEars' Lahn <Point...@web.de> writes:
>>> Jim Beard wrote:
>> >> Within the United States, your "real name" is the name you use as
>> >> your real name.
>> > That is simply untrue.
>>
>> The Wikipedia article section you've referenced below seem to
>> support Jim's position. For instance:
>>
>> Most state courts have held that a legally assumed name (i. e., for
>> a non-fraudulent purpose) is a legal name and usable as their true
>> name, though assumed names are often not considered the person's
>> technically true name.
> <snip>
> The WIKI

_Wikipedia_

[Only people who have no clue what they are talking about say “Wiki”
(that’s just the umbrella term; Wikipedia is *a* wiki) or even “WIKI”
(it is _not_ an acronym).]

> article basically agrees with me.

No, basically it *disagrees* with you.

> The equivocation that "assumed names are often not considered the person's
> technically true name" is an equivocation, as they may be used at will and
> will be recognized in a court of law.
>
> U.S. law does prohibit using an assumed name for criminal purpose (e.g.
> to commit fraud) and by implication prohibits using a trademarked name as
> your own, [tl;dr]

IOW, an “assumed name” in this sense is _not_ a *legal* name, let alone a
*real* name (which was the *actual* subject). Thanks in advance for paying
attention (and stopping to post off-topic).

F‘up2 poster


Head shaking,
0 new messages