Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Extra lines

114 views
Skip to first unread message

solitary....@gmail.com

unread,
Apr 14, 2015, 8:22:56 PM4/14/15
to
I've been enjoying using AWK for cleaning up
text and in the process learning a new
language and techniques.

I was doing that today and ran into the
following:

---
c:\Users\Steve>echo 123 | gawk 'gsub(/^[ ]*/,"") | gawk '{print "|" $0 "|"}'
|123

c:\Users\Steve>
---

I believe the extra line is caused by echo.
But why is the second pipe not printed?

Thanks, Steve

Steve Graham

unread,
Apr 14, 2015, 8:24:23 PM4/14/15
to
I am using GNU Awk 3.0.4 on a Windows 7x64 system.

Steve

Ed Morton

unread,
Apr 14, 2015, 11:59:12 PM4/14/15
to
You're missing a closing quote on your first awk command. You do not need 2
separate awk commands and a pipe anyway but it's not clear what you're trying to
do (you seem to be trying to remove some leading spaces but there would be no
leading spaces in the above so idk) so post the real expected input and desired
output.

Ed.

Steve Graham

unread,
Apr 15, 2015, 11:16:28 AM4/15/15
to
On Tuesday, April 14, 2015 at 8:59:12 PM UTC-7, Ed Morton wrote:
Thanks, Ed. I'm trying to remove leading spaces and verify that by printing '|' before and after $0.

Here's another one:

C:\Users\Steve>echo 123 | gawk 'gsub(/^[ ]*/,"") {print "|" $0 "|"}'
|123
C:\Users\Steve>

I'the tried typing echo followed by 5 spaces and 123 and get 4 spaces and 123. So the echo does produce leading spaces.

Where is the final pipe going?

Steve

Steve Graham

unread,
Apr 15, 2015, 11:17:02 AM4/15/15
to
I've not I'the

Kenny McCormack

unread,
Apr 15, 2015, 11:59:10 AM4/15/15
to
In article <cb59ebcc-58dc-4729...@googlegroups.com>,
Steve Graham <solitary....@gmail.com> wrote:
...
>Here's another one:
>
>C:\Users\Steve>echo 123 | gawk 'gsub(/^[ ]*/,"") {print "|" $0 "|"}'
>|123
>C:\Users\Steve>
>
>I'the tried typing echo followed by 5 spaces and 123 and get 4 spaces and 123.
>So the echo does produce leading spaces.

My testing suggests otherwise.

>Where is the final pipe going?

There are so many things wrong with this that it is hard to know where to
start. Basically, gawk is unusable under DOS/Windows because of all the
command line quoting issues. Actually, it is usable if you always put your
program (script) in a file, the invoke it with "-f", but that largely
eliminates the benefit of using (g)awk. I.e., the main point is to be able
to do it "on-the-fly"/on the command line.

Here are just a few things to look out for:

1) The character "^" is special on the command line (it is the "escape"
character, like the "\" (backslash) in Unix). It needs to be
escaped (i.e., doubled).

2) The character "|" is special on the command line and needs to be
escaped.

3) It is not clear whether either or both of single or double quotes
work in DOS/Windows the way they do in Unix. The problem is that
it is very application-specific (unlike in Unix where it is
shell-specific - thus pretty much application-independent). This
usually ends up meaning that it depends on which software
development toolchain was used to build the particular version og
GAWK.EXE for DOS/Windows that you are using. Out of curiosity,
which build of GAWK.EXE for DOS/Windows are you using?

4) A lot (but not all) of these problems go away if you use TAWK (I.e.,
AWKW.EXE) instead of GAWK when working under DOS/Windows.
Unfortuately, TAWK is no longer being sold on the open market...

5) When you do: echo 123 | ...
the command after the pipe symbol only sees "123" (without the
quotes), not any of the leading and trailing spaces. As usual, it
is pretty easy to work around this issue on Unix, but difficult in
DOS/Windows.

6) Installing Cygwin on your system will go some way to making your
life better, but the best is to install something called "Virtual
Box", which allows you to run Linux directly on your PC.

--
"Every time Mitt opens his mouth, a swing state gets its wings."

(Should be on a bumper sticker)

Steve Graham

unread,
Apr 15, 2015, 1:01:44 PM4/15/15
to
On Wednesday, April 15, 2015 at 8:59:10 AM UTC-7, Kenny McCormack wrote:
> In article <cb59ebcc-58dc-4729...@googlegroups.com>,
Thanks for the comments. Putting it into a script worked in Windows. But I'm like you and would rather do AWK on the fly.

C:\Users\Steve>echo 123 | gawk '1'
123

C:\Users\Steve>echo 123 | gawk '{print $0}'
123

C:\Users\Steve>echo 123 | gawk 'gsub(/^[ ]*/,"") {text = $0; text = ("B" tex
t "A"); print text}'
A123

I'm missing something.

Steve

pop

unread,
Apr 15, 2015, 1:56:18 PM4/15/15
to
Steve Graham wrote on 4/15/2015 12:01 PM:
> On Wednesday, April 15, 2015 at 8:59:10 AM UTC-7, Kenny McCormack wrote:
<snip>
>
> Thanks for the comments. Putting it into a script worked in Windows. But I'm like you and would rather do AWK on the fly.
>
> C:\Users\Steve>echo 123 | gawk '1'
> 123
>
> C:\Users\Steve>echo 123 | gawk '{print $0}'
> 123
>
> C:\Users\Steve>echo 123 | gawk 'gsub(/^[ ]*/,"") {text = $0; text = ("B" tex
> t "A"); print text}'
> A123
>
> I'm missing something.
>
> Steve
>
To be perfectly correct in windows you would do:
echo 123 | gawk "gsub(/^[ ]*/,\"\") {text = $0; text = (\"B\" text
\"A\"); print text}"
outputs: B123 A
(mind the email line wrapping)
you need the double quotes since there are several windows special
characters in the line which aren't hidden by single quotes; namely, ^ (
) which may not be seen by gawk. None of my gawk versions will accept
single quotes so I can't test escaping windows specials with ^

HTH
pop (Mark)

Steve Graham

unread,
Apr 15, 2015, 2:15:19 PM4/15/15
to
Thanks, pop

C:\Users\Steve>echo 123 | gawk "gsub(/^[ ]*/,\"\") {text = $0; text = (\"B\" text \"A\"); print text}"
A123

It's almost like it's not seeing the "B".

Steve
C:\Users\Steve>

Kaz Kylheku

unread,
Apr 15, 2015, 2:44:35 PM4/15/15
to
On 2015-04-15, pop <p_...@hotmail.com> wrote:
> Steve Graham wrote on 4/15/2015 12:01 PM:
>> On Wednesday, April 15, 2015 at 8:59:10 AM UTC-7, Kenny McCormack wrote:
><snip>
>>
>> Thanks for the comments. Putting it into a script worked in Windows. But I'm like you and would rather do AWK on the fly.
>>
>> C:\Users\Steve>echo 123 | gawk '1'
>> 123
>>
>> C:\Users\Steve>echo 123 | gawk '{print $0}'
>> 123
>>
>> C:\Users\Steve>echo 123 | gawk 'gsub(/^[ ]*/,"") {text = $0; text = ("B" tex
>> t "A"); print text}'
>> A123
>>
>> I'm missing something.
>>
>> Steve
>>
> To be perfectly correct in windows you would do:
> echo 123 | gawk "gsub(/^[ ]*/,\"\") {text = $0; text = (\"B\" text
> \"A\"); print text}"

To be perfectly correct in windows, you have to know exactly how the given
program parses the command line, which it receives as a single character
string. CMD.EXE does some processing on it, such as processing the ^ escaping.

If gawk (or any other program) is linked to the Microsoft C library (e.g. MinGW
port) and uses main as the startup function, then the string is split on
spaces and tabs by the startup code in the library.

Other rules:

Double quotes prevent splitting. Single quotes are not recognized.
Backslashes escape literal double quotes. Backslashes also escape backslashes,
but in a quirky way.

I *think* it's something like: this: runs of backslashes not followed by a
quote are literal; they are not escapes:

abc\\\ d\\\ "\\\ \\x" \\\ -> {abc\\\} {d\\\} {\\\ \\x} {\\\}

Runs of backslashes immediately followed by a single quote *are* escapes.
In this situation, the quote itself may or may not be escaped.
In the following example, the "x y" variants demonstrate unescaped quotes:

\" \\"x y" \\\" \\\\"x y" -> {"} {\x y} {\"} {\\x y}

The caret character is *not* recognized; that is something processed by
CMD.EXE.

Aha, here is a reference (lovely MSDN URL):

https://msdn.microsoft.com/en-us/library/windows/desktop/17w5ykft%28v=vs.85%29.aspx

This URL will likely break in; the topic title search for is "Parsing C++
Command-Line Arguments".

Steve Graham

unread,
Apr 15, 2015, 3:15:02 PM4/15/15
to
On Wednesday, April 15, 2015 at 11:44:35 AM UTC-7, Kaz Kylheku wrote:
Thanks, Kaz. Too convoluted for my tastes.

I have git installed on my machine. When I try the following under git bash and awk or gawk, it gives the proper result:
echo 123 | gawk '{print $) 7}'
1237

When I try the same with gawk under the cmd.exe, I get:
echo 123 | gawk '{print $0 7}'
723

This is version GNU Awk 3.0.4

Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe for such.


Steve

Ed Morton

unread,
Apr 16, 2015, 1:04:26 AM4/16/15
to
On 4/15/2015 2:15 PM, Steve Graham wrote:
<snip>
> I have git installed on my machine. When I try the following under git bash and awk or gawk, it gives the proper result:
> echo 123 | gawk '{print $) 7}'

I assume you meant $0, not $).

> 1237
>
> When I try the same with gawk under the cmd.exe, I get:
> echo 123 | gawk '{print $0 7}'
> 723

My guess is that that is related to Windows appending a formfeed char to the end
of your text lines and so after $0 is printed the cursor is moved back to the
start of the line and then the 7 is printed, visually overwriting the 1. Pipe
the output to `cat -v` to check:

echo 123 | gawk '{print $0 7}' | cat -v

>
> This is version GNU Awk 3.0.4
>
> Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe for such.

Yes that is the right approach.

Ed.

>
> Steve
>

Steve Graham

unread,
Apr 16, 2015, 11:50:06 AM4/16/15
to
Someone suggested I upgrade gawk. I did to 3.1.6. Thanks.

Ed: Thanks for the suggestion of checking with 'cat -v'. It did reveal a ctrl-M.

With the upgrade, I am no longer able to use '' to surround the script: I must use "". This is under CMD.EXE . Running your script example, under CMD.EXE I get

123 7^M

Under sh using " I get

7^M

Under sh using ' I get

1237^M

How very odd. Never seen this amount of system specificity before.

Steve

Janis Papanagnou

unread,
Apr 16, 2015, 12:01:56 PM4/16/15
to
On 16.04.2015 17:50, Steve Graham wrote:
[...]
>>>
>>> Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe for such.

To me, this was the only noteworthy and sensible part of the thread.

>>> Steve
>
> Someone suggested I upgrade gawk. I did to 3.1.6. Thanks.

Gee! 3.1.6 is *old*. If you upgrade gawk go for a 4.x release.

>
> With the upgrade, I am no longer able to use '' to surround the script: I must use "". This is under CMD.EXE . Running your script example, under CMD.EXE I get

Better - don't - run - anything - under - cmd!

Why did you fall back to that route?

Janis

Kenny McCormack

unread,
Apr 16, 2015, 12:51:08 PM4/16/15
to
In article <mgomdj$vls$1...@news.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>On 16.04.2015 17:50, Steve Graham wrote:
>[...]
>>>>
>>>> Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe
>>>> for such.
>
>To me, this was the only noteworthy and sensible part of the thread.

(rest of anti-Windows advocacy - snipped - not because I don't agree with
it, but because we've all heard it before - in fact, a few posts back, I
did exactly that)

Which all really boils down to: Can GAWK be used under Windows?

I mean, if you can't use it, reasonably natively, under the default shell
(the much reviled, but very widely deployed, CMD.EXE), then can you, in any
meaningful way, say that it is usable? I say, no you cannot.

Or, to ask this another way, what alternative do you, Janis, recommend?

Do you recommend Cygwin (which is the best, although clearly not the only,
way to get a normal (Unix/sh/bash) shell under Windows) ?

Or do you, as I do, recommend Virtual Box, which gets you the whole
enchilada - not just a shell?

--
The motto of the GOP "base": You can't *be* a billionaire, but at least you
can vote like one.

Kaz Kylheku

unread,
Apr 16, 2015, 12:55:17 PM4/16/15
to
On 2015-04-15, Steve Graham <solitary....@gmail.com> wrote:
> On Wednesday, April 15, 2015 at 11:44:35 AM UTC-7, Kaz Kylheku wrote:
>> Aha, here is a reference (lovely MSDN URL):
>>
>> https://msdn.microsoft.com/en-us/library/windows/desktop/17w5ykft%28v=vs.85%29.aspx
>>
>> This URL will likely break in; the topic title search for is "Parsing C++
>> Command-Line Arguments".
>
> Thanks, Kaz. Too convoluted for my tastes.

It is even more convoluted than that. The above are just the rules for how programs
based on the Micorosft C library parse their command line into arguments. If you're
creating that command line in CMD.EXE, then you have to be aware of how CMD.EXE
munges things.

CMD.EXE processes only double quotes, not single quotes. When it sees a double quote,
it goes into quote mode, which means that special characters are not special, until the
next double quote.

These double quotes are *not* removed.

Special characters are %, ^, (, ), | and a few others. When these are outside of quotes,
they have to be escaped if you don't want their special meaning to apply. For instance,
the ampersand here means to run two commands in sequence:

C:\> echo foo & echo bar
foo
bar

You can escape that with quotes, but then the quotes get echoed:

C:\> echo foo "&" echo bar
foo "&" echo bar

(The echo command, clearly, is *not* an external program that is written using Microsoft's
C library and a main() function, because such a program would not see the quotes in
its argv[] array.)

You can escape with the caret. As you can see, it *is* consumed, unlike double quotes:

C:\> echo foo ^& echo bar
foo & echo bar

Double quotes can be escaped with carets like other special characters, to
suppress their meaning. Here the processing of % takes place between quotes:

C:\> echo ^"foo %PATH% bar^"
"foo [snip dump of PATH] bar"

Here it doesn't, but either way, the quotes are retained:

C:\> echo "foo %PATH% bar"
"foo %PATH% bar"

That's not all there is to it, but enough to get a basic understanding.

> I have git installed on my machine. When I try the following under git bash and awk or gawk, it gives the proper result:
> echo 123 | gawk '{print $) 7}'

Typo; should be $0

> 1237
> When I try the same with gawk under the cmd.exe, I get:
> echo 123 | gawk '{print $0 7}'
> 723

Here, the single quote means nothing to CMD.EXE. None of the characters are special, so the
gawk executable receives the command line as a single character string, quotes and all:

'{print $0 7}'

Something in the gawk program itself is dealing with the single quotes somehow,
according to some rules. Those rules are different from the Microsoft conventions
for parsing command lines, which do not recognize single quotes.

The output 723 looks like it could be result of 123 being printed, followed by
a carriage return (but not a newline). It's hard to understand why that would
happen; why would 123 be printed with no newline.

> This is version GNU Awk 3.0.4

I suspect that the gawk that you're accessing from cmd.exe and the one from
bash are not the same program.

> Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe for such.

CMD.EXE offers a poor language if you're using external tools which involve
writing in an entire programming language which is embedded in a command line
argument. It's mainly suitable for passing simple tokens to dumb,
non-programmable tools.

If you launch awk from cmd.exe, you're probably best off in putting the script in
a file and using "awk -f file".

Janis Papanagnou

unread,
Apr 16, 2015, 1:19:29 PM4/16/15
to
On 16.04.2015 18:51, Kenny McCormack wrote:
> In article <mgomdj$vls$1...@news.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>> On 16.04.2015 17:50, Steve Graham wrote:
>> [...]
>>>>>
>>>>> Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe
>>>>> for such.
>>
>> To me, this was the only noteworthy and sensible part of the thread.
>
> (rest of anti-Windows advocacy - snipped - not because I don't agree with
> it, but because we've all heard it before - in fact, a few posts back, I
> did exactly that)
>
> Which all really boils down to: Can GAWK be used under Windows?

Luckily, with the workaround of option -f, yes. This is not ideal, granted,
but hey, we're on a non-Unix platform, thus may accept self-restrictions to
some degree.

>
> I mean, if you can't use it, reasonably natively, under the default shell
> (the much reviled, but very widely deployed, CMD.EXE), then can you, in any
> meaningful way, say that it is usable? I say, no you cannot.
>
> Or, to ask this another way, what alternative do you, Janis, recommend?

Frankly, I am not in the position to suggest WinDOS folks what they should
do. (Specifically not if they think cmd.exe is a big tool.)

>
> Do you recommend Cygwin (which is the best, although clearly not the only,
> way to get a normal (Unix/sh/bash) shell under Windows) ?

Well, this is at least what *I* do to get something more appropriate. But
this is hardly an option that I would seriously suggest for the "normal"
users[*] on that plattform. Why? For one, because the installation tool is
a pain. Then you will buy (even with the basic installation level) far too
much what a user who is only interested in, say, awk, really doesn't need.
Usually Cygwin doesn't even satisfy me; I continue installing other things
as well, like ksh.

>
> Or do you, as I do, recommend Virtual Box, which gets you the whole
> enchilada - not just a shell?

I cannot comment on that; never used it.

Janis

[*] Oh, I like the ambiguity of putting "normal" [users] into quotes. :-)

Kaz Kylheku

unread,
Apr 16, 2015, 2:06:21 PM4/16/15
to
On 2015-04-16, Kenny McCormack <gaz...@shell.xmission.com> wrote:
> In article <mgomdj$vls$1...@news.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>>On 16.04.2015 17:50, Steve Graham wrote:
>>[...]
>>>>>
>>>>> Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe
>>>>> for such.
>>
>>To me, this was the only noteworthy and sensible part of the thread.
>
> (rest of anti-Windows advocacy - snipped - not because I don't agree with
> it, but because we've all heard it before - in fact, a few posts back, I
> did exactly that)
>
> Which all really boils down to: Can GAWK be used under Windows?
>
> I mean, if you can't use it, reasonably natively, under the default shell
> (the much reviled, but very widely deployed, CMD.EXE), then can you, in any
> meaningful way, say that it is usable? I say, no you cannot.

If gawk is ported using a run-time library which which follows the Microsoft
conventions for processing arguments (or is that library itself, as in the
MinGW case), then there is a reasonably well defined way to predict what
arguments it is actually getting, from a given CMD.EXE syntax.

Give me a Unix command line invoking awk, and I can probably translate it
to CMD.EXE which targets such an awk port. (The only question is then: how
"reasonably native" is the transliteration.)

For instance, multi-line script argument:

$ awk '/foo/ {
print
}'

In CMD.EXE:

C:\>c:\Cygwin\bin\gawk.exe ^"BEGIN {^
More? print

C:\>c:\Cygwin\bin\gawk.exe ^"/foo/ {^
More? print^
More? }^"
foobar
foobar
baz
xyzzy
football
football
gawk: cmd. line:1: (FILENAME=- FNR=4) fatal: error reading input file `-': Interrupted system call

The last thing there occurred when I hit ctrl-Z.

Explanation:

1. We need double quotes around the argument, so that the run-time library for the
Cygwin gawk.exe will recognize it as a single argument.

2. The CMD.EXE program will also recognize the quotes. It will preserve them in the
command line, but turn off its recognition of special characters. This is a problem!
We need the recognition of the ^ (caret escape) character. Therefore, we
escape the quotes with caret to turn off their CMD.EXE meaning.

3. We can escape the linefeed at the end of a line with caret. This allows the
command line to continue over multiple lines. CMD.EXE asks "More? " which is
not a yes/no question, but a prompt for more of the command line.

Ed Morton

unread,
Apr 16, 2015, 2:41:02 PM4/16/15
to
If I can liken Windows to Hell, and I think I can, that's a little like
descending into Hell and being surprised by all the flames...

Ed.

Kaz Kylheku

unread,
Apr 16, 2015, 3:04:22 PM4/16/15
to
On 2015-04-16, Ed Morton <morto...@gmail.com> wrote:
> If I can liken Windows to Hell, and I think I can, that's a little like
> descending into Hell and being surprised by all the flames...

The surprise is entirely legitimate if your descent into Windows indicates that
hell has frozen over.

Joep van Delft

unread,
Apr 16, 2015, 4:32:43 PM4/16/15
to
On Wed, 15 Apr 2015 10:01:43 -0700 (PDT)
Steve Graham <solitary....@gmail.com> wrote:

> I'm missing something.
>
> Steve

Yes. A proper, predictable shell. Give Cygwin a spin if you are tied
to MS, it will make your life better.

Kind regards,

Joep

Kenny McCormack

unread,
Apr 19, 2015, 7:12:18 PM4/19/15
to
In article <mgoquv$np$1...@news.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>On 16.04.2015 18:51, Kenny McCormack wrote:
>> In article <mgomdj$vls$1...@news.m-online.net>,
>> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>>> On 16.04.2015 17:50, Steve Graham wrote:
>>> [...]
>>>>>>
>>>>>> Guess I will stick with bash/(gawk or awk). Or stay away from cmd.exe
>>>>>> for such.
>>>
>>> To me, this was the only noteworthy and sensible part of the thread.
>>
>> (rest of anti-Windows advocacy - snipped - not because I don't agree with
>> it, but because we've all heard it before - in fact, a few posts back, I
>> did exactly that)
>>
>> Which all really boils down to: Can GAWK be used under Windows?
>
>Luckily, with the workaround of option -f, yes. This is not ideal, granted,
>but hey, we're on a non-Unix platform, thus may accept self-restrictions to
>some degree.

Yes, when working under Windows, you pretty much gotta use "-f".
But what I've been arguing all along is that that pretty much takes all the
fun out of it. Which is why I say it is (for all intents and purposes) not
usable under DOS/Windows.

>>
>> I mean, if you can't use it, reasonably natively, under the default shell
>> (the much reviled, but very widely deployed, CMD.EXE), then can you, in any
>> meaningful way, say that it is usable? I say, no you cannot.
>>
>> Or, to ask this another way, what alternative do you, Janis, recommend?
>
>Frankly, I am not in the position to suggest WinDOS folks what they should
>do. (Specifically not if they think cmd.exe is a big tool.)

No one will crime you for expressing an opinion. Come on, we know you've
got one!

>>
>> Do you recommend Cygwin (which is the best, although clearly not the only,
>> way to get a normal (Unix/sh/bash) shell under Windows) ?
>
>Well, this is at least what *I* do to get something more appropriate. But
>this is hardly an option that I would seriously suggest for the "normal"
>users[*] on that plattform. Why? For one, because the installation tool is
>a pain. Then you will buy (even with the basic installation level) far too
>much what a user who is only interested in, say, awk, really doesn't need.
>Usually Cygwin doesn't even satisfy me; I continue installing other things
>as well, like ksh.

OK

>> Or do you, as I do, recommend Virtual Box, which gets you the whole
>> enchilada - not just a shell?
>
>I cannot comment on that; never used it.

Look into it. It makes Windows usable...

>Janis
>
>[*] Oh, I like the ambiguity of putting "normal" [users] into quotes. :-)
>

Indeed.

--
Watching ConservaLoons playing with statistics and facts is like watching a
newborn play with a computer. Endlessly amusing, but totally unproductive.

Steve Graham

unread,
Apr 21, 2015, 12:34:23 PM4/21/15
to
Thanks to all who helped and/or expressed an opinion.

Steve

David Thompson

unread,
Apr 26, 2015, 7:17:29 AM4/26/15
to
On Wed, 15 Apr 2015 15:59:09 +0000 (UTC), gaz...@shell.xmission.com
(Kenny McCormack) wrote:
<snip: gawk under Windows is hard>
> Here are just a few things to look out for:
<agree with most, but>
>
> 5) When you do: echo 123 | ...
> the command after the pipe symbol only sees "123" (without the
> quotes), not any of the leading and trailing spaces. As usual, it
> is pretty easy to work around this issue on Unix, but difficult in
> DOS/Windows.
>
Nope. *One* space (or certain special chars!) following/terminating
the name echo is swallowed, but all/any remaining spaces (and
doublequotes) are kept. Unless the remainder is empty or *all* spaces
or the special value ON or OFF eithercase plus optional spaces, in
which case it displays or changes (batch) *command* echoing instead.

Contrast Unix where echo like all commands tokenizes the line by
whitespace (or more exactly using variable IFS which defaults to
whitespace) except where quoted ( ' " \ ), and echo then joins them
back together with single space so
echo 123 456 | cat gets you 123 456

David Thompson

unread,
Apr 26, 2015, 7:20:01 AM4/26/15
to
On Thu, 16 Apr 2015 16:55:08 +0000 (UTC), Kaz Kylheku
<k...@kylheku.com> wrote:

> On 2015-04-15, Steve Graham <solitary....@gmail.com> wrote:
<snip: much about Windows>
> > When I try the same with gawk under the cmd.exe, I get:
> > echo 123 | gawk '{print $0 7}'
> > 723
>
> Here, the single quote means nothing to CMD.EXE. None of the characters are special, so the
> gawk executable receives the command line as a single character string, quotes and all:
>
> '{print $0 7}'
>
> Something in the gawk program itself is dealing with the single quotes somehow,
> according to some rules. Those rules are different from the Microsoft conventions
> for parsing command lines, which do not recognize single quotes.
>
> The output 723 looks like it could be result of 123 being printed, followed by
> a carriage return (but not a newline). It's hard to understand why that would
> happen; why would 123 be printed with no newline.
>
Not hard if it's MinGW ...

> > This is version GNU Awk 3.0.4
>
... at least this oldish version which I also have (works well enough
for me and I've never bothered to update). When reading from a *file*
-- either gawk "/p/{a}" <data or gawk "/p/{a}" data -- this build, I
believe using MSVCRT, reads (Windows-standard) CRLF as LF and breaks
the line there given default RS, and when writing to a file it writes
LF as CRLF; the same things happen for (compiled) user programs not
using the "b" modifier in fopen, or (nonstandard) setmode() on STD*.
But when reading from a *pipe* it breaks on LF and leaves CR in the
data of $0, where it causes exactly the trouble seen by the OP here.

I *can* fix this with gsub(/\r/,v) where v is set to the empty string
(I prefer that to backslashing two doublequotes) but since I have
other MinGW tools present I usually just pipe through grep "" which
normalizes linebreaks to LF without changing anything else.

> If you launch awk from cmd.exe, you're probably best off in putting the script in
> a file and using "awk -f file".

That is often good for the script, but doesn't help with the data.
0 new messages