Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Perl or Sed?

324 views
Skip to first unread message

Jay Sherman

unread,
Nov 21, 1994, 3:22:41 PM11/21/94
to
Here's a quick example that should do what you want:

#--------------------- cut here -------------------------------------------
#!/bin/ksh
# Replaces more than one space between words with one, and puts two spaces on
# the end of each line. * selects every file in that dir, so watch out where
# you test it.
for file in *
do
cat $file | sed "s/ */ /g" | awk '{print $0" "}' > /tmp/file.tmp
cat /tmp/file.tmp > $file
done
#--------------------- cut here -------------------------------------------

Randal L. Schwartz

unread,
Nov 22, 1994, 9:37:02 PM11/22/94
to

Yes, it's that time of the week again. This week's winner of my
frequently awarded "useless use of cat award" goes to jay, as in:

>>>>> "Jay" == Jay Sherman <usai...@dawn.mmm.com> writes:

Jay> Here's a quick example that should do what you want:
Jay> #--------------------- cut here -------------------------------------------
Jay> #!/bin/ksh
Jay> # Replaces more than one space between words with one, and puts two spaces on
Jay> # the end of each line. * selects every file in that dir, so watch out where
Jay> # you test it.
Jay> for file in *
Jay> do
Jay> cat $file | sed "s/ */ /g" | awk '{print $0" "}' > /tmp/file.tmp

This one here. sed can read $file just fine.

Jay> cat /tmp/file.tmp > $file
Jay> done
Jay> #--------------------- cut here -------------------------------------------

Remember.... "cat something | blah blah" is almost always not needed.

Just another Unix hacker (since 1977),
--
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <mer...@stonehenge.com> Snail: (Call) PGP-Key: (finger mer...@ora.com)
Phrase: "Welcome to Portland, Oregon ... home of the California Raisins!"

Lance F. Larsen-HO-77163U(MT4973)0000

unread,
Nov 23, 1994, 1:17:18 PM11/23/94
to
In article <MERLYN.94N...@linda.teleport.com>,

Randal L. Schwartz <mer...@stonehenge.com> wrote:
>
>Remember.... "cat something | blah blah" is almost always not needed.

Not necessary for the code, but frequently helpful to the person
who has to read it and follow it, especially when that is not their
primary line of work. Starting a string of commands through which
you pipe data with "cat <file>" makes it easier to read (for me
anyway) and I will always sacrifice elegance for clarity.

Just another tech writer who occasionally picks up a [shell]toolbox. 8^)

Lance F. Larsen l...@quartet.att.com

Kevin Darcy

unread,
Nov 23, 1994, 10:33:00 PM11/23/94
to
In article <MERLYN.94N...@linda.teleport.com>,
Randal L. Schwartz <mer...@stonehenge.com> wrote:
>
>Yes, it's that time of the week again. This week's winner of my
>frequently awarded "useless use of cat award" goes to jay, as in:
>
>>>>>> "Jay" == Jay Sherman <usai...@dawn.mmm.com> writes:
>
>Jay> Here's a quick example that should do what you want:
>Jay> #--------------------- cut here -------------------------------------------
>Jay> #!/bin/ksh
>Jay> # Replaces more than one space between words with one, and puts two spaces on
>Jay> # the end of each line. * selects every file in that dir, so watch out where
>Jay> # you test it.
>Jay> for file in *
>Jay> do
>Jay> cat $file | sed "s/ */ /g" | awk '{print $0" "}' > /tmp/file.tmp
>
>This one here. sed can read $file just fine.

For that matter, the "awk" is also redundant.

sed -e's/ */ /g' -e's/$/ /' $file > /tmp/file.tmp

--------------------------------------------------------------------------------
ke...@cfc.com <-- (ASCII only please) | Kevin Darcy, UNIX Systems Admin (CFC)
ke...@tech.mis.cfc.com <-- (mute | Technical Services
Voice: (810) 759-7140 NeXTmail | Chrysler Corporation
Fax: (810) 758-8173 welcome) | Center Line, Michigan, MIS Complex
--------------------------------------------------------------------------------

Harald Hanche-Olsen

unread,
Nov 24, 1994, 10:49:19 AM11/24/94
to
In article <CzqG4...@nntpa.cb.att.com> l...@danmark.mt.att.com (Lance
F. Larsen-HO-77163U(MT4973)0000) writes:


lfl> In article <MERLYN.94N...@linda.teleport.com>,


lfl> Randal L. Schwartz <mer...@stonehenge.com> wrote:
>>
>> Remember.... "cat something | blah blah" is almost always not needed.

lfl> Not necessary for the code, but frequently helpful to the person
lfl> who has to read it and follow it, especially when that is not their
lfl> primary line of work. Starting a string of commands through which
lfl> you pipe data with "cat <file>" makes it easier to read (for me
lfl> anyway) and I will always sacrifice elegance for clarity.

You mean readability, not elegance, right?

All the shells I am able to lay my hands on (sh, es, bash, csh)
understand

<something blah blah

Is that readable enough?

- Harald

Alan Robson

unread,
Nov 25, 1994, 10:27:15 PM11/25/94
to
Lance F. Larsen-HO-77163U(MT4973)0000 (l...@danmark.mt.att.com) wrote:
: In article <MERLYN.94N...@linda.teleport.com>,

: Randal L. Schwartz <mer...@stonehenge.com> wrote:
: >
: >Remember.... "cat something | blah blah" is almost always not needed.

: Not necessary for the code, but frequently helpful to the person
: who has to read it and follow it, especially when that is not their
: primary line of work. Starting a string of commands through which
: you pipe data with "cat <file>" makes it easier to read (for me
: anyway) and I will always sacrifice elegance for clarity.


Quite right. I feel much the same. Intellectually I know that

prog1 file | prog2

or prog1 < file | prog2

is the best way but I seldom use it because it FEELS ugly. There is a
definite psychologically uncomfortable itch to it - it seems to jump
from left to right and back again - whereas

cat file | prog1 | prog2

has no such ugly jumps in it and flows smoothly from left to right all
the way along the line.

It's silly, and I'm quite prepared to stay here in a minority of one,
but I can't help the way my subconscious works, and it just plain
doesn't like the efficient way and it does like the (to me) pretty way.

Sorry purists - but I will continue to cat files into pipelines even
though I know it's "wrong".

And in the end, the only real criterion is a pragmatic one. If it works,
it is right. Who cares about programmatical political correctness in
something you'll probably only run once anyway? Isn't that what many
piplines are for?

--
Best wishes,

Alan
----
_
Alan Robson tri...@iconz.co.nz o( )
The Internet Company of New Zealand / /\

Howard Fear

unread,
Nov 28, 1994, 6:24:59 PM11/28/94
to
There are other ways to get program flow:

In article <3b69uk$c0l@status>, tri...@iconz.co.nz (Alan Robson) writes:
|> Lance F. Larsen-HO-77163U(MT4973)0000 (l...@danmark.mt.att.com) wrote:
|> : In article <MERLYN.94N...@linda.teleport.com>,
|> : Randal L. Schwartz <mer...@stonehenge.com> wrote:
|> : >
|> : >Remember.... "cat something | blah blah" is almost always not needed.
|>
|> : Not necessary for the code, but frequently helpful to the person
|> : who has to read it and follow it, especially when that is not their
|> : primary line of work. Starting a string of commands through which
|> : you pipe data with "cat <file>" makes it easier to read (for me
|> : anyway) and I will always sacrifice elegance for clarity.
|>
|> Quite right. I feel much the same. Intellectually I know that
|> prog1 file | prog2
|> or
|> prog1 < file | prog2
|> is the best way but I seldom use it because it FEELS ugly. There is a
|> definite psychologically uncomfortable itch to it - it seems to jump
|> from left to right and back again - whereas
|> cat file | prog1 | prog2
|> has no such ugly jumps in it and flows smoothly from left to right all
|> the way along the line.

How about some variation of:
< file ( prog1 | prog2 )
or
( prog1 | prog2 ) < file
Both of which indicate that file is the input to the entire
pipeline.

--
Howard Fear email1: howar...@stortek.com
StorageTek email2: h...@blackcat.stortek.com
(303)673-5170 (303)467-0706

Ron Wigmore

unread,
Nov 29, 1994, 1:00:28 PM11/29/94
to
Howard Fear (h...@darkstar.stortek.com) wrote:
: There are other ways to get program flow:

My question is along the same lines as this "Useless Use Of Cat ..."
thread, so I'll ask my questions as part of this thread:

How do I get a file into a pipe and have it processed "in parallel"
or multiple times?

eg. How do I merge the two following pipes into one

sort foo | cut -c1-40 | (lotta stuff done to the 'left side') > bar
sort foo | cut -c21-60 | (lotta stuff done to the 'middle') >> bar
sort foo | cut -c41-80 | (lotta stuff done to the 'right side') >> bar

I know about using "sub-pipes" (dunno if they have a proper name) such
as:

sort foo | (echo "Sorted listing on `date`\n" ; tr '/' '@') | pg

would use 'pg' to display the sorted file foo with a heading, where
the '/' in the date have been changed to '@'.

For a specific example, I want to produce a summary of su history
by reading the /var/adm/sulog file ONCE, and have it produce a report
such as "10 failed su's to root
50 failed su's in total
20 successful su's to root
200 successful su's to root
1 failed su to root by a non-system group account
1 successful su to root by a non-system group account"

I can produce this report easily enough by reading the 'sulog' file
for each summary line of the report, but is it possible to construct a
complex pipe that would only read the 'sulog' file ONCE and using nested
pipes, sub-pipes (insert proper term here) to have the report produced in
one pass?

Hopefully this is a challenging question. I know I could simply use a

while read line
do
blah, blah, blah
done < /var/adm/sulog

type of routine, but the real question is, how can you construct a pipe
to read the 'sulog' file only ONCE and have it process the entire file
multiple times?

Ron,,,

Wout Mertens

unread,
Nov 30, 1994, 7:43:02 AM11/30/94
to
Randal L. Schwartz (mer...@stonehenge.com) wrote:

: Yes, it's that time of the week again. This week's winner of my


: frequently awarded "useless use of cat award" goes to jay, as in:

[stuff deleted]
: Remember.... "cat something | blah blah" is almost always not needed.

I hope that in this case it is justified?

echo
echo Your favorite Logged In Ones:
echo
if [ `who|cut -c1-8|sort|uniq|cat /dev/stdin .favlist \
|sort|uniq -d|tee /dev/tty|wc -l` -eq 0 ]; then
^^^^^^I splitted the above line in two lines...
echo Nobody
fi

with .favlist being a file that has the usernames of my favorite etcs.
Problem is, I keep getting fork: device temp. unavailable

Do you know why?

Wout.

---
Wout Mertens | Always bring your towel... You might need it!
a.k.a. Weird Squirrel |

Harald Hanche-Olsen

unread,
Nov 30, 1994, 11:28:27 AM11/30/94
to
In article <3bfq7s$19...@hermes.acs.ryerson.ca> rwig...@acs.ryerson.ca (Ron Wigmore) writes:

:> How do I get a file into a pipe and have it processed "in parallel"
:> or multiple times?

:> eg. How do I merge the two following pipes into one

:> sort foo | cut -c1-40 | (lotta stuff done to the 'left side') > bar
:> sort foo | cut -c21-60 | (lotta stuff done to the 'middle') >> bar
:> sort foo | cut -c41-80 | (lotta stuff done to the 'right side') >> bar

Wasn't there a program named tpipe or teepipe or something like that
posted to comp.sources.unix a long time ago? It should almost solve
your problem, except for putting all the outputs sequentially into the
`bar' file.

- Harald

Barry Margolin

unread,
Dec 1, 1994, 3:57:04 AM12/1/94
to
In article <3bfq7s$19...@hermes.acs.ryerson.ca> rwig...@acs.ryerson.ca (Ron Wigmore) writes:
>but the real question is, how can you construct a pipe
>to read the 'sulog' file only ONCE and have it process the entire file
>multiple times?

You can't. Pipelines are purely linear.

This is why programs like awk and perl were invented. They can process the
standard input and maintain the state necessary to merge multiple results
of processing the input.
--

Barry Margolin
BBN Internet Services Corp.
bar...@near.net

Brian Blackmore

unread,
Dec 5, 1994, 1:30:51 PM12/5/94
to
Wout Mertens (wmer...@eduserv.rug.ac.be) wrote:

: Randal L. Schwartz (mer...@stonehenge.com) wrote:

: : Yes, it's that time of the week again. This week's winner of my
: : frequently awarded "useless use of cat award" goes to jay, as in:
: [stuff deleted]
: : Remember.... "cat something | blah blah" is almost always not needed.

: I hope that in this case it is justified?

: echo
: echo Your favorite Logged In Ones:
: echo
: if [ `who|cut -c1-8|sort|uniq|cat /dev/stdin .favlist \
: |sort|uniq -d|tee /dev/tty|wc -l` -eq 0 ]; then
: ^^^^^^I splitted the above line in two lines...
: echo Nobody
: fi

Yours is a "cat something something" which is not the same as a
"cat something", this is what cat was actually orginally written for to
concatenate files together. There are rare exceptions when it is worth
using cat with one (or even none) arguements, these almost always involve a
cat that is NOT in a simple pipeline and only serve to prove the rule.

--
Brian Blackmore. Only 385 days left to next Christmas.

0 new messages