SHELLdorado Newsletter 1/2005

Heiner Steven

unread,

Apr 30, 2005, 11:43:22 AM4/30/05

to

SHELLdorado Newsletter 1/2005 - April 30th, 2005

================================================================
The "SHELLdorado Newsletter" covers UNIX shell script related
topics. To subscribe to this newsletter, leave your e-mail
address at the SHELLdorado home page:

http://www.shelldorado.com/

View previous issues at the following location:

http://www.shelldorado.com/newsletter/

"Heiner's SHELLdorado" is a place for UNIX shell script
programmers providing

Many shell script examples, shell scripting tips & tricks,
a large collection of shell-related links & more...
================================================================

Contents

o Shell Tip: How to read a file line-by-line
o Shell Tip: Print a line from a file given its line number
o Shell Tip: How to convert upper-case file names to lower-case
o Shell Tip: Speeding up scripts using "xargs"
o Shell Tip: How to avoid "Argument list too long" errors

-----------------------------------------------------------------
>> Shell Tip: How to read a file line-by-line
-----------------------------------------------------------------

Assume you have a large text file, and want to process it
line-by line. How could you do it?

file=/etc/motd
for line in `cat $file` # WRONG
do
echo "$line"
done

...is no solution, because the variable "line" will in turn
contain each (whitespace-delimited) *word* of the file, not
each line. The "while" command is a better candidate for
this job:

file=/etc/motd
while read line
do
echo "$line"
done < "$file"

Note that the "read" command automatically processes its
input: it removes leading whitespace from each line, and
concatenates a line ending with "\" with the one following.
The following commands suppress this behaviour:

file=/etc/motd
OIFS=$IFS; IFS= # Change input field separator
while read -r line
do
echo "$line"
done < "$file"
IFS=$OIFS # Restore old value

There still is one disadvantage to this loop: it's slow. If
the processing consists of string manipulations, consider
replacing the loop completely e.g. with an AWK script.

Portability:
"read -r" is available with ksh, ksh93, bash, zsh,
POSIX, but not with older Bourne Shells (sh).

-----------------------------------------------------------------
>> Shell Tip: Print a line from a file given its line number
-----------------------------------------------------------------

Regular expressions can be very powerful, and there are many
tools (like "egrep") allowing to use them on any file. But
what if we simply want to get the 5th line of a file? No
elaborate regular expression required:

lineno=5
sed -n "${lineno}p"

prints the 5th line without involving ^.[]*$ or other
meta-characters resembling the noise of a defective serial
interface. "sed -n" means: do not automatically print each
line. "5p" indicates: print line 5. We have to use
"${lineno}p" here instead of "$linenop", because otherwise
the shell would try to expand the variable "$linenop", not
knowing that "p" is an "sed" command.

This could be improved upon. Assume the input file is
"/usr/dict/words", and consists of 25143 lines. The "sed"
command above would not only dutifully print line 5, but
also continue to read the following 25138 lines, doing what
it was told to do: ignore them. The following command makes
"sed" stop reading after line 5:

lineno=5
sed -n "${lineno}{p;q;}"

So you think you have a better solution for this problem?
Prove it: send me your suggestion
(heiner...@shelldorado.com, closing date: 2005-05-31),
and I'll measure the speed of all contributions on a Linux
and a Solaris system. The fastest (or most elegant) solution
using only POSIX shell commands will be published in the
next SHELLdorado Newsletter.

-----------------------------------------------------------------
>> Shell Tip: How to convert upper-case file names to lower-case
-----------------------------------------------------------------

Admit it: you sometimes copy files from an operating system
with a name ending in *indows. A frequent annoyance are file
names IN ALL UPPER CASE.

The following command renames them to contain only lower
case characters:

for file in *
do
lcase=`echo "$file" | tr '[A-Z]' '[a-z]'`

# Does the target file exist already? Do not
# overwrite it:
[ -f "$lcase" ] && continue

# Are old and new name different?
[ x"$file" = x"$lcase" ] && continue # no change

mv "$file" "$lcase"
done

The KornShell (and ksh93) has the useful "typeset -l"
option, which will automatically convert the contents of a
variable to lower case:

$ typeset -l lcase=ABCDE
$ echo "$lcase"
abcde

Changing the above loop to use "typeset -l" is left as an
exercise for the reader.

-----------------------------------------------------------------
>> Shell Tip: Speeding up scripts using "xargs"
-----------------------------------------------------------------

The essential part of writing fast scripts is avoiding
external processes.

for file in *.txt
do
gzip "$file"
done

is much slower than just

gzip *.txt

because the former code may need many "gzip" processes for a
task the latter command accomplishes with only one external
process. But how could we build a command line like the one
above when the input files come from a file, or even
standard input? A naive approach could be

gzip `cat textfiles.list archivefiles.list`

but this command can easily run into an "Argument list too
long" error, and doesn't work with file names containing
embedded whitespace characters. A better solution is using
"xargs":

cat textfiles.list archivefiles.list | xargs gzip

The "xargs" command reads its input line by line, and build
a command line by appending each line to its arguments
(here: "gzip"). Therefore the input

a.txt
b.txt
c.txt

would result in "xargs" executing the command

gzip a.txt b.txt c.txt

"xargs" also takes care that the resulting command line does
not get too long, and therefore avoids "Argument list too
long" errors.

-----------------------------------------------------------------
>> Shell Tip: How to avoid "Argument list too long" errors
-----------------------------------------------------------------

Oh no, there it is again: the system's spool directory is
almost full (4018 files); old files need to be removed, and
all useful commands only print the dreaded "Argument list
too long":

$ cd /var/spool/data
$ ls *
ls: Argument list too long
$ rm *
rm: Argument list too long

So what exactly in the character '*' is too long? Well, the
current shell does the useful work of converting '*' to a
(large) list of files matching that pattern. This is not the
problem. Afterwards, it tries to execute the command (e.g.
"/bin/ls") with the file list using the system call
execve(2) (or a similar one). This system call has a
limitation for the maximum number of bytes that can be used
for arguments and environment variables(*), and fails.

It's important to note that the limitation is on the side of
the the system call, not the shell's internal lists.

To work around this problem, we'll use shell-internal
functions, or ways to limit the number of files directly
specified as arguments to a command.

Examples:

o Don't specify arguments, to get the (hopefully) useful
default:

$ ls

o Use shell-internal functionality ("echo" and "for" are
shell-internal commands):

$ echo *
file1 file2 [...]

$ for file in *; do rm "$file"; done # be careful!

o Use "xargs"

$ ls | xargs rm # careful!

$ find . -type f -size +100000 -print | xargs ...

o Limit the number of arguments for a command:

$ ls [a-l]*
$ ls [m-z]*

Using this techniques should help getting around the
problem.

---
(*) Parameter ARG_MAX, often 128K (Linux) or 1 or 2 MB
(Solaris).

----------------------------------------------------------------
If you want to comment on this newsletter, have suggestions for
new topics to be covered in one of the next issues, or even want
to submit an article of your own, send an e-mail to

mailto:heiner...@shelldorado.com

================================================================
To unsubscribe, send a mail with the body "unsubscribe" to
newsl...@shelldorado.com
================================================================

Ed Morton

unread,

May 2, 2005, 9:43:49 AM5/2/05

to

Heiner Steven wrote:
> SHELLdorado Newsletter 1/2005 - April 30th, 2005

Are you asking us to review these suggestions or are they part of a FAQ
(or some other list) that's previously been reviewed or something else?
Assuming the former:

<snip>

> -----------------------------------------------------------------
> >> Shell Tip: How to read a file line-by-line
> -----------------------------------------------------------------

<snip>

> file=/etc/motd
> OIFS=$IFS; IFS= # Change input field separator
> while read -r line
> do
> echo "$line"
> done < "$file"
> IFS=$OIFS # Restore old value

Or, preferably:

file=/etc/motd
while IFS= read -r line

do
echo "$line"
done < "$file"

<snip>

> -----------------------------------------------------------------
> >> Shell Tip: Print a line from a file given its line number
> -----------------------------------------------------------------

<snip>

> lineno=5
> sed -n "${lineno}{p;q;}"
>
> So you think you have a better solution for this problem?
> Prove it: send me your suggestion
> (heiner...@shelldorado.com, closing date: 2005-05-31),
> and I'll measure the speed of all contributions on a Linux
> and a Solaris system. The fastest (or most elegant) solution
> using only POSIX shell commands will be published in the
> next SHELLdorado Newsletter.

We saw in a different thread recently (http://tinyurl.com/87c8d) about a
similair problem that both of these might be faster than the sed
approach in this case:

head -n "${lineno}" | tail -1
awk -vln="${lineno}" 'NR==ln{print;exit}'

>
> -----------------------------------------------------------------
> >> Shell Tip: How to convert upper-case file names to lower-case
> -----------------------------------------------------------------

<snip>

> Admit it: you sometimes copy files from an operating system
> with a name ending in *indows. A frequent annoyance are file
> names IN ALL UPPER CASE.
>
> The following command renames them to contain only lower
> case characters:
>
> for file in *
> do
> lcase=`echo "$file" | tr '[A-Z]' '[a-z]'`

This doesn't seem worth mentioning as I'd imagine it's the first thing a
newbie would do anyway.

<snip>

> -----------------------------------------------------------------
> >> Shell Tip: How to avoid "Argument list too long" errors
> -----------------------------------------------------------------

<snip>

> Examples:
>
> o Don't specify arguments, to get the (hopefully) useful
> default:
>
> $ ls
>
> o Use shell-internal functionality ("echo" and "for" are
> shell-internal commands):
>
> $ echo *
> file1 file2 [...]
>
> $ for file in *; do rm "$file"; done # be careful!
>
> o Use "xargs"
>
> $ ls | xargs rm # careful!
>
> $ find . -type f -size +100000 -print | xargs ...
>
> o Limit the number of arguments for a command:
>
> $ ls [a-l]*
> $ ls [m-z]*
>
> Using this techniques should help getting around the
> problem.

Again, none of these except the "xargs" one, seem worth mentioning as
they're all obvious even to a newcomer.

Ed.

Markus Gyger

unread,

May 2, 2005, 10:42:11 AM5/2/05

to

Ed Morton writes:
> file=/etc/motd
> while IFS= read -r line
> do
> echo "$line"
> done < "$file"

Note that you don't have to use "" after <, > or = if you
are using variables only (e.g. a=$b$c or <$file is fine).

Markus

Heiner Steven

unread,

May 2, 2005, 12:37:21 PM5/2/05

to

Ed Morton wrote:

> Heiner Steven wrote:
>
>> SHELLdorado Newsletter 1/2005 - April 30th, 2005

[...]

> Are you asking us to review these suggestions or are they part of a FAQ
> (or some other list) that's previously been reviewed or something else?

Following the previous issues of the newsletter I got many
comments suggesting to also include tips for shell scripting
beginners, not only for experienced script programmers.

Recently I gained some experience with novice script programmers,
and the tips below cover the most frequently asked questions.
I know that we in comp.unix.shell have a higher knowledge
level, and also have Joe's excellent FAQ, but nevertheless
I hope some beginners find urgent questions answered in the
newsletter.

Of course I always welcome comments, so here we go...

> Assuming the former:
>
> <snip>
>
>> -----------------------------------------------------------------
>> >> Shell Tip: How to read a file line-by-line
>> -----------------------------------------------------------------
>
> <snip>
>
>> file=/etc/motd
>> OIFS=$IFS; IFS= # Change input field separator
>> while read -r line
>> do
>> echo "$line"
>> done < "$file"
>> IFS=$OIFS # Restore old value
>
>
> Or, preferably:
>
> file=/etc/motd
> while IFS= read -r line
> do
> echo "$line"
> done < "$file"

I remember having had problems with a particular shell
interpreting the construct "while IFS= read ...", because
"IFS=" was taken as a literal command. I couldn't reproduce
the problem with the shells in my reach.

It works with

Solaris 8
/bin/sh
/bin/ksh Version M-11/16/88i
/bin/bash 2.03.0(1)-release
/usr/dt/bin/dtksh Version M-12/28/93d (ksh93)

Linux
bash BASH 3.00.0(1)-release
pdksh @(#)PD KSH v5.2.14 99/07/13.2
ksh93 Version M 1993-12-28 p
zsh 4.2.1
ash 1.6.1

Does anybody have an example of a shell where it does not
work?

[...]

>> -----------------------------------------------------------------
>> >> Shell Tip: Print a line from a file given its line number
>> -----------------------------------------------------------------
>
> <snip>
>
>> lineno=5
>> sed -n "${lineno}{p;q;}"

[...]

> We saw in a different thread recently (http://tinyurl.com/87c8d) about a
> similair problem that both of these might be faster than the sed
> approach in this case:
>
> head -n "${lineno}" | tail -1
> awk -vln="${lineno}" 'NR==ln{print;exit}'

I missed that thread, and acknowledge that it has been handled
extensively. Your even measured the run times of the suggestions.
Luckily I did not promise a prize for the fastest example,
otherwise somebody could just have cut'n'paste from your
posting ;-)

>> -----------------------------------------------------------------
>> >> Shell Tip: How to convert upper-case file names to lower-case
>> -----------------------------------------------------------------

[...]

> This doesn't seem worth mentioning as I'd imagine it's the first thing a
> newbie would do anyway.

Maybe not interesting for a frequent comp.unix.shell reader,
but sometimes an unsurmountable problem for a novice programmer.

>> -----------------------------------------------------------------
>> >> Shell Tip: How to avoid "Argument list too long" errors
>> -----------------------------------------------------------------
>
> <snip>
>
>> Examples:
>>
>> o Don't specify arguments, to get the (hopefully) useful
>> default:
>>
>> $ ls
>>
>> o Use shell-internal functionality ("echo" and "for" are
>> shell-internal commands):
>>
>> $ echo *
>> file1 file2 [...]
>>
>> $ for file in *; do rm "$file"; done # be careful!
>>
>> o Use "xargs"
>>
>> $ ls | xargs rm # careful!
>>
>> $ find . -type f -size +100000 -print | xargs ...
>>
>> o Limit the number of arguments for a command:
>>
>> $ ls [a-l]*
>> $ ls [m-z]*
>>
>> Using this techniques should help getting around the
>> problem.

> Again, none of these except the "xargs" one, seem worth mentioning as
> they're all obvious even to a newcomer.

I disagree here. A newcomer usually doesn't know that the
"Argument list too long" error is due to externally executed
commands, and that shell-internal commands do not have the
limitation. For others the footnote not cited here listing common
argument length limitations could be interesting.

How do others think about this? Should I stop posting the
irregular newletter here, or only when advanced topics are covered?

Heiner
--
___ _
/ __| |_ _____ _____ _ _ Heiner STEVEN <heiner...@nexgo.de>
\__ \ _/ -_) V / -_) ' \ Shell Script Programmers: visit
|___/\__\___|\_/\___|_||_| http://www.shelldorado.com/

Michael Tosch

unread,

May 2, 2005, 1:35:03 PM5/2/05

to

?
/bin/sh takes IFS= but not -r
/usr/xpg4/bin/sh takes also -r

You have two more good reasons to leave the sed solution:
- sed for printing a single line was not slower than awk
- the proposed awk solution is less portable

Yep. This is an FAQ, and the answer is useful.

>
> How do others think about this? Should I stop posting the
> irregular newletter here, or only when advanced topics are covered?
>

This newsgroup should not become "advanced-only"!
Keep on posting, but please put the keyword FAQ into the Subject:
SHELLdorado FAQ Newsletter ...

--
Michael Tosch @ hp : com

Janis Papanagnou

unread,

May 2, 2005, 1:37:39 PM5/2/05

to

Heiner Steven wrote:
> Ed Morton wrote:
>> Heiner Steven wrote:
>>
>>> file=/etc/motd
>>> OIFS=$IFS; IFS= # Change input field separator
>>> while read -r line
>>> do
>>> echo "$line"
>>> done < "$file"
>>> IFS=$OIFS # Restore old value
>>
>>
>>
>> Or, preferably:
>>
>> file=/etc/motd
>> while IFS= read -r line
>> do
>> echo "$line"
>> done < "$file"
>
>
> I remember having had problems with a particular shell
> interpreting the construct "while IFS= read ...", because
> "IFS=" was taken as a literal command. I couldn't reproduce
> the problem with the shells in my reach.

> Does anybody have an example of a shell where it does not
> work?

I recently had a (yet unsolved) problem with these two forms - though
in another context: a test operator - which behave differently on two
different ksh93 versions (q and, I think, g) on Linux. So there may be
problems, but I cannot be more precise at the moment since the problem
might as well be on my side. I resorted to the store/restore approach.

> How do others think about this? Should I stop posting the
> irregular newletter here, or only when advanced topics are covered?

I think it's good to have a newsletter. While I'd appreciate a higher
level - for reasons of self interest ;-) - I don't mind if you post
it as you've done, even if I share Ed's opinion that it might not hit
the audience in this newsgroup perfectly well.

Janis

Ed Morton

unread,

May 2, 2005, 1:56:18 PM5/2/05

to

Heiner Steven wrote:
<snip>

> How do others think about this? Should I stop posting the
> irregular newletter here, or only when advanced topics are covered?

I personally don't mind the posting at all, I just don't recall ever
seeing this newsletter before so I wasn't sure where you were coming
from in posting it. I just did a google search and found that the
previous newsletter was posted in Jan 2003 which is probably why I don't
remember seeing it.

I don't mean this as a put down in any way but I'm still not 100% clear
what the purpose is of the newsletter as a whole or why those particular
5 subjects were addressed. Are they questions you've seen asked
frequently somewhere or was there some other criteria used in choosing them?

Ed.

Heiner Steven

unread,

May 2, 2005, 2:14:21 PM5/2/05

to

Ed Morton wrote:

[...]

> I don't mean this as a put down in any way but I'm still not 100% clear
> what the purpose is of the newsletter as a whole or why those particular
> 5 subjects were addressed. Are they questions you've seen asked
> frequently somewhere or was there some other criteria used in choosing
> them?

Maybe the selection of subjects seemed a little arbitrary,
but they are the questions I know by personal experience
are asked most frequently from shell scripting beginners.
In fact, some of them I've personally been asked (and answered)
several times already.

Other issues of the newsletter often addressed more
advanced topics, from writing shell-only CGI-scripts, over
setting a timeout for commands, to writing a small example
HTTP server using ksh. [http://www.shelldorado.com/newsletter/]

The feedback was generally positive, but many people asked
for inclusion of more basic topics. Maybe I'll find the
right mixture of easy and more advanced tips in some
of the newsletters to come ;-)

In either case I welcome feedback and suggestions.

matt_left_coast

unread,

May 2, 2005, 2:20:38 PM5/2/05

to

Ed Morton wrote:

>
>
> Heiner Steven wrote:
> <snip>
>> How do others think about this? Should I stop posting the
>> irregular newletter here, or only when advanced topics are covered?
>
> I personally don't mind the posting at all, I just don't recall ever
> seeing this newsletter before so I wasn't sure where you were coming
> from in posting it. I just did a google search and found that the
> previous newsletter was posted in Jan 2003 which is probably why I don't
> remember seeing it.
>
> I don't mean this as a put down in any way but I'm still not 100% clear
> what the purpose is of the newsletter as a whole or why those particular
> 5 subjects were addressed.

Personally I think it obvious. The line

"Many shell script examples, shell scripting tips & tricks,

a large collection of shell-related links & more" talks about "tips and
tricks". Clearly the newsletter puts out different "tips and tricks".

> Are they questions you've seen asked
> frequently somewhere or was there some other criteria used in choosing
> them?
>
> Ed.

I have to ask you, does everything someone offers as usefull or interesting
have to be an answer to a question? Is it beyond comprehension that someone
may post something strictly because they think it interesting or
helpfull???

Ed Morton

unread,

May 2, 2005, 2:35:15 PM5/2/05

to

Michael Tosch wrote:
> Heiner Steven wrote:
>
>> Ed Morton wrote:

<snip>

>>>> -----------------------------------------------------------------
>>>> >> Shell Tip: Print a line from a file given its line number
>>>> -----------------------------------------------------------------
>>> <snip>
>>>
>>>> lineno=5
>>>> sed -n "${lineno}{p;q;}"
>> [...]
>>
>>> We saw in a different thread recently (http://tinyurl.com/87c8d)
>>> about a similair problem that both of these might be faster than the
>>> sed approach in this case:
>>>
>>> head -n "${lineno}" | tail -1
>>> awk -vln="${lineno}" 'NR==ln{print;exit}'

<snip>

> You have two more good reasons to leave the sed solution:
> - sed for printing a single line was not slower than awk

We hadn't seen the results either way on that, hence "might be faster".
FWIW, a quick test of printing the 5th line out of 100000 on Solaris gave:

$ lineno=5
$ time sed -n "${lineno}{p;q;}" file
All work and no play makes Jack a dull boy

real 0m0.05s
user 0m0.00s
sys 0m0.03s
$ time gawk -vln="${lineno}" 'NR==ln{print;exit}' file
All work and no play makes Jack a dull boy

real 0m0.09s
user 0m0.01s
sys 0m0.05s
$ time head -n "${lineno}" file | tail -1
All work and no play makes Jack a dull boy

real 0m0.04s
user 0m0.01s
sys 0m0.05s

while printing the 99995th gave:

$ lineno=99995
$ time sed -n "${lineno}{p;q;}" file
All work and no play makes Jack a dull boy

real 0m0.48s
user 0m0.26s
sys 0m0.22s
$ time gawk -vln="${lineno}" 'NR==ln{print;exit}' file
All work and no play makes Jack a dull boy

real 0m0.61s
user 0m0.52s
sys 0m0.06s
$ time head -n "${lineno}" file | tail -1
All work and no play makes Jack a dull boy

real 0m6.00s
user 0m1.37s
sys 0m9.91s

so the head...tail has the expected problem when the line number crawls
up, but there doesn't seem to be much to choose between the sed and awk
versions (all awks except old awk had similair times). Given that, I'd
stick with the sed solution since it's shorter to type.

> - the proposed awk solution is less portable

If portability to old awk is the main concern, then modify it to:

awk 'NR=='"$lineno"'{print;exit}'

but all newer awks support the "-v" option as shown above so I'd
recommend sticking with that.

<snip>

> This newsgroup should not become "advanced-only"!

I certainly didn't mean to imply that it should and I appologise if it
came off that way.

Ed.

Stephane CHAZELAS

unread,

May 2, 2005, 2:55:00 PM5/2/05

to

2005-04-30, 17:43(+02), Heiner Steven:

> SHELLdorado Newsletter 1/2005 - April 30th, 2005

You may prefer to cross post, instead of multi posting (to cus
and cuq).

[...]

> file=/etc/motd
> for line in `cat $file` # WRONG
> do
> echo "$line"
> done
>
> ...is no solution, because the variable "line" will in turn
> contain each (whitespace-delimited) *word* of the file, not
> each line.

Not to speak of filename generation.

> The "while" command is a better candidate for
> this job:
>
> file=/etc/motd
> while read line
> do
> echo "$line"
> done < "$file"
>
> Note that the "read" command automatically processes its
> input: it removes leading whitespace from each line, and
> concatenates a line ending with "\" with the one following.
> The following commands suppress this behaviour:
>
> file=/etc/motd
> OIFS=$IFS; IFS= # Change input field separator
> while read -r line
> do
> echo "$line"
> done < "$file"
> IFS=$OIFS # Restore old value

This doesn't restore the previous value if IFS was previously
unset.

Also note that commands in the loop have their stdin affected.

As other pointed out:

while IFS= read -r line <&3; do
printf '%s\n' "$line" # echo should be banished
done 3< "$file"

> There still is one disadvantage to this loop: it's slow. If
> the processing consists of string manipulations, consider
> replacing the loop completely e.g. with an AWK script.
>
> Portability:
> "read -r" is available with ksh, ksh93, bash, zsh,
> POSIX, but not with older Bourne Shells (sh).

I'd rather say "but not with older sh's (as the Bourne shell)".
Actually, the Bourne Shell from SVR4.2 supports read -r
(according to Sven Mascheck, but I've never come accross such a
shell). ash doesn't support -r but newer so called POSIX
conformant shells based on it do (such as dash or newer BSD
shells).

Note that in the Bourne Shell, IFS= read line, sets IFS not only
for the duration of "read", but also in the Bourne shell,
redirected loops are run in a subshell (use exec 3< "$file" to
prevent that).

For shells that don't support read -r:

sed 's/\\/&&/g' < "$file" | while IFS= read line

But I think all that is in the Unix FAQ.

> -----------------------------------------------------------------
> >> Shell Tip: Print a line from a file given its line number
> -----------------------------------------------------------------

[...]

> lineno=5
> sed -n "${lineno}{p;q;}"

Or sed "$lineno!d;q"

> -----------------------------------------------------------------
> >> Shell Tip: How to convert upper-case file names to lower-case
> -----------------------------------------------------------------
>
> Admit it: you sometimes copy files from an operating system
> with a name ending in *indows. A frequent annoyance are file
> names IN ALL UPPER CASE.

The obvious solution is to use dedicated tools such as zsh's zmv:

autoload -U zmv # in ~/.zshrc
zmv '*[[:upper:]]*' '${(L)f}' # rename all files

zmv '^*[[:lower:]]*' '${(L)f}' # rename only the all ucase ones

>
> The following command renames them to contain only lower
> case characters:
>
> for file in *
> do
> lcase=`echo "$file" | tr '[A-Z]' '[a-z]'`

printf '%s\n' "$file" | ...

>
> # Does the target file exist already? Do not
> # overwrite it:
> [ -f "$lcase" ] && continue

"[ -f" is for exists *and is regular file*, why not renaming the
other types of files?

>
> # Are old and new name different?
> [ x"$file" = x"$lcase" ] && continue # no change
>
> mv "$file" "$lcase"

mv -- "$file...

[...]

> because the former code may need many "gzip" processes for a
> task the latter command accomplishes with only one external
> process. But how could we build a command line like the one
> above when the input files come from a file, or even
> standard input? A naive approach could be
>
> gzip `cat textfiles.list archivefiles.list`
>
> but this command can easily run into an "Argument list too
> long" error, and doesn't work with file names containing
> embedded whitespace characters.

That depends how what value you gave to IFS.

With
IFS='
'; set -f

There shouldn't be problems with filenames with blanks (there
should be with filenames with newlines though).

> A better solution is using
> "xargs":

Which works only if you use quoting properly (and various
implementations of xargs understand different quotings), it
doesn't work if textfiles.list contains one filename per line
(unless filenames are known not to contain any blanks or quotes
or backslashes).

zsh's zargs would be a better solution. Recent versions of ksh93
have command -x.

[...]

> cat textfiles.list archivefiles.list | xargs gzip
>
> The "xargs" command reads its input line by line, and build
> a command line by appending each line to its arguments
> (here: "gzip"). Therefore the input

No, not /each line/.

Also note the problem fixed with xargs -r in some xargs
implementations (gzip called even if the file list is empty).

> -----------------------------------------------------------------
> >> Shell Tip: How to avoid "Argument list too long" errors
> -----------------------------------------------------------------
>
> Oh no, there it is again: the system's spool directory is
> almost full (4018 files); old files need to be removed, and
> all useful commands only print the dreaded "Argument list
> too long":
>
> $ cd /var/spool/data
> $ ls *
> ls: Argument list too long
> $ rm *
> rm: Argument list too long

ls ./* rm ./*
or
ls -- * rm -- *

[...]

> $ echo *
> file1 file2 [...]
>
> $ for file in *; do rm "$file"; done # be careful!

rm -- "$file"

>
> o Use "xargs"
>
> $ ls | xargs rm # careful!
>
> $ find . -type f -size +100000 -print | xargs ...
>
> o Limit the number of arguments for a command:
>
> $ ls [a-l]*
> $ ls [m-z]*

and zsh's zargs:

autoload -U zargs # in ~/.zshrc
zargs ./* -- rm

or ksh93 command -x:
command -x rm ./*

Or use zsh's builtin rm:

zmodload zsh/files
rm ./*

--
Stéphane

Ed Morton

unread,

May 2, 2005, 2:47:38 PM5/2/05

to

matt_left_coast wrote:

> Ed Morton wrote:
<snip>

>>I don't mean this as a put down in any way but I'm still not 100% clear
>>what the purpose is of the newsletter as a whole or why those particular
>>5 subjects were addressed.
>
>
> Personally I think it obvious. The line
> "Many shell script examples, shell scripting tips & tricks,
> a large collection of shell-related links & more" talks about "tips and
> tricks". Clearly the newsletter puts out different "tips and tricks".

That text described what the web site contains, not what the newsletter
contains. I completely understand the purpose of the web site and I've
even referred to it in postings here. I'm just not clear on the purpose
of the newsletter. If it's to address specific questions raised
somehere, that's fine. If it's to serve as a reminder that the web site
exists, that's fine, too. If it's to get feedback from us before adding
the tips to the web site, that's also fine. Whatever... it'd just be
nice to know the intent.

>>Are they questions you've seen asked
>>frequently somewhere or was there some other criteria used in choosing
>>them?
>>
>>Ed.
>
>
>
> I have to ask you, does everything someone offers as usefull or interesting
> have to be an answer to a question? Is it beyond comprehension that someone
> may post something strictly because they think it interesting or
> helpfull???

Well, no, but should we all go off and write up half a dozen things we
think are useful or interesting and just post them without explaining
why we feel it's appropriate to post those specific items?

If we do that, should we (or anyone else!) feel surprised or offended if
someone asks for some context?

Ed.

Stephane CHAZELAS

unread,

May 2, 2005, 3:01:17 PM5/2/05

to

2005-05-2, 14:42(+00), Markus Gyger:

[...]

Yes, you do for bash and ksh and it won't harm anyway:

$ echo a > '*'
$ echo b > 'b'
$ echo c > '[ab]'
$ ls -l
total 12
-rw-r--r-- 1 chazelas chazelas 2 May 2 19:56 *
-rw-r--r-- 1 chazelas chazelas 2 May 2 19:56 [ab]
-rw-r--r-- 1 chazelas chazelas 2 May 2 19:56 b
$ a='[ab]' bash -c 'cat < $a'
b
$ a='[ab]' ksh -c 'cat < $a'
c
$ a='[ab]' ksh -ic 'cat < $a'
b
$ a='*' bash -c 'cat < $a'
bash: $a: ambiguous redirect
$ a='*' zsh -c 'setopt globsubst multios; cat < $a'
a
c
b

--
Stéphane

matt_left_coast

unread,

May 2, 2005, 3:35:36 PM5/2/05

to

Ed Morton wrote:

>
>
> matt_left_coast wrote:
>
>> Ed Morton wrote:
> <snip>
>>>I don't mean this as a put down in any way but I'm still not 100% clear
>>>what the purpose is of the newsletter as a whole or why those particular
>>>5 subjects were addressed.
>>
>>
>> Personally I think it obvious. The line
>> "Many shell script examples, shell scripting tips & tricks,
>> a large collection of shell-related links & more" talks about "tips and
>> tricks". Clearly the newsletter puts out different "tips and tricks".
>
> That text described what the web site contains, not what the newsletter
> contains. I completely understand the purpose of the web site and I've
> even referred to it in postings here.

Let's see, a newsletter from a web site that contains "Many shell script

examples, shell scripting tips & tricks, a large collection of

shell-related links & more" Is to contain something DIFFERENT? Anyhow, a
quick browse though the content would convince anyone with even half a
brain that it contains "tips & tricks" just like the web sit in the link.

> I'm just not clear on the purpose
> of the newsletter. If it's to address specific questions raised
> somehere, that's fine.

Again why does it have to address specific questions? Why can't someone or
some organization just post something they find interesting or usefull when
it comes to UNIX shell scripting. I find nothing off topic in the post and
wonder why you are having such an issue?

> If it's to serve as a reminder that the web site
> exists, that's fine, too.

That is one of the purposes I think the OP was intending.

> If it's to get feedback from us before adding
> the tips to the web site, that's also fine. Whatever... it'd just be
> nice to know the intent.

Where did he ask for feedback? Again, it it clear to me that the poster was
posting something ON TOPIC that they thought was of interest to the group.
At NO POINT did they ask for feedback. The fact that it is a *NEWS*letter
tends to rule out that they are asking for feedback. "News" implies they
are giving out information.

Why does the PURPOSE have to be approved by YOU? Are you saying that any of
the content was OFF TOPIC for "comp.unix.shell"?

>
>>>Are they questions you've seen asked
>>>frequently somewhere or was there some other criteria used in choosing
>>>them?
>>>
>>>Ed.
>>
>>
>>
>> I have to ask you, does everything someone offers as usefull or
>> interesting have to be an answer to a question? Is it beyond
>> comprehension that someone may post something strictly because they think
>> it interesting or helpfull???
>
> Well, no,

then what is your problem? Are you saying that "tips and tricks" about UNIX
shell commands is off topic in "comp.unix.shell"?

> but should we all go off and write up half a dozen things we
> think are useful or interesting and just post them without explaining
> why we feel it's appropriate to post those specific items?

Why would poisting interesting "tips and tricks" about Unix shell commands
be inappropriate? Is this group called "questions and answers only"? It
should be obvious why posting half a dozen "tips and tricks" about UNIX
shell comands would be appropriate for comp.unix.shell. Unless you think
this group is about something other than discussing UNIX shell stuff.
Considering they are about Unix shells, they were all ON TOPIC. I don

>
> If we do that, should we (or anyone else!) feel surprised or offended if
> someone asks for some context?

As long is the "context" is about UNIX shells why would anyone (AKA you) be
so offended??? The post was totally ON TOPIC for "comp.unix.shell" and I
for one appreciate the post.

>
> Ed.

Michael Tosch

unread,

May 2, 2005, 4:02:57 PM5/2/05

to

Stephane CHAZELAS wrote:
> 2005-04-30, 17:43(+02), Heiner Steven:
>

...

>> The following commands suppress this behaviour:
>>
>> file=/etc/motd
>> OIFS=$IFS; IFS= # Change input field separator
>> while read -r line
>> do
>> echo "$line"
>> done < "$file"
>> IFS=$OIFS # Restore old value
>
>
> This doesn't restore the previous value if IFS was previously
> unset.
>
> Also note that commands in the loop have their stdin affected.
>
> As other pointed out:
>
> while IFS= read -r line <&3; do
> printf '%s\n' "$line" # echo should be banished
> done 3< "$file"

Let me translate for the newcomers:

while IFS= read -r line; do
echo "$line" # simple echo may remain
read dummy
done < "$file"

has the problem that the 2nd read command reads from the same file as
the loop's read.
Your trick with reading from &3 (or higher) prevents that.

This leads to the following solution to the FAQ
Why does rsh (or remsh) disturbe a "while read" loop?

rsh (or remsh) "steals" some characters from stdin.

Preferred solution:

while IFS= read -r line <&3; do

rsh remote-host "echo '$line'"
done 3< "$file"

The other solution is

while IFS= read -r line; do
rsh -n remote-host "echo '$line'"
done < "$file"

Ed Morton

unread,

May 2, 2005, 4:09:25 PM5/2/05

to

matt_left_coast wrote:
<snip>

> Let's see, a newsletter from a web site that contains "Many shell script
> examples, shell scripting tips & tricks, a large collection of
> shell-related links & more" Is to contain something DIFFERENT? Anyhow, a
> quick browse though the content would convince anyone with even half a
> brain that it contains "tips & tricks" just like the web sit in the link.

I didn't ask what it contains, I asked what the purpose was of creating
and posting it.

<snip>

> Again why does it have to address specific questions? Why can't someone or
> some organization just post something they find interesting or usefull when
> it comes to UNIX shell scripting.

a) Because it may be wrong (or non-optimal), which is fine as long as
you're soliciting feedback.
b) Because if we all did that it'd flood the NG with trivia.

I find nothing off topic in the post and
> wonder why you are having such an issue?

Post the quote where I said it was off topic. Perhaps parnoia is
creeping in, but you're starting to sound very familiar "matt".

<snip>

> Where did he ask for feedback?

Here:

> send me your suggestion
> (heiner...@shelldorado.com,

and here:

> If you want to comment on this newsletter, have suggestions for
> new topics to be covered in one of the next issues, or even want
> to submit an article of your own, send an e-mail to
>
> mailto:heiner...@shelldorado.com

<snip>

> Why does the PURPOSE have to be approved by YOU? Are you saying that any of
> the content was OFF TOPIC for "comp.unix.shell"?

Post the quote where I said I had to approve it.
Post the quote where I said it was off topic.

<snip>

> then what is your problem? Are you saying that "tips and tricks" about UNIX
> shell commands is off topic in "comp.unix.shell"?

Post the quote where I said it was off topic.

<snip>

> As long is the "context" is about UNIX shells why would anyone (AKA you) be
> so offended???

Post the quote where I said I was offended.

You're not listening. This is my last response to you in this thread.

Ed.

Heiner Steven

unread,

May 2, 2005, 4:16:35 PM5/2/05

to

Stephane CHAZELAS wrote:

[...]

> As other pointed out:
>
> while IFS= read -r line <&3; do
> printf '%s\n' "$line" # echo should be banished
> done 3< "$file"

"printf" has its own share of problems, one of which
I ran into recently:

printf "%d\n" 020 080

shows different behaviour for different operating systems,
and even different versions thereof.

020 is POSIX'ly correct interpreted as an octal number (16 decimal),
but "080" can result either in an "invalid number" error,
or in a silent conversion to decimal, resulting in the
output "80".

[...]

> Note that in the Bourne Shell, IFS= read line, sets IFS not only

> for the duration of "read" [...]

That's the problem I once had with this approach, thank
you Stephane for reminding me ;-)

[...many valid comments omitted...]

>>-----------------------------------------------------------------
>>
>>>>Shell Tip: How to avoid "Argument list too long" errors
>>
>>-----------------------------------------------------------------
>>
>> Oh no, there it is again: the system's spool directory is
>> almost full (4018 files); old files need to be removed, and
>> all useful commands only print the dreaded "Argument list
>> too long":
>>
>> $ cd /var/spool/data
>> $ ls *
>> ls: Argument list too long
>> $ rm *
>> rm: Argument list too long

> ls ./* rm ./*
> or
> ls -- * rm -- *

This should be used to make the command work with names containing
e.g. leading dashes. (As you probably know) it does not help
with the "Argument list too long" error.

Stephane CHAZELAS

unread,

May 2, 2005, 4:44:06 PM5/2/05

to

2005-05-02, 22:16(+02), Heiner Steven:

> Stephane CHAZELAS wrote:
>
> [...]
>> As other pointed out:
>>
>> while IFS= read -r line <&3; do
>> printf '%s\n' "$line" # echo should be banished
>> done 3< "$file"
>
> "printf" has its own share of problems, one of which
> I ran into recently:
>
> printf "%d\n" 020 080
>
> shows different behaviour for different operating systems,
> and even different versions thereof.
>
> 020 is POSIX'ly correct interpreted as an octal number (16 decimal),
> but "080" can result either in an "invalid number" error,
> or in a silent conversion to decimal, resulting in the
> output "80".

Yes, you also get troubles with printf '\0351 \351\n'

But printf '%s\n' "$var" is not troublesome (except on Solaris 7
/bin/printf where it SEGVs if $var is too long for instance)

> [...]
>> Note that in the Bourne Shell, IFS= read line, sets IFS not only
>> for the duration of "read" [...]
>
> That's the problem I once had with this approach, thank
> you Stephane for reminding me ;-)

But the Bourne shell doesn't have "-r" (except in that legendary
version).

IFS= read -r line

is POSIX syntax and is valid in every POSIX shell.

[...]

>> ls ./* rm ./*
> > or
> > ls -- * rm -- *
>
> This should be used to make the command work with names containing
> e.g. leading dashes. (As you probably know) it does not help
> with the "Argument list too long" error.

[...]

Yes, but as rm * is invalid (strictly speaking) and dangerous,
its usage shouldn't be advertised.

--
Stéphane

matt_left_coast

unread,

May 2, 2005, 4:56:36 PM5/2/05

to

Ed Morton wrote:

>
>
> matt_left_coast wrote:
> <snip>
>> Let's see, a newsletter from a web site that contains "Many shell script
>> examples, shell scripting tips & tricks, a large collection of
>> shell-related links & more" Is to contain something DIFFERENT? Anyhow, a
>> quick browse though the content would convince anyone with even half a
>> brain that it contains "tips & tricks" just like the web sit in the link.
>
> I didn't ask what it contains, I asked what the purpose was of creating
> and posting it.
>

Is it off topic?????? No, then it is APPROPRIATE to post here and he does
not need to explain why he posted it.

> <snip>
>> Again why does it have to address specific questions? Why can't someone
>> or some organization just post something they find interesting or usefull
>> when it comes to UNIX shell scripting.
>
> a) Because it may be wrong (or non-optimal),

How so? The post was clearly on topic for this group.

> which is fine as long as
> you're soliciting feedback.

How so??? Is it off topic? Where is it written that posts to comp.unix.shell
require soliciting feedback? Of course you post TWO places where he
solicits feedback below, so, what is your problem? According to you, he is
soliciting feedback so it should be OK!

> b) Because if we all did that it'd flood the NG with trivia.

Funny, I thought it was on topic discution of UNIX shells. How is discussing
UNIX shells in comp.unix.shell "flooding the NG with trivia" Or are you
saying that posts about "tips and tricks" about UNIX shells are trivia,
that nobody gains by them?

>
> I find nothing off topic in the post and
>> wonder why you are having such an issue?
>
> Post the quote where I said it was off topic.

If it is not off topic, why are you having an issue with it?

> Perhaps parnoia is
> creeping in, but you're starting to sound very familiar "matt".

Yes, only a paranoid person would go off ranting about flooding "the NG with
trivia" over and ON TOPIC POST. But I sense you are getting desperate to
maintain your position as the king expert of comp.unix.shell and will now
attempt to discredit the messanger rather than take a close look at your
behavior.

>
> <snip>
>> Where did he ask for feedback?
>
> Here:
>
> > send me your suggestion
> > (heiner...@shelldorado.com,
>
> and here:
>
> > If you want to comment on this newsletter, have suggestions for
> > new topics to be covered in one of the next issues, or even want
> > to submit an article of your own, send an e-mail to
> >
> > mailto:heiner...@shelldorado.com
>

Since he is clearly asking for feedback what is your problem? you stated
that it would be OK if he solicits feedback, he clearly did, so it must be
OK by your logic.

> <snip>
>> Why does the PURPOSE have to be approved by YOU? Are you saying that any
>> of the content was OFF TOPIC for "comp.unix.shell"?
>
> Post the quote where I said I had to approve it.
> Post the quote where I said it was off topic.

then there is no issue with his post. You don't need to approve it and it is
on topic, there is no problem.

>
> <snip>
>> then what is your problem? Are you saying that "tips and tricks" about
>> UNIX shell commands is off topic in "comp.unix.shell"?
>
> Post the quote where I said it was off topic.

The only reason why his post would not be appropriate, would be if it were
not "on topic". Since you seem to think his post was not appropriate, then
it must not be on topic.

>
> <snip>
>> As long is the "context" is about UNIX shells why would anyone (AKA you)
>> be so offended???
>
> Post the quote where I said I was offended.

It has been my experience that when people talk about things being
"appropriate" or not, they are offended. You said he needed to explain why
his post was appropriate, I am saying that it was appropriate because it
was ON TOPIC.

As of yet, I have seen no reason why he should have to explain why "tips and
tricks" about UNIX shells are appropriate to post to comp.unix.shell.

>
> You're not listening. This is my last response to you in this thread.

Ahhhh, the "I lost the debate so I will go off on a rant, then plug my
fingers in my ears and sing "LA-La-la-la"." whine. How childish of you.

The fact of the matter is, you were going off on the OP because he not
explain why his post was appropriate. But you can't give a single reason
why posting "tips and tricks" about UNIX shells would NOT be appropriate.

>
> Ed.

Sven Mascheck

unread,

May 3, 2005, 8:09:22 AM5/3/05

to

Stephane CHAZELAS wrote:

> Actually, the Bourne Shell from SVR4.2 supports read -r

i.e., the system shell on UnixWare and OpenUnix 8