ksh 1, perl 2 - ksh or perl for scripting?

Alan Harder

unread,

Dec 21, 1992, 9:52:37 AM12/21/92

to

Hi, all. We are currently trying to decide if we should move from
/bin/sh as our language for production scripting to ksh, or if we
should move to perl instead. Is anyone out there using perl as their
production scripting language of choice?

-Alan Harder
a...@math.ams.org

My opinions are not the opinions of the American Mathematical Society.
Did you think they were?

Tom Christiansen

unread,

Dec 21, 1992, 12:44:03 PM12/21/92

to

From the keyboard of a...@ulysses.mr.ams.com (Alan Harder):
:Hi, all. We are currently trying to decide if we should move from

:/bin/sh as our language for production scripting to ksh, or if we
:should move to perl instead. Is anyone out there using perl as their
:production scripting language of choice?

Certainly. Many shops use perl for various purposes, including
install scripts, test drivers, database interfaces, menu systems,
and sysadmin tools.

[The following text contains pieces of semi-canned prose which some of
you may have seen before.]

What you're really asking is whether you should use perl or ksh for your
scripting. The problem with that question is that not all problems require
the same solution. For simple command-oriented tasks, a shell script
works just fine. It's faster to write, and sometimes faster to execute.
The built-in test, expr, and other features of ksh will give you a
performance win in this area.

Perl and ksh do not address precisely the same problem set. There is some
overlap, but in general, ksh is an interactive shell and command
language, whereas perl is not a shell but much more the general purpose
programming language. Perl fills the gap between sh and C, and indeed
extends into those languages' problem domains as well Ksh does not.

Shell programming is inherently cumbersome at expressing certain kinds of
algorithms. Most of us have written, or at least seen, shell scripts from
hell. While often touted as one of UNIX's strengths because they're
conglomerations of small, single-purpose tools, these shell scripts
quickly grow complex that they're cumbersome and hard to understand,
modify and maintain. After a certain point of complexity, the strength of
the UNIX philosophy of having many programs that each does one thing well
becomes its weakness.

The big problem with piping tools together is that there is only one
pipe. This means that several different data streams have to get
multiplexed into a single data stream, then demuxed on the other end of
the pipe. This wastes processor time as well as human brain power.

For example, you might be shuffling through a pipe a list of filenames,
but you also want to indicate that certain files have a particular
attribute, and others don't. (E.g., certain files are more than ten
days old.) Typically, this information is encoded in the data stream
by appending or prepending some special marker string to the filename.
This means that both the pipe feeder and the pipe reader need to know
about it. Not a pretty sight.

Because perl is one program rather than a dozen others (sh, awk, sed, tr,
wc, sort, grep, ...), it is usually clearer to express yourself in perl
than in sh and allies, and often more efficient as well. You don't need
as many pipes, temporary files, or separate processes to do the job. You
don't need to go shoving your data stream out to tr and back and to sed
and back and to awk and back and to sort back and then back to sed and
back again. Doing so can often be slow, awkward, and/or confusing.

Anyone who's ever tried to pass command line arguments into a sed script
of moderate complexity or above can attest to the fact that getting the
quoting right is not a pleasant task. In fact, quoting in general in the
shell is just not a pleasant thing to code or to read.

In a heterogeneous computing environment, the available versions of many
tools varies too much from one system to the next to be utterly reliable.
Does your sh understand functions on all your machines? What about your
awk? What about local variables? It is very difficult to do complex
programming without being able to break a problem up into subproblems of
lesser complexity. You're forced to resort to using the shell to call
other shell scripts and allow UNIX's power of spawning processes serve as
your subroutine mechanism, which is inefficient at best. That means your
script will require several separate scripts to run, and getting all these
installed, working, and maintained on all the different machines in your
local configuration is painful. With perl, all you need do is get
it installed on the system -- which is really pretty easy thanks to
Larry's Configure program -- and after that you're home free.

Shell scripts can seldom hope to approach a perl program's speed.
In fact, perl programs are often faster than a C program, at least, one
that hasn't been highly tuned. In general, if you have a perl and C expert
working on the same problem, you'll get the perl code to 2 to 3 times
the C code's speed, although for some problems, it's much better than that.
The next release of perl will also be substantially faster (current figures
indicate that 25% faster is not unlikely) than the current one.

Besides being faster, perl is a more powerful tool than sh, sed, or awk.
I realize these are fighting words in some camps, but so be it. There
exists a substantial niche between shell programming and C programming
that perl conveniently fills. Tasks of this nature seem to arise with
extreme frequency in the realm of systems administration. Since a system
administrator almost invariably has far too much to do to devote a week to
coding up every task before him in C, perl is especially useful for him.
Larry Wall, perl's author, has been known to call it "a shell for C
programmers." I like to think of it as a "BASIC for UNIX." I realize
that this carries both good and bad connotations. So be it.

In what ways is perl more powerful than the individual tools? This list
is pretty long, so what follows is not necessarily an exhaustive list.
To begin with, you don't have to worry about arbitrary and annoying
restrictions on string length, input line length, or number of elements in
an array. These are all virtually unlimited, i.e. limited to your
system's address space and virtual memory size.

Perl's regular expression handling is far and above the best I've ever
seen. For one thing, you don't have to remember which tool wants which
particular flavor of regular expressions, or lament that fact that one
tool doesn't allow (..|..) constructs or +'s \b's or whatever. With
perl, it's all the same, and as far as I can tell, a proper superset of
all the others.

Perl has a fully functional symbolic debugger (written, of course, in
perl) that is an indispensable aid in debugging complex programs. Neither
the shell nor sed/awk/sort/tr/... have such a thing. There've been folks
who've switched over to doing all their major production scripting in Perl
just so that they have access to a real debugger.

Perl has a loop control mechanism that's more powerful even than C's. You
can do the equivalent of a break or continue (last and next in perl) of
any arbitrary loop, not merely the nearest enclosing one. You can even do
a kind of continue that doesn't trigger the re-initialization part of a
loop, something you do from time to time want to do.

Perl's data-types and operators are richer than the shells' or awk's,
because you have scalars, numerically-indexed arrays (lists), and
string-indexed (hashed) arrays. Each of these holds arbitrary data
values, including floating point numbers, for which mathematic built-in
subroutines and power operators are available. In can handle
binary data of arbitrary size.

Speaking of lisp, you can generate strings, perhaps with sprintf(), and
then eval them. That way you can generate code on the fly. You can even
do lambda-type functions that return newly-created functions that you can
call later. The scoping of variables is dynamic, fully recursive subroutines
are supported, and you can pass or return any type of data into or out
of your subroutines.

You have a built-in automatic formatter for generating pretty-printed
forms with automatic pagination and headers and center-justified and
text-filled fields like "%(|fmt)s" if you can imagine what that would
actually be were it legal.

There's a mechanism for writing suid programs that can be made more secure
than even C programs thanks to an elaborate data-tracing mechanism that
understands the "taintedness" of data derived from external sources. It
won't let you do anything really stupid that you might not have thought of.

You have access to just about any system-related function or system call,
like ioctl's, fcntl, select, pipe and fork, getc, socket and bind and
connect and attach, and indirect syscall() invocation, as well as things
like getpwuid(), gethostbyname(), etc. You can read in binary data laid
out by a C program or system call using structure-conversion templates.

At the same time you can get at the high-level shell-type operations like
the -r or -w tests on files or `backquote` command interpolation. You can
do file-globbing with the <*.[ch]> notation or do low-level readdir()s as
suits your fancy.

Dbm files can be accessed using simple array notation. This is really
nice for dealing with system databases (aliases, news, ...), efficient
access mechanisms over large data-sets, and for keeping persistent data.

Perl is extensible, and with the next release, will be embeddable.
People link it with their own libraries for accessing curses, database
access routines, network management routines, or X graphics libraries.
Perl's namespace is more flexible than C's, having a sort of package
notation for public and private data and code as well as
module-initialization routines.

Don't be dismayed by the apparent complexity of what I've just discussed.
Perl is actually very easy to learn because so much of it derives from
existing tools. It's like interpreter C with sh, sed, awk, and a lot
more built in to it. There's a very considerable quantity of code out
there already written in perl, including libraries to handle things
you don't feel like reimplementing.

Don't give up your shell programming. You'll want it for writing
makefiles and making shell callouts in perl, if for no other reason. :-)
My personal rule of thoumb is usually that if I can do the task in under a
dozen or so lines of shell code that is straightforward and clean and not
too slow, then I use the Bourne shell, otherwise I use perl, except for
the occasional task for which C is very clearly the optimal solution.

--tom
--
Tom Christiansen tch...@convex.com convex!tchrist

Emacs is a fine operating system, but I still prefer UNIX. -me

Stephen O. Lidie

unread,

Dec 21, 1992, 1:06:10 PM12/21/92

to

In article <ASH.92De...@ulysses.mr.ams.com> a...@ulysses.mr.ams.com (Alan

Sometimes even I don't use Perl..... but mostly I do. Gave up all shells/sed
awk/grep/egrep/fgrep long ago.....

Why I use Perl:
. Perl outperforms ksh and friends essentially all the time
. Perl has no hidden arbitray limitations - too many scripts
fail due to line length problems, core dumps, etc.
. I only have to know one language and one regular expression
language.
. I like lists

Summary: Perl is my production scripting language. Perhaps a bit
'notatioanally cluttered', but far better than what else is out there. Now,
I can't wait for Perl 5 'cause I need user defineable TYPEs/structures!

SOL

william E Davidsen

unread,

Dec 23, 1992, 11:04:43 AM12/23/92

to

In article <ASH.92De...@ulysses.mr.ams.com>, a...@ulysses.mr.ams.com (Alan Harder) writes:
|
| Hi, all. We are currently trying to decide if we should move from
| /bin/sh as our language for production scripting to ksh, or if we
| should move to perl instead. Is anyone out there using perl as their
| production scripting language of choice?

You should definitely move!

Now, as to where, I prefer ksh for most problems, although there are a
number of things which work more easily in perl. I find that ksh does
redirection and argument truncation in a way which is easier to enter
and read (for me) than perl. Things like ${name%.*} are trivial to write
in ksh, but somewhat more verbose in perl. For these reasons I find I
use perl very sparingly, possibly only a few times a month.

Certainly the retraining will be faster if you go to ksh, since you
can use 100% of the things you already know, and can use the same shell
for programming and interractive use, which speeds learning.

Perl is more powerful, but seems to provide too many solutions for my
taste, certainly for simple problems. I like a language where there is
one clear way to solve a problem rather than one which has many ways,
since I find that programmers tend to waste a lot of time looking for
the single elegant way to solve the problem, rather than finding one way
which is clearly best and then useing it. This is particularly true of
really clever perl programmers I have known, who seem to feel the need
to write "impressive" code instead of code which just works.

--
bill davidsen, GE Corp. R&D Center; Box 8; Schenectady NY 12345
Keyboard controller has been disabled, press F1 to continue.

Larry Wall

unread,

Dec 28, 1992, 6:58:15 PM12/28/92

to

In article <1992Dec23.1...@crd.ge.com> davi...@crd.ge.com (bill davidsen) writes:
: This is particularly true of

: really clever perl programmers I have known, who seem to feel the need
: to write "impressive" code instead of code which just works.

You can prove anything by counting idiots.

Larry Wall
lw...@netlabs.com

william E Davidsen

unread,

Dec 29, 1992, 3:15:05 PM12/29/92

to

I don't think it's true that all good perl programmers are idiots.
They tend to be a bit strange, I agree...

This problem appears in most languages in which there is more than one
reasonable way to solve a problem. Programmers get thinking about the
code and decide they could do it a bit {faster,smaller,better,cleaner}
and go rewrite the code.

Larry Wall

unread,

Dec 31, 1992, 3:37:24 PM12/31/92

to

In article <1992Dec29....@crd.ge.com> davi...@crd.ge.com (bill davidsen) writes:

: In article <1992Dec28.2...@netlabs.com>, lw...@netlabs.com (Larry Wall) writes:
: | In article <1992Dec23.1...@crd.ge.com> davi...@crd.ge.com (bill davidsen) writes:
: | : This is particularly true of
: | : really clever perl programmers I have known, who seem to feel the need
: | : to write "impressive" code instead of code which just works.
: |
: | You can prove anything by counting idiots.
:
: I don't think it's true that all good perl programmers are idiots.

I don't think you think it's true that all good perl programmers are idiots.

: They tend to be a bit strange, I agree...

*I'm* never strange...

: This problem appears in most languages in which there is more than one

: reasonable way to solve a problem. Programmers get thinking about the
: code and decide they could do it a bit {faster,smaller,better,cleaner}
: and go rewrite the code.

Nothing the matter with that. We all optimize for various and sundry
reasons, and those reasons can change over time. I don't think there's
even anything terribly wrong with optimizing for impressive obfuscation
when the fit takes you. It's only when the fit takes you inappropriately
that it can be considered antisocial. These are the folks I was thinking
of.

I would argue that these same people tend to obfuscate their C, their sh,
their nroff, too. There are, in fact, multiple ways to solve problems
in these languages too, though I won't quarrel with you if you want
to assert that Perl gives you a slightly richer basic toolset than C,
and that this may have some influence on a programmer's thinking. The
linguists have been debating this question longer than most of us are old.

I do quarrel with logic that says, "Stupid people are associated with X,
therefore X is stupid." Stupid people are associated with everything.

Mind you, I'm not accusing you of expressing exactly that sentiment,
but it was close enough that I could trot out the epigram that had been
burning a hole in my noggin for several days. :-)

Larry

Bennett Todd @ Salomon Brothers Inc., NY

unread,

Jan 4, 1993, 10:46:12 PM1/4/93

to

Actually, when it comes to clarity of code, I think Perl wins over sh or C,
precisely because there are more ways to do a thing. I try to make code
visually clear [at least to me:-]. I try to adjust the functional
organization and layout of the source code to reflect the logical
description of what it's doing; the more choices I've got in flow-of-control
constructs the better. I find the combination of Perl's named blocks, the
``next LABEL'' and ``last LABEL'', and the pattern matching against the
default match space $_, to work particularly well for many programs I write.

I'm also a big fan of top-down design and bottom-up implementation, and I
find Perl's package system of namespace management to be very comfortable
and natural for the job. I can implement reusable packages to implement the
trickier lower-level functionality, starting with a spec for the interface
to the package. Then I can debug the package with simple test drivers, or
even calling it interactively from ``perl -de 0'' in a JOVE interactive
shell window. Once the package is all set I can use its external entry
points as a new collection of primitives to write the next higher level of
functionality. I've enjoyed good code reuse with this approach.

-Bennett
b...@sbi.com

Chris Fedde

unread,

Jan 5, 1993, 3:25:22 PM1/5/93

to

In article <8...@cyclone.sbi.com> b...@sbi.com (Bennett Todd @ Salomon Brothers Inc., NY ) writes:
>Actually, when it comes to clarity of code, I think Perl wins over sh or C,

[...]

I don't mean to pick on Mr Todd specifically, neither do I mean
to slight the value of any particular scripting language but...

Always use /bin/sh or compiled images for system tools.

did I make my point?
chris

Chris Fedde, USWEST MRG, ch...@mrg.uswest.com, voice +13037842823

Tom Christiansen

unread,

Jan 5, 1993, 5:31:46 PM1/5/93

to

From the keyboard of ch...@engineer.mrg.uswest.com (Chris Fedde):
:Always use /bin/sh or compiled images for system tools.

:
: did I make my point?

Of course not. You merely made an assertion. Without logic
to explain your position, it carries no weight.

--tom
--
Tom Christiansen tch...@convex.com convex!tchrist

If you have ever seen the grim word "login:" on a screen, your mind
is now a wholly-owned subsidiary of The Death Star.
John Woods in <14...@ksr.com> of comp.unix.bsd