Some alternative choices I am aware of, to call commands such as:
/bin/ls
/opt/myapps/mybin/myprog
are:
1. Put all directories in the PATH
PATH=/bin:/usr/bin:/opt/myapps/mybin
ls /a/b/c/d
myprog -a aaa -b bbb
I prefer this method. It makes for less cluttered looking code than
if full pathname is specified, and less cryptic than seeing lots of
variables, like $LS, used in choice 2.
In Perl, running in taint-checking mode (perl -T), the PATH must be
nailed down in this way, not inherited from the environment.
Of course, I could still nail down the PATH, but also call programs
using choice 2 or 3.
2. Define all invoked commands at the top
LS="/bin/ls"
MYPROG="/opt/myapps/mybin/myprog"
$LS /a/b/c/d
$MYPROG -a aaa -b bbb
I think this is widely practiced.
An advantage is that you can test for the existance of executables,
before trying to call them:
[[ -x $MYPROG ]] || { print -u2 "$MYPROG: No such executable";
exit 1; }
(Admittedly, I practice this choice in Perl, for safe maintenance.
But far less often in shell scripts, since shells have far fewer
built-ins, and peppering code with the likes of $LS and $CAT looks
ugly.)
3. Spell out the full pathname
/bin/ls /a/b/c/d
/opt/myapps/mybin/myprog -a aaa -b bbb
This avoids the ugliness of variables like $LS, but having to spell
out the (right) pathname for every command (and know which ones are
built-ins, so have no pathname) seems plain awkward.
A separate issue is whether production scipts should inherit the PATH
from their (possibly uncontrolled) environment, or nail it down,
either explicitly like in choice 1, or implicitly as in:
. /opt/myapps/mybin/myenv
(which presumably defines PATH for any caller).
Regards,
Clyde
Clyde Ingram wrote:
> What are readers views of shell scripts calling other programs by
> specifying absolute pathname?
>
> Some alternative choices I am aware of, to call commands such as:
>
> /bin/ls
> /opt/myapps/mybin/myprog
>
> are:
>
>
> 1. Put all directories in the PATH
>
> PATH=/bin:/usr/bin:/opt/myapps/mybin
>
> ls /a/b/c/d
> myprog -a aaa -b bbb
I usually do that for scripts that only I use since I can always easily
debug problems in my own scripts and this is a quick and easy way to get
at where the tools should be.
<snip>
> 2. Define all invoked commands at the top
>
> LS="/bin/ls"
> MYPROG="/opt/myapps/mybin/myprog"
>
> $LS /a/b/c/d
> $MYPROG -a aaa -b bbb
>
> I think this is widely practiced.
> An advantage is that you can test for the existance of executables,
> before trying to call them:
Exactly. I do this in scripts that many people are going to use since
they need to be more robust and provide good error detection and error
messages.
The naming convention that most people use is to reserve all-upper-case
names for exported variables, so your variable names should really be
something like:
ls="/bin/ls"
myprog="/opt/myapps/mybin/myprog"
$ls /a/b/c/d
$myprog -a aaa -b bbb
but that introduces a potential problem because if you forget to put the
"$" in when using those variables, e.g.:
ls="/bin/ls"
myprog="/opt/myapps/mybin/myprog"
ls /a/b/c/d
myprog -a aaa -b bbb
then you find yourself either calling the wrong versions of "ls" and
"myprog" or thenm being inecplicably missing. Even worse, you may
initially be calling the correct versions and it's 3 years down the road
before the wrong version appears earlier in your PATH and then you have
to try to fifure out what suddenly went wrong in your previously working
script when you didn't change the script!
To get around this, you need to adopt a different naming convention for
your non-exported variables. I always prefix mine with underscore, e.g.:
_ls="/bin/ls"
_myprog="/opt/myapps/mybin/myprog"
$_ls /a/b/c/d
$_myprog -a aaa -b bbb
but you could come up with something else if you don't like that.
<snip>
> 3. Spell out the full pathname
>
> /bin/ls /a/b/c/d
> /opt/myapps/mybin/myprog -a aaa -b bbb
>
> This avoids the ugliness of variables like $LS, but having to spell
> out the (right) pathname for every command (and know which ones are
> built-ins, so have no pathname) seems plain awkward.
Don't do that since there will be times when you want to call the same
command in multiple places in one script so then you'd be duplicating
the path information and making the script harder to maintain.
Another option you didn't mention is to create a variable for the path
to each tool, e.g.:
lsBin="/bin"
myprogBin="/opt/myapps/mybin"
${lsBin}/ls /a/b/c/d
${myprogBin}/myprog -a aaa -b bbb
I wouldn't do that myself since it can also result in some duplication
and it'd be easy to forget the bin prefix and end up calling the wrong tool.
A final option would be to create a function for each external tool, e.g.
function ls { /bin/ls "$@"; }
function myprog { /opt/myapps/mybin/myprog "$@"; }
ls /a/b/c/d
myprog -a aaa -b bbb
and then your only potential problem would be if you forgot to create a
function for the tool, then you'd just be calling whichever version is
first in your PATH.
Probably the best solution overall is to define a meaningful variable
for the path to the bin of each cluster of tools you use and then use
that in the function definitions so that if, say, all of your printer
management tools moved to a different bin, you'd just change the
variable, e.g.:
prtBin="/usr/spool/bin"
dataBin="/home/bill/bin"
calendarBin="/home/bill/bin"
function prt { ${prtBin}/prt "$@"; }
function prtstat { ${prtBin}/prtstat "$@"; }
function prtcancel { ${prtBin}/prtcancel "$@"; }
function dump { ${dataBin}/dump "$@"; }
function delete { ${dataBin}/delete "$@"; }
function agenda { ${calendarBin}/agenda "$@"; }
> A separate issue is whether production scipts should inherit the PATH
> from their (possibly uncontrolled) environment, or nail it down,
> either explicitly like in choice 1, or implicitly as in:
>
> . /opt/myapps/mybin/myenv
>
> (which presumably defines PATH for any caller).
I tend to just inherit the callers PATH so I can take advantage of the
system administrators having set some default tool dirs. I could could
see not wanting to do that for some applications though.
Ed.
I prefer this, too.
In fact it is most readable and most portable.
>
>
> 2. Define all invoked commands at the top
>
> LS="/bin/ls"
> MYPROG="/opt/myapps/mybin/myprog"
>
> $LS /a/b/c/d
> $MYPROG -a aaa -b bbb
>
> I think this is widely practiced.
You are right.
There must be bad shell courses out there...
> An advantage is that you can test for the existance of executables,
> before trying to call them:
>
> [[ -x $MYPROG ]] || { print -u2 "$MYPROG: No such executable";
> exit 1; }
The existings of executable tools like ls,awk,sed can often be taken
for granted.
On the other hand, most differences are in tools behavior, arguments,
capabilities.
>
> (Admittedly, I practice this choice in Perl, for safe maintenance.
> But far less often in shell scripts, since shells have far fewer
> built-ins, and peppering code with the likes of $LS and $CAT looks
> ugly.)
After reading a book about quality, you might want to go for "safe
maintenance".
After a while you will find that your code gets more and more ugly,
and lacks quality.
>
>
> 3. Spell out the full pathname
>
> /bin/ls /a/b/c/d
> /opt/myapps/mybin/myprog -a aaa -b bbb
>
> This avoids the ugliness of variables like $LS, but having to spell
> out the (right) pathname for every command (and know which ones are
> built-ins, so have no pathname) seems plain awkward.
... so is ugly as well. In addition: inflexible.
>
>
> A separate issue is whether production scipts should inherit the PATH
> from their (possibly uncontrolled) environment, or nail it down,
> either explicitly like in choice 1, or implicitly as in:
>
> . /opt/myapps/mybin/myenv
if myenv is common to many scripts.
>
> (which presumably defines PATH for any caller).
>
If security matters, nail it down (PATH and IFS).
The normal method is to prepend it:
PATH=/bin:/usr/bin:/opt/myapps/mybin:$PATH
If you *want* the users to take influence, append it:
PATH=${PATH}:/bin:/usr/bin:/opt/myapps/mybin
--
Michael Tosch
IT Specialist
HP Managed Services Germany
Phone +49 2407 575 313
On SUSv3 conformant systems, in a SUSv3 conformant script
interpreted by a SUSv3 conformant shell,
PATH=$(command -p getconf PATH)${PATH+:$PATH}
export PATH
Ensures that all the utilities specified at
http://www.opengroup.org/onlinepubs/007904975/utilities/contents.html
exist and conform to that specification.
[...]
>
> If security matters, nail it down (PATH and IFS).
> The normal method is to prepend it:
>
> PATH=/bin:/usr/bin:/opt/myapps/mybin:$PATH
>
> If you *want* the users to take influence, append it:
>
> PATH=${PATH}:/bin:/usr/bin:/opt/myapps/mybin
IFS is not a problem. Depending on the shell/the script there
may be with ENV, BASH_ENV, FIGNORE, SHELLOPTS, ARGV0, HOME,
ZDOTDIR, FPATH, LANG, LC_*, TMOUT (funny with bash and ksh93),
LD_PRELOAD, SHLIB_PATH, LD_LIBRARY_PATH, all sorts of other
dynamic linker variables, STTY, TMPPREFIX... some of which you
can't do anything against (as it's too late when the script is
started).
~$ FIGNORE='!(..)' ksh93 -c 'echo rm -rf *'
rm -rf ..
If a user wants to break a script, he'll always be able to do
so, he can edit the script and put garbage in it.
I think it's enough to only fix what the user might have
reasonably changed, for the rest, the user is to be blamed if
the script failed because of an unexpected value for a variable.
--
Stéphane ["Stephane.Chazelas" at "free.fr"]
some precisions:
IFS:
affects: very early Bourne shells (others ignore the IFS
variable found in there environment on startup)
effect: on those shells, syntax parsing, word splitting...
example:
$ IFS=i sh -c exit
runs "ex" on the "t" file.
ENV:
affects: pdksh, ksh88, zsh in sh or ksh emulation, some shells
based on ash.
effect: sources the given script, command substitution expanded
in ENV value. If the value expands to the path of a fifo, the
shell is blocked.
example:
$ ENV='$(echo foo >&2)' ksh -c :
foo
BASH_ENV:
affects: bash
effect: same as above
FIGNORE:
affects: ksh93
effect: change the filename generation behavior
example:
$ FIGNORE='!(..)' ksh93 -c 'echo rm -rf *.*'
rm -rf ..
SHELLOPTS:
affects: bash
effect: change the shell options
example:
$ SHELLOPTS=noexec:verbose bash -c 'echo foo'
echo foo
ARGV0:
affects: zsh
effect: changes the emulation mode
example:
$ ARGV0=csh zsh -c 'a=/et*; echo "$a"'
/etc
HOME:
affects: zsh
effect: it's the place where ".zshenv" file is found if $ZDOTDIR
is not set.
$ echo echo foo > /tmp/.zshenv
$ HOME=/tmp zsh -c :
foo
ZDOTDIR:
affects: zsh
effect: see above
FPATH:
affects: zsh, ksh
effect: same as PATH except that's for library functions
example:
$ echo echo foo > /tmp/zmv
$ FPATH=/tmp zsh -c 'autoload zmv; zmv a b'
foo
LANG, LC_...:
affects: most modern shells, ksh93 badly
effect: changes the sort order, the charset/language used for
messages, the displayed time format, the "ls -l" output format,
the numeric format (breaks ksh93 script that use floating point
arithmetic)...
example:
$ date +%B
January
$ LC_TIME=fr sh -c '[ "$(date +%B)" = January ] || echo We are not in January'
We are not in January
$ LC_NUMERIC=fr_FR ksh93 -c 'echo $((3.14159))'
ksh93: line 1: 3.14159: arithmetic syntax error
TMOUT:
affects: bash, ksh93
effect: "read" fails if it takes more than $TMOUT to perform
example:
$ TMOUT=1 ksh93 -c '(sleep 2; echo a) | (read a; echo b$a)'
b
TMOUT, PPID, HISTCMD, MAILCHECK, LINENO, OPTIND, RANDOM, SECONDS...
affects: ksh93, pdksh for some
effect: ksh93 returns immediately with an error if the value is
not a valid arithmetic expression.
example:
$ RANDOM=++ ksh93 -c 'echo foo'
ksh93: ++: more tokens expected
LD_PRELOAD, LD_LIBRARY_PATH...
affects: every non statically linked shell
effect: shells rely on functions from libc or other libraries,
they can be replaced by other ones this way. Other side effects,
with other variables, depending on the system.
example:
$ LD_TRACE_LOADED_OBJECTS=1 sh -c :
libdl.so.2 => /lib/libdl.so.2 (0x40021000)
libc.so.6 => /lib/libc.so.6 (0x40024000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
STTY:
affects: zsh
effect: change the terminal settings (runs stty before each
command)
example:
$ STTY=-g zsh -c :
500:5:bf:8a3b:3:1c:8:15:4:0:1:0:11:13:1a:0:12:f:17:16:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0
TMPPREFIX:
affects: zsh
effect: change the path where temporary files are created (for
here documents/strings and =(...))
example:
$ TMPPREFIX=/ zsh -c 'cat <<< foo'
zsh: permission denied
TMPDIR:
affects: pdksh
effect: same as above, except that pdksh reverts to the system
default tmp dir if it is unable to create a tmpfile in $TMPDIR
(but it may still fail for file paths too long for instance).
OPTIND:
affects: zsh (a bug)
effect: getopts fails
example:
$ OPTIND=4 zsh -c 'getopts a var -a; echo "<$var>"'
<>
PS4:
affects: most shells
effect: change the display for xtracing, and command
substitution is performed.
$ PS4='$(()' pdksh -cx :
pdksh: no closing quote
EXECSHELL:
affects: pdksh
effects: command used to run command that return ENOEXEC (valid
scripts without a shebang)
example:
$ echo echo b > a; chmod +x a
$ EXECSHELL=echo pdksh -c ./a
./a
POSIXLY_CORRECT:
affects: pdksh,
CDPATH:
affects: most recent shells
effect: cd'ing to a directory may no longer fail, cd may output
unexpected strings.
example:
$ mkdir /tmp/A /tmp/B
$ cd /tmp/B
$ CDPATH=/tmp bash -c 'cd A && pwd'
/tmp/A
/tmp/A
PATH:
affects: every shell
effect: well known
That list is probably not exhaustive.
> LANG, LC_...: affects: most modern shells, ksh93 badly
> effect: [...] (breaks ksh93 script that use floating point arithmetic)...
> $ LC_NUMERIC=fr_FR ksh93 -c 'echo $((3.14159))'
$ LC_NUMERIC=fr_FR ksh93 -c 'echo $((3,14159))'
3,14159
In general this is certainly a feature if setting LC_NUMERIC,
it's a more general issue:
$ LC_NUMERIC=en_US locale -c LC_NUMERIC | sed -ne '2p'
.
$ LC_NUMERIC=fr_FR locale -c LC_NUMERIC | sed -ne '2p'
,
But the comma also is an operator in arithmetical expressions
(and that's why zsh intentionally ignores LC_NUMERIC here).
Yes but the point is that a script written as:
#! /usr/local/bin/ksh93
typeset -F pi=3.14159265359
echo "cos(15°) = $((cos(15 * pi / 180)))"
will only work in locales where the decimal_point is ".".
So, you have to fix LC_NUMERIC/LC_ALL first in your script. In
other languages, you have to tell it when you want to use
localisation, in shells, that's the contrary. I only makes sense
to use localization in the shell for user interaction, so, that's
up to the programmer to decide when to use it.
So that script should be written:
#! /usr/local/bin/ksh93
LC_ALL=C command typeset -F pi=3.14159265359
echo "cos(15°) = $((cos(15 * pi / 180)))"
~$ LC_ALL=C ./a
cos(15°) = 0.965925826289
~$ LANG=fr_FR ./a
cos(15°) = 0,965925826289
Or you should but at the top of each ksh93 script that has to
cope with localisation:
#! /usr/local/bin/ksh93
if [[ ${LC_ALL+set} ]]; then
set_user_locale="LC_ALL='$LC_ALL'"
else
set_user_locale="unset LC_ALL"
fi
c_locale() { LC_ALL=C; }
user_locale() { eval "$set_user_locale"; }
# then
c_locale
typeset -F pi=3.14159265359
user_locale
echo "cos(15°) = $((cos(15 * pi / 180)))"
> In general this is certainly a feature if setting LC_NUMERIC,
> it's a more general issue:
>
> $ LC_NUMERIC=en_US locale -c LC_NUMERIC | sed -ne '2p'
> .
> $ LC_NUMERIC=fr_FR locale -c LC_NUMERIC | sed -ne '2p'
> ,
[...]
Also note: locale decimal_point
(at least with POSIX locale)
> So, you have to fix LC_NUMERIC/LC_ALL first in your script.
I differ a bit...
> In other languages, you have to tell it when you want to use
> localisation, in shells, that's the contrary. I only makes sense
> to use localization in the shell for user interaction, so, that's
> up to the programmer to decide when to use it.
LANG is the problem (not only) here. In many cases, LC_CTYPE
would be completely sufficient. LC_MESSAGES might serve
almost all of the remaining people. LANG in turn includes
LC_COLLATE and LC_NUMERIC (which is problematic).
> if [[ ${LC_ALL+set} ]]; then
> set_user_locale="LC_ALL='$LC_ALL'"
And you _never_ should have LC_ALL being set...
I guess a script (usually) really shouldn't try to fix
wrong locale settings...
When I have LANG, LC_NUMERIC set to fr_FR@euro, I tell the
applications, that, from my user point of view, decimal number
have to be displayed or read from me as 3,14. That's not an
incorrect setting. What is incorrect is to use the value of
LC_NUMERIC in a program for other thing than user interaction.
(AFAIUI).
> When I have LANG, LC_NUMERIC set to fr_FR@euro, I tell the
> applications, that, from my user point of view, decimal number
> have to be displayed or read from me as 3,14.
I thought about actively setting LC_NUMERIC only for the applications
where it's actually needed. but - no, that doesn't make sense at all.