Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Learning Bash Shell Scripting

112 views
Skip to first unread message

kaushal

unread,
Jul 14, 2017, 10:35:16 AM7/14/17
to
Hi,

I am new to Bash shell scripting. Any guides to start learning bash shell scripting?

Any help will be highly appreciable.

Regards,

Kaushal

Kenny McCormack

unread,
Jul 14, 2017, 10:56:03 AM7/14/17
to
In article <5ab4d37c-3d5b-4885...@googlegroups.com>,
The best advice is probably this: Don't.

Unless you have a specific need/reason to do it.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/RoyDeLoon

Kaz Kylheku

unread,
Jul 14, 2017, 11:02:30 AM7/14/17
to
On 2017-07-14, Kenny McCormack <gaz...@shell.xmission.com> wrote:
> In article <5ab4d37c-3d5b-4885...@googlegroups.com>,
> kaushal <kaushal...@gmail.com> wrote:
>>Hi,
>>
>>I am new to Bash shell scripting. Any guides to start learning bash shell
>>scripting?
>>
>>Any help will be highly appreciable.
>>
>>Regards,
>>
>>Kaushal
>
> The best advice is probably this: Don't.
>
> Unless you have a specific need/reason to do it.

Addendum: if those reasons are such as "intellectual curiosity" and
"enlightenment", you can save a lot of time if you believe those who say
that those reasons will be ultimately disappointed.

Bit Twister

unread,
Jul 14, 2017, 1:12:06 PM7/14/17
to
On Fri, 14 Jul 2017 07:35:12 -0700 (PDT), kaushal wrote:
> Hi,
>
> I am new to Bash shell scripting.
> Any guides to start learning bash shell scripting?

Here is a snippet from my urls list. The exclamation mark and
following text are comments. I have not checked the links in years.

http://tldp.org/LDP/abs/html/index.html ! bash advanced documentation
http://tldp.org/LDP/Bash-Beginners-Guide/html/ ! document
http://bahmanm.com/blogs/command-line-options-how-to-parse-in-bash-using-getopt ! document
http://cfaj.freeshell.org/shell ! bash script tips usage doc
http://cfajohnson.com/shell/?2004-05-22_shell_websites ! bash doc reference
http://cli.learncodethehardway.org/bash_cheat_sheet.pdf ! command document cmd line
http://members.iinet.net/~herman546/p20/GRUB2%20Bash%20Commands.html ! document grub2 document bootable usb
http://mywiki.wooledge.org/BashFAQ/050 ! bash variable expansion document
http://mywiki.wooledge.org/BashGuide ! document
http://mywiki.wooledge.org/FullBashGuide ! document
http://spin.atomicobject.com/2011/03/30/parsing-arguments-in-bash-with-getopts/ ! document command (best)
https://www.sourceware.org/autobook/autobook/autobook_119.html#Test ! bash documentation
http://www.freeos.com/guides/lsst/ ! bash Linux Shell Scripting Tutorial documentation
http://www.gnu.org/software/bash/manual/html_node/Bash-Variables.html ! document
http://www.howtoforge.com/detailed-error-handling-in-bash ! document sqlite trap
http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html ! basic shell bash doc

> Any help will be highly appreciable.

My first recomendation is to QUIT using G2/1.0 and use a real Usenet client.

Helmut Waitzmann

unread,
Jul 14, 2017, 9:01:25 PM7/14/17
to
kaushal <kaushal...@gmail.com>:
> I am new to Bash shell scripting. Any guides to start learning
> bash shell scripting?

I suggest first to learn unix, particularly features like
processes and the process' environment (invocation arguments,
environment variables, the working directory, the umask, signal
dispositions, the file descriptor table), files, directories,
directory entries (hard links), FIFOs, file descriptors,
credentials (real and effective user and group ids), file access
permissions, the controlling terminal and job control, the system
calls “fork()”, “execve()”, “_exit()”, “wait()”, “open()”,
“close()”, “dup()”, “dup2()”, “stat()”, “lstat()”, “link()”,
“unlink()”, “rename()”, “symlink()”, “chdir()”, “mkdir()”,
“rmdir()”, “kill()”, “sigaction()”, “umask()”, “chmod()”,
“chown()”, “read()”, “write()”, “utime()”, “getpgrp()”,
“setpgid()”, “tcgetpgrp()”, “tcsetpgrp()”.

After that, read the shell's manual page.

You will recognize many of the shell's features as the equivalents
to the unix features and system services, see, what a shell's
command line is, and its relation and difference to a program's
invocation arguments.

This will save you from many pitfalls.

Chris F.A. Johnson

unread,
Jul 15, 2017, 1:08:09 AM7/15/17
to
On 2017-07-14, Bit Twister wrote:
> On Fri, 14 Jul 2017 07:35:12 -0700 (PDT), kaushal wrote:
>> Hi,
>>
>> I am new to Bash shell scripting.
>> Any guides to start learning bash shell scripting?
>
> Here is a snippet from my urls list. The exclamation mark and
> following text are comments. I have not checked the links in years.
>
> http://tldp.org/LDP/abs/html/index.html ! bash advanced documentation

DON'T use this; it is full of bad advice and even outright errors.


--
Chris F.A. Johnson

Janis Papanagnou

unread,
Jul 15, 2017, 3:05:59 AM7/15/17
to
Shell is a high level language abstraction layer from the Unix system
calls. It opens a lot of pitfalls by design and by herself that do
not stem from the Unix system interface. It is certainly necessary to
know the basic concepts of Unix. But while it is maybe helpful to know
some of these Unix system calls it's not necessary to know them just
for purpose of learning to program in shell script. (It's probably
much better to get a good book about (structured) programming instead
(if the OP isn't already proficient in it).

My (additional) suggestion is to not focus on specific bash tutorials,
but to use sources that differenciates between bash specifics, POSIX
commons, shells that extend POSIX in common ways WRT their feature sets
(ksh, zsh, bash), and shells that go beyond that common feature set in
ways very helpful to programing (like ksh). For the latter I'd suggest
the book of Rosenblatt/Robbins, and Bolsky/Korn (for ksh/POSIX).

Janis

kaushal

unread,
Jul 17, 2017, 8:01:49 AM7/17/17
to
> I suggest first to learn unix, particularly features like
> processes and the process' environment (invocation arguments,
> environment variables, the working directory, the umask, signal
> dispositions, the file descriptor table), files, directories,
> directory entries (hard links), FIFOs, file descriptors,
> credentials (real and effective user and group ids), file access
> permissions, the controlling terminal and job control, the system
> calls “fork()”, “execve()”, “_exit()”, “wait()”, “open()”,
> “close()”, “dup()”, “dup2()”, “stat()”, “lstat()”, “link()”,
> “unlink()”, “rename()”, “symlink()”, “chdir()”, “mkdir()”,
> “rmdir()”, “kill()”, “sigaction()”, “umask()”, “chmod()”,
> “chown()”, “read()”, “write()”, “utime()”, “getpgrp()”,
> “setpgid()”, “tcgetpgrp()”, “tcsetpgrp()”.
>

Hi Helmut,

Please point me to any tutorials or books to learn unix and the various system calls you mentioned.

Regards,

Kaushal

Janis Papanagnou

unread,
Jul 17, 2017, 9:09:55 AM7/17/17
to
You can always use the 'man' command to get information about the system
calls. An excellent book is R. Stevens' "Advanced Programming in the UNIX
Environment". (But as previously already mentioned; for shell scripting
it's really not necessary to learn the Unix system calls.)

Janis

>
> Regards,
>
> Kaushal
>

applemcg

unread,
Jul 17, 2017, 1:47:29 PM7/17/17
to
learn how to use functions.

postpone learning "scripts". i.e. since you're at the command line, use functions as commands.

Janis Papanagnou

unread,
Jul 17, 2017, 2:46:37 PM7/17/17
to
This statement sounds very queer. Functions *are* _part_ of the shell
language. The _contents_ of the functions *are* shell commands, external
commands, and, shell control constructs. To be of any use you have to
persist any function definition. Either in a script, or in a function
library (e.g. a specific directory), or (worst), all in a monolithic
profile without structure or topical separation. Modern functions[*]
even allow similar scopes as script programs do (WRT local variables,
signal traps, getopt processing, etc.), even [old style] shell functions
(now "POSIX functions") still have a lot similarities when you call your
code (e.g. parameter passing, exit status, etc.). Even on command line
in interactive use (i.e. besides shell programs stored in scripts or
functions) you can [in ksh] interactrively write scripts on the fly
(Esc, v) in an editor to be executed immediately, and the function is
then in the history file available for reuse.

Janis

[*] Here "modern" equals "1988" for their definition date in ksh, and
long lasting reluctance to be adapted in standards and other shells.

Helmut Waitzmann

unread,
Jul 18, 2017, 8:58:24 AM7/18/17
to
Janis Papanagnou <janis_pa...@hotmail.com>:
Yes, that's absolutely correct. I didn't mean to bash Unix.
What I wanted to say:

If one knows Unix, one will avoid many mistakes when programming
for Unix, may it be when using the system calls or when using a
shell.

The shell heavily depends on the features of the Unix system (or
POSIX standard). I think, it's important to know them, when
programming for the shell.

Some common pitfalls, I've seen:

The shell passes to (non-built-in) utilities an invocation
arguments' list, not a command line (like MSDOS did). Many
people don't distinguish the shell's command line from the
invocation arguments' list, causing many pitfalls.

* Gather some strings into /one/ shell variable without doing
proper quoting and “eval”uating, for example

parameters=
while ...
do
# compute and assign a value to "$parameter"
...
parameters="$parameters $parameter"
done
some_utility $parameters

Pitfalls:

Gathering arguments into one shell variable by just glueing
them together with spaces in between is not a reliable way of
gathering elements of an argument list.

An unquoted variable expansion using word splitting at “"$IFS"”
characters is not a reliable way to expand a part of a command
line into an argument list.

Modern shells have got a concept of array variables to
circumvent that problem. For example, with a “bash”:

declare -a parameters
while ...
do
# compute and assign a value to "$parameter"
...
parameters=("${parameters[@]}" "$parameter")
done
some_utility "${parameters[@]}"


* Building a shell command line to let a shell invoke a utility
with an argument list using the “-c” option or using the “eval”
command without doing proper quoting first, for example

variable='some arbitrary invocation argument'
su -- root -c 'a_command '"$variable"

calls for trouble. (A decade ago, Debians “su”, when invoked
with arguments to be passed to the shell, did exactly this: The
arguments were glued to the invoked shell's command line, with spaces in
between, without proper quoting, rather than just passed to the
shell as invocation arguments.)

Pitfall:

Putting a shell parameter value into a command line by just
glueing it into the command line separated by spaces is not a
reliable way of building a command line: It will be “eval”uated
by the shell rather than just put into the argument list.

* Using command substitution to build a list of parameters,
splitting the output of the substitution command at “"$IFS"”
characters, for example

for file in $( find ... )
do
...
done

Pitfall: the same as with gathering values into one shell
variable and using it unquoted to be split at “"$IFS"”
characters.

If one knows, that Unix file names may contain any character
except ASCII NUL, and, that utilities get their parameters by
argument lists rather than by a command line (see the system
call “execve()”), then it's obvious, that the shown examples
will fail, depending on the actual contents of the command
substitution or variable expansion.

Another common cause of errors is to think of file descriptor
redirection as kind of a divertion:

* Doing cascades of redirections in the wrong sequence, for
example redirecting standard output as well as standard error
into a logfile:

some command 2>&1 > logfile

If one knows, that redirection is merely copying entries of the
process' file descriptor table by means of the “dup2()” system
call, then it's obvious, that the correct sequence will be

some command > logfile 2>&1

* Creating an access-restricted inode by first creating it and
afterwards restricting its access permissions via “chmod()”,
for example

(
set -C &&
if : > a_private_file
then
chmod -- go= a_private_file
fi
)

creates a race condition: There is a small time interval, in
which an attacker could open the created file before its access
permissions are restricted to the file's owner.

The correct way to do this would be

(
umask -- go= &&
set -C &&
if : > a_private_file
then
chmod -- go= a_private_file
fi
)

Pitfall:

Restricting the access permissions of an inode does not close
already opened accesses to it, which wouldn't have been granted,
had it got the restrictive access permissions in the first
place.

In Unix, access permissions are checked when opening a file,
rather than when reading or writing to it afterwards. One, who
doesn't know the former, might assume the latter.

Note: If the directory, in which the file is to be created, has
got a default access control list, then the umask will be
ignored, and there might be no reliable (free of race
conditions) way to create an access restricted file in that
directory /using the shell/.

Reason: Using the shell, one can't specify the “mode” parameter
of the “open()” or “creat()” system call. This is a limitation
of the shell, which does not stem from the system.

* The (shell's) current working directory is not an implicit
prepending of the value of the shell variable "PWD" to pathnames
not starting with a slash but rather is an (additional, besides
"/") entry point in the file hierarchy, used in pathname
resolution for pathnames not starting with a slash:

(
set -x &&
mkdir -- dir &&
{
mkdir -- dir/subdir1 &&
(
cd dir/subdir1 &&
printf 'PWD=%s\n' "$PWD" && pwd && pwd -P &&
printf '%s\n' 'Hello, world!' > hello.txt &&
mv -- ../subdir1 ../subdir2 &&
{
# Note, that the following command succeeds:
cat -- hello.txt
# Note, that the following command fails:
cat -- "${PWD%/}"/hello.txt
printf 'PWD=%s\n' "$PWD"
pwd
pwd -P
}
)
rm -rf -- dir
}
)

> It is certainly necessary to know the basic concepts of
> Unix. But while it is maybe helpful to know some of these Unix
> system calls it's not necessary to know them just for purpose of
> learning to program in shell script.

The shells' manual pages---this is my impression---are written for
people which already know how to program for Unix/POSIX.

Knowing, how the Unix system works and what it offers to the
applications (the system calls), is helpful to circumvent many
pitfalls.

> (It's probably much better to get a good book about (structured)
> programming instead (if the OP isn't already proficient in it).

Yes, structured programming is recommended for programming in any
programming language.

applemcg

unread,
Jul 18, 2017, 10:08:15 AM7/18/17
to
Let's start by agreeing on the use of command history. When I find I've written a rather long-winded command I've constructed. I frequently do this:

$ newFunction () { !568; }

where command # 568 was one I'd come to after some trial and error. Then:

$ declare -f newFunction | tee -a $HOME/bin/publiclib

where publiclib is the one thing i "source" in my profile.

The nice thing about "tee -a" is the library when sourced uses the last instance of the definition appended to the library. You might notice the library may get cluttered with multiple instances of a function. I've written a function, named "lib_crunch", which does this:

1. source the library
2. collect a list of functions in the library, then
3. declare -f "this list" into a temp file.
4. compare the list collected against the current list, and
5. if identical, copy the temp file to the library.

You can always repair a function by:

$ ${EDITOR:-vi} +/newFunction/ $HOME/bin/publiclib

which itself might be a function:

$ efun () { ${EDITOR:-vi} +/$1/ $HOME/bin/publiclib; }

used:

$ efun newFunction

Depending on the length of the command line to be captured, and the presence of what looks like a function argument, I'll do this:

$ set newFunction; ${EDITOR:-vi} +/$1/ $HOME/bin/publiclib;

so then i might rewrite the function to supply a convenient default, so
the absence of an argument doesn't cause the thing to blow up. Here's a frequent first fix:

$ efun () { set ${1:-efun}; ${EDITOR:-vi} +/$1/ $HOME/bin/publiclib ; }

A short-hand would have been to omit the "set" and put the argument expansion right where it's used. I find the leading "set" to be both documentation and defensive programming.

I wholly endorse (and use) your recommendations about command line function writing and history. I could get carried away here, but thought a brief context-setting would be helpful here.

-- Marty
http://alum.mit.edu/www/mcgowan

Helmut Waitzmann

unread,
Jul 18, 2017, 10:09:28 AM7/18/17
to
kaushal <kaushal...@gmail.com>:
Maurice J. Bach: "The design of the UNIX® operating system".

It explains the concepts of unix with many small code snippets,
written in the C language, that help explaining, what the unix
kernel does.

It helped me a lot, to understand, how a shell uses the kernel via
the system calls to accomplish its tasks.

R. Stevens' "Advanced Programming in the UNIX Environment" gives a
deeper knowledge how to program using the system calls. But I
agree with Janis: That's not necessary to understand how to use
the shell.

> and the various system calls you mentioned.

As Janis already pointed out, the manual pages are the reference
to the system calls.

The manual pages are accessible with the "man" command. See its
manual page, "man(1)", i.e. the command

man man

for more information how to use it.

On my linux system, the system calls are in the section 2 of the
manual pages.

Also, if you want to stay POSIX compatible when writing shell
scripts, POSIX.1-2008, simultaneously IEEE Std 1003.1™-2008 and
The Open Group Technical Standard Base Specifications, Issue 7
(<http://pubs.opengroup.org/onlinepubs/9699919799/mindex.html>,
especially
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/toc.html>)
is your reference.

Janis Papanagnou

unread,
Jul 18, 2017, 10:16:31 AM7/18/17
to
(I didn't mean that you meant to bash Unix. - Intended pun, BTW?)

> What I wanted to say:
>
> If one knows Unix, one will avoid many mistakes when programming
> for Unix, may it be when using the system calls or when using a
> shell.

Yes, if you're on a Unix box (and not working in "GUI-click mode"
only) you should know the basic Unix concepts (and you named some
above; specifically file system, processes, IPC, signals, etc.).

>
> The shell heavily depends on the features of the Unix system (or
> POSIX standard). I think, it's important to know them, when
> programming for the shell.

What I wanted to point out is that (beside knowing the Unix basics)
you should learn the shell language and shell programming paradigmns
not from the Unix system calls but from shell books or tutorials.
Some shell books also describe the basic Unix concepts to a degree
necessary to understand the related shell concepts. So, depending on
the source of knowledge, even learning Unix independently may not be
necessary (though probably advisable anyway if you work on Unix).

>
> Some common pitfalls, I've seen:
>
> The shell passes to (non-built-in) utilities an invocation
> arguments' list, not a command line (like MSDOS did). Many
> people don't distinguish the shell's command line from the
> invocation arguments' list, causing many pitfalls.
>
> * Gather some strings into /one/ shell variable without doing
> proper quoting and “eval”uating, for example
>
> parameters=
> while ...
> do
> # compute and assign a value to "$parameter"
> ...
> parameters="$parameters $parameter"
> done
> some_utility $parameters
>
> Pitfalls:
>
> Gathering arguments into one shell variable by just glueing
> them together with spaces in between is not a reliable way of
> gathering elements of an argument list.

Gathering arguments is not the big problem; usually expanding them is.
In any way it's a shell [programming] issue (no Unix issue).

>
> An unquoted variable expansion using word splitting at “"$IFS"”
> characters is not a reliable way to expand a part of a command
> line into an argument list.

Yes.

>
> Modern shells have got a concept of array variables to
> circumvent that problem. For example, with a “bash”:

Arrays help to keep groups of arguments semantically together, but
they do not solve the expansion problem, specifically not the buffer
limit that you may reach with long argument lists if you use an
external command. IFS and quoting is basic, crucial shell knowledge.
You learn all that from learning the shell (not from the unrelated
system calls, also not from the exec() family, since only the shell
can tell you that it has a limit, otherwise you could only guess).

(Note that the shell has also own abstractions of concepts that you
find on OS-level, and you'd get certainly confused if you read the
corresponding system call and system data definitions instead of
the shell description.

>
> declare -a parameters
> while ...
> do
> # compute and assign a value to "$parameter"
> ...
> parameters=("${parameters[@]}" "$parameter")
> done
> some_utility "${parameters[@]}"
>
>
> * Building a shell command line to let a shell invoke a utility
> with an argument list using the “-c” option or using the “eval”
> command without doing proper quoting first, for example
>
> variable='some arbitrary invocation argument'
> su -- root -c 'a_command '"$variable"
>
> calls for trouble. (A decade ago, Debians “su”, when invoked
> with arguments to be passed to the shell, did exactly this: The
> arguments were glued to the invoked shell's command line, with spaces in
> between, without proper quoting, rather than just passed to the
> shell as invocation arguments.)
>
> Pitfall:
>
> Putting a shell parameter value into a command line by just
> glueing it into the command line separated by spaces is not a
> reliable way of building a command line: It will be “eval”uated
> by the shell rather than just put into the argument list.

Again, a shell issue. - Though an advanced issue; I hope a beginner
would not root-su (probably even to other machines) without knowning
a lot of Unix.

>
> * Using command substitution to build a list of parameters,
> splitting the output of the substitution command at “"$IFS"”
> characters, for example
>
> for file in $( find ... )
> do
> ...
> done

A bad code pattern. Yes, you should learn your shell and make such
code save. I think a good book on shell programming should address
that. (From a Unix system-level book you generally can't reliably
infer any behaviour to shell, and specifically not in this case.)

>
> Pitfall: the same as with gathering values into one shell
> variable and using it unquoted to be split at “"$IFS"”
> characters.
>
> If one knows, that Unix file names may contain any character
> except ASCII NUL,

(except NUL and '/')

> and, that utilities get their parameters by
> argument lists rather than by a command line (see the system
> call “execve()”), then it's obvious, that the shown examples
> will fail, depending on the actual contents of the command
> substitution or variable expansion.
>
> Another common cause of errors is to think of file descriptor
> redirection as kind of a divertion:
>
> * Doing cascades of redirections in the wrong sequence, for
> example redirecting standard output as well as standard error
> into a logfile:
>
> some command 2>&1 > logfile

Again, that's shell semantics; you have to know what the FDs are
and, in addition, need to know that they are in shell evaluated
from left to right. You get those semantics also from shell books,
certainly not from Unix system calls (where above syntax has no
meaning).

>
> If one knows, that redirection is merely copying entries of the
> process' file descriptor table by means of the “dup2()” system
> call, then it's obvious, that the correct sequence will be
>
> some command > logfile 2>&1
>

[ skipping more of you pitfall-samples; hope my point got clear ]

> [...]
>
>> It is certainly necessary to know the basic concepts of
>> Unix. But while it is maybe helpful to know some of these Unix
>> system calls it's not necessary to know them just for purpose of
>> learning to program in shell script.
>
> The shells' manual pages---this is my impression---are written for
> people which already know how to program for Unix/POSIX.

I think they are written for folks who should at least be proficient
in using Unix on the command line (not necessarily programming for
Unix). They are specifically no tutorials. We largely agree on that.

>
> Knowing, how the Unix system works and what it offers to the
> applications (the system calls), is helpful to circumvent many
> pitfalls.

We disagree here, as you've seen with my comments above.

>
>> (It's probably much better to get a good book about (structured)
>> programming instead (if the OP isn't already proficient in it).
>
> Yes, structured programming is recommended for programming in any
> programming language.

With the exception of Intercal[*], I think. ;-)

Janis

[*] https://en.wikipedia.org/wiki/INTERCAL

Hongyi Zhao

unread,
Jul 21, 2017, 10:50:06 PM7/21/17
to
On Tue, 18 Jul 2017 14:57:57 +0200, Helmut Waitzmann wrote:

> some command 2>&1 > logfile

I often use the following form for simplicity:

some command &> logfile

Regards
--
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.

Helmut Waitzmann

unread,
Jul 25, 2017, 5:31:08 PM7/25/17
to
Janis Papanagnou <janis_pa...@hotmail.com>:
> On 18.07.2017 14:57, Helmut Waitzmann wrote:
>> Janis Papanagnou <janis_pa...@hotmail.com>:

>>> Shell is a high level language abstraction layer from the Unix system
>>> calls. It opens a lot of pitfalls by design and by herself that do
>>> not stem from the Unix system interface.
>>
>> Yes, that's absolutely correct. I didn't mean to bash Unix.
>
> (I didn't mean that you meant to bash Unix. - Intended pun, BTW?)

Yes, but an innocent one, neither Bash nor me bashing Unix.

> Some shell books also describe the basic Unix concepts to a degree
> necessary to understand the related shell concepts.

If there are such books available, then why aren't they read and
understood (see below)?

[…]

Often shell script programmers are in the need of computing a
command line, then save it into a shell variable for later (re-)
use. But that's not easy. One can't tell the shell to store a
command line into a shell variable rather than to parse it: If
a shell reads a command line, it will parse and execute it.

Of course one can store an arbitrary character string into a shell
variable, for example the variable “commandline”. Then, if one
wants the shell to execute that string as a command line, one simply
writes the command

“eval "$commandline"”.

For example, the simple command

(0)

“printf '%s\n' \
'Saving the shell'\''s positional parameters in a' \
'variable is the big problem; expanding them is easy.'
”,

when invoked, would give the following output:

“Saving the shell's positional parameters in a
variable is the big problem; expanding them is easy.


Now, what would one have to write into a command line, that would
store this command (0) into the shell variable “commandline” in
order to execute it by

“eval "$commandline"”?

Just to copy and paste

“commandline=printf '%s\n' \
'Saving the shell'\''s positional parameters in a' \
'variable is the big problem; expanding them is easy.'


won't work: The shell's parser sees “commandline=printf”, which
is an assignment to the environment variable “commandline”
preceding the simple command “'%s\n' 'Saving the shell'\''s
positional ...”, which is not the “printf”-command, that was
intended.

To get the command line stored into the variable, one would have
to write

“commandline='printf '\''%s\n'\'' \
'\''Saving the shell'\''\'\'''\''s positional parameters in a'\'' \
'\''variable is the big problem; expanding them is easy.'\'''
”.

This is painful at least to do.

Note: There are many different command lines, that will execute
exactly the same “printf” command, when given to “eval”. What they
have got in common, however, is the invocation arguments list,
which is computed by the shell after parsing the command line and
consists of the following elements: “printf”, “%s\n”,
“Saving the shell's positional parameters in a”, and
“variable is the big problem; expanding them is easy.”.

And there is good news: This argument list can be stored
in some variables and reused for later execution without the need
of an extra level of quoting, for example:

Storing an argument list into the variables “arg0”, “arg1”,
... “arg<n-1>”, unsetting the variable “arg<n>” and storing the
number of variables (i.e.: <n>) into the variable “argc”:


argc=0 &&
for value in printf '%s\n' \
'Saving the shell'\''s positional parameters in a' \
'variable is the big problem; expanding them is easy.'
do
eval 'arg'"${argc}"'="$value"'
argc=$((argc+1))
done
eval 'unset arg'"${argc}"


Retrieving:


set ''; shift; n=0
while eval 'test -n "${arg'"$n"'+defined}"'
do
eval 'arg="$arg'"$n"'"'
set '' "$@" "$arg"; shift
done


Executing:

“"$@"”

> Gathering arguments is not the big problem; usually expanding
> them is.

Yes, indeed: Gathering arguments into the shell's positional
parameters (“"$@"”) is not a big problem.

But because it's tedious, many shell script programmers prefer
simpler but errorprone techniques, and therefore I don't
agree completely.

Problems arise, if one wants to store the positional parameters in
one shell variable rather than in a variable of its own for each
one to be able to free the positional parameters for another
purpose, then later invoke the stored values.


(1)

If one wants to save them into the variable “parameters”, one
could try the variable assignment

“parameters="$*"”.

Then the variable expansion “"$parameters"” will have the
following value (all in one line with single blanks in between):

“printf %s\n Saving the shell's positional parameters in a variable is the big problem; expanding them is easy.”

In this variable value, the shell cannot tell the difference
between, for example, the space character before respectively
after the word “Saving”, because the information about which of
the white space characters should separate the parameters and
which of them should remain in the parameter, is already lost. To
get it right, the former should separate the third from the second
invocation argument, the latter should be retained as part of the
third invocation argument. So neither of the following expansion
variants will work:


(1.1)

The command line

“${parameters}”

will invoke a command equivalent to the command line

“printf %s\\n Saving the shell\'s positional parameters in a \
variable is the big problem\; expanding them is easy.”,

which will split the variable value using the “"$IFS"” characters,
thus output each of the words in a line of its own:

“Saving
the
shell's
positional
parameters
in
a
variable
is
the
big
problem;
expanding
them
is
easy.



(1.2)

The command line

“"${parameters}"”

will invoke a command equivalent to the following command line
(all in one line with single blanks in between):

“'printf %s\n Saving the shell'\''s positional parameters in a variable is the big problem; expanding them is easy.'”,

which will fail, because there is no utility named

“printf %s\n Saving...”.


(1.3)

With the command line

“eval "${parameters}"”,

the shell will try to parse the variable value as the following
command line (all in one line with single blanks in between):

“printf %s\n Saving the shell's positional parameters in a variable is the big problem; expanding them is easy.”

and will fail at the apostrophe of the “shell's” genitive case.

> In any way it's a shell [programming] issue (no Unix issue).

Yes, its all about (as if) translating the shell command line into
an (“execve()” like) invocation arguments list, which makes the
parsing of the variable's value necessary, see
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tag_17_06>:

However, all of the standard utilities, including the regular
built-ins […], but not the special built-ins described
in Special Built-In Utilities, shall be implemented in a manner
so that they can be accessed via the exec family of functions as
defined in the System Interfaces volume of POSIX.1-2008 and can
be invoked directly by those standard utilities that require it
(env, find, nice, nohup, time, xargs)..

An, indeed, for many standard utilities, the only way for
them to be accessed is via the exec family of functions.

That is, the shell must construct an invocation arguments list
From the command line.

And of course, shell scripts, which gather a command line into one
variable by themselves, must do it in a way, that allows the shell
to construct an invocation arguments list out of the variable
value afterwards.

Apparently that is not learned by most shell script programmers,
and I guess, that is the case, because they don't even know of an
invocation arguments list, let alone the difference between it and
the command line. (That was the problem with Debian's “su” about
a decade ago: It merely glued the given arguments like (1) with
spaces in between into the command line, which was parsed by the
invoked shell, like the example (1.3) above.)

[Array variables in modern shells]

> Arrays help to keep groups of arguments semantically together,
> but they do not solve the expansion problem,

In the following example, the expansion problem is solved using an
array with “bash”:


(2)

The invocation arguments list could be gathered using an array by
doing

“declare -a parameters
parameters=(printf '%s\n' \
'Saving the shell'\''s positional parameters in a' \
'variable is the big problem; expanding them is easy.')
”.

The array variable contains an invocation arguments list rather
than a command line. It can be invoked by the command line

“"${parameters[@]}"”,

which will neither break the parameters at internal white space
nor confuse the shell's parser at the apostrophe or the semicolon,
because the parser won't look at the contents of the array
elements. The shell will simply pass them unmodified as the
invocation arguments list to the “execve()” system call (if it's
an external utility, or behave, as if it did) or process it by
itself (if it's a built-in).


(3)

With “bash”, there is another way to get it right without using
arrays: The elements of the invocation arguments list could be
glued together with spaces in between, but properly quoted by
means of the formatting directive “%q” of the “bash”-built-in
“printf” and assigned to the shell variable “commandline”:

“commandline="$(printf '%q ' printf '%s\n' \
'Saving the shell'\''s positional parameters in a' \
'variable is the big problem; expanding them is easy.')"


The variable “commandline” will then have the following contents
(all in one line with single spaces in between):

“printf %s\\n Saving\ the\ shell\'s\ positional\ parameters\ in\ a variable\ is\ the\ big\ problem\;\ expanding\ them\ is\ easy. ”

It could be given to the shell's “eval” command:

“eval "''${commandline}"”.

The “eval” command will see the following command line (all in one line with
single spaces in between):

“''printf %s\\n Saving\ the\ shell\'s\ positional\ parameters\ in\ a variable\ is\ the\ big\ problem\;\ expanding\ them\ is\ easy. ”,

which is equivalent (that is, it produces the same invocation
arguments list, when parsed and evaluated) though not equal to the
original command line (0).


(4)

And finally, there is a way (though not a built-in one), to get it
right with a POSIX system, as well:

As the POSIX “printf” doesn't have the “%q” formatting directive,
a function, say, “quote_words”, is to be written, which takes each
of its positional parameters, replaces each apostrophe in it by
the sequence “'\''”, then encloses the positional parameter in
apostrophes, glues all this translated positional parameters into
one variable with spaces in between, and finally outputs that
variable to standard output.

It can then be used as a replacement for “printf '%q '” in (3)
above.


> specifically not the buffer limit that you may reach with long
> argument lists if you use an external command.

This is another problem: the limited length of the invocation
arguments list (see below).

> IFS and quoting is basic, crucial shell knowledge. You learn
> all that from learning the shell

As there are many shell scripts and applications using the shell
in the wild, that do it wrong like one of the solutions (1.1) or
(1.3) above, you are going to say, that whose developers lacked
basic, crucial shell knowledge? You may be right.

I can think of

* the options loop in “/usr/bin/ps2pdfwr” (like 1.1):

OPTIONS="-P- -dSAFER"
while true
do
case "$1" in
-?*) OPTIONS="$OPTIONS $1" ;;
*) break ;;
esac
shift
done

[…]

exec "$GS_EXECUTABLE" $OPTIONS ...

The arguments are glued together in a partial command line
without proper quoting and later split using “"$IFS"”.

* the “/etc/init.d/dovecot” shell script (like 1.1):

DAEMON_ARGS=""

[…]

PIDBASE=${PIDBASE:-`
sed -r "s/^[ \t]*base_dir[ \t]*=[ \t]*([^ \t]*)/\1/;t;d" \
${CONF}`}
PIDFILE=${PIDBASE:-/var/run/dovecot}/master.pid

[…]

start-stop-daemon --nicelevel 19 --start --quiet \
--pidfile $PIDFILE --exec $DAEMON -- \
-c ${CONF} $DAEMON_ARGS \
|| return 2


* the corrupted Debian “su” about a decade ago (like 1.3).

> Again, a shell issue. - Though an advanced issue; I hope a beginner
> would not root-su (probably even to other machines) without knowning
> a lot of Unix.

The cause of the error was, not to understand the difference
between the invocation arguments (here: of the shell) in the
System Interfaces volume of POSIX.1-2008 and the shell's command
line. That is basic and crucial knowledge.

The problem with “su” was not at the su user's side. It was in
the “su” program itself: Neither a shell beginner nor a shell
expert could use it in a reliable way.

My impression about shell knowledge of many shell script
programmers is “Invocation arguments? Never heard of.
Constructing a command line? Sorry, I neither know how to do
this, nor do I understand the explanation in my shell book.”

Are there books for shell programming, that address the solutions
(0) and (4), and, with “bash” and similar modern shells, (2) and
(3)?

[The size limit of the invocation arguments list]

> (not from the unrelated system calls, also not from the exec()
> family, since only the shell can tell you that it has a limit,
> otherwise you could only guess).

I guess, the shell's limit (as long as there is virtual memory)
will be at least as wide as the “exeve()” system call's limit.
Therefore, I think, the shell won't help with this problem. If
large invocation argument lists are a problem, there is GNU xargs,
which by means of the option “--null” is capable of processing
arbitrary invocation arguments it reads from standard input (for
example, arguments containing white space, backslashes,
apostrophes or quotation marks).

> (Note that the shell has also own abstractions of concepts that you
> find on OS-level, and you'd get certainly confused if you read the
> corresponding system call and system data definitions instead of
> the shell description.

… not instead of the shell description, but as well as the shell
description. As the quotation above from the POSIX standard
shows, understanding invocation arguments lists is crucial, when
programming shell scripts.

The “su” developers wouldn't have done it wrong, had they read the
system call and system data definitions as well as the shell's
manual page or
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html>,
because “su” uses the “exec” family of functions to invoke the
shell.
0 new messages