Google Ryhmät ei enää tue uusia Usenet-postauksia tai ‐tilauksia. Aiempi sisältö on edelleen nähtävissä.

lndir in shell

26 katselukertaa
Siirry ensimmäiseen lukemattomaan viestiin

Kaz Kylheku

lukematon,
6.4.2018 klo 0.36.246.4.2018
vastaanottaja
The following is intended to replicate the salient feature of
the famous "lndir" utility from X11.

lndir fromdir todir creates a mirror of the fromdir directory
structure rooted at todir, and populates the mirror with symlinks
to the original files.

The links are relative if the fromdir argument is relative.

lndir is very useful; with lndir you can build code in
build directory separate from the source tree. (Even code
which has no build support for this whatsoever.)

The shell dialect is POSIX; hence no "local var=$1" is used.

Any comments welcome.

lndir()
{
fromdir=$1
todir=$2
abs=${fromdir%${fromdir#/}}

find "$fromdir" \( -type f -o -type d \) | while read frompath ; do
topath=${frompath#$fromdir}
topath=${topath#/}
[ -n "$topath" ] || topath="."
if [ -f "$frompath" ] ; then
if [ $abs ] ; then
ln -sf "$frompath" "$todir/$topath"
else
old_IFS=$IFS
IFS=/
set -- $todir/$topath
IFS=$old_IFS
dots=""
while [ $# -gt 0 ] ; do
[ $1 = "." ] || dots="$dots../"
shift
done
ln -sf "$dots$frompath" "$todir/$topath"
fi
else
mkdir -p "$todir/$topath"
fi
done
}

--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1

Thomas 'PointedEars' Lahn

lukematon,
6.4.2018 klo 1.28.146.4.2018
vastaanottaja
Kaz Kylheku wrote:

> The shell dialect is POSIX; hence no "local var=$1" is used.

That is no reason to introduce global variables. Use a subshell in the
function to define subshell-local ones.

> Any comments welcome.
>
> find "$fromdir" \( -type f -o -type d \) | while read frompath ; do

Never ever do this.

In particular:

1. Avoid using “read” in a shell-script loop. That is very inefficient by
comparison to the alternatives, such as awk(1), which is specified in
POSIX.1-2008.

2. Invoking “read” without options is a recipe for disaster, as e.g.
backslash in the input line acts as an escape character.

3. find(1) has a POSIX-compliant “-exec” predicate; use it:

find "$fromdir" \( -type f -o -type d \) -exec sh -c '…' sh '{}' \;

RTFM, STFW.

Also, you can test, by attempting it, whether the available cp(1) is GNU-
like and supports “-l” (create hardlinks instead of copying) or “-s” (create
symlinks instead of copying), which should make this a lot less complex and
more efficient.

--
PointedEars
<https://github.com/PointedEars> | <http://PointedEars.de/wsvn/>
Twitter: @PointedEars2
Please do not cc me. /Bitte keine Kopien per E-Mail.

Janis Papanagnou

lukematon,
6.4.2018 klo 11.15.386.4.2018
vastaanottaja
On 06.04.2018 06:36, Kaz Kylheku wrote:
> The following is intended to replicate the salient feature of
> the famous "lndir" utility from X11.
>
> lndir fromdir todir creates a mirror of the fromdir directory
> structure rooted at todir, and populates the mirror with symlinks
> to the original files.
>
> The links are relative if the fromdir argument is relative.
>
> lndir is very useful; with lndir you can build code in
> build directory separate from the source tree. (Even code
> which has no build support for this whatsoever.)
>
> The shell dialect is POSIX; hence no "local var=$1" is used.
>
> Any comments welcome.

Useful. Thanks!

>
> lndir()
> {
> fromdir=$1
> todir=$2
> abs=${fromdir%${fromdir#/}}
[...]

> if [ $abs ] ; then

Is there a reason you didn't quote this one?
I suspect you'd get issues with spaces here.

Janis

[...]


Janis Papanagnou

lukematon,
6.4.2018 klo 11.17.406.4.2018
vastaanottaja
Forget my comment. You're just testing non-empty.

> Janis

Kaz Kylheku

lukematon,
6.4.2018 klo 11.50.216.4.2018
vastaanottaja
Yes; abs is expected to contain ony one of two values: empty or /.

(In the assignment where abs is set, whitespace is taken care of due to
the treatment of that context; we don't have to quote $bar in foo=$bar,

But now I'm not so sure about the <X> argument position of ${param%<X>}.
Though we are basically delimited by the surrounding syntax, it
doesn't seem guaranteed that splitting and re-combining won't occur
which will turn multiple spaces into one.

> I suspect you'd get issues with spaces here.

Will investigate.

Janis Papanagnou

lukematon,
6.4.2018 klo 12.00.156.4.2018
vastaanottaja
On 06.04.2018 17:50, Kaz Kylheku wrote:
[...]
>
> Yes; abs is expected to contain ony one of two values: empty or /.
>
> (In the assignment where abs is set, whitespace is taken care of due to
> the treatment of that context; we don't have to quote $bar in foo=$bar,
>
> But now I'm not so sure about the <X> argument position of ${param%<X>}.

That should be safe, so...

> Though we are basically delimited by the surrounding syntax, it
> doesn't seem guaranteed that splitting and re-combining won't occur
> which will turn multiple spaces into one.
>
>> I suspect you'd get issues with spaces here.
>
> Will investigate.

... I suppose that's fine.

Janis

Kaz Kylheku

lukematon,
6.4.2018 klo 13.20.506.4.2018
vastaanottaja
On 2018-04-06, Janis Papanagnou <janis_pa...@hotmail.com> wrote:
> On 06.04.2018 17:50, Kaz Kylheku wrote:
> [...]
>>
>> Yes; abs is expected to contain ony one of two values: empty or /.
>>
>> (In the assignment where abs is set, whitespace is taken care of due to
>> the treatment of that context; we don't have to quote $bar in foo=$bar,
>>
>> But now I'm not so sure about the <X> argument position of ${param%<X>}.
>
> That should be safe, so...

For the purposes of just reducing to the leading slash, it doesn't
matter.

I have used that ${var%${var#pat}} type trick before though.

Hmm, in Bash it seems well-behaved:

$ spcvar="a b"
$ var=${spcvar% b}
$ echo "<$var>"
<a >

The spaces in ${spcvar% b} are preserved. I think that's true of any
expansion which takes place there. As if the % } are, effectively, a
pair of quotes.

Ian Zimmerman

lukematon,
6.4.2018 klo 13.34.526.4.2018
vastaanottaja
On 2018-04-06 07:28, Thomas 'PointedEars' Lahn wrote:

> 3. find(1) has a POSIX-compliant "-exec" predicate; use it:

This is the best way when the command is short (fits on a single line)
and needs no parametrization other than the file name from find.
Otherwise the awk way is more manageable.

Also, quoting from the GNU find manpage:

The string `{}' is replaced by the current file name being processed
everywhere it occurs in the arguments to the command, not just in
arguments where it is alone, as in some versions of find.

It is not clear if the GNU behavior here is specified by POSIX (and I am
too lazy to check).

There already is (or was) existing shell code for this task: look for
symlink-tree (or so) in the supporting scripts for GNU automake.
Myself, if I wanted to avoid depending on lndir or symlink-tree I'd do
this with perl.

--
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
To reply privately _only_ on Usenet and on broken lists
which rewrite From, fetch the TXT record for no-use.mooo.com.

Kaz Kylheku

lukematon,
6.4.2018 klo 14.12.386.4.2018
vastaanottaja
On 2018-04-06, Ian Zimmerman <i...@no-use.mooo.com> wrote:
> This is the best way when the command is short (fits on a single line)

Using -exec is advantageous when we can directly execute
some utility. This is better than reading the paths into the
shell, and constructing that command.

Using -exec is inefficient in this situation:

find ... -exec sh -c 'script ... "{}" ...' \;

Compared to the read loop:

find ... | while read path ; script ... "$path" ... ; done

this is because we aren't launching a new shell for each
repetition of script.

The difference will be particularly emphasized if script doesn't itself
spawn any external commands.

The script *per se* is equally slow whether it is executed as 'sh -c
script' or as the body of the loop. So the multiple spawns of the
shell only add overhead.

The interpolation of {} is fragile.

Of course, so is reading the output of find, but less so.

If a filename looks like abc$def, and is output by find and read into a
variable called path, then the quoting in "$path" safely interpolates it.

If find traverses the name abc$def and we insert it into
a shell script template as "{}", the quoting doesn't help;
we get the expansion "abc$def" where $def is expanded as a variable.

A smartly designed mechanism in find would be to use an environment
variable, like, say:

-execenv foo command ... \;

In this fantasy predicate, find will bind the traversed name to the
"foo" environment variable, which is passed down to command
and can be referenced in it with few or no issues.

(Of course, if find were intelligently designed, it would use square
brackets for grouping and exec termination would be something other
than the semicolon.)

I think the env utility can implement -execenv:

-exec env foo={} command ... \;

if {} will interpolate in the foo={} argument, it will do so without
any expansion issues. The env utility will, I think, cleanly treat any
characters after the = sign as contents for the variable.

However, env is a separate executable, so more overhead.

(The find implementor could recognize env and optimize it out.)

> and needs no parametrization other than the file name from find.
> Otherwise the awk way is more manageable.

I didn't use awk because this is going into a configure script.

$ grep awk configure Makefile
$

No awk in there now; so going from "no awk" to "some awk" is a big
dependency increment.

I don't see how awk would really help here all that much.

Awk doesn't have support for doing the "find" job, so find would
still be involved. The logic where we split the path along slashes
and count the components that aren't "." to generate the "../.."
could perhaps look better in Awk.

But then, issuing the mkdir -p and ln -s commands in Awk would be
uglified with system().

All in all, I don't envision that it would be a win.
>
> Also, quoting from the GNU find manpage:
>
> The string `{}' is replaced by the current file name being processed
> everywhere it occurs in the arguments to the command, not just in
> arguments where it is alone, as in some versions of find.

If it's replaced where it is alone, that's a safe mechanism, provided
that it's processed as a pure argument by the command that is spawned.

Thinking about this gives me the following idea:

Note this behavior:

$ sh -c 'echo $3' a b c d
d

A script passed with -c to the shell seems to have access to arguments.

If all the relevant aspects of that are portable, we can do:

find ... -exec sh -c 'script' -- {} \;

The {} is now a lone argument so works with any find. It is free of
expansion issues; the script accesses it as a positional parameter.

> There already is (or was) existing shell code for this task: look for
> symlink-tree (or so) in the supporting scripts for GNU automake.
> Myself, if I wanted to avoid depending on lndir or symlink-tree I'd do
> this with perl.

If you wanted to avoid depending on lndir or symlink-tree, you'd
have a sort of psychological inconsistency if you didn't also want to
avoid depending on perl. :)

Helmut Waitzmann

lukematon,
6.4.2018 klo 15.47.556.4.2018
vastaanottaja
Kaz Kylheku <157-07...@kylheku.com>:
> The following is intended to replicate the salient feature of
> the famous "lndir" utility from X11.
>
> lndir fromdir todir creates a mirror of the fromdir directory
> structure rooted at todir, and populates the mirror with symlinks
> to the original files.
>
> The links are relative if the fromdir argument is relative.
>
> lndir is very useful; with lndir you can build code in
> build directory separate from the source tree. (Even code
> which has no build support for this whatsoever.)
>
> The shell dialect is POSIX; hence no "local var=$1" is used.

You could use

lndir() ( )

rather than

lndir() { ;}

to obviate the "local var" command, should it be necessary.

> lndir()
> {
> fromdir=$1
> todir=$2
> abs=${fromdir%${fromdir#/}}
>
> find "$fromdir" \( -type f -o -type d \) | while read frompath ; do

What about inodes in the "$fromdir" tree, that are neither regular
files nor directories, for example, symbolic links?

find ... | while IFS= read -r frompath
...
done

will fail, if there are any newlines in any pathname component.

A way to deal with this for example would be

find ... -exec sh -c -- '
for frompath
do
...
done' sh '{}' +

but that would make counting the depth of the "$todir/$topath" by

> old_IFS=$IFS
> IFS=/
> set -- $todir/$topath
> IFS=$old_IFS
> dots=""
> while [ $# -gt 0 ] ; do
> [ $1 = "." ] || dots="$dots../"
> shift
> done

impossible, because the positional parameters would be already in
use for the pathnames given to the shell. (One could write a
shell function to gather positional parameters in one shell
variable in a way that could be used with "eval" to restore the
positional parameters, but I don't think it's worth while doing
that, see below.)

Also, it might not work with a "$todir" containing any ".." or
symlink pathname components.

To obviate this problem, I would abolish the "todir" parameter,
requesting that the invoker always changes to the "$todir"
directory prior to invoking lndir() and therefore supplies a
(relative or absolute) "fromdir" parameter, that reaches the
source directory from starting at that then current (i.e. intended
"$todir") directory.

Otherwise, lndir() would have to canonicalize the "$todir"
parameter, i.e., expanding symbolic links, resolving ".." and
removing "." path components.

Now, if "$todir" is always ".", the commands

> old_IFS=$IFS
> IFS=/
> set -- $todir/$topath
> IFS=$old_IFS
> dots=""
> while [ $# -gt 0 ] ; do
> [ $1 = "." ] || dots="$dots../"
> shift
> done

could be replaced by

dots="$(
export LC_COLLATE LC_CTYPE
LC_COLLATE=POSIX; LC_CTYPE=POSIX
printf %s "$topath" |
tr -cd -- / |
tr -- / \\n |
sed -e s/\^/../ | tr -- \\n /
)"

or, if one likes to avoid calling programs (i.e. fork() and
execve(), assuming, that "false" is a shell built-in),

path="$topath" && dots= &&
while basename="${path##*/}"
path="${path%"$basename"}"
${path:+:} false
do
path="${path%/}"
dots=../"$dots"
done

> topath=${frompath#$fromdir}
> topath=${topath#/}
> [ -n "$topath" ] || topath="."
> if [ -f "$frompath" ] ; then
> if [ $abs ] ; then
> ln -sf "$frompath" "$todir/$topath"

"ln" might fail, if the explicit end-of-options delimiter "--" is
not used, depending on the values of "$frompath" and
"$todir/$topath".

"ln" won't do, what was intended, when "$todir/$topath" is an
existing directory, but "$frompath" is not.

> else
> old_IFS=$IFS
> IFS=/
> set -- $todir/$topath
> IFS=$old_IFS
> dots=""
> while [ $# -gt 0 ] ; do
> [ $1 = "." ] || dots="$dots../"
> shift
> done
> ln -sf "$dots$frompath" "$todir/$topath"

(Same caveats here.)

> fi
> else
> mkdir -p "$todir/$topath"

(end-of-option delimiter recommended here, too)

> fi
> done
> }


WARNING: The following code has not been tested. There is no
error checking for the case, that

* either the source tree is (a part of) the destination tree (then
the files in the source tree might be replaced by dangling symbolic
links referring to themselves),

* or the destination tree is (a part of) the source tree (then it
might recurse till the file system is exhausted).

* Also, as far as I know, there is no way for a utility called from
"find" via "-exec" to tell "find", that it stop traversing the
file hierarchy and terminate.

To test this function, the "ln" and "mkdir" commands could be
replaced by

( set -- ln -sf -- "${1:?}" "$2"/ ; printf %s\\n "$*" )

( set -- ln -sf -- "$4$1" "$2"/ ; printf %s\\n "$*" )

( set -- mkdir -p -- "$topath" ; printf %s\\n "$*" )

respectively, in order to only print, what would be done
otherwise.

lndir()
{
# Invocation:
# lndir fromdir

# "$1" shall be (a symbolic link to) a directory:
if ! test -d "${1:?}"
then
printf >&2 "%s: Not a directory.\\n" "$1"
return 1
fi

# FIXME: What about symbolic links in the
# "$1" tree?

set -- "${1%/}"/
find "$1" \( -type f -o -type d \) \
-exec sh -c -- '
set -e
fromdir="${1:?}"
shift
if test " $1" = " $fromdir"
then
# This is the already existing root of the tree.
# Nothing to do.
shift
fi
abs="${fromdir%"${fromdir#/}"}"
if ${abs:+:} false
then
# "$fromdir" is an absolute pathname.
symlink()
{
# Invocation:
# symlink frompath tosubdir

ln -sf -- "${1:?}" "$2"/
}
else
# "$fromdir" is a relative pathname starting at the
# current directory.
symlink()
{
# Invocation:
# symlink frompath tosubdir

# (ab-)use the positional parameters for local
# named parameters:
set -- "$1" "$2" "${2#.}" ""
# "$@" = frompath tosubdir tosubdirrest dots
while
${3:+:} false
do
set -- "$1" "$2" "${3%/*}" "$4"../
done
ln -sf -- "$4$1" "$2"/
}
fi
# Let "$fromdir" have a trailing slash:
fromdir="${fromdir%/}/"
for frompath
do
topath="${frompath#"$fromdir"}"
# "$topath" is not empty
if test -d "$frompath"
then
# "$frompath" is (a symbolic link to) a directory.
# Treat symbolic links to directories like directories:
mkdir -p -- "$topath"
else
# "$frompath" is not (a symbolic link to) a directory.
# Symlink it:
tosubdir=./"$topath"
symlink "$frompath" "${tosubdir%/*}"
fi
done' sh "$1" '{}' +
}

Thomas 'PointedEars' Lahn

lukematon,
6.4.2018 klo 15.53.216.4.2018
vastaanottaja
Kaz Kylheku wrote:

> Using -exec is inefficient in this situation:
>
> find ... -exec sh -c 'script ... "{}" ...' \;

I did not say that you should do *that*. Read more carefully.

> Compared to the read loop:
>
> find ... | while read path ; script ... "$path" ... ; done

This can be mitigated by using “+” instead of “\;”. I agree with Ian that,
in general, a long list of strings is probably better processed with awk
(AISB).

However, letting find(1) handle the filenames has the advantage that
filenames can contain newline, which is more difficult to process with awk,
and in a strictly POSIX compliant way requires the use of an external
command *before* awk (namely, printf(1)) anyway.

Since you are attempting to create a *drop-in replacement*, a *general*
solution that *always* *works*, IMHO “-exec” is the way to go despite its
being less efficient.

> The interpolation of {} is fragile.
>
> Of course, so is reading the output of find, but less so.

Utter nonsense.

Helmut Waitzmann

lukematon,
6.4.2018 klo 16.01.206.4.2018
vastaanottaja
Ian Zimmerman <i...@no-use.mooo.com>:
> On 2018-04-06 07:28, Thomas 'PointedEars' Lahn wrote:
>
>> 3. find(1) has a POSIX-compliant "-exec" predicate; use it:

[...]

> Also, quoting from the GNU find manpage:
>
> The string `{}' is replaced by the current file name being processed
> everywhere it occurs in the arguments to the command, not just in
> arguments where it is alone, as in some versions of find.
>
> It is not clear if the GNU behavior here is specified by POSIX (and I am
> too lazy to check).

I'm sorry to tell, it's permitted by POSIX. I don't think, this
has been a good decision.

(Look for "{}" in
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html#tag_20_47_05>.)

> There already is (or was) existing shell code for this task: look for
> symlink-tree (or so) in the supporting scripts for GNU automake.
> Myself, if I wanted to avoid depending on lndir or symlink-tree I'd do
> this with perl.

On my Debian system there is a "lndir" binary executable.

Also, GNU cp has got the "--symbolic-link" option:

`-s'
`--symbolic-link'
Make symbolic links instead of copies of non-directories.
All source file names must be absolute (starting with `/')
unless the destination files are in the current directory.
This option merely results in an error message on systems
that do not support symbolic links.


Thomas 'PointedEars' Lahn

lukematon,
6.4.2018 klo 16.04.386.4.2018
vastaanottaja
Ian Zimmerman wrote:

> On 2018-04-06 07:28, Thomas 'PointedEars' Lahn wrote:
>> 3. find(1) has a POSIX-compliant "-exec" predicate; use it:
>
> This is the best way when the command is short (fits on a single line)
> and needs no parametrization other than the file name from find.

NAK.

> Otherwise the awk way is more manageable.

Yes, but that should not be the only criterion for a solution. In fact, I
find it more important that a solution *works* than that it is easy to
maintain or aesthetically pleasing, because otherwise it is not a solution
at all.

> Also, quoting from the GNU find manpage:
>
> The string `{}' is replaced by the current file name being processed
> everywhere it occurs in the arguments to the command, not just in
> arguments where it is alone, as in some versions of find.
>
> It is not clear if the GNU behavior here is specified by POSIX (and I am
> too lazy to check).

No, that is the behavior that POSIX.1-2008 describes as “implementation-
dependent”; what the GNU find manpage says is the case “in some versions
of find” is what is required by POSIX instead. Hence my suggestion.

FYI: In GNU find, “{}” within an argument to “-exec” can be written “\{\}”
to escape it, and not expand it.

You might find this bookmarklet/search engine entry useful:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/%s.html

I use “posix-utils” as keyword for it.

> There already is (or was) existing shell code for this task: look for
> symlink-tree (or so) in the supporting scripts for GNU automake.

Thanks.

> Myself, if I wanted to avoid depending on lndir or symlink-tree I'd do
> this with perl.

I would use mc(1) as I have done before.

Thomas 'PointedEars' Lahn

lukematon,
6.4.2018 klo 16.11.316.4.2018
vastaanottaja
Helmut Waitzmann wrote:

> Kaz Kylheku <157-07...@kylheku.com>:
>> The shell dialect is POSIX; hence no "local var=$1" is used.
>
> You could use
>
> lndir() ( )
>
> rather than
>
> lndir() { ;}
>
> to obviate the "local var" command, should it be necessary.

Fascinating, the former is in fact POSIX-compliant. Until now, I have used
(and meant) the equivalent of

lndir ()
{
(

exit …
)

return $?
}

instead. Thank you.

<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_05>

> [tl;dr]

Kaz Kylheku

lukematon,
6.4.2018 klo 18.19.226.4.2018
vastaanottaja
On 2018-04-06, Helmut Waitzmann <nn.th...@xoxy.net> wrote:
> Ian Zimmerman <i...@no-use.mooo.com>:
>> On 2018-04-06 07:28, Thomas 'PointedEars' Lahn wrote:
>>
>>> 3. find(1) has a POSIX-compliant "-exec" predicate; use it:
>
> [...]
>
>> Also, quoting from the GNU find manpage:
>>
>> The string `{}' is replaced by the current file name being processed
>> everywhere it occurs in the arguments to the command, not just in
>> arguments where it is alone, as in some versions of find.
>>
>> It is not clear if the GNU behavior here is specified by POSIX (and I am
>> too lazy to check).
>
> I'm sorry to tell, it's permitted by POSIX. I don't think, this
> has been a good decision.
>
> (Look for "{}" in
><http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html#tag_20_47_05>.)
>
>> There already is (or was) existing shell code for this task: look for
>> symlink-tree (or so) in the supporting scripts for GNU automake.
>> Myself, if I wanted to avoid depending on lndir or symlink-tree I'd do
>> this with perl.
>
> On my Debian system there is a "lndir" binary executable.

Yes; it's from the X11 project and dates back to the 1980's.

I seem to recall the folklore that the X11 people developed that in
order to build in a separate directory, made necessary by the
source code being NFS-mounted (either mounted read-only, or simply not
wanting to write over NFS).

All the C utilities I have ever cloned in shell code have been from X11,
by weird luck.

Well, all *two* of them.

The other one is resize: the incredibly useful little bugger which talks
to your TTY to figure out how big the screen is, and then plants those
numbers into the "struct termios" in the TTY driver.

Ironically, you do *not* need resize on a modern X11 system because of all
the integration with the POSIX tty/pty plumbing. When you resize a
graphical terminal, the kernel gets informed, the slave TTY is updated,
userland gets the SIGWINCH and all. Nobody ever has to do
"eval $(resize)" in xterm, Gnome terminal, rxvt or what have you.

Where resize is useful is embedded systems with a serial console.

And those buggers, by and large, don't have X11 packaged.

Kaz Kylheku

lukematon,
6.4.2018 klo 18.41.116.4.2018
vastaanottaja
On 2018-04-06, Helmut Waitzmann <nn.th...@xoxy.net> wrote:
> You could use
>
> lndir() ( )
>
> rather than
>
> lndir() { ;}
>
> to obviate the "local var" command, should it be necessary.

[ ... ]

> find ... | while IFS= read -r frompath
> ...
> done
>
> will fail, if there are any newlines in any pathname component.
>
> A way to deal with this for example would be
>
> find ... -exec sh -c -- '
> for frompath
> do
> ...
> done' sh '{}' +
>
> but that would make counting the depth of the "$todir/$topath" by
>
>> old_IFS=$IFS
>> IFS=/
>> set -- $todir/$topath
>> IFS=$old_IFS
>> dots=""
>> while [ $# -gt 0 ] ; do
>> [ $1 = "." ] || dots="$dots../"
>> shift
>> done
>
> impossible, because the positional parameters would be already in
> use for the pathnames given to the shell.

Not so!!!

By the time we get to this piece of code, the positional parameters
we want have been copied to local variables and we can clobber
them.

Each -exec spawns a new shell with new positional parameters; they don't
matter; nothing sees them after the script finishes.

Look, the following works for me. It is a trivial transformation: moving
exactly the same body of code into a '...' literal handed off to sh -c.

I added the -- arguments to the ln and mkdir commands, and put the body
into a subshell. Now we don't just have locals, but are exporting
them, so we have to be tidy.

For my purposes (small number of files) this grossly inefficient
monstrosity would work:

#!/bin/sh

lndir()
(
export fromdir=$1
export abs=${fromdir%${fromdir#/}}
export todir=$2

find "$fromdir" \( -type f -o -type d \) -exec sh -c \
'frompath=$0
topath=${frompath#$fromdir}
topath=${topath#/}
[ -n "$topath" ] || topath="."
if [ -f "$frompath" ] ; then
if [ $abs ] ; then
ln -sf -- "$frompath" "$todir/$topath"
else
old_IFS=$IFS
IFS=/
set -- $todir/$topath
IFS=$old_IFS
dots=""
while [ $# -gt 0 ] ; do
[ $1 = "." ] || dots="$dots../"
shift
done
ln -sf -- "$dots$frompath" "$todir/$topath"
fi
else
mkdir -p -- "$todir/$topath"
fi' {} \;
)

However, what seems dodgy to me here is the business of assuming that

frompath=$0

will access the {} argument which immediately follows 'script'
in the sh -c 'script' {} line.

I don't have a lot of shells installed on this Ubuntu thing here.
It works whether I substitute "dash" or "bash" for "sh".

Thomas 'PointedEars' Lahn

lukematon,
6.4.2018 klo 22.55.466.4.2018
vastaanottaja
This is well-documented, POSIX-compliant behavior. We have discussed it
here recently.

But it is why I suggested that you use “sh” for the second non-option
argument of “sh -c”, and '{}' for the *third* one instead.

Then you can use “+” instead of “\;”, and skip ”set” (which you are using
in an unsafe way anyway, and the “while” loop appears pointless *here*),
whereas sh(1) will be invoked with as many positional arguments (i.e. file
paths) as possible, and thus as few times as possible, increasing
efficiency.

It is also not a good idea to always overwrite existing symlinks.

Finally, note that, as a quirk of the ln(1) utility because it accepts the
target directory for the symlimk as the second non-option argument, creating
a symlink in an attempt to overwrite a symlink for a directory will create
the symlink in the symlinked directory instead; AFAIK, you have to rm(1) the
symlink for the directory first:

1.

'- foo -------> bar
'- ...

2. ln -sf baz foo

3.

:
'- foo -------> bar
:- ...
'- baz ------> baz

Compare:

4. rm foo/baz
rm foo

5.

:
'-

6. ln -sf baz foo

:
'- foo -------> baz

(I have encountered this problem repeatedly when trying to modify symlinks
to directories stored in “/usr/local/”.)

Note that neither GNU ln(1)’s “-n” option nor another option that makes this
easier is specified in POSIX.1-2008.

Helmut Waitzmann

lukematon,
8.4.2018 klo 8.48.558.4.2018
vastaanottaja
Thomas 'PointedEars' Lahn <Point...@web.de>:

> Finally, note that, as a quirk of the ln(1) utility because it
> accepts the target directory for the symlimk as the second
> non-option argument, creating a symlink in an attempt to
> overwrite a symlink for a directory will create the symlink in
> the symlinked directory instead; AFAIK, you have to rm(1) the
> symlink for the directory first:
>
> 1.
>
> '- foo -------> bar
> '- ...
>
> 2. ln -sf baz foo
>
> 3.
>
> :
> '- foo -------> bar
> :- ...
> '- baz ------> baz
>
> Compare:
>
> 4. rm foo/baz
> rm foo
>
> 5.
>
> :
> '-
>
> 6. ln -sf baz foo
>
> :
> '- foo -------> baz

[…]

> Note that neither GNU ln(1)’s “-n” option nor another option
> that makes this easier is specified in POSIX.1-2008.

In this special usecase of the OP, where the basename of the
symlink's target equals the basename of the symlink's pathname,
i.e.

ln -fs source_dir/the_basename target_dir/the_basename

rather than to do

ln -fs source_dir/the_basename target_dir/the_basename

one could do

ln -fs source_file/the_basename target_dir/

which will remove the already existing symlink
"target_dir/the_basename" before creating the new one.

(But that has already been written and you could have read it,
hadn't you "tl;dr". RTEMs.)

Helmut Waitzmann

lukematon,
8.4.2018 klo 10.36.358.4.2018
vastaanottaja
Kaz Kylheku <157-07...@kylheku.com>:
> On 2018-04-06, Helmut Waitzmann <nn.th...@xoxy.net> wrote:

>> find ... -exec sh -c -- '
>> for frompath
>> do
>> ...
>> done' sh '{}' +
>>
>> but that would make counting the depth of the "$todir/$topath" by
>>
>>> old_IFS=$IFS
>>> IFS=/
>>> set -- $todir/$topath
>>> IFS=$old_IFS
>>> dots=""
>>> while [ $# -gt 0 ] ; do
>>> [ $1 = "." ] || dots="$dots../"
>>> shift
>>> done
>>
>> impossible, because the positional parameters would be already in
>> use for the pathnames given to the shell.
>
> Not so!!!
>
> By the time we get to this piece of code, the positional parameters
> we want have been copied to local variables and we can clobber
> them.

As Thomas pointed out already, if one uses a "+" rather than a ";"
to terminate the "-exec" predicate, then there are many frompaths
passed to the invoked shell via the invocation arguments, which
can be accessed by the shell's positional parameters.

[…]

> find "$fromdir" \( -type f -o -type d \) -exec sh -c \
> 'frompath=$0
[…]
> fi' {} \;

[…]

> However, what seems dodgy to me here is the business of assuming that
>
> frompath=$0
>
> will access the {} argument which immediately follows 'script'
> in the sh -c 'script' {} line.

Yes. I think so, too.

As Thomas already wrote: You can avoid that by supplying a
command_name, for example, "sh", before the "{}"; then the
pathnames found by "find" will be passed to the shell as
additional invocation arguments and can be accessed by the shell
via the positional parameters starting with "$1".

> I don't have a lot of shells installed on this Ubuntu thing here.
> It works whether I substitute "dash" or "bash" for "sh".

I've seen a shell (don't remember which shell at which unix
system, maybe at Solaris), that changed it's behavior according to
the command_name, that was given to it: If the command_name
started with a "-", it would behave as a login shell.

I think it would be best to always let the command_name be equal
to the invocation name. In your usecase:

find ... -exec \
sh -c 'for frompath; do ...' sh '{}' +
^^ ^^^^^^^^^^^^^^^^^^^^^^ ^^
| ` command_string |
| ` command_name
` invocation_name

Ian Zimmerman

lukematon,
9.4.2018 klo 14.15.259.4.2018
vastaanottaja
On 2018-04-06 18:12, Kaz Kylheku wrote:

> > Myself, if I wanted to avoid depending on lndir or symlink-tree I'd
> > do this with perl.
>
> If you wanted to avoid depending on lndir or symlink-tree, you'd
> have a sort of psychological inconsistency if you didn't also want to
> avoid depending on perl. :)

Well, avoiding unnecessary dependencies is one metric to decide which
solution to prefer. There are other metrics which complement it.

When I wrote the above paragraph, I had a strong feeling that in fact _I
had_ written such a program, but I couldn't locate it then. Now I can.
I dropped a copy here:

https://very.loosely.org/paste/symlink-farm

I think it handles at least a few more corner cases than the automake
one. Didn't compare with yours or with lndir.

Kaz Kylheku

lukematon,
9.4.2018 klo 15.38.109.4.2018
vastaanottaja
On 2018-04-09, Ian Zimmerman <i...@no-use.mooo.com> wrote:
> On 2018-04-06 18:12, Kaz Kylheku wrote:
>
>> > Myself, if I wanted to avoid depending on lndir or symlink-tree I'd
>> > do this with perl.
>>
>> If you wanted to avoid depending on lndir or symlink-tree, you'd
>> have a sort of psychological inconsistency if you didn't also want to
>> avoid depending on perl. :)
>
> Well, avoiding unnecessary dependencies is one metric to decide which
> solution to prefer. There are other metrics which complement it.
>
> When I wrote the above paragraph, I had a strong feeling that in fact _I
> had_ written such a program, but I couldn't locate it then.

Maybe it's made its way into CPAN.

> Now I can.
> I dropped a copy here:
>
> https://very.loosely.org/paste/symlink-farm
>
> I think it handles at least a few more corner cases than the automake
> one. Didn't compare with yours or with lndir.

By the way, I have a specific use case in mind for this that I'm not
discussing, because I posted the code to gather comments, not to defend
the coding choices that are right in my specific requirements.

Some of the comments have been useful; others understandably apply in
broader circumstances (useful in the future, or to someone else).

E.g: performance is totally irrelevant well into the foreseeable future
due to the small size of the affected subtree; there won't be spaces,
control characters or shell-meta characters in the filenames; and
nothing will look like a command line option, since the from-dir is a
source file sub-tree in a software project.

(Well, performance is never "totally irrelevant"; because what that
literally means is that it's okay for what ought be a three second
operation to take three months. But YKWIM.)

Thomas 'PointedEars' Lahn

lukematon,
9.4.2018 klo 19.45.499.4.2018
vastaanottaja
Kaz Kylheku wrote:

> By the way, I have a specific use case in mind for this that I'm not
> discussing, because I posted the code to gather comments, not to defend
> the coding choices that are right in my specific requirements.

Your solution cannot be discussed without knowing its use-case. As you did
not state any, it stood to reason that you are trying to write a drop-in
replacement.

Thanks for wasting my time. Score adjusted.
0 uutta viestiä