In article <
2015052407...@chaz.gmail.com>,
Stephane Chazelas <
stephane...@gmail.com> wrote:
> isempty() (
> [ -d "$1" ] || exit 2
> content=$(ls -ALq -- "$1") || exit 2
> [ -z "$content" ]
> )
>
> would be POSIX (2008 for -A) and relatively reliable.
But it launches two subshells plus an external command, so it's quite
slow.
Also, unless set -f (noglob) is active, the result of anything to the
right of the = in a variable assignment is subject to pathname expansion
(globbing) depending on what's in the present working directory, which
is not the directory you're listing. So, to avoid that, the command
substitution should be quoted:
content="$(ls -ALq -- "$1")"
> [...]
> > dirisempty() {
> > cd -P -- "$1" || return 2
>
> Note that you don't need to be able to cd to a directory to list
> its content (all you need is read permission).
>
> If you want to check if the directory called "-" is empty,
> you'll have to write it dirisempty ./- (same problem with other
> values like -1, -2, +1 in some shells)
>
> Note that -P is a POSIX addition, if you're going to use that
> (which you need on POSIX shells), then you migh as well use [
> instead of test.
Yes, I write for current, pure POSIX shells. I've been working on a
rather ambitious general-purpose POSIX shell library project. It needs a
reliable and fast way to test for an empty directory. I'd rather avoid
using a subshell or external command if at all possible.
'test' and '[' are completely equivalent so there is no functional
difference. But I find 'test' more legible. I also find the way that the
'[' command tries to masquerade as syntax rather evil, because it
involves rather annoying mandatory spaces where you wouldn't expect them
in syntax. Matter of taste, I suppose.
> A few notes:
>
> - zsh and shells based on the Forsyth shell (pdksh, mksh, posh,
> probably oksh) the original Minix shell, don't know about the
> current) nevery expand . nor .. in their glob, so .* will expand
> to .* in an empty dir in those shells
Woops, missed that one. You're absolutely right.
(The current Minix default shell is a rather broken version of the
Almquist shell. I've confirmed it to work on there, though.)
So this means there are two possible outcomes to test for, the other one
being:
test "$#" -eq 2 \
&& test "$1" = '.*' \
&& test ! -L '.*' \
&& test ! -e '.*' \
&& test "$2" = '*' \
&& test ! -L '*' \
&& test ! -e '*'
> - unless you fix the locale to C, you've got no guarantee that .
> sorts before ..
Is there actually *any* locale where a single character sorts after a
double of the same character? I find it hard to imagine.
I suppose it's better to be safe than sorry, though, so I'll set LC_ALL
to C and restore it after.
> - Not all file systems have . and .. entries and not all OSes
> fake one if they're missing, so same as above, .* may expand to
> .*
Do you know any examples of systems that don't fake them? Not that it
matters now, because it's covered.
> - if you've got search but not read permission to the directory,
> that glob will expand to .* *.
We need read permission anyway, so I'll just test for that.
> - test "$x" = whatever may fail in some non-POSIX shell if $x is
> ! or (... for instance.
Yes. Argh. I keep forgetting that.
(In my library I've got strcmp() that deals with this correctly.)
> > && test ! -e '*'
>
> test ! -e '*' will return false if * is a symlink to a
> non-existent or inaccessible (as in a directory you don't have
> search access for) file.
Great catch, thank you.
> You'd want to test for ! -L '*' as well (though if you don't
> have search access to the dir and end up using a solution that
> doesn't involve cd, that won't help).
>
> > case "$?" in
> > ( 0 ) cd "$OLDPWD" && return 0 ;;
> > ( 1 ) cd "$OLDPWD" && return 1 ;;
> > esac
>
> cd "$OLDPWD" gives you no guarantee to return to where you were,
> for instance if any of the path components have been renamed
> (even before that function was called).
>
> Best is to use cd in a subshell here.
I care quite a lot about performance, so I'd really rather not. But I
must agree -- I can't find any other way around this than using a
subshell.
So now it looks like a tradeoff: either accept that it's impossible to
return to a nonexistent PWD or accept slow performance.
How likely is it really that your PWD doesn't exist? Perhaps trying to
defend against this is futile. After all, you're hardly going to delete
your own PWD without expecting breakage -- and if it's possible for
something else to delete or manipulate your PWD while you're working in
it, you've already got bigger problems to worry about than a failure to
return to your nonexistent PWD. So I tend to doubt it's worth the
performance tradeoff. At least the function tests and exits on failure
to return to $OLDPWD.
Two new versions are below. The first one still does not use any
subshell or external command. It's getting kind of absurdly long, but
it's still generally about six times faster than anything using a
subshell or external command (on bash, only about three times faster).
Note: on zsh, the first version only works in POSIX/'emulate sh' mode
(which is all I personally care about anyway). Without it, zsh will exit
on non-matching glob patterns, or in an interactive shell, it will abort
execution and leave you with LC_ALL=C in the environment.
Also, this function does not work with 'set -e' (errexit) active. That
option makes it very hard to distinguish between 'false' and 'error'
exit statuses of commands, because any non-zero exit status is treated
as a fatal error. Ugly hacks would be needed to cope with it.
The second version uses a subshell, so is shorter and probably less
insecure, but (as said) much slower. One big advantage is the lack of
need to restore any kind of variable or setting. For zsh, it activates
POSIX emulation within the subshell so it works on default zsh too.
I'd be very interested to hear about it if either version below breaks
on some POSIX shell or system in some way.
- M.
First version (no subshell):
dirisempty() {
cd -P -- "$1" && test -r '.' || return 2
# Enforce C locale to ensure correct sorting.
if test "${LC_ALL+set}" = 'set'; then
_saveLC_ALL="$LC_ALL"
else
unset -v _saveLC_ALL
fi
LC_ALL=C
# Temporarily enable globbing if it's disabled;
# set positional parameters to directory contents.
case "$-" in
( *f* ) set +f; set -- .* *; set -f ;;
( * ) set -- .* * ;;
esac
# Restore locale.
if test "${_saveLC_ALL+set}" = 'set'; then
LC_ALL="$_saveLC_ALL"
else
unset -v LC_ALL
fi
# Test for number and content of parameters
# corresponding to an empty directory.
{
test "$#" -eq 3 \
&& test "$1" = '.' \
&& test "$2" = '..' \
&& test "$3" = '*' \
&& test ! -L '*' \
&& test ! -e '*'
} || {
test "$#" -eq 2 \
&& test "$1" = '.*' \
&& test ! -L '.*' \
&& test ! -e '.*' \
&& test "$2" = '*' \
&& test ! -L '*' \
&& test ! -e '*'
}
case "$?" in
( 0 ) cd "$OLDPWD" && return 0 ;;
( 1 ) cd "$OLDPWD" && return 1 ;;
esac
echo "dirisempty: fatal error: 'test' or 'cd' failed" 1>&2
case "$-" in ( *i* ) return 3 ;; ( * ) exit 3 ;; esac
}
Second version (subshell):
dirisempty() (
cd -P -- "$1" && test -r '.' || exit 2
# Apply compatible settings within subshell.
[ -n "$ZSH_VERSION" ] && emulate sh || POSIXLY_CORRECT=y
set +e +f
LC_ALL=C
# Set positional parameters to directory contents.
set -- .* *
# Test for number and content of parameters
# corresponding to an empty directory.
{
test "$#" -eq 3 \
&& test "$1" = '.' \
&& test "$2" = '..' \
&& test "$3" = '*' \
&& test ! -L '*' \
&& test ! -e '*'
} || {
test "$#" -eq 2 \
&& test "$1" = '.*' \
&& test ! -L '.*' \
&& test ! -e '.*' \
&& test "$2" = '*' \
&& test ! -L '*' \
&& test ! -e '*'
}
)