Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

The First Pure Shell Contest (PUSH): relativepath

9 views
Skip to first unread message

Jens Schweikhardt

unread,
Aug 19, 2011, 1:23:19 PM8/19/11
to

The First Pure Shell Contest (PUSH)

== Motivation ==

The shell is not known for powerful string processing. Seasoned
programmers use various combinations of sed, awk, cut and others to
perform the more sophisticated transformations, or sacrifice portability
by using their shell's extensions, like arrays. But calls to external
utilities and use of shell extensions are often not needed. Knowing what
the POSIX shell can do may turn a slow non-portable script into a
blazingly fast and portable hacker's delight. The goal is to show the
absurdity of forking and nonportable constructs when a nice and small
pure shell solution solves the problem just as well.

== Task ==

Write a POSIX shell function named 'relativepath' which takes as
arguments two absolute canonicalized pathnames, and stores the relative
path from the first to the second in the variable "result" or "." when
the arguments are the same. Example: relativepath /foo/bar /foo/baz sets
result to "../baz". A canonicalized pathname is one that contains no "."
or ".." directories and no occurrences of "//"; only the root directory
may end with "/". If an argument is relative or noncanonical, the
function's exit status shall be 1 and the "result" variable shall
contain the string "<%s> not absolute or canonical" where %s is replaced
with the argument in question. For successful invocations the exit
status shall be 0.

More formally, relativepath must successfully pass the unit_test function
in evaluate.sh below.

== Entry format ==

Entries must consist of a single file implementing the relativepath function.
The first line should be a comment identifying the submitter.

The code should be indented with four spaces (no tabs) as shown in the
sample entry below to

1) keep it readable
2) make it easy to determine a "statement count"
3) avoid writing one long line to improve performance measurements

Entries will be stripped of comments and leading whitespace before
evaluation so that well commented entries suffer no performance penalty,
however slight, due to parsing more characters. The strip_comments
function is quite simplistic: comments must start a line or begin
with " # [A-Z]", i.e. space, hash, space, uppercase letter. It is
your responsibility to verify that this particular way of preprocessing
leaves your entry functional.

== Sample Entry ==

------------------------------------------------------------------------
# J. Random Hacker (or Anonymous if you prefer it.)
helperfunc() {
printf '42\n'
}
relativepath() {
# Note: not considered exhaustive!
# See <URL:http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html>
helperfunc
for foo in bar baz; do # Another comment.
something > output
done
while test ...; do
somethingelse < input
done
case word in
...)
foo
bar
;;
...)
baz
esac
if test ...; then
when true
elif [ ... ]; then
when true
else
when false
fi
command arg1 arg2 ...
builtin arg1 arg2 ...
pipe1 |
pipe2 |
pipe3
a &&
b &&
c ||
d
bg1 &
bg2 &
wait
empty= # One assignment per line.
result=...
name1=val1 \
name2=val2 \
oneshotcmd with 2 variable assignments
eval "
code as above indented an additional level, e.g.
case word in
pat1)
foo
bar
;;
pat2)
baz
esac
"
trap "
code as above indented an additional level, e.g.
case word in
pat1)
foo
bar
;;
pat2)
baz
esac
" SIG
}
------------------------------------------------------------------------

Slight variations are acceptable, but try to stick to a general "one
command per line" principle and do not attempt to outsmart the
simplistic statement counter described below.

== How to submit your entry ==

Submit entries via email to schweikh AT schweikhardt DOT net
(Followup-To: poster is set). The "Subject:" should contain the string
"relativepath entry 1" (without the quotes). Each contestant may submit
up to three entries; adjust the number in the subject accordingly. A
submission with the same number replaces an earlier submission. The
contest closes September 1st 2011, 00:00:01 UTC. By submitting an entry
the submitter assures that the submission is original work and agrees
that his/her work may be modified and published in usenet newsgroups and
other media, free of charge.

== Entry Evaluation ==

Entries compete in three categories:

1) Least statement count (tiebreaker: character count, then performance)
2) Least character count (tiebreaker: statement count, then performance)
3) Best performance (tiebreaker: statement count, then character count)

Entries are evaluated by running

$ env -i BC=$(which bc) PRINTF=printf SORT=$(which sort) RUNS=10 LOOPS=1000 ./evaluate.sh ./yourentry
Statements: 999 Characters: 9999 Performance: 999999 # J. Random Hacker

Lower numbers are better. The evaluate.sh script is attached below. The
current version of evaluate.sh can be downloaded from
<URL:http://www.schweikhardt.net/evaluate.sh>. You are encouraged to use
evaluate.sh to test your entry. env is the POSIX utility,
<URL:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/env.html>.
The number of RUNS and LOOPS may differ but will be the same for all
entries. If printf is not built-in to your shell, you can provide an
absolute path name, e.g. PRINTF=/usr/bin/printf.

The statement count is computed by the statement_count function. In words:

2 for each conditional control structure (if, elif, while)
1 for each for or case
1 for each break or continue
1 for each case label
1 for each name=value or name= Put them on separate lines. See sample entry.
1 for each built-in
1 for each function call
eval: 1 + Statement count of argument
trap: 2 + Statement count of action argument
here-docs: 1 per line; lines include terminator

The character count is computed by the character_count function. In words:

* Whitespace and semicolons are free.
* Comments on a line by themselves are free.
* Comments on a code line with '#' preceded by whitespace are free.

The performance is the sum of all fields of the POSIX 'times' utility,
converted to milliseconds, after running a loop with a suitable number
of iterations. Several runs are performed and the minimum number is used
as the actual performance. Note that performance measurement is
impossible if your shell's times command uses a non-standard output
format. It must look like this

zsh/bash/ash $ times
0m0.20s 0m0.18s
10m2.81s 1m58.70s

Additional digits of precision are okay, but these formats are not:

ksh93 $ times
user 0m1.20s
sys 0m0.30s

pdksh $ times
Shell: 0.10s user 0.20s system
Kids: 0.20s user 0.50s system


== Other tidbits ==

1 Make no assumptions about the value of $0. Your script is run (sourced)
using an arbitrary file name.
2 Do not assume that $PWD is a writable directory; use the /tmp directory
if you need to write and read back files.
3 You can assume that 'test' and '[' are built-ins. If this is not true
for your shell you may set PATH with env -i PATH=... to the
directory with the 'test' and '[' executables for the sole purpose of
using these external commands. It must run without PATH on a shell where
these are built-ins (i.e. on the contest evaluation machine).
4 Your entry should work in the general case, not just for the
particular invocations tested by the test_suite function.

VERBOTEN! NOT ALLOWED! DON'T EVEN THINK OF IT!

5 Invoking external commands with $(command) or `command`. This should
fail anyway due to PATH being unset.
6 Invoking an external with an absolute name with /path/to/command.
7 Abusing any variable used by evaluate.sh
8 (Re)defining or interfering with unit_test, performance_test,
statement_count or character_count to improve evaluation results.
9 Setting PATH (except for 3 above).

Entries doing so will be disqualified.

== Announcement of Winners ==

Winners will be announced during September 2011 in comp.unix.shell.

For each of the three categories, the best three submissions are awarded
the gold, silver and bronze medal, respectively. An entry may win in
more than one category. The original (indented and commented) winning
entries are posted to comp.unix.shell, naming the submitter as stated in
the entry's "# J. Random Hacker" comment.

== Prizes ==

Fame and admiration on the 'Net, especially in comp.unix.shell. Maybe an
offer to write or coauthor a book on shell programming? You're in it for
the honor, not for profit.

== Contest Operator's Entries ==

The contest operator reserves the right to participate in the contest.
Because it would be an unfair advantage if the contest operator looked
at competing submissions and stole the best ideas, the contest operator
must finish his submissions before the contest is announced and make
public the MD5 sums of his submission(s) in the contest announcement.
Should any of the contest operator's entries win, the public is able to
verify that it was not due to modifications after contest announcement.

Contest operator's entry MD5 sums:
MD5 (entry1) = 02e80db5a7885a4f28dcd1f559057cbf
MD5 (entry2) = 4f84167130276a7f290493f52cf517dc
MD5 (entry3) = ac7a0d00c25358fd278eb51d4352809f

Happy hacking everybody!

Jens Schweikhardt -- Contest Operator

== evaluate.sh ==

#!/bin/sh
#!/usr/local/bin/bash
#!/usr/local/bin/zsh
# ksh93 and pdksh -- Bogus performance due to non-standard times(1) output.
#
# evaluate.sh - test and evaluate a contest entry
#
# $Id: evaluate.sh,v 1.7 2011/08/19 17:00:28 schweikh Exp schweikh $

unit_test () {
while read FROM TO RESULT; do
relativepath "$FROM" "$TO"
XS=$?
if test "$XS $result" != "$RESULT"; then
$PRINTF '%s\n' "OOPS! Expected '$RESULT' for '$FROM' to '$TO' but got '$XS $result'"
return 1
fi
done <<EOF
/ foo 1 <foo> not absolute or canonical
/ ./foo 1 <./foo> not absolute or canonical
/ ../foo 1 <../foo> not absolute or canonical
bar / 1 <bar> not absolute or canonical
./bar / 1 <./bar> not absolute or canonical
../bar / 1 <../bar> not absolute or canonical
// /foo 1 <//> not absolute or canonical
/foo//bar /foo 1 </foo//bar> not absolute or canonical
/. /foo 1 </.> not absolute or canonical
./ /foo 1 <./> not absolute or canonical
/foo/../bar /foo 1 </foo/../bar> not absolute or canonical
/foo/./bar /foo 1 </foo/./bar> not absolute or canonical
/foo/bar/ /foo 1 </foo/bar/> not absolute or canonical
/foo/bar/. /foo 1 </foo/bar/.> not absolute or canonical
/ / 0 .
/- /- 0 .
/-x /-x 0 .
/ /-x 0 -x
/-n /-n 0 .
/ /-n 0 -n
/-z /-z 0 .
/ /-z 0 -z
/-- /-- 0 .
/-- /--/-- 0 --
/--/-- /-- 0 ..
/? /? 0 .
/ /? 0 ?
/?? /?? 0 .
/ /?? 0 ??
/??? /??? 0 .
/ /??? 0 ???
/?* /?* 0 .
/ /?* 0 ?*
/* /* 0 .
/ /* 0 *
/\ . /\ . 0 .
/* /** 0 ../**
/* /*** 0 ../***
/*.* /*.** 0 ../*.**
/*.??? /*.?? 0 ../*.??
/[] /[] 0 .
/][ /][ 0 .
/[a-z]* /[0-9]* 0 ../[0-9]*
/' /" 0 ../"
/""'' /''"" 0 ../''""
/"'"' /'"'" 0 ../'"'"
/foo /foo 0 .
/foo / 0 ..
/ /foo 0 foo
/ /foo/bar 0 foo/bar
/ /foo/bar/baz 0 foo/bar/baz
/foo /foo/bar/baz 0 bar/baz
/foo/bar / 0 ../..
/foo/bar /foo/bar 0 .
/foo/bar /foo 0 ..
/foo/bar /foo/baz 0 ../baz
/foo/bar /bar/foo 0 ../../bar/foo
/foo/bar/baz /gnarf/blurfl/blubb 0 ../../../gnarf/blurfl/blubb
/foo/bar/baz /gnarf 0 ../../../gnarf
/foo/bar/baz /foo/baz 0 ../../baz
/foo. /bar. 0 ../bar.
/.foo /.bar 0 ../.bar
/foo.. /bar.. 0 ../bar..
/..foo /..bar 0 ../..bar
/..foo.. /..bar.. 0 ../..bar..
/baz/..foo.. /baz/..bar.. 0 ../..bar..
/.baz./..foo.. /.baz./..bar.. 0 ../..bar..
/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p /1/2/3/4/5/6/7/8/9/A/B/C/D/E/F 0 ../../../../../../../../../../../../../../../../1/2/3/4/5/6/7/8/9/A/B/C/D/E/F
/f$$-$PPID /t$PPID-$$ 0 ../t$PPID-$$
EOF
FUNKYNAME=$($PRINTF '/ -n -z C:\\\a\b\f\n\r\t\v\\ \b%% $ & ( ) = ? $(touch x) * ~ # _ ; : . , ! { } [ ] ^ | < >')
relativepath "/ A name with/spaces in it!" "$FUNKYNAME"
XS=$?
if test "$XS $result" != "0 ../..$FUNKYNAME"; then
$PRINTF '%s\n' "OOPS! Expected '0 ../..$FUNKYNAME' but got '$XS $result'"
return 1
fi
}

statement_count () {
FILE=$1
COUNT=0
while read LINE; do
case $LINE in
'' | 'relativepath'* | '"' | 'done' | 'else' | 'fi' | 'esac' | '}' | '#'*)
;;
'if '* | 'elif '* | 'while '*)
COUNT=$(($COUNT + 2))
;;
*)
COUNT=$(($COUNT + 1))
;;
esac
done < "$FILE"
$PRINTF 'Statements: %3u' $COUNT
}

character_count () {
FILE=$1
COUNT=0
while read -r LINE; do
case $LINE in
'#'*)
continue
;;
esac
set -- $LINE
for WORD in $*; do
case $WORD in
*';'*)
while test -n "$WORD"; do
if test "${WORD%${WORD#?}}" != ";"; then
COUNT=$(($COUNT + 1))
fi
WORD=${WORD#?}
done
;;
'#'*)
break
;;
*)
COUNT=$(($COUNT + ${#WORD}))
;;
esac
done
done < "$FILE"
$PRINTF ' Characters: %4u' $COUNT
}

performance_test () {
L=$LOOPS
while test $L -gt 0; do
unit_test
L=$(($L - 1))
done
times
}

strip_comments () {
FILE=$1
while read -r LINE; do
case $LINE in
'#'*)
continue
;;
*)
LINE=${LINE%% # [A-Z]*}
$PRINTF '%s\n' "${LINE%${LINE##*[^ ]}}"
;;
esac
done < "$FILE"
}

: ${RUNS:=10} ${LOOPS:=100}
: ${PRINTF:=printf} ${BC:=/usr/bin/bc} ${SORT:=/usr/bin/sort}

test -n "$ZSH_VERSION" && setopt shwordsplit
strip_comments "$1" > "$1.nc"
. "$1.nc"
unit_test || exit 1
statement_count "$1.nc"
character_count "$1.nc"
while test $RUNS -gt 0; do
performance_test | {
IFS="ms" read M1 S1 M2 S2 REST
IFS="ms" read M3 S3 M4 S4 REST
$PRINTF 'scale=0;60000*(%d+%d+%d+%d)+1000*(%.3f+%.3f+%.3f+%.3f)/1\n' \
$M1 $M2 $M3 $M4 $S1 $S2 $S3 $S4 |
$BC
}
RUNS=$(($RUNS - 1))
done |
$SORT -n | {
read MILLISECONDS
read SUBMITTER < "$1"
$PRINTF ' Performance: %6u %s\n' $MILLISECONDS "$SUBMITTER"
}
exit 0

# vi: tabstop=4 shiftwidth=4 expandtab fileformat=unix

Stephane CHAZELAS

unread,
Aug 19, 2011, 2:13:43 PM8/19/11
to
2011-08-19, 17:23(+00), Jens Schweikhardt:
[...]

> The goal is to show the
> absurdity of forking and nonportable constructs when a nice and small
> pure shell solution solves the problem just as well.

What do you mean? The shell is is a command interpreter, it's
whole raison d'etre is to run commands.

Awk and Perl are built in my POSIX shell and not "[". Which am I
allowed to use?

> Write a POSIX shell function named 'relativepath' which takes as
> arguments two absolute canonicalized pathnames, and stores the relative
> path from the first to the second in the variable "result" or "." when
> the arguments are the same. Example: relativepath /foo/bar /foo/baz sets
> result to "../baz".

You mean /foo/bar/../baz is meant to be /foo/baz? On my system,
it's /usr/local/baz as /foo/bar is a symlink to /usr/local/bin.


> A canonicalized pathname is one that contains no "."
> or ".." directories and no occurrences of "//"; only the root directory
> may end with "/".

You probably want to add as requirement that none of the
components of either path should be a symbolic link.

--
Stephane

Rainer Weikusat

unread,
Aug 19, 2011, 2:58:55 PM8/19/11
to
Jens Schweikhardt <schw...@schweikhardt.net> writes:

> The First Pure Shell Contest (PUSH)
>
> == Motivation ==
>
> The shell is not known for powerful string processing. Seasoned
> programmers use various combinations of sed, awk, cut and others to
> perform the more sophisticated transformations, or sacrifice portability
> by using their shell's extensions, like arrays. But calls to external
> utilities and use of shell extensions are often not needed. Knowing what
> the POSIX shell can do may turn a slow non-portable script into a
> blazingly fast and portable hacker's delight. The goal is to show the
> absurdity of forking and nonportable constructs when a nice and small
> pure shell solution solves the problem just as well.

I spent some thoughts on this because I was interested in at least
having a general idea how a solution could look like. OTOH, the reason
why I work in miserably paid programming jobs is essentially because I
enjoy solving software problems a lot more than dicksizing contests.

Jens Schweikhardt

unread,
Aug 19, 2011, 3:12:58 PM8/19/11
to
In comp.unix.programmer Stephane CHAZELAS <stephane...@yahoo.fr> wrote:
# 2011-08-19, 17:23(+00), Jens Schweikhardt:
# [...]
#> The goal is to show the
#> absurdity of forking and nonportable constructs when a nice and small
#> pure shell solution solves the problem just as well.
#
# What do you mean? The shell is is a command interpreter, it's
# whole raison d'etre is to run commands.

But commands can be built-in or external. External commands are forked
and potentially much slower than an equivalent built-in. Was the
"Motivation" paragraph that hard to understand?

# Awk and Perl are built in my POSIX shell and not "[". Which am I
# allowed to use?

Then our understanding of the term "built-in" is very different and
I'm afraid it is yours that is the non-conventional one.

#> Write a POSIX shell function named 'relativepath' which takes as
#> arguments two absolute canonicalized pathnames, and stores the relative
#> path from the first to the second in the variable "result" or "." when
#> the arguments are the same. Example: relativepath /foo/bar /foo/baz sets
#> result to "../baz".
#
# You mean /foo/bar/../baz is meant to be /foo/baz? On my system,
# it's /usr/local/baz as /foo/bar is a symlink to /usr/local/bin.

No. I mean what my definition of "canonical" said.

#> A canonicalized pathname is one that contains no "."
#> or ".." directories and no occurrences of "//"; only the root directory
#> may end with "/".
#
# You probably want to add as requirement that none of the
# components of either path should be a symbolic link.

The relative path is purely textual. The names don't even have to exist
in the file system. Maybe I should have added that to the task
description.

Regards,

Jens
--
Jens Schweikhardt http://www.schweikhardt.net/
SIGSIG -- signature too long (core dumped)

pk

unread,
Aug 19, 2011, 3:14:32 PM8/19/11
to
On 19 Aug 2011 19:12:58 GMT, Jens Schweikhardt <use...@schweikhardt.net>
wrote:

> # Awk and Perl are built in my POSIX shell and not "[". Which am I
> # allowed to use?
>
> Then our understanding of the term "built-in" is very different and
> I'm afraid it is yours that is the non-conventional one.

POSIX does not forbid a shell to have whatever you may think of built-in.

Jens Schweikhardt

unread,
Aug 19, 2011, 3:23:06 PM8/19/11
to
In comp.unix.programmer Rainer Weikusat <rwei...@mssgmbh.com> wrote:
# Jens Schweikhardt <schw...@schweikhardt.net> writes:
#
#> The First Pure Shell Contest (PUSH)
#>
#> == Motivation ==
#>
#> The shell is not known for powerful string processing. Seasoned
#> programmers use various combinations of sed, awk, cut and others to
#> perform the more sophisticated transformations, or sacrifice portability
#> by using their shell's extensions, like arrays. But calls to external
#> utilities and use of shell extensions are often not needed. Knowing what
#> the POSIX shell can do may turn a slow non-portable script into a
#> blazingly fast and portable hacker's delight. The goal is to show the
#> absurdity of forking and nonportable constructs when a nice and small
#> pure shell solution solves the problem just as well.
#
# I spent some thoughts on this because I was interested in at least
# having a general idea how a solution could look like. OTOH, the reason
# why I work in miserably paid programming jobs is essentially because I
# enjoy solving software problems a lot more than dicksizing contests.

Well, this is a badly paid contest :-) No prize money. People are in it,
I hope, just for the fun of it. I certainly am.

In days of yore I've been participating in the IOCCC and it was fun to
be creative, cram every last bit of functionality in a limited number of
bytes.

Of course everybody's free to just sit and wait what the result are
and then comment like Waldorf and Statler.

Jens Schweikhardt

unread,
Aug 19, 2011, 3:29:31 PM8/19/11
to
In comp.unix.shell pk <p...@pk.invalid> wrote:
# On 19 Aug 2011 19:12:58 GMT, Jens Schweikhardt <use...@schweikhardt.net>
# wrote:
#
#> # Awk and Perl are built in my POSIX shell and not "[". Which am I
#> # allowed to use?
#>
#> Then our understanding of the term "built-in" is very different and
#> I'm afraid it is yours that is the non-conventional one.
#
# POSIX does not forbid a shell to have whatever you may think of built-in.

And because I am aware of that I made explicit references to test and [
and printf. I'll never understand why people try so hard to
misunderstand on purpose... :-)

pk

unread,
Aug 19, 2011, 3:37:36 PM8/19/11
to
On 19 Aug 2011 19:29:31 GMT, Jens Schweikhardt <use...@schweikhardt.net>
wrote:

> In comp.unix.shell pk <p...@pk.invalid> wrote:


> # On 19 Aug 2011 19:12:58 GMT, Jens Schweikhardt <use...@schweikhardt.net>
> # wrote:
> #
> #> # Awk and Perl are built in my POSIX shell and not "[". Which am I
> #> # allowed to use?
> #>
> #> Then our understanding of the term "built-in" is very different and
> #> I'm afraid it is yours that is the non-conventional one.
> #
> # POSIX does not forbid a shell to have whatever you may think of
> built-in.
>
> And because I am aware of that I made explicit references to test and [
> and printf. I'll never understand why people try so hard to
> misunderstand on purpose... :-)

Still, I don't see where your rules forbid using Perl or Awk if one's POSIX
shell has them built-in (which was the OP's question).

Jens Schweikhardt

unread,
Aug 19, 2011, 4:04:48 PM8/19/11
to
In comp.unix.programmer pk <p...@pk.invalid> wrote:
# On 19 Aug 2011 19:29:31 GMT, Jens Schweikhardt <use...@schweikhardt.net>
# wrote:
#
#> In comp.unix.shell pk <p...@pk.invalid> wrote:
#> # On 19 Aug 2011 19:12:58 GMT, Jens Schweikhardt <use...@schweikhardt.net>

#> # wrote:
#> #
#> #> # Awk and Perl are built in my POSIX shell and not "[". Which am I
#> #> # allowed to use?

#> #>
#> #> Then our understanding of the term "built-in" is very different and
#> #> I'm afraid it is yours that is the non-conventional one.

#> #
#> # POSIX does not forbid a shell to have whatever you may think of
#> built-in.
#>
#> And because I am aware of that I made explicit references to test and [
#> and printf. I'll never understand why people try so hard to
#> misunderstand on purpose... :-)
#
# Still, I don't see where your rules forbid using Perl or Awk if one's POSIX
# shell has them built-in (which was the OP's question).

It forbidden by the well known standardese rule of "if it's not defined,
then it's undefined." aka "undefined by omission".

3 You can assume that 'test' and '[' are built-ins. If this is not true
for your shell you may set PATH with env -i PATH=... to the
directory with the 'test' and '[' executables for the sole purpose of
using these external commands. It must run without PATH on a shell where
these are built-ins (i.e. on the contest evaluation machine).

perl isn't mentioned. So you can not assume it is a built-in.

I believe that the name of the contest gives an oh so slight hint that a
solution involving perl is not quite what is meant. The motivation
paragraph too (even mentions awk). If that isn't enough, the contestant
would surely wonder if perl was also a built-in on the unknown contest
evaluation machine.

Reading a text, understanding it and inferring a writer's intention are
the initial barriers to participating in the contest. I think this goes
for most contests. I admit there is room for improvement regarding its
formal "rules" (heck, the very word does not even appear once!), but I
did not want to overload the text with kilobytes of blah blah just to
fend off the nit-pickers, who would find a hole in any formal set of
rules. One thing I really should add for the next PUSH is "Rule
interpretation is MINE ALONE. PERIOD!" :-)

pk

unread,
Aug 19, 2011, 4:25:07 PM8/19/11
to
On 19 Aug 2011 20:04:48 GMT, Jens Schweikhardt <use...@schweikhardt.net>
wrote:

> Still, I don't see where your rules forbid using Perl or Awk if one's

> POSIX shell has them built-in (which was the OP's question).


>
> It forbidden by the well known standardese rule of "if it's not defined,
> then it's undefined." aka "undefined by omission".

Is it well known?



> 3 You can assume that 'test' and '[' are built-ins. If this is not
> true for your shell you may set PATH with env -i PATH=... to the
> directory with the 'test' and '[' executables for the sole
> purpose of using these external commands. It must run without PATH on a
> shell where these are built-ins (i.e. on the contest evaluation machine).
>
> perl isn't mentioned. So you can not assume it is a built-in.

Uh...this is basic logic. If I say "you can use the bicycle", does it follow
that you are not allowed to use the car?

If you said "you can assume that only 'test' and '[' are built-in", then
things would have been different.

> I believe that the name of the contest gives an oh so slight hint that a
> solution involving perl is not quite what is meant. The motivation
> paragraph too (even mentions awk). If that isn't enough, the contestant
> would surely wonder if perl was also a built-in on the unknown contest
> evaluation machine.

Admittedly the Perl and awk example was a stretch, but surely there are a
number of commands that are on the border, and a contestant may assume in
good faith that he's legitimate to use some of them, only to have his
submission fail on your machine. So I think you should explicitly list what
built-ins are available in the evaluation machine's shell, so people know
what they can rely on.

> Reading a text, understanding it and inferring a writer's intention are
> the initial barriers to participating in the contest. I think this goes
> for most contests.

The same goes for the OP's first comment (and if you've been reading
these groups for a while, I'm sure you know that he knows what he's talking
about). The intention, to me, was to signal that some parts of the rules
were not clear, and instead you interpreted it as a nitpick. See? It works
both ways.

> I admit there is room for improvement regarding its formal "rules" (heck,
> the very word does not even appear once!), but I did not want to overload
> the text with kilobytes of blah blah just to fend off the nit-pickers,
> who would find a hole in any formal set of rules. One thing I really
> should add for the next PUSH is "Rule interpretation is MINE ALONE.
> PERIOD!" :-)

Even then, you should make that interpretation explicit (that is, write it),
and not assume that people can somehow magically infer it.

Jens Schweikhardt

unread,
Aug 19, 2011, 5:08:25 PM8/19/11
to
In comp.unix.shell pk <p...@pk.invalid> wrote:
# On 19 Aug 2011 20:04:48 GMT, Jens Schweikhardt <use...@schweikhardt.net>
# wrote:
#
#> Still, I don't see where your rules forbid using Perl or Awk if one's
#> POSIX shell has them built-in (which was the OP's question).
#>
#> It forbidden by the well known standardese rule of "if it's not defined,
#> then it's undefined." aka "undefined by omission".
#
# Is it well known?
#
#> 3 You can assume that 'test' and '[' are built-ins. If this is not
#> true for your shell you may set PATH with env -i PATH=... to the
#> directory with the 'test' and '[' executables for the sole
#> purpose of using these external commands. It must run without PATH on a
#> shell where these are built-ins (i.e. on the contest evaluation machine).
#>
#> perl isn't mentioned. So you can not assume it is a built-in.
#
# Uh...this is basic logic. If I say "you can use the bicycle", does it follow
# that you are not allowed to use the car?

If that sentence appeared in an ISO document, I'd say yes. If I told you
"You can use my bicycle" and you took my car, I'd be surprised.
Standardese logic is somewhat different. When you tell your sweatheart
"I love you" that usually *does* exlude loving the other heartbreakers
in town, without you appending "and only you". Even everyday logic has
this property and it's not too unusual to apply this kind of reasoning.

# If you said "you can assume that only 'test' and '[' are built-in", then
# things would have been different.

Even better would have been "As the only additional builtins to those
explicitly enumerated in the POSIX shell you can assume test, [ and
printf." I'll keep that in mind for the next PUSH.

#> I believe that the name of the contest gives an oh so slight hint that a
#> solution involving perl is not quite what is meant. The motivation
#> paragraph too (even mentions awk). If that isn't enough, the contestant
#> would surely wonder if perl was also a built-in on the unknown contest
#> evaluation machine.
#
# Admittedly the Perl and awk example was a stretch, but surely there are a
# number of commands that are on the border, and a contestant may assume in
# good faith that he's legitimate to use some of them, only to have his
# submission fail on your machine. So I think you should explicitly list what
# built-ins are available in the evaluation machine's shell, so people know
# what they can rely on.

I thought this was clear from providing the link in the sample entry.
But the "Not exhaustive!" note could be read as opening a wide hole.
MISREAD in my mind.

#> Reading a text, understanding it and inferring a writer's intention are
#> the initial barriers to participating in the contest. I think this goes
#> for most contests.
#
# The same goes for the OP's first comment (and if you've been reading
# these groups for a while, I'm sure you know that he knows what he's talking
# about). The intention, to me, was to signal that some parts of the rules
# were not clear, and instead you interpreted it as a nitpick. See? It works
# both ways.

Yes, I know Stéphane is a shell wizard but I am still puzzled why he
replied the way he did. I assume it is because he wants to point out
where the test leaves too much room for misinterpretation, just like he
would point out conditions under which a shell script would not do what
it is supposed to do--which is a good thing, which I appreciate.
I'm no stubborn bonehead and neither are you or Stéphane :-)

#> I admit there is room for improvement regarding its formal "rules" (heck,
#> the very word does not even appear once!), but I did not want to overload
#> the text with kilobytes of blah blah just to fend off the nit-pickers,
#> who would find a hole in any formal set of rules. One thing I really
#> should add for the next PUSH is "Rule interpretation is MINE ALONE.
#> PERIOD!" :-)
#
# Even then, you should make that interpretation explicit (that is, write it),
# and not assume that people can somehow magically infer it.

Fair enough. Now that all is clear as mud, back to hacking relativepath :-)

Rainer Weikusat

unread,
Aug 19, 2011, 5:18:15 PM8/19/11
to
Jens Schweikhardt <use...@schweikhardt.net> writes:
> In comp.unix.programmer Rainer Weikusat <rwei...@mssgmbh.com> wrote:
> # Jens Schweikhardt <schw...@schweikhardt.net> writes:
> #
> #> The First Pure Shell Contest (PUSH)
> #>
> #> == Motivation ==

[...]

> # I spent some thoughts on this because I was interested in at least
> # having a general idea how a solution could look like. OTOH, the reason
> # why I work in miserably paid programming jobs is essentially because I
> # enjoy solving software problems a lot more than dicksizing contests.
>
> Well, this is a badly paid contest :-) No prize money. People are in it,
> I hope, just for the fun of it. I certainly am.

I've just spent about two hours of my time (not counting the several
days I need to even get to this point) with determining how to get an
xauth username out of the internal racoon data structures, provided an
xauth username was supposed to be provided and was actually
provided. The result of that looks like this (not compiled so far):

static char const *locate_xauth_user(struct ph1handle const *iph1)
{
#ifndef ENABLE_HYBRID
return NULL;
#else
char const *user, *xauth_user;
vchar_t const *login;
unsigned len;

xauth_user = NULL;

if (iph1->side == INITIATOR) {
if (iph1->rmconf && iph1->rmconf->xauth) {
login = xauth->login;

if (login) {
xauth_user = login->v;
len = login->l;
}
}
} else {
if (iph1->mode_cfg) {
xauth_user = iph1->mode_cfg->xauth.authdata.generic.usr;
if (xauth_user) len = strlen(xauth_user);
}
}

if (!(xauth_user && len)) return NULL;

/*
The reason for this is that the rmconf xauth user
is not 0-terminated for some idiotic reason.
*/
user = malloc(len + 1);
if (!user) return NULL;

memcpy(user, xauth_user, len);
user[len] = 0;

return user;
#endif
}

This was sort-of fun (to the degree that work can be fun) because it
was an intellectual challenge. But 'trying to run as fast as you can,
in the hope that none of the persons living on this planet who can run
faster is in the same contest' isn't. Somebody (probably, a lot of
somebodies) will be able to write a faster shell function than me, and
very probably not the least because I usually resent code-uglification
except if necessary.

NB: This is just 'my 0.02'.

Stephane CHAZELAS

unread,
Aug 20, 2011, 4:07:56 PM8/20/11
to
2011-08-19, 17:23(+00), Jens Schweikhardt:
[...]

> Write a POSIX shell function named 'relativepath' which takes as
> arguments two absolute canonicalized pathnames, and stores the relative
> path from the first to the second in the variable "result" or "." when
> the arguments are the same. Example: relativepath /foo/bar /foo/baz sets
> result to "../baz".
[...]

In the not-answering-the-question category, what about:

printf '%s\n' "$1//$2/"|sed -e:1 -e'$!{N;b1' -e} -e'
s|/////*|///|;s|\(.*/\)\(.*/\)\1|\2|;:2
s|^[^/]*/\(\(.*/\)*/\)|\1../|;t2
s|^/||;s|/$||;s/^$/./'

--
Stephane

Jens Schweikhardt

unread,
Aug 21, 2011, 6:43:17 AM8/21/11
to
In comp.programming.contests Stephane CHAZELAS <Stephane...@free.fr> wrote:
# 2011-08-19, 17:23(+00), Jens Schweikhardt:
# [...]

#> Write a POSIX shell function named 'relativepath' which takes as
#> arguments two absolute canonicalized pathnames, and stores the relative
#> path from the first to the second in the variable "result" or "." when
#> the arguments are the same. Example: relativepath /foo/bar /foo/baz sets
#> result to "../baz".
# [...]
#
# In the not-answering-the-question category, what about:
#
# printf '%s\n' "$1//$2/"|sed -e:1 -e'$!{N;b1' -e} -e'
# s|/////*|///|;s|\(.*/\)\(.*/\)\1|\2|;:2
# s|^[^/]*/\(\(.*/\)*/\)|\1../|;t2
# s|^/||;s|/$||;s/^$/./'

Cool! Since semicolons are free, what about using them as pattern delimiters:

printf '%s\n' "$1//$2/"|sed -e:1 -e'$!{N;b1' -e} -e'
s;/////*;///;;s;\(.*/\)\(.*/\)\1;\2;;:2
s;^[^/]*/\(\(.*/\)*/\);\1../;;t2
s;^/;;;s;/$;;;s/^$/./'

Note to contestants: Not a valid entry due to sed, which cannot assumed
to be a builtin of the contest evaluation machine's shell :-)

Rainer Weikusat

unread,
Aug 21, 2011, 10:14:56 AM8/21/11
to

It's a nice demonstration that some people really excel in the art of
creating complicated solutions to simple problems.

Janis Papanagnou

unread,
Aug 21, 2011, 11:28:00 AM8/21/11
to
Am 21.08.2011 17:14, schrieb Rainer Weikusat:
> Stephane CHAZELAS<Stephane...@free.fr> writes:
>> 2011-08-19, 17:23(+00), Jens Schweikhardt:
>> [...]
>>> [...]
>>
>> In the not-answering-the-question category, what about:
>>
>> printf '%s\n' "$1//$2/"|sed -e:1 -e'$!{N;b1' -e} -e'
>> s|/////*|///|;s|\(.*/\)\(.*/\)\1|\2|;:2
>> s|^[^/]*/\(\(.*/\)*/\)|\1../|;t2
>> s|^/||;s|/$||;s/^$/./'
>
> It's a nice demonstration that some people really excel in the art of
> creating complicated solutions to simple problems.
>

Oh, I took it as a poem, a nice programming poem, a dadaistic one.
Well done, Stephane!

Janis

Stephane CHAZELAS

unread,
Aug 21, 2011, 12:19:43 PM8/21/11
to
2011-08-21, 15:14(+01), Rainer Weikusat:

It might look complicated as sed is not very verbose, but it's
quite simple. The core is in one substitution and one loop with
only one substitution per iteration.

sed is the only POSIX tool I know that has a built-in way to
find the longest leading common string in two strings (well,
with expr(1)(*) and ex/vi) which makes it probably the best tool
for the task which is what "pure shell programming" should be
IMO.

(*)
$ expr 'foobarbaz foobarfoo' : '\(.*\).* \1'
foobar

--
Stephane

Stephane CHAZELAS

unread,
Aug 21, 2011, 4:12:29 PM8/21/11
to
2011-08-21, 16:19(+00), Stephane CHAZELAS:

> 2011-08-21, 15:14(+01), Rainer Weikusat:
>> Stephane CHAZELAS <Stephane...@free.fr> writes:
>>> 2011-08-19, 17:23(+00), Jens Schweikhardt:
>>> [...]
>>>> Write a POSIX shell function named 'relativepath' which takes as
>>>> arguments two absolute canonicalized pathnames, and stores the relative
>>>> path from the first to the second in the variable "result" or "." when
>>>> the arguments are the same. Example: relativepath /foo/bar /foo/baz sets
>>>> result to "../baz".
>>> [...]
>>>
>>> In the not-answering-the-question category, what about:
>>>
>>> printf '%s\n' "$1//$2/"|sed -e:1 -e'$!{N;b1' -e} -e'
>>> s|/////*|///|;s|\(.*/\)\(.*/\)\1|\2|;:2
>>> s|^[^/]*/\(\(.*/\)*/\)|\1../|;t2
>>> s|^/||;s|/$||;s/^$/./'
>>
>> It's a nice demonstration that some people really excel in the art of
>> creating complicated solutions to simple problems.
>
> It might look complicated as sed is not very verbose, but it's
> quite simple.
[...]

Actually, it can be simplified a bit:

$ cat rp
sed ':1
$!{N;b1
}
s|////*|///|;s|\(.*\)\(/.*/\)\1/|\2|;:2
s|/[^/]*\(.*//\)|\1../|;t2
s|/*||;s|/$||;s|^$|.|'<<.
$1//$2/
$ wc rp
7 8 119 rp
$ ./rp /foo/bar /foo/baz
../baz

--
Stephane

Rainer Weikusat

unread,
Aug 22, 2011, 4:15:03 PM8/22/11
to
Jens Schweikhardt <schw...@schweikhardt.net> writes:
> The First Pure Shell Contest (PUSH)
>
> == Motivation ==
>
> The shell is not known for powerful string processing.

For sake of completeness.

NB: I do not claim that this would be good for anything except that
it survives the unit_test() function. Also, it is not supposed to be
'competitive'.

-----------------------------
# Name

split()
{
eval "case \"\$$1\" in
?*/*)
$3=\${$1#/}
$2=\${$3%%/*}
$3=\${$3#*/}

return 0
;;

/*)
$3=\${$1#/}
;;

*)
$3=\"\$$1\"
esac

$2=''"
}

error()
{
echo "<$1> not absolute or canonical"
}

valid()
{
rc=1
test "${1#/}" != "$1" && {
rc=''
case "$1" in
*//*)
rc=1
;;

*?/)
rc=1
;;

*/./*)
rc=1
;;

*/../*)
rc=1
;;

*/..)
rc=1
;;

*/.)
rc=1
;;
esac
}

test -z "$rc" && return 0

result=`error "$1"`
return 1
}

one_up()
{
result="../$result"
}

one_down()
{
eval "result=\"\$result\$$1/\""
}

walk()
{
w="$1"
split w fw rw
while test -n "$fw";
do
$2 fw

w="$rw"
split w fw rw
done

$2 rw
return 0
}

relativepath()
{
d0="$1"
valid "$d0" || return 1

d1="$2"
valid "$d1" || return 1

split d0 fa ra
split d1 fb rb

while test -n "$fa" -a "$fa" = "$fb";
do
a="$ra"
split a fa ra

b="$rb"
split b fb rb
done

result=''
if test -n "$fa";
then
a="$ra"

if test -n "$fb" -o "$fa" != "$rb"
then
a="/$fa/$a"

test -n "$rb" && {
b="$rb"
test -n "$fb" && b="/$fb/$b"
walk "$b" one_down
}
fi

walk "$a" one_up
elif test -n "$fb";
then
b="$rb"
if test "$ra" != "$fb";
then
b="/$fb/$b"
test -n "$ra" && one_up
fi

walk "$b" one_down
else
test "$ra" = "$rb" && {
result=.
return 0
}

result="$rb"
test "$d0" != / && result="../$result"
fi

result="${result%/}"
return 0
}

Stephane CHAZELAS

unread,
Aug 23, 2011, 2:23:50 PM8/23/11
to
2011-08-22, 21:15(+01), Rainer Weikusat:

> Jens Schweikhardt <schw...@schweikhardt.net> writes:
>> The First Pure Shell Contest (PUSH)
>>
>> == Motivation ==
>>
>> The shell is not known for powerful string processing.
>
[...]

> echo "<$1> not absolute or canonical"
[...]

> while test -n "$fa" -a "$fa" = "$fb";
[...]

> if test -n "$fb" -o "$fa" != "$rb"
[...]

We may want to add those test cases:

/=/x / 0 ../..
/ \\\\\\\\ 1 <\\\\\\\\> not absolute or canonical

What about:

relativepath(){ for i do case $i/ in /|[!/]*|*/./*|*/../*|*?//*|//?*)
result="<$i> not absolute or canonical";return 1;esac;done
p=${1%/} r= b=${2%/}/;while :;do case $b in "$p"/*)r=$r${b#"$p"/}
r=${r%/};result=${r:-.};break;esac;p=${p%/*} r=../$r;done;}

--
Stephane

Rainer Weikusat

unread,
Aug 23, 2011, 2:52:25 PM8/23/11
to
Stephane CHAZELAS <stephane...@yahoo.fr> writes:
> 2011-08-22, 21:15(+01), Rainer Weikusat:
>> Jens Schweikhardt <schw...@schweikhardt.net> writes:
>>> The First Pure Shell Contest (PUSH)
>>>
>>> == Motivation ==
>>>
>>> The shell is not known for powerful string processing.
>>
> [...]
>> echo "<$1> not absolute or canonical"
> [...]
>> while test -n "$fa" -a "$fa" = "$fb";
> [...]
>> if test -n "$fb" -o "$fa" != "$rb"
> [...]
>
> We may want to add those test cases:
>
> /=/x / 0 ../..
> / \\\\\\\\ 1 <\\\\\\\\> not absolute or canonical

No. You 'might' (actually, you should) want to point out the problem
with the two statements above, namely,

In addition, the string comparison binary primaries '=' and
"!=" shall have a higher precedence than any unary primary.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html

and maybe even write that the obvious fix is
to rewrite this as

test -n "$fa" && test "$fa" = "$fb"

(and similar for the other case).

That you, you'd still had had the chance to boast with your superior
knowledge about pecularities of the UNIX(*) standard but in a way
which had a chance to be useful to readers.



> What about:
>
> relativepath(){ for i do case $i/ in /|[!/]*|*/./*|*/../*|*?//*|//?*)
> result="<$i> not absolute or canonical";return 1;esac;done
> p=${1%/} r= b=${2%/}/;while :;do case $b in "$p"/*)r=$r${b#"$p"/}
> r=${r%/};result=${r:-.};break;esac;p=${p%/*} r=../$r;done;}

What about

dd if=/dev/urandom bs=1 count=1024 | base64 | fold -b -w 26

?

Granted, it doesn't solve the problem but its as unintelligible as
the gibberish above.

Stephane CHAZELAS

unread,
Aug 23, 2011, 5:56:52 PM8/23/11
to
2011-08-23, 19:52(+01), Rainer Weikusat:
[...]

>>>> The shell is not known for powerful string processing.
>>>
>> [...]
>>> echo "<$1> not absolute or canonical"
>> [...]
>>> while test -n "$fa" -a "$fa" = "$fb";
>> [...]
>>> if test -n "$fb" -o "$fa" != "$rb"
>> [...]
>>
>> We may want to add those test cases:
>>
>> /=/x / 0 ../..
>> / \\\\\\\\ 1 <\\\\\\\\> not absolute or canonical
>
> No. You 'might' (actually, you should) want to point out the problem
> with the two statements above, namely,
>
> In addition, the string comparison binary primaries '=' and
> "!=" shall have a higher precedence than any unary primary.
>
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
>
> and maybe even write that the obvious fix is
> to rewrite this as
>
> test -n "$fa" && test "$fa" = "$fb"
>
> (and similar for the other case).
[...]

Yes, you're right. But it's repeated so often on cus that
- you can't use echo for arbitrary data portably or reliably
-> use printf
- the -a and -o binary test operators shouldn't be used, or more
generally, past 4 arguments to "test"/"[", the behavior is
unreliable (-> use &&/|| as you pointed out)

that it goes almost without saying.

>> What about:
>>
>> relativepath(){ for i do case $i/ in /|[!/]*|*/./*|*/../*|*?//*|//?*)
>> result="<$i> not absolute or canonical";return 1;esac;done
>> p=${1%/} r= b=${2%/}/;while :;do case $b in "$p"/*)r=$r${b#"$p"/}
>> r=${r%/};result=${r:-.};break;esac;p=${p%/*} r=../$r;done;}
>
> What about
>
> dd if=/dev/urandom bs=1 count=1024 | base64 | fold -b -w 26
>
> ?
>
> Granted, it doesn't solve the problem but its as unintelligible as
> the gibberish above.

You're just jealous because yours is bigger ;-)

--
Stephane

0 new messages