Critique of Rexx, and a procedure library
I am an accountant and occasional programmer, and will soon
be moving to OS/2 from a Unix and DOS background. I have
decided to learn Rexx because of the universal macro
capability it provides to OS/2 applications. My vehicle for
learning Rexx has been a port of Jon Bentley's procedure
library, which is appended to this message. I am writing
this message to clarify in my own mind my experiences with
Rexx and help determine how Rexx will fit my accounting,
spreadsheet and database work on OS/2.
There are numerous rules of syntax and semantics in Rexx
which are a nuisance because they so frequently require
extra code to work around. The most obvious problem is the
prohibition of expressions in the tail of a compound
variable: "x.i+1" parses as "(x.i)+1" and "x.(i+1)" is
illegal, and must be written as "iplus1=i+1; x.iplus1"
instead. There are no records or structs, so a programmer
must resort to the old fortran trick of parallel arrays.
All elements in a conditional expression are evaluated,
preventing the "short-circuit" evaluation of conditions as
in C; the C condition "if i < maxi && x.i < x/i+1 ..." must
be rewritten as two separate if statements in Rexx:
if i < maxi then do
iplus1 = i + 1
if x.i < x.iplus1 then do
...
end
end
where the previous "i=iplus1" problem makes matters even
worse. And there are doubtless other examples of nuisance
syntax and semantics. These problems don't affect the final
capability of code, but they reduce programmer efficiency by
making code larger, harder to read, and harder to debug
("hmm -- did iplus1 get updated every time i was updated"?).
The lack of "pass by reference" for compound variables is a
serious bar to writing general-purpose procedures. In the
attached procedure library, in most other languages it
would be possible to make the name of the compound variable
to be sorted a parameter to the sort procedure. But because
Rexx passes all variables by value, there is no way to pass
or return a sorted compound variable. I solved the problem
in my procedure library by always operating on the global
variable x., but this is hardly a general solution.
In general, Rexx requires too many variables to be global.
There are no static variables, so any variable which is
local to a procedure but must retain its value between calls
of the procedure must be global. There is no fortran-like
common, so any variable which must be shared between
non-hierarchically-related procedures must be global. The
problem of "pass by reference" discussed previously may
result in variables being made global. Any variable which
is to be initialized must be global, because otherwise its
scope is only the procedure in which it is initialized. Of
course, the sins of global variables are well known to most
programmers, causing increased program complexity and
lengthier debugging.
One problem which sounds simple but has the capacity to do
great harm is the lack of any syntactic limits on a
procedure. Execution continues in a Rexx procedure until a
RETURN is seen; given SIGNALs, that can be anywhere in a
program. Although that's not normally a problem, since most
programs raise few SIGNALs, consider this scenario: during
program development or maintenance, a procedure which is a
large "if ... else if ... else" chain, with each alternative
in the chain processing its own return, is modified so that
none of the alternatives is selected, so that control flow
drops through to the next procedure, where it is unlikely to
do anything useful. In general, considering the problem
from the standpoint of program verification, the lack of
syntactic limits on procedures means that an entire program
must be inspected at once, instead of breaking it into
smaller pieces.
Although a program can create an arbitrary number of tails
for a single compound variable, Rexx makes no provision to
iterate over all of the tails, as in the "for (i in x)" loop
of awk. This problem is apparent in the checksubs procedure
of the attached library, which attempts to test that the
final tails of a compound variable are a permutation of the
original tails, as asserted by the problem definition. The
obvious solution is to keep a parallel array of tails
("xtails.1 to xtails.n, with xtails.count = n"), though this
may be hard to do generally. I ignored the problem in my
procedure library.
Some implementations of Rexx offer solutions to some of
these problems. The Rexx I have been using while waiting to
switch to OS/2 is IMC/rexx for unix (available via anonymous
ftp from rexx.uwaterloo.ca). IMC/rexx allows parenthesized
expressions in the tails of compound variables, and a
PROCEDURE HIDE declaration eases the problem of global
variables, without eliminating it. Of course, such
non-portable solutions leave a lot to be desired.
My overall conclusion is that Rexx is ill-suited for
"programming in the large." During the writing of my
procedure library, I stumbled over most of the problems
mentioned above. With practice, I will probably do better,
but some of the problems -- no "short-circuit" evaluation,
no "pass by reference" for compound variables, too many
globals, no syntactic limits on procedures -- are
fundamental to the language and have profound effects on
debugging and program verification.
Of course, Rexx was initially intended as a shell scripting
and general-purpose macro language, and for these purposes
it works well enough. I'll use Rexx with my OS/2
applications work to move data between applications, perhaps
laundering it along the way, or to write scripts that invoke
multiple applications. But I'll use other languages for
applications development, as they are suitable to the task
at hand.
I appreciate your thoughtful comments. In particular, I
solicit comments from programmers who have experience with
Rexx for writing large, complex programs.
Phil
pbe...@netcom.com
NOTE: My port of Jon Bentley's procedure library is shown
below:
/* bentley.rexx -- jon bentley's subroutine library, ported from awk
* searches: sequential search, binary search
* sorts: insertion sort, quick sort, heap sort
* heaps: siftup, siftdown
* priority queues: initialize, insert, extract max
* order statistics: select
* all set algorithms operate on the array x.1 to x.n; there is
* no calling interface
*
* Originally published in Communications of the ACM, July 1985. Updated
* and republished by Addison-Wesley in "More Programming Pearls" in 1988.
*/
/* MAIN PROGRAM */
globals = 'bign smalln errcnt n x. searchname sortname pqmax'
bign = 12; smalln = 5; errcnt = 0
say "testing assert -- should fail"
call assert 1 = 0
say "testing select"
do n = 0 to bign
say " n=" n
call clearsubs
do i = 1 to n
call geninorder
call select i
call checkselect i
end
do i = 1 to n
call scramble
call select i
call checkselect i
end
call genequal
do i = 1 to n
call select i
call checkselect i
end
call checksubs
end
say "testing quick sort"
call testsort "qsort"
say "testing insertion sort"
call testsort "isort"
say "testing heap sort"
call testsort "hsort"
say "testing priority queues"
do m = 0 to bign
say " m=" m
call clearsubs
call pqinit m
do i = 1 to m
call pqinsert i
end
do i = m to 1 by -1
call assert pqextractmax() = i
end
call assert n = 0
call pqinit m
do i = m to 1 by -1
call pqinsert i
end
do i = m to 1 by -1
call assert pqextractmax() = i
end
call assert n = 0
call pqinit m
do i = 1 to m
call pqinsert 1
end
do i = m to 1 by -1
call assert pqextractmax() = 1
end
call assert n = 0
n = m
call checksubs
end
say "testing sequential search"
call testsearch "ssearch"
say "testing binary search"
call testsearch "bsearch"
say "total errors (1 expected):" errcnt
if errcnt > 1 then
say ">>>> TEST FAILED <<<<"
exit
/* TESTING ROUTINES */
assert: procedure expose (globals)
parse arg cond
if \cond then do
errcnt = errcnt + 1
say " >> assert failed <<"
end
return
checkselect: procedure expose (globals)
parse arg k
do i = 1 to k - 1
call assert x.i <= x.k
end
do i = k + 1 to n
call assert x.i >= x.k
end
return
checksort: procedure expose (globals)
do i = 1 to n - 1
iplus1 = i + 1
call assert x.i <= x.iplus1
end
return
checksubs: procedure expose (globals)
do i = 1 to n
drop x.i
end
/* assert x. is empty; not possible in rexx */
return
clearsubs: procedure expose (globals)
drop x.
return
genequal: procedure expose (globals)
do i = 1 to n
x.i = 1
end
return
geninorder: procedure expose (globals)
do i = 1 to n
x.i = i
end
return
scramble: procedure expose (globals)
do i = 1 to n
call swap i, random(i, n)
end
return
search: procedure expose (globals)
parse arg t
select
when searchname = "bsearch" then return bsearch(t)
when searchname = "ssearch" then return ssearch(t)
otherwise do
say "invalid search name"
call assert 1 = 0
return
end
end
sort: procedure expose (globals)
select
when sortname = "qsort" then call qsort 1, n
when sortname = "hsort" then call hsort
when sortname = "isort" then call isort
otherwise do
say "invalid sort name"
call assert 1 = 0
end
end
return
testsearch: procedure expose (globals)
parse arg searchname
do n = 0 to bign
say " n=" n
call clearsubs
call geninorder
do i = 1 to n
call assert search(i ) = i
call assert search(i - 0.5) = 0
call assert search(i + 0.5) = 0
end
call genequal
call assert search(0.5) = 0
if n > 0 then
call assert search(1) >= 1
call assert search(1) <= n
call assert search(1.5) = 0
call checksubs
end
return
testsort: procedure expose (globals)
parse arg sortname
say " pathological tests"
do n = 0 to bign
say " n=" n
call clearsubs
call geninorder
call sort
call checksort
do i = 1 to n/2
call swap i, n+1-i
end
call sort
call checksort
call genequal
call sort
call checksort
call checksubs
end
say " random tests"
nfac = 1
do n = 1 to smalln
say " n=" n
nfac = nfac * n
call clearsubs
call geninorder
do i = 1 to nfac
call scramble
call sort
call checksort
end
call checksubs
end
return
/* SET ALGORITHMS */
/* swap -- exchange x.i and x.j */
swap: procedure expose (globals)
parse arg i, j
t = x.i; x.i = x.j; x.j = t
return
/* select -- partition x. so k'th largest element is in x.k */
/* average time complexity of n */
/* worst-case time complexity of n * n if x.1 = ... = x.n */
select: procedure expose (globals)
parse arg k
l = 1
u = n
do while l < u
call swap l, random(l, u)
t = x.l
m = l
do i = l + 1 to u
if x.i < t then do
m = m + 1
call swap m, i
end
end
call swap l, m
if m <= k then l = m + 1
if m >= k then u = m - 1
end
return
/* qsort -- quick sort of x.l to x.u */
/* recursive; call initially by qsort(1,n) */
/* average time complexity of n log n */
/* worst-case time complexity of n * n if x.1 = ... = x.n */
qsort: procedure expose (globals)
parse arg l, u
if l < u then do
call swap l, random(l, u)
t = x.l
m = l
do i = l + 1 to u
if x.i < t then do
m = m + 1
call swap m, i
end
end
call swap l, m
call qsort l, m - 1
call qsort m + 1, u
end
return
/* isort -- insertion sort of x.1 to x.n */
/* average time complexity of n * n */
/* best-case time complexity of n when x.1 <= ... <= x.n */
isort: procedure expose (globals)
do i = 2 to n
do j = i to 2 by -1
jminus1 = j - 1
if x.jminus1 > x.j then
call swap jminus1, j
else
leave
end
end
return
/* siftup -- re-establish heap after adding element to end */
/* time complexity of log n */
siftup: procedure expose (globals)
parse arg l, u
i = u
do forever
if i <= l then leave
p = trunc(i / 2)
if x.p >= x.i then leave
call swap p, i
i = p
end
return
/* siftdown -- re-establish heap after extracting maximum */
/* time complexity of log n */
siftdown: procedure expose (globals)
parse arg l, u
i = l
do forever
c = 2 * i
if c > u then leave
cplus1 = c + 1
if cplus1 <= u then
if x.cplus1 > x.c then
c = cplus1
if x.i >= x.c then leave
call swap c, i
i = c
end
return
/* hsort -- heap sort of x.1 to x.n */
/* time complexity of n log n */
hsort: procedure expose (globals)
do i = trunc(n / 2) to 1 by -1
call siftdown i, n
end
do i = n to 2 by -1
call swap 1, i
call siftdown 1, i - 1
end
return
/* pqinit -- initialize priority queue */
/* time complexity of 1 */
pqinit: procedure expose (globals)
parse arg pqmax
n = 0
return
/* pqinsert -- insert t into priority queue */
/* time complexity of log n */
pqinsert: procedure expose (globals)
parse arg t
call assert n < pqmax
n = n + 1
x.n = t
call siftup 1, n
return
/* pqextractmax -- delete and return value of */
/* maximum element in priority queue */
/* assumes priority queue is non-empty */
/* time complexity of log n */
pqextractmax: procedure expose (globals)
call assert n >= 1
t = x.1
x.1 = x.n
n = n - 1
call siftdown 1, n
return t
/* ssearch -- sequential search for t in x.1 to x.n */
/* returns position of t in x, or 0 if t not in x */
/* returns first of multiple identical items */
/* time complexity of n */
ssearch: procedure expose (globals)
parse arg t
do i = 1 to n
if x.i = t then return i
end
return 0
/* bsearch -- binary search for t in sorted x.1 to x.n */
/* assumes x.1 <= x.2 <= x.i <= x.n for all 1 <= i <= n */
/* returns position of t in x, or 0 if t not in x */
/* returns any of multiple identical items */
/* time complexity of log n */
bsearch: procedure expose (globals)
parse arg t
l = 1; u = n
do while l <= u
m = trunc((l + u) / 2)
select
when x.m < t then l = m + 1
when x.m > t then u = m - 1
otherwise /* x.m = t */ return m
end
end
return 0
--
Phil Bewig ... pbe...@netcom.COM