Sample code

Andrew Koenig

unread,

Jun 11, 1986, 11:01:23 AM6/11/86

to

As absurd as it may sound, I know a guy who claims that the
real-world performance of an APL system can be accurately
predicted by finding out how fast an empty user-defined
function can be called in a simple loop. Thus:

N <- 1000
LOOP: FUN
-> (0 >= N <- N - 1) / LOOP

where FUN has no arguments and no body.

APL Quote Quad

unread,

Jun 12, 1986, 8:29:46 AM6/12/86

to

It would be interesting to have reports of people's experience here.
For instance, the Harris Benchmark today makes allowance for an "ADJ"
term which measures more or less the time taken for the above loop. It
would be interesting, to see a system that is blindingly fast on that
expression, but stupidly slow at some "real" problem. But, I doubt
that such a system exists.

At the other extreme however, a good example does exist, notably "The
APL Machine" by Analogic. It is not expecally fast on the empty loop,
but as a parallel processor ( Is APL the first language parallel
processing language? ) it is like greased lightning. Ask it to invert
a 200 by 200 matrix or multiply two long real vectors with say 250K
entries each. Instantaneous!

I think it gives incredible performance for modest price. I wish I had
one!

--
wat...@watmath.UUCP
watapl%wat...@waterloo.CSNET
... {ihnp4, allegra, decvax} !watmath!watapl

Lee Dickey

unread,

Jun 12, 1986, 9:27:49 AM6/12/86

to

> At the other extreme however, a good example does exist, notably "The

> APL Machine" by Analogic. ...
> ...

> I wish I had one!

I wish I had three.

Jan Prins

unread,

Jun 16, 1986, 2:12:47 AM6/16/86

to

>> As absurd as it may sound, I know a guy who claims that the
>> real-world performance of an APL system can be accurately
>> predicted by finding out how fast an empty user-defined
>> function can be called in a simple loop. Thus:
>>
>>
>> N <- 1000
>> LOOP: FUN
>> -> (0 >= N <- N - 1) / LOOP
>>
>>
>> where FUN has no arguments and no body.

>It would be interesting to see a system that is blindingly fast on that

>expression, but stupidly slow at some "real" problem. But, I doubt
>that such a system exists.

IBM apparently considered (constructed?) several microprogrammed assists
for the VSAPL interpreter on various bottom end /370 machines (/148,/135).
Each was to optimize a different aspect of APL execution.

I am not sure whether more than one ever made it into the field, but the
/148 microcode released implemented the mechanics of interpretation:
parsing, function invocation and variable localization, storage allocation,
etc. The assist actually "popped out" to branch tables and native 370
code to perform most of the APL primitives.

This was surprising since the "fast arithmetic and vector loop" implemen-
tations are far easier to build and were generally thought to be the best
bet for speeding APL execution up.

IBM claimed, and this was supported by interpreter statistics, that despite
the large array opportunities in the language, most data values were very
short or even scalar, and that interpretation dominated the cost of execution.

This was borne out by the performance of the /148 assist. STSC performed
some benchmarks that found the /148 VSAPL to be equal, in average performance
(commercial time-sharing), to a /168. The performance loss with the assist
disabled was a factor of 10, while for worst-case large numerical computations
the factor was 40.

The sort of loop described above is exactly where this system excelled.
None of the operations require an exit from the microcode, so this represents
top speed of the implementation. Depending on how you read "blindingly fast"
and "stupidly slow", this system might appear to violate the performance
metric. But the bottom line was that it was an impressive APL engine, on
the other end of the spectrum from ANALOGIC.

In fact, this "surprise" was the cause of an enormous computing fiasco right
here at Cornell some years ago. Heads rolled! A clever administrator
realized that the main campus' /168, which was 10% idle, had spare capacity
equal to the Med. School's /148, which could then be eliminated. "All they
do is run APL all day" he said as they carted the machine off....

Personally I think that the small sizes of arrays that dominated the
statistics 10 years ago were symptomatic of poor utilization of the language,
something that might have improved with more generalized definitions (APL2)
and programmer experience. Perhaps the time is better now for parallel
implementations that stress performance on large aggregate values?

jan prins {vax135,decvax,ihnp4}!cornell!prins
pr...@svax.cs.cornell.edu
pr...@cornell.csnet
PR...@CRNLCS.BITNET