> LR 12,15 Origin of CSECT to GR12
> LA 10,2048 Number 2048 to GR10
> LA 11,2048(10,15) GR11 = GR15 + 2048 + 2048
> LA 10,2048(10,11) GR10 = GR11 + 2048 + 2048
> ...
> and even better, that takes only 14 bytes.
>
> Or just do like PL/S and its descendants did, set up the base regs
> 4095 bytes apart rather than 4096 apart. That cuts the number of bytes
> taken to set up three base registers down to 10 - at the expense of
> losing two bytes of addressability. Of course, the USING/s would have
> to reflect the fact of the base regs being only 4095 apart too.
>
> Andy Wood
>
woo...@trap.ozemail.com.au
Dear Mr Andy Wood,
My thanks for your extended reply to my latest post.
I never explicitly intended to set up a coding contest; I just replied to the original post to
celebrate a memory from a previous professional life of mine.
In 1968 I became an employee of Delft University of Technology on the occasion of the purchase of an
IBM S/360 Model 65 with OS/MFT and at a later date OS/MVT as its operating system. I got completely
engaged with and engulfed by IBM stuff, the Systems Journal and the Journal of Research and
Development included. I remember rather vividly the issue of the Systems Journal on NASA's Apollo
missions and got a certain understanding of the necessity of writing tight and nevertheless provably
correct coding of the on-board software.
However, during my IBM time at Delft University I never had to write tight mission-critical code.
So now is the time to take revenge, so to say, and I imagine now I have to write the tightest S/360
machine code for a moderately large module (i.e. moderately large for the 1970s).
So I did take the challenge to write coding for the three base registers in less that 14 bytes - I
did not succeed! - and then to give a proof as watertight as possible that one indeed cannot do
better than 14 bytes.
PROOF THAT ONE CANNOT DO BETTER THAN 14 BYTES IN SETTING THREE BASE REGISTERS
-----------------------------------------------------------------------------
(A)
One needs anyhow "LR 12,15", to add 4096 to GR12 and leave the result in GR11, and to add 4096 to
GR11 and leave the result in GR10. This makes three copy/store operations and two additions.
(B)
The number 4096 cannot be generated directly using only GRs. The shortest code to obtain 4096 is a
"LA ..,2048(0,0)" and one addition, or alternatively, "LH ..,=H'4096'". Generating 8192 requires one
more addition.
(C)
(C1) Altogether one needs a "LA", three copy/store operations and four additions when using GRs
only, or
(C2) with a literal H'4096': three copy/store operations and three additions.
(C1)
The Load Address instruction constitutes the shortest code to combine two additions and a store
operation. Therefore the shortest code ever possible to cover three copy/store operations and four
additions necessarily consists of two "LA" instructions and a "LR" instruction, occupying together
10 bytes. The indispensable "LA ..,2048" takes 4 bytes. Altogether 14 bytes as a minimum.
(C2)
One could try and do the job with a single "LA" and two "AR" instructions, or with three . The "LR
12,15" and the literal H'4096' occupy four bytes, the "LH" instruction to retrieve the literal also
occupies four bytes, so to beat the 14 bytes limit of case C1 one has a coding room of 6 bytes (or 7
bytes, if ever a one-byte constant would enter the game). Still two copy/stores and three additions
to go... this will never fit into six bytes, even not into eight bytes. I tried this, but did not
yet make the effort for a conclusive proof that one needs at least 16 bytes when using H'4096' in
main storage.
By the way, I was never involved in PL/S programming; my main areas at Delft University Computing
Centre were systems and interface programming in OS/360 and OS/370 Assembler Language and
application programming in Fortran, a little Algol-60, PL/1 and COBOL.
My pet programs still are
(A) a disk file inspection program based on reading full tracks in two revolutions, one revolution
for the Count fields and the next revolution for the full Count-Key-Data blocks (as long as no
End-Of-Data blocks are present; for each EOD block one needs a next revolution to read the next few
data blocks); this program shortened the operators' night shift by about one hour and a half,
fortunately without a reduction of their salaries.
(B) a newly written stand-alone post mortem dump program on punched cards for the IBM S/360 Model
44; the usual post mortem dump program covers the full S/360 instruction set and will not run on the
Model 44 with its reduced instruction set; the existing program did not conserve all of the control
words in the lowest 512 bytes across an IPL.
Best regards: Johan E. Mebius