A second thing is the use of self modifying code. Is there still any use
for it (if there ever was)?
Apart from curiosity there is a more serious aspect to my asking.
Here in Europe there are some EC funded projects attempting to bring formal
(verification/development) methods to the construction of system kernels,
(hard) real time and/or safety critical systems. Most assume the use of some
high(er) level language and retain e.g. strict data/code separaration.
To what extend are such assumptions justified?
The question is not so much whether using such assumptions can be justified
at some stage during system development but rather whether or not such
assumptions can be maintained at every stage.
Ok, now for the FUN part!
Let's chronicle the deeds of the past masters(*)---or indeed, present masters---
of machine coding. Who has any stories?
What is the hairiest, most beautiful or most awful piece of self modifying code
that you've come accross?
Any takers?
(*) No: not Mell the programmer; unless somebody comes up with the actual code.
--
# Rob Gerth #
# #
# uucp: ro...@win.tue.nl | Eindhoven University of Technology #
# | 5600 MB Eindhoven, The Netherlands #
Assembly language is best used when you've got to squeeze code onto a
machine that barely has enough memory or speed to solve the problem
at hand.
In the aerospace business, it's all too common to be saddled with
hardware that would be considered obsolete by those of us who think
in terms of earthbound systems. For example, there's a sattelite
I was involved with that is to be launched in 1992 where, two years
ago, we were designing the software assuming we'd get a 3.5 MHz 80C86
processor, but it turned out we couldn't even get that processor
in radiation-hardened, certified-for-space-launch, ultra-low-power form,
so they had to abandon some work and back off to an even crummier
processor.
I spent a summer hand-tuning the assembly code for a data compression
algorithm to run on the 80C86, and got it to the point where it was
barely fast enough and small enough to meet our needs when that
processor was abandoned.
In theory, I suppose I could have tuned my code by diddling with C
source and watching what each change did to the compiler output, but
that's a shotgun approach (except perhaps, for someone who knows
exactly how the compiler works). For some changes, particularly those
involving register assignment on the horribly irregular 8086
architecture, I doubt I could have done them at the source level.
The software support that would have been most useful in the awkward
job of writing fine-tuned 8086 code would have been a cycle counter
that delivered, for each basic block, the number of machine cycles
expected to be consumed by a pass through that block.
Doug Jones
jo...@herky.cs.uiowa.edu
In IBM/370 assembler, there is a special instruction that does self modifying
code for you! (well, almost). It's called EX (Execute). How it works is
rather weird. Given a register and a memory location, it will take the
register and execute the instruction at the given location, after ORing it
with the instruction (exactly what part it ORs it with depends on the
instruction).
Why would you want to use it? I think an example would explain things a bit
more clearly ...
location to another) will be modified by EX so that the register used by the EX
Something like:
MVC LABEL1(8),LABEL2
Moves 8 bytes from LABEL2 to LABEL1. But if you wanted to move a
variable number of bytes, you could do this:
EX 5,MVCINST
.
.
.
MVCINST MVC LABEL2(0),LABEL1
This uses the lowest byte of R5, and OR's it with the length given,
allowing a variable number of bytes moved.
Actually, looking back, I realize that you should subtract 1 from the
register used by EX, because of the format of the MVC instruction.
---
By the time a man reads women like a book he's too old to collect a library.
Ken Hornstein kxh...@psuvm.psu.edu Phone: 814/862-7007
I know of two examples where this sort of thing was done. On the
DEC-20 we had in school, the TECO editor would compile stuff into
machine code, then execute the machine code. It turned out to be
faster to compile the code and then execute it than it was to
interpret the "average" macro.
The second example is more bizarre. There was a program called
visiclone that was a spreadsheet that ran on our BSD vax. It used to
compile the instructions it got from the user into a series of JSB's
(or CALLS, I never can keep those strait :-), then would jump to that
array. Made for a very fast interpreter on the 750, but didn't do
much to make visiclone portable.
The only honest to god self modifying code that I have had to write
was on the PDP 11 for saving a subset of the flags, then setting them
later. I had to hack the instructions in core to get the instructions
that I wanted.
The DEC Rainbow is a curious machine. It has a Z80 in it to do disk
I/O (and run CP/M programs), and a 8088 in it to run DOS (originally
to run CP/M-86, but whose counting?). The 8088 couldn't access the
disk drive at all, except via some rather strange contorted "calls" to
the Z80. The Z80 code would tend to stuff values into "magic"
addresses in the read/write routine to make it into a read or a write
routine (basically changing a IN instruction to an OUT intstruction).
Warner
--
Warner Losh i...@Solbourne.COM
We sing about Beauty and we sing about Truth at $10,000 a show.
It's still being done on the very smallest of embedded systems. Many
devices today are still written in assembler, and (despite a recent
editorial in Embedded Systems to the contrary) I think they still will.
Even if megabit memories and multi-mips microcontrollers are becoming
more and more affordable, I think there will always be a subset of
applications that must run in the *tiniest* space, on the *lowest* power,
that will have to be hand-coded. (Does writing pushrod code for nanogadgets
count as assembler?)
--
-- Laird P. Broadfield | Year after year, site after
UUCP: {akgua, sdcsvax, nosc}!crash!lairdb | site, and I still can't think
INET: lai...@crash.cts.com | of a funny enough .sig.
>>I'd like to start a new thread; or maybe two threads:
>>I'm curious about current *real world* use of assembly/machine coding.
>>Where is it still done? And on what systems? And why?
Sometimes a compiler can't produce code that meets certain unusual
requirements. An example might be the restart/powerup entry of certain
machines, such as Microvaxes. The code must start at a specific address and
save the state of the world in a machine-specific way, or precise control of
manipulating I/O devices or something is needed. Also, some code interfaces to
the operating system environment are quite arcane, such as VMS device drivers.
I suppose it might be possible to write one in a high level language, but it
would be very difficult.
>>Ok, now for the FUN part!
>>Let's chronicle the deeds of the past masters(*)---or indeed, present masters---
>>of machine coding. Who has any stories?
>>What is the hairiest, most beautiful or most awful piece of self modifying code
>>that you've come accross?
>>
>>Any takers?
>>
>I've read and written plenty of self-modifying code (long ago) but I don't
>really know of an example worth preserving for posterity.
>The CDC 6000 diagnostic programs used to distinguish between 6400-type and
>6600-type CPUs (which needed different diagnostics) by
>
> SA1 JUMP6400 LOAD JUMP TO 6400 CODE
> BX6 X1
> SA6 *+1 STORE AS NEXT INSTRUCTION
>+ EQ P6600 JUMP - IT'S A 6600
>
>JUMP6400 EQ P6400 JUMP - IT'S A 6400
>
>This is not the original code ; I make it up as I go along ; it's five years
>or so since I last programmed in CP Compass. It clobbers the next instruction
>word to be executed to change a jump to 6600 into a jump to 6400. On a 6600,
>which had an instruction stack, this would fail, and the code would jump to
>P6600 anyway.
The CDC machines were worse than this - a machine instruction deliberately
generated self modifying code! A procedure call instruction (RJ) wrote a
branch instruction to the following instruction at the destination address, and
then started executing the instruction after this word. There was no return
instruction, to return from the procedure, you jumped to its starting address,
then executed the modified instruction, which was a branch to the instruction
following the procedure call. Because of this method, you couldn't use this for
recursion. The machine also had no stack as we know it.
Also the procedure that made system requests under CDC's NOS operating system
always checked the machine type to see if it had a certain instruction for
system calls. If it did, it wrote the instruction in the code path it just
executed, so next time it did not check, it just executed the instruction
and "returned". If not, it wrote a different branch instruction.
-Mike Moroney
I recently had a look at a book about 6809 microprocessors (written just
before the 16-bit machines took over, it predicted a glorious future for
the 6809. Sigh. It really is a nice processor, pity it wasn't introduced a
few years earlier.).
Now, this book seemed to be written by a real hardware jock, used to
programming on the bare metal - he consistently used a format like this
for his assembly code (my knowledge of 6809 assembler is minimal so the
mnemonics may be bogus, but I hope you get the idea):
LDA
$12
$34
BPL
$FB
instead of the more conventional
LOOP: LDA $1234
BPL LOOP
- the format was really adapted for hand-assembly!
Anyway, the 6809 has a nice orthogonal instruction set with instructions
like BRN (BRanch Never), which of course acts as a sort of two-byte NOP.
When commenting on why the designers had chosen to incorporate such a
strange instruction, he of course didn't mention the esthetic pleasure of
an orthognal instruction set [:-)] but instead used the motivation that
this allowed you to save memory space by writing code with multiple entry
points. If the operand of the BRN is the opcode of some useful
instruction, you get a routine that does different things depending on
whether you jump to the BRN or to the byte after it!
This attitude is about as far as you can get from 'GOTO considered
harmful', isn't it?
Is this kind of coding really used in real life? Or is it just something
real prgrammers do for the heck of it? And is there a name for this kind
of coding? I suppose you could call it 'overlaying', but that name's
already taken for something quite different... What about 'interleaved
code'?
Magnus Olsson | \e+ /_
Dept. of Theoretical Physics | \ Z / q
University of Lund, Sweden | >----<
Internet: mag...@thep.lu.se | / \===== g
Bitnet: THEPMO@SELDC52 | /e- \q
I've heard of one use for it among 80x86 machines. To determine if your
program is running on an 8088 or 80286, for example, you can execute some
seemingly nonsensical code like this:
MOV AL,0
MOV [ADDR+1],AL ; change the operand of the next instruction
; You'll possibly need to insert some NOP's here
ADDR: MOV AH,1
The point is that the 8088 and the 80286 have differently sized prefetch
queues. On an 8088, the code will operate as it should, since the MOV AH,1
hasn't been fetched from memory when its operand is changed. On the 80286,
however, this instruction is fetched *before* its operand is changed, so
the code will produce the "wrong" result (AH=1).
[Note: I'm not *quite* sure of the facts here. I saw this on Comp.arch a
few months ago and am writing it down from memory. Please don't flame me
too hard if I've misunderstood something - this a folklore group, remember?
:-)]
>A second thing is the use of self modifying code. Is there still any use
>for it (if there ever was)?
A few years back I wrote some sprite routines for my BBC micro (6502
based micro computer). I wanted to optimize for speed primarily and
space secondarily. There was a choice of placing, ORing, ANDing and
XORing the sprite on the screen. This was solved by modifying an
instruction in the central loop to be either LDA, ORA, AND or EOR. The
actual parameter to the sprite routine that determined the operation
was the op-code for the desired instruction!
The alternative to this approach was to make a copy of the code for
each of the alternatives, and then do an indexed jump to one of these.
But that required more space and it would actually be a bit slower.
I have bben told that on some workstation BitBLT operations build a
piece of code that is specialized not only for the logical operation,
but also for the size of the bitmaps.
This is actually partial evaluation (specialization) of some general
code, followed by immediate execution of the residual (specialized)
code.
Self modifying code is actually used extensively among Prolog
programmers, by use of assert and retract. Most of the uses correspond
to modifying global variables, but some uses actually modify the
central code.
Torben Mogensen (tor...@diku.dk)
>A second thing is the use of self modifying code. Is there still any use
>for it (if there ever was)?
Back in the days of ferrite-core memories, one application for which
self-modifying code was absolutely essential was that of clearing
the whole of memory (including the program itself). On some machines
this was an impossible task (eg PDP-1). On others it could be done,
but was often tricky, to say the least. In the late 1960's several of
these routines for the PDP-8 were put forward in the DECUS newsletter,
and many of them were quite astonishingly ingenious. I believe that
a version that a colleague of mine and I produced about July 1965 was
probably the very first for the PDP-8, but I would be interested to
hear of any earlier claims!
--
+-------------------+-----------------------+
| Alec McKenzie | a...@stl.stc.co.uk |
+-------------------+-----------------------+
The TRS-80 model 1 used this trick. It's a Z-80 machine with Microsoft
BASIC in ROM. The entry point to the BASIC verbs SET and RESET (used to
turn pixels on or off on the screen) use "overlayed" code. Jump to one
location, the routine turns a point on, jump to the middle of the first
instruction, and the routine turns the point off.
TRS-80 disk operating systems also typically used self-modifying code,
especially for the disk drivers. The OS I use only uses a couple of K
of memory, but is actually about 40K in length. It works by loading
overlays to perform different operations, so it pays to make the resident
part as space-efficient as possible. The disk read routine is changed into
a disk write routine by changing a couple of code bytes on the fly... all
the code that computes directory/file locations for different disks (single/
double density, single/double sided, 8", 5.25", or 3.5", 40 or 80 tracks,
etc.) uses self modifying code. The single routine just modifies itself
for whatever disk geometry/density is currently being accessed.
I'm sure it was fun to write. It's hell to disassemble and comment. The
designers of the OS felt it was sufficient to provide you with a short
list of entrypoints with terse documentation, and a disassembler!
--
| b...@epmooch.UUCP (Ben Mesander) | "Cash is more important than |
| ben%serval...@uokmax.ecn.uoknor.edu | your mother." - Al Shugart, |
| !chinet!uokmax!servalan!epmooch!ben | CEO, Seagate Technologies |
SUBRTN ST 14,SUBRTNX+2
.
(body of subroutine)
.
SUBRTNX B *
If you forgot to save your return address, you'd go into a loop.
I spent many happy hours hacking that beast.
Charli...@mindlink.UUCP
I'm trying to develop a photographic memory.
>A second thing is the use of self modifying code. Is there still any use
>for it (if there ever was)?
I am surprised that nobody has pointed out that self-modifying code is standard
industry practice and that, used properly, it still obeys techniques of modular
programming. In fact, it would seem that when done properly, self-modifying
code is so natural and simple to use and understand that few people have
realized that every operating system that is not resident in ROM uses
self-modifying code: Self-modifying code is what the operating system does to
load programs to run. Move code from disk into memory and then call it.
-- edp (Eric Postpischil)
"Always mount a scratch monkey."
e...@jareth.enet.dec.com
> What is the hairiest, most beautiful or most awful piece of self modifying
> code that you've come accross?
> Any takers?
In November of 1982, when I was in high school, I used self-modifying code
as part of an Applesoft BASIC equation graphing program I wrote which
graphed 3rd and 4th order polynomials in rectangular or polar coordinates
for "Functions" a pre-pre-calculus class.
In the equation-entry window (I used a human interface similar to
the one in Applewriter 2.0) I prompted the user with "F(X) = " or "R(THETA)
= " and the user would then enter a mathematical expression in BASIC. The
user could also enter range, origin, and various scaling factors for the
graphing. The user data, which was entered as text strings, was then
concatenated into larger strings, such as:
350 S1$ = "10020 FN F(X) =" + EQ$ .
These lines were written into a text file, which was then EXEC-ed, and the
lines became new program lines, which were then run. Some might not
consider this truly self-modifying code, since although the Applesoft
environment was interpretive, I really just executed a shell script
and re-interpreted on the fly. Also, if the user made a syntax error such
as "2X^2" instead of "2 * X^2" the program would bomb immediately, but that
was a risk worth taking for me.
- Rob
--
Robert A. Levene Internet: lev...@aplcomm.jhuapl.edu Bitnet: RXL1@APLVM
Disclaimer: I speak neither for my race, my culture, my country, my religion,
my political party, nor my employer, but for me alone.
Yes, this is indeed so central to computers and so transparent to the user
that nobody ever thinks of it. I think the original poster meant what
"everybody" normally means by "self-modifying code" - a clearly defined
user program that modifies its own code while it's running (this is not a
very clear definition but of course there's no clear diving line). In your
example, you could say that the loader isn't really modifying *itself*,
but that's of course a matter of definition.
I remember being a bit surprised when I first saw the definition of a
(von Neumann) computer in an introductory CS course: The definition specified
that the computer be able to modify its program. Having been taught that
self-modifying ("unclean") code is a Bad Thing, I was a bit puzzled about the
emphasis placed on this, until I realized that loading a new program can be
viewed as "modifying the program"...
I have seen the same strategy in the Microsoft Basic in ROM on the TRS-80.
Saves a couple of bytes here and there.
--
--------- You can never have too many ferrets. -----------
gor...@rpi.edu USERF023@RPITSMTS USER...@mts.rpi.edu
>In November of 1982, when I was in high school, I used self-modifying code
>as part of an Applesoft BASIC equation graphing program I wrote
...
>These lines were written into a text file, which was then EXEC-ed, and the
>lines became new program lines, which were then run. Some might not
>consider this truly self-modifying code, since although the Applesoft
>environment was interpretive, I really just executed a shell script
>and re-interpreted on the fly.
I did the same thing when I was in high school, only that my program
was truly self-modifying. This was one of the first non-trivial programs I
ever wrote. It "plotted" a function as a bar-graph, using character "graphics"
on a ZX 80.
In general, how do you write a program that plots a user-supplied function?
Well, in the trivial case of a polynomial, you can simply let the user enter
the polynomial's coefficients, but suppose we want a more general program.
If you have the time, skill and CPU power to do it (and, preferably, a language
moer suited for parser writing than BASIC), you build a parser/expression
evaluator into your program. If you haven't (e.g. if you're a 16-year old
novice hacking BASIC on a ZX 80), you force the user to enter the function
into the source code, manually rewriting the program each time the function is
changed. This works reasonably well if you use an interpreted language, but
it's dangerous and decidedly inelegant.
However, by reading the manual I found out how to find the starting address of
the program. So, I put a statement like
50 Y=X
followed by a *lot* of spaces somewhere in the beginning of the program,
searched through the program 'til I found the "Y=" and poked the user-supplied
string on top of the spaces (or any previous function), thus enabling the user
to supply any function the BASIC interpreter could handle, with no overhead
for function evaluation once it was poked into place. This worked well until
you entered a function longer than the space allocated, in which case some
very strange things could happen to the program.
My first own machine, an Acorn Atom (the ZX 80 was borrowed) was also nice for
writing self-modifying BASIC programs. I remember doing so a few times, for
example there was one program where you typed in a function and the program
solved it for you (numerically). Alas, most BASICs don't let you do this (they
keep the program in tokenized form, and they don't tell you where it's kept
anyway), and of course it's impossible to do in a compiled language, unless
you embed a compiler (that compiles the expression to machine code which is
then executed) in your program...
Oh, yes. I'd forgotten that this neat and elementary way of implementing
subroutine calls, then normal, is now looked upon with horror. But it was
not restricted to the CDC 6000, and probably not even invented by CDC.
Though I cannot remember any other computer that wrote in the opcode as well
as the return address.
This feature could be abused in intellectually satisfying ways, of course.
Consider
TEST EQ B2,B3,SKIPIT SKIP LEADING BLANKS
RJ * REMOVE TEST
Huh? Well, if you don't want to do the test anymore once it fails, this
will do the trick. The RJ will clobber the word with an EQ *+1 which is
essentially a no-op. For instance, the parser skips leading blanks but
not embedded blanks. So you stop skipping blanks after your first non-blank.
It only works for a single line obviously (like, a control card). If you
want to process more lines, you have to restore the instruction ... by
reloading the overlay, of course.
>
>-Mike Moroney
Daan Sandee san...@sun16.scri.fsu.edu
Supercomputer Computations Research Institute
Florida State University, Tallahassee, FL 32306-4052 (904) 644-7045
I did roughly the same thing in high school (ca 1980) with a twist
that makes it more self-modifying. Buried in my program was a loop of
the form
500 FOR X = MINX TO MAXX
510 REM QQQQQQQQQQQQQQQQQQQQQQQ
520 <plot from last to X,Y>
530 NEXT
with the REM padded with chars to make the line maximally long (256
chars (I think) on an Apple 2). The first thing the program did was
scan the memory (by PEEKing) for the comment and store the address of
the REM token in a variable. I then prompted the user for a basic
expression, did a simple parse and POKEd onto line 510 the BASIC
tokens corresponding to "Y = <expression> : REM" (the second REM being
needed to avoid falling into the remaining Q's). The padding was
needed to make sure that I had enough room to put the expression.
Evan Kirshenbaum
HP Laboratories
3500 Deer Creek Road, Building 26U
Palo Alto, CA 94304
I do lots of assembly language programming on VAXen because I write device
drivers. Device drivers under VMS require absolute control of the machine
at the register level, because VMS' kernel environment is a bit strange.
Wonderful, but strange.
I also do assembly language on the 8051 because I'm not convinced that a
HLL can do a good job. Flame me if you like.
>>A second thing is the use of self modifying code. Is there still any use
>>for it (if there ever was)?
> There most certainly was a need for it in a situation where memory was
> critical. Like on the minicomputers of twenty years ago. But I can't see the
> need now.
> If you include overlays, of course (of code, not of data), it's still being
> done on machines without virtual memory. Like any Cray.
Actually, the only time I've used self-modifying code in a deliverable
program was for performance on a PDP-11, not memory space. This was in a
situation where the PDP-11 was generating test data for a piece of hardware
that it controlled for a VAX, which talked to it at 8 MB/s through a
DR780 (wonderful device! Lots of fun to program). The PDP-11 had to be able
to generate data reasonably fast enough to give the hardware a good test.
The goal of the optimization effort was to minimize the number of memory
references in the PDP-11 code, because this is what costs the most time. So
loop control was done by INCrementing the immediate operand:
INC #0
which operand was plugged by other code before the loop started (when time
wasn't critical). There were other interesting optimizations like that.
>>
>>Apart from curiosity there is a more serious aspect to my asking.
>>Here in Europe there are some EC funded projects attempting to bring formal
>>(verification/development) methods to the construction of system kernels,
>>(hard) real time and/or safety critical systems. Most assume the use of some
>>high(er) level language and retain e.g. strict data/code separaration.
>>To what extend are such assumptions justified?
>
> Some (not all) machines have hardware which makes total separation of code
> and data possible. ...
> For the purpose cited (validating safety-critical systems) it is better to
> insist on hardware protection, i.e. tell the vendor to make sure his O/S
> damn well *uses* the hardware separation of code and data.
These days most processors that I read about have separate instruction and
data caches which makes self-modifying code difficult or impossible. The
modification winds up in the wrong cache, you see, so you have to be
really careful. Of course, the processors that I actually *use* tend to
have only a single cache or no cache at all.
>>What is the hairiest, most beautiful or most awful piece of self modifying code
>>that you've come accross?
>>
>>Any takers?
>>
My personal favorite is the RK05 bootstrap for my PDP-8/e. Two words:
0030= 6743 (an I/O instruction that escapes me)
0031= 5031 JMP 31 (jump to itself)
Eventually the disk interface DMAs over the JMP and the code enters the
boot block.
--
===============================================================================
Roger Ivie
35 S 300 W
Logan, Ut. 84321
(801) 752-8633
===============================================================================
>I'm curious about current *real world* use of assembly/machine coding.
>Where is it still done? And on what systems? And why?
The only place I see assembly language being used is to fill in the cracks
between the compiler and the bare metal. It's just not cost effective (yet)
to produce compilers that interface seamlessly with the underlying
hardware. Any EE with a hot soldering iron seems to be able to cobble
together a single board computer with your-favorite processor. But
these guys don't realize that the existence of "a" compiler for their
processor chip doesn't guarantee any compatibility with their home
brewed (or company brewed) board. Such boards have nonstandard I/O
structures, peculiar ways of initializing and of setting up processor
state.
Sometimes you also use it to fill in cracks between the compiler and
your off-the-shelf real time OS (pSOS is a prime example). You have
to write a batch of assembly language routines that takes apart the
argument list passed by the high level language and put things in the
right places for calling the OS trap procedure. You get a variant
of this when your processor has special I/O instructions as opposed
to using device registers in the processor address space, or when
the processor supports esoterica you need like mutual exclusion
operations.
There's only one reason I know of for writing a "pure" assembly
language program, and that's when you can't find a good enough compiler
to do the job. This happens most often in older, smaller hardware
configurations. A previous employer investigated the use of Ada
in their embedded weapons systems. They didn't pursue the idea very
far since there wasn't an Ada compiler that would generate code for
the 8 bit micros they were used to using. And it was deemed not to
be cost effective for the company to switch to a more capable
processor, given weapon certification requirements, maintenance
requirements, and such. They use some "C" but lots of the older
stuff is entirely in assembler.
>A second thing is the use of self modifying code. Is there still any use
>for it (if there ever was)?
Personally, I only used it for "glue". I'd write a 'general trap'
subroutine that took a trap number and some other stuff as
arguments, build a 'trap' instruction with the right number embedded
in it, and then jump to it. That's not really a necessity, just
something pushed on me by a weird host environment.
>Here in Europe there are some EC funded projects attempting to bring formal
>(verification/development) methods to the construction of system kernels,
>(hard) real time and/or safety critical systems. Most assume the use of some
>high(er) level language and retain e.g. strict data/code separaration.
>To what extend are such assumptions justified?
The main place where things break down is at initialization. That's
where you "write" your execute-only stuff from your disk onto your
memory. This is an incredibly tough area of system assurance and most
verification people finesse it away. Of course, this is exactly the
area that failed a few years back during the first Shuttle launch.
Rick.
sm...@sctc.com Arden Hills, Minnesota
>The whole rationale behind giving Lisp programs and data the same format
>is to make it easy to write self-modifying code. ... An explicit
>call to eval tells Lisp to execute a data object as source code.
I hesitate to call this self-modifying code. This *is* a case of
a program generating a program, which I believe is different. Now,
you *can* do "self modifying code" in LISP using (setf ) of bits
of list structure that you're interpreting.
I don't consider a compile/link/run cycle on Unix to be an example
of "self modifying code" even though the Unix system is producing
an instruction stream dynamically and then jumping to it.
I would say that 'real' self modifying code is only performed
by executable procedures that are not re-entrant. In other words,
procedures that mash the instruction stream depending on the
environment. Two processes can not reliably execute such a
thing simultaneously unless they have completely separate copies.
Typical cases involving LISP's (eval ) don't fall into this
category, since you're usually cons-ing up a fresh list or
referencing one that isn't changing. However, it's easy enough
to dream up an example in LISP that does. Ugly, ugly.
BTW, another example of "practical" self modifying code is in
the area of CPU diagnostics and instruction set validation.
A common thing to do is write a program that constructs
instructions, executes them, and saves the results, one after
another.
If the OS development group and the compiler development group have good
enough communication, a good way to solve all these problems is to provide
machine-specific extensions in the compiler. For instance, the processors
that run Multics have special test-and-set instructions (needed for
symmetric multiprocessors), so the Multics PL/I compiler includes a
nonstandard built-in function that invokes it. We never got to the point
where we could get rid of all the assembly code, but during a compiler
redesign for a project to implement a Multics-like OS for a new processor
we added most of what would be necessary (unfortunately, the project was
later cancelled).
Symbolics Lisp Machines are programmed entirely in Lisp using this
technique.
[Regarding self-modifying code:]
>The CDC machines were worse than this - a machine instruction deliberately
>generated self modifying code! A procedure call instruction (RJ) wrote a
>branch instruction to the following instruction at the destination address, and
>then started executing the instruction after this word. There was no return
>instruction, to return from the procedure, you jumped to its starting address,
>then executed the modified instruction, which was a branch to the instruction
>following the procedure call. Because of this method, you couldn't use this for
>recursion. The machine also had no stack as we know it.
The PDP-8 was similar. The procedure call instruction (JMS) wrote the
address of the following instruction into the destination address, and then
jumped to the instruction after it. To return, you executed an indirect
jump through the starting address.
In both cases, you can implement recursion by pushing the contents of the
starting address onto a stack as soon as you enter the procedure, and
popping it off just before the return. Even though the machines don't have
real stacks you can implement them in software.
--
Barry Margolin, Thinking Machines Corp.
bar...@think.com
{uunet,harvard}!think!barmar
The GEODOS interface that (I believe) is bundled with the PS/1 is
a noteworthy example of the value of .ASM in this age of 386
machines. (Ever used Win3 much in a 286 or below?)
There is also a now-legedary, unreleased MS-DOS database interpreter by
a Boring land company with legendary speed that was written in .ASM
by two people, belying somewhat the trend to higher and higher
languages with teams of more and more people.
BTW, in MS-DOS, anyone considering doing serious .ASM work _must_
get the book "The Zen of Assembly Language" by Abrash. He points
out many of the real tricks of getting maximum performance on
the somewhat quirky, quasi-documented architechture of the IBM PC
and its brood.
Tom Rombouts Torrance 'Tater to...@ashtate.A-T.com V:(213)538-7108
Something like this was a standard optimization in the Ritchie PDP-11 C
compiler, and a note in the source said it came from Bliss-11. What it
did was to change this:
op $n,n(r)
into this:
op (pc),n(r)
so that the same copy of N served for both operands. There's no reason an
assembler couldn't automatically do the same thing with the first word of
the next instruction, though in practice the right values wouldn't occur
very often.
Back in the 1950s, before the advent of indirect addressing and index
registers, the only way to process arrays was to patch the load and store
instructions. The ability to do this was seen as one of the most important
advantages of Von Neumann computers since it allowed for a much simpler
design, which at the time meant a design that was more likely to work, than
a Harvard architecture.
--
John R. Levine, IECC, POB 349, Cambridge MA 02238, +1 617 864 9650
jo...@iecc.cambridge.ma.us, {ima|spdcc|world}!iecc!johnl
" #(ps,#(rs))' " - L. P. Deutsch and C. N. Mooers
1) On an 8K DDP-124 for a fighter flight trainer, I resorted to storing
flag bits in the address field of instructions that didn't use it,
e.g. CRA - clear accumulator.
2) On the Redifon R2000, the best machine exerciser was a Worm program that
replicated itself and cleared out the prior version. It was an odd
number of instructions so eventually all the operations would be
exercised in all locations. We also had a program called Early-Bird (!)
that would analyse a dead machine on which Worm had crashed.
3) The classic one instruction PDP-11 program that ran backwards!
Put
MOV -(PC),-(PC)
in the highest memory location and watch that puppy run. We tried this
on an 11/05. I believe it had trouble with I/D space -11s like the
11/45.
Rob Spray 3000 Waterview Parkway 214/497-4110 (office)
Software Quality Manager PO Box 833851 MS QAE 214/497-4441 (fax)
CONVEX Computer Corp. Richardson TX 75083-3851 USA ...uunet!convex!spray
"Is this real, Ben? Or just some strange and twisted dream?" sp...@convex.com
The 6809 system I have at home is the Tandy Color Computer, with a
BASIC interpreter by Microsoft. It appears that who ever wrote it
used some neat tricks like to save space and execution time. For
instance:
SPEAKON LDA #$34 SOME VALUE TO TURN ON SPEAKER
FCB #$HH OPCODE FOR CMPA
SPEAKOF LDA #$43 SOME VALUE TO TURN OFF SPEAKER
STA $FF23 TOGGLE SPEAKER BIT
RTS
[Note: This is just sample code - but it should get the point across]
Calling SPEAKON, the CPU would execute
LDA #$34 SOME VALUE TO TURN ON SPEAKER
CMPA $A843 COMPARE REG. A TO MEMORY LOCATION $A843
WHICH IS ACTUALLY LDA #$43
and then continue with the rest of the code. It saves space (okay,
one byte over a small branch), but I think it takes the same
amount of time.
I agree that the 6809 was indeed a very nice 8/16 bit CPU (two
accumulators, each 8 bit, which could be used as one 16 bit
accumulator).
-spc (sigh ... the good ol' days)
>My personal favorite is the RK05 bootstrap for my PDP-8/e. Two words:
>
> 0030= 6743 (an I/O instruction that escapes me)
> 0031= 5031 JMP 31 (jump to itself)
>
>Eventually the disk interface DMAs over the JMP and the code enters the
>boot block.
My favourite disk hack was a CCW chain which I cribbed from a
boot block for a Univac 9300, whose 8414 disks were clones of the
IBM 2314, right down to the channel programming. It would read the
volume label, which was always at cylinder 0, head 0, record 3, into
memory, then seek to the VTOC using the address in the volume label.
However, the seek command needed a couple of bytes of zeros preceding
the actual cylinder/head address, and this area contained other data
in the volume label. Rather than go back to the CPU, the chain just
issued a couple of one-byte reads from an area which was guaranteed
to contain a zero, into the bytes to be cleared. The next CCW could
then do a seek to the VTOC, and the next one would do a multi-track
search for the file ID, followed by a read of the desired file's
format 1 label. It was FAST!
My second favourite hack, which I could key in from the front
panel, was to issue a CCW chain that would seek to cylinder zero,
seek to cylinder 202 (the highest on the drive), then transfer in
channel (TIC, an unconditional branch) back to the first seek.
The drive would start to walk out of the room, and hitting the
stop button on the processor would have no effect because the
loop was in the disk controller, not the CPU. Nothing short of
a system reset could stop it.
Charli...@mindlink.UUCP
For every vision there is an equal and opposite revision.
--
_
Kevin D. Quitt demott!kdq k...@demott.com
DeMott Electronics Co. 14707 Keswick St. Van Nuys, CA 91405-1266
VOICE (818) 988-4975 FAX (818) 997-1190 MODEM (818) 997-4496 PEP last
I have some assembly code to calculate Pi to about 10,000 digits on a 6809
system (written in Assembly, of course). The author used self-modifying code
to increase the speed of the program (putting several values into the immed-
iate field of some instructions, and actually changing the instruction in
another area).
-spc (Me? Use self-modifying code? Only to piss off some proffessors)
In a Lisp system, where the "shell" is a Lisp interpreter, the distinction
between shell script and program goes away completely.
The whole rationale behind giving Lisp programs and data the same format
is to make it easy to write self-modifying code. A Lisp macro is a piece
of code which is run at "compile time" to generate the code to be compiled.
(In interpreted code this can happen on the fly at run time.) An explicit
call to eval tells Lisp to execute a data object as source code.
Perhaps real difference between "macro" and "micro" level is the existence
of a hierarchy. In the shell script example the script compiles C code
but it doesn't change itself. It is, in fact, qualitatively the same
situation if the user types the commands separately, rather than executing
them together via the shell script. Likewise, an AI program in Lisp may
rewrite one of its subroutines over and over as the requirements of the
problem change, but the master program never changes itself.
Some assembly-level self-modification is relatively innocuous. Modifying
a single instruction for indexing purposes doesn't really feel like
SELF-modification, since the pattern of execution is not directly affected.
There just isn't much differece between changing a value in the body of
the code and changing a value in an index register, so long as the effect
is the same. The real shock in Mel's code was when the opcode itself was
changed, thus affecting the logic of the program.
It also doesn't seem to really count if a program overwrites its
initialization section with data in order to save space. After all, the
initializer was never going to be run again anyway.
Does anyone have any examples of code that actually modifies its own
logic, as opposed to modifying the logic of a subordinate routine? Any
assembly-level modification more significant than just faking things
like indexing and subroutine calls?
A couple of closing notes: In Core War self-modifying programs of various
kinds are quite common. Since the Life automaton can perform computations,
Life patterns can be considered a form of massively parallel self-modifying
code. The mind is self-modifying through learning. The biosphere is
self-modifying through natural selection.
--
Louis Howell
"But when we got into the street I viddied that thinking is for the gloopy
ones and that the oomny ones use like inspiration and what Bog sends."
We are coding a good deal of the graphics and system software
for Pixel-Planes, a very-high-end i860 + custom SIMD processor graphics engine,
in 860 assembler. The reason is that the code quality produced by currently
available C compilers for the 860 is extremely poor, and we cannot achieve
the machine's performance goals writing in C.
--
Jon Leech (le...@cs.unc.edu) __@/
``Those what cannot remedy the past can pretend to repeal it."
- Attributed to Santa Ana by Howland Owl
On some really bad processors like ILLIAC I, you needed self-modifying
code to fake things like procedure calls and indexed addressing. I'd call
that "micro-self-modification" since it is being used at the instruction
level to perform operations that are missing from the machine's instruction
set. This should be obsolete!
On the other hand, "macro-self-modification" is surprisingly common. Here,
the units modified are not single instructions but rather entire procedures
or subprograms.
Consider the following fragment of a UNIX shell script:
phase1 <infile >tempfile.c
cc -o tempfile tempfile.c
tempfile <infile >outfile
Here we have a program that generates C code, which we then compile and
run to process some data. The alternative would probably be to do the
processing interpretively, a far slower process. If you insist that a
shell script is somehow different from a program, then you can pretend
that this isn't self-modifying code. If you admit that shell scripts are
just a higher level programming language, then the above program must be
considered to be a self-modifying program where executing the program
creates one of its subprograms before calling on it.
Another example along the same lines is the old QED text editor for the
GCOS system, written by Ritchie, Thompson, and Kernighan, of C and UNIX
fame. This had a regular expression syntax more elaborate than that of
its descendant, ed (and ex and vi) on UNIX. QED regular expressions
could be constructed to match any context free grammar (I think it had
problems with left-recursive rules, though). To make pattern searches
fast, rumor had it that QED compiled regular expressions to machine code
before starting a search. Thus, the QED text editor was rumored to be
self modifying, creating new code on the fly to meet the needs of the
interactive user.
Doug Jones
jo...@cs.uiowa.edu
| IMP_MOVE_%A4_TO_%C4 # word count for move
| CMIS_MOVE_SLICES_%P1_TO_%P2_STRIDE_%C4
| IMP_SUBTRACT_%I3_FROM_%P1 # reset to base address
| IMP_SUBTRACT_%I3_FROM_%P2 # reset to base address
| IMP_MOVE_%A3_TO_%C4 # Toggle backward/forward flag
| IMP_DEC_%C4_JUMP_Z
| offset xend
| IMP_LOAD_%C4
| 1
|xend: IMP_MOVE_%C4_TO_%A3
| IMP_DEC_%C2_JUMP_NZ
| offset yloop
Looks like real assembly code, right? This is on the floating-point chips :
| #load dest
| IMP_MOVE_%A3_TO_%P1
| CMIS_FPU_STATIC
| wtl3164_static_inst
| CMIS_FPU_MWB_%P1_STRIDE_FRB_DYN
| 24
| wtl3164_dynamic_inst_up count=24,xcnt=LOAD_MSH_REG,efaddr=1,baddr_inc=0
| #load source2 and do the mult-add
| IMP_MOVE_%A2_TO_%P1
| CMIS_FPU_STATIC
| wtl3164_static_inst func=FMUL_FADD_CHAINED,aain=0,abin=0,main=X,mbin=BADD
| CMIS_FPU_MWB_%P1_STRIDE_FRB_DYN
| 26
| wtl3164_dynamic_inst xcnt=LOAD_MSH_X,baddr=31
| wtl3164_dynamic_inst xcnt=LOAD_MSH_X,baddr=31
| wtl3164_dynamic_inst xcnt=LOAD_MSH_X,baddr=31,aaddr=1,cdaddr=1
| wtl3164_dynamic_inst xcnt=LOAD_MSH_X,baddr=31,aaddr=2,cdaddr=2
| (22 similar lines follow)
With code like this, a major application improved from 0.9 to 6.3 gigaflops.
Which is well worth a couple of months' programming effort ...
>A second thing is the use of self modifying code. Is there still any use
>for it (if there ever was)?
There most certainly was a need for it in a situation where memory was
critical. Like on the minicomputers of twenty years ago. But I can't see the
need now.
If you include overlays, of course (of code, not of data), it's still being
done on machines without virtual memory. Like any Cray.
>
>Apart from curiosity there is a more serious aspect to my asking.
>Here in Europe there are some EC funded projects attempting to bring formal
>(verification/development) methods to the construction of system kernels,
>(hard) real time and/or safety critical systems. Most assume the use of some
>high(er) level language and retain e.g. strict data/code separaration.
>To what extend are such assumptions justified?
Some (not all) machines have hardware which makes total separation of code
and data possible. But even if it *is* possible, vendor software often
doesn't use it. In general, a user program has no protection against overwriting
itself ; although some high level languages have strict runtime checks (at
great cost to performance) which make it very difficult to clobber your own
code.
For the purpose cited (validating safety-critical systems) it is better to
insist on hardware protection, i.e. tell the vendor to make sure his O/S
damn well *uses* the hardware separation of code and data.
You'd be better off discussing this on comp.arch - if you can manage to
convince people there to end the religious war (whether MS-DOS is better
than Unix, or something).
>The question is not so much whether using such assumptions can be justified
>at some stage during system development but rather whether or not such
>assumptions can be maintained at every stage.
>
>Ok, now for the FUN part!
>Let's chronicle the deeds of the past masters(*)---or indeed, present masters---
>of machine coding. Who has any stories?
>What is the hairiest, most beautiful or most awful piece of self modifying code
>that you've come accross?
>
>Any takers?
>
I've read and written plenty of self-modifying code (long ago) but I don't
really know of an example worth preserving for posterity.
The CDC 6000 diagnostic programs used to distinguish between 6400-type and
6600-type CPUs (which needed different diagnostics) by
SA1 JUMP6400 LOAD JUMP TO 6400 CODE
BX6 X1
SA6 *+1 STORE AS NEXT INSTRUCTION
+ EQ P6600 JUMP - IT'S A 6600
JUMP6400 EQ P6400 JUMP - IT'S A 6400
This is not the original code ; I make it up as I go along ; it's five years
or so since I last programmed in CP Compass. It clobbers the next instruction
word to be executed to change a jump to 6600 into a jump to 6400. On a 6600,
which had an instruction stack, this would fail, and the code would jump to
P6600 anyway.
>--
># Rob Gerth #
># uucp:ro...@win.tue.nl | Eindhoven University of Technology #
># | 5600 MB Eindhoven, The Netherlands #
Daan Sandee san...@sun16.scri.fsu.edu
}I'm curious about current *real world* use of assembly/machine coding.
}Where is it still done? And on what systems? And why?
Some of the Space Shuttle's code (all of it that I ever worked with) is
written in AP-101 assembler.
I use and assembly routine on my Osborne-1 to modify the disk drives' seek
speed on startup (makes them virtually silent).
}Ok, now for the FUN part!
}Let's chronicle the deeds of the past masters(*)---or indeed, present masters---
}of machine coding. Who has any stories?
Well ... I don't claim to be a master, but I once got a Data General
Eclipse C-330 to do things Data General said it couldn't do.
I had to implement what amounted to an "on control-C goto" capability for
a real-time multitasking system running on the Eclipse under RDOS. (In the
end it worked more like "setjump/longjump" with the longjump triggered by
a ^C interrupt). I asked the local DG guru and was told: "It can't be
done." I asked DG and was told: "It can't be done. We don't want you to
try. If you try anyway, we won't help and we don't want to hear about
it." I kid you not.
After much reading through obscure manuals, I wrote an assembler routine
that, when called with the setjump equivalent, would stash the current
state of the machine in a data structure for future use and enable ^C
trapping to itself. When triggered by a ^C, it would disable interrupts,
do clever things to the task list to shut everything down gracefully,
re-enable interrupts and restore the state of the machine from the data
structure, including the PC.
That and the rest of the software (written in DG FORTRAN 5) eventually
made it into NASA's COSMIC library -- one of my contributions to the
Shuttle program.
Data General still doesn't know about it. (-:{
My other major assembler project is the one that got me my job here. That
was our old CAT-1 automatic teller machine whose final demise I chronicled
in this group last year. They were programmed entirely in 8085 assembler
with a 64K address space, bank switched for a total 128K of memory, and no
operating system at all. On that humble platform we implemented a three
language customer interface, including Chinese, and foreground/background
real time multi-tasking. In retrospect, the mind boggles. (I can't take
credit for most of it. It was nearly 10 years old when I signed on to
help maintain the software. Hats off to my former boss, Cynthia
Aronson-Turner, who implemented most of it and taught me all about it).
--
The Polymath (aka: Jerry Hollombe, M.A., CDP, aka: holl...@ttidca.tti.com)
Head Robot Wrangler at Citicorp(+)TTI Illegitimis non
3100 Ocean Park Blvd. (213) 450-9111, x2483 Carborundum
Santa Monica, CA 90405 {csun | philabs | psivax}!ttidca!hollombe
I've already given an example how you can abuse this feature to change your
code. Here is a hairy story about the code generated by the Fortran compiler
for secondary entry points.
(This thread was started by someone interested in verifiable code. I can
hear them groaning, Fortran?? Secondary entry points ???)
SUBROUTINE MAINEPT
I=0
GO TO 100
ENTRY SECEPT
I=1
100 CONTINUE
RETURN
END
resulted in,
MAINEPT EQ *+1S17
SX6 B0
SA6 I
EQ L100
SECEPT EQ *+1S17
SA1 SECEPT
BX6 X1
SA6 MAINEPT
SX6 B1
SA6 I
L100 BSS
EQ MAINEPT
That is, the secondary entry point took the return address which had been
provided by the hardware, and plugged it into the main entry point. The
routine then always returned through the main entry point.
This worked OK on CDC 6000/Cyber 170 systems, with or without an instruction
stack, because the jump back to the main entry point (the last instruction
in the code above) voided the instruction stack, as all jumps did.
Then some genius added hardware to do instruction lookahead on conditional
branches. That is, in the code above, when entering at SECEPT, the instruction
lookahead detected the EQ coming up and read the instruction word located at
the branch address, MAINEPT, *before* this address had been written into by
the self-modifying code. Result : the routine jumped back to whatever code
had last called the *main* entry point, with disastrous results.
This hardware was actually installed in the field together with a fix in the
compiler not to mess around with entry point code, but that did mean you had
to recompile all your Fortran code, at least if it uses secondary entry points
... A minor problem? Try explaining that to the customer, who is running a
data services center, and was promised that the new system would be 100%
compatible (but much faster) than the old system. And she had a lot of
paying customers that were running applications supplied by third parties
as binaries ...
There are better ways to determine which kind of CPU is in use. The
8086 (and 8088) differ in the way that 'push SP' functions; on the
8086 and 8088, the value of is pushed after it is decremented, where
the 286 (and 386 and 486) push the value before decrementing it. With
this in mind, one can construct a test for 8086/88 vs 286/386/486.
Once one determines that a 286/386/486 is in use, one can look at
the MSW; a 286 initializes MSW to 0xFFF0; the 386/486 initializes this to
0x0000.
Once one has determined that a 386/486 is in use, the CPU type and
revision level (found in the DX register after reset) can be
used to find which CPU is in use.
--
* Dana H. Myers KK6JQ | Views expressed here are *
* (213) 337-5136 | mine and do not necessarily *
* da...@locus.com | reflect those of my employer *
This technique has been considerably refined. The Turbo C and Microsoft C
floating-point emulation works by overwriting(+) the first two bytes of each
coprocessor instruction with an INT 3xh instruction. At least the Turbo C
emulation library patches the INT instruction back to the original coprocessor
instruction if a coprocessor is present and allowed to be used. Thus there
is zero overhead for supporting emulation when the section of code is
re-executed.
(+) actually, this is accomplished by outputting linker directives such that
the instruction is unchanged when linked with the coprocessor-only library
and turned into the appropriate INT instruction when linked with the
emulation library.
--
{backbone}!cs.cmu.edu!ralf ARPA: RA...@CS.CMU.EDU FIDO: Ralf Brown 1:129/3.1
BITnet: RALF%CS.CMU.EDU@CMUCCVMA AT&Tnet: (412)268-3053 (school) FAX: ask
DISCLAIMER? Did | It isn't what we don't know that gives us trouble, it's
I claim something?| what we know that ain't so. --Will Rogers
MOV 0 1
(Sorry, I don't remember the syntax of PDP-11 assembler.) The effect
what to move a copy of whatever was pointed to by the program counter
(i.e. *this* instruction!) to the next address... which was, of
course, then executed, and which made another copy, etc. It only
stopped growing when it hit the top of its address space and
*wrapped*.
--
Scott Deerwester | Internet: sc...@tira.uchicago.edu | ~{P;N,5B~}
Center for Information and | Phone: 312-702-6948 |
Language Studies | 1100 E. 57th, CILS |
University of Chicago | Chicago, IL 60637 |
>Does anyone have any examples of code that actually modifies its own
>logic, as opposed to modifying the logic of a subordinate routine? Any
>assembly-level modification more significant than just faking things
>like indexing and subroutine calls?
Obviuosly there is nothing that I can post here, for security reasons,
but having written a cuople of protection systems myself I can asure, That it
does happen. The prime examples are disk loaders for the Commodore 1541 disk
drive, whoich have been known to decryption routines that encode the original
decryption routine such that it then becomes an integral part of the loader.
functions that when called, push their own values onto the stack to
return code to somewhere else to a function that adjust the stack register,
code thats actually stored in the file allocation table.
Don't beleive me?? Try hacking some disk games written for the C64
and you'll see exactly what I mean!
--
..!mcsun!slxsys!cdh || c...@specialix.co.uk
Disclaimer:- I didn't write this: It does not represent my own opinions,
those of my employer, anyone I know or anyone that knows me!
#include <[.signature of your choice]>
> What is the hairiest, most beautiful or most awful piece of self modifying
> code that you've come accross?
OK you've opened the floodgate. Here is my favorite hack of all times:
A freelist that you searched by executing it. It was a system I built
while working on my Ph.D. in the early 1960s. In those days, storage
allocation and garbage collection was primitive and there was no
literature. Everyone had to reinvent everyything for themselves. I was
using an IBM 7040. 36 bit words divided into 3 bit op code, 15 bit
"decrement" field, 3 bit "tag", and 15 bit address.* My blocks of
free storage were minimum two words long and the top two words were
arranged thus:
First word: Decrement: size of this block
Address: address of next free block
Second word: Address: address to go to when a block of proper size was found
However, the first word was not just data, it was a form of index test
and branch instruction. Executed, it would test its decrement field
against the index register identified by the tag and, if <=, fall
through to the next instruction, otherwise jump to the Address. The
second word was a subroutine transfer which had the effect of putting
its own address into a specified register and transferring to the given
address.
So, to search the free list, you put the size of block you needed into a
selected index register and then jumped to the top of the first block on
the free list. Each block would test its own size against the size
needed and either skip to the succeeding block if too small or jump to a
"got it" address (sticking its own address into a second register) if big
enough. If you fell out of the bottom of the list, the final address
was the start of the garbage collection routine.
Simple, elegant, and apallingly obscure.
________________________
*Lisp historians will recall that with the leftmost 15bit address
called the decrement and the rightmost 15bits the address, one had
"contents of address register" (CAR) and "contents of decrement
register" (CDR).
--
--Dick Wexelblat (r...@ida.org) 703 845 6601
April is the cruelest month, breeding lilacs out of the dead earth,
mixing memory and desire...
The first concerns a C compiler that only output absolute code when I needed
to run the programs in an arbitrary address space. I wrote a post-processor
that put a wrapper around the program (using link map information) that
caused the program to modify ALL absolute memory references within itself
to properly run wherever it needed to. The DDT debugger in CP/M used this
technique to move itself out of the TPA to make room for the test program.
The second example involved an operating system call that required an in-line
operand. I wanted to make this function callable from C, and with a variable
parameter. The only way to do it was to build the system call on the stack,
along with the parameter and a return instruction, then call it.
Finally, I had a system of trap handlers that needed to chain together --
sort of like stealing interrupts, then passing them along. When I installed
a new handler, it needed to keep the chain intact. I could always save the
address of the old handler in a memory location, then jump indirect through
that location, but for some reason I found it easier to build a jump to the
old address and jump to that instruction. I think it was because my
instruction set did not allow me to jump indirect through memory and I didn't
have any registers free at the time of the jump.
Several obvious problems with the technique. First, it is very difficult to
follow someone else's code that does this kind of stuff. It took me hours to
understand the DDT trick -- and the effort undoubtedly warped me for life.
Since then, my code has sent seasoned programmers screaming with horror.
Second, modern processors with pipelines and caches absolutely HATE this kind
of trick. Code which works flawlessly on a 68000, gets shaky on a 68010, has
some potential problems on a 68020, breaks regularly on a 68030, and absolutely
refuses to work on a 68040 as the instruction cache gets larger and larger.
Tricks to squeeze cycles of performance out of a one-lung CPU actually cost
performance on a larger machine due to the need to flush the cache regularly.
Ah yes. Self-modifying BASIC programs.
The attendance records at my high school were kept on Commodore PET computers.
A single unit with keyboard, screen, and cassette drive. The teacher that
wrote the program refused to learn how to manipulate data files on the
cassette, so he kept the information in DATA statements in the program and
saving the data involved resaving the program. Of course, that meant that
the data had to be entered into the program so it could be saved. This is the
fun part.
The Commodore PET's BASIC was screen-oriented. To edit a program, you would
list the lines of interest, move up into them, edit a line, and hit return.
The machine PULLED THE NEW LINE FROM SCREEN MEMORY! (early PETs had a problem
in that a single line of BASIC could take up two screen lines and if you
filled a line to the very edge of the screen, it would tack the next source
line onto the end of the current one, but that's a different story).
The secretaries didn't know about DATA statements and didn't have to. The
program worked like this:
- It prompted the secretaries for the information
- Turned the video off
- Built data statements in screen memory followed by the command RUN
- Poked a number of RETURNS into the keyboard typeahead buffer
- Stopped running!
- The machine read the RETURNs from the buffer and used them to read the
DATA lines into the program and start the program going again
- The program turned the video back on and did it all again.
Back in high school I worked for a company which had a contract to convert
educational software to run on the Apple II. The original programs ran on
electromechanical microfilm based machines -- Each question had five possible
answers. The student would press a button. This would advance the film to
a different position, and the machine would either tell you you were right
or give you a hint and let you try again.
This required writing endless streams of cursor positioning commands, prints,
and simple minded if-then-goto logic. This got old really fast.
Eventually one of the more hackish people came up with this thing called
codegen, which was an applesoft program. When you ran codegen, you
were presented with a blank screen. Various escape sequences allowed you
to design the screens interactively -- You could define boxes, insert text,
create decision points, etc. As you did this, codegen would construct
tokenized applesoft on the fly and append it to the end of itself. When you
were finished, usually when you ran out of memory, you would delete the code
generator and save what was left, which was the finished program. Amazing.
>A second thing is the use of self modifying code. Is there still any use
>for it (if there ever was)?
[...]
>Ok, now for the FUN part!
>Let's chronicle the deeds of the past masters(*)---or indeed, present masters---
>of machine coding. Who has any stories?
>What is the hairiest, most beautiful or most awful piece of self modifying code
>that you've come accross?
>Any takers?
Back in the good old (well, not _that_ old) days of ZX-Spectrum i remember
writing a fast line-rutine (on a bit-mapped display). The plotting loop did
not plot each point on the basis of x,y-coords, as this would be much too
slow - instead it kept a 'current screen address' and 'current bit mask',
and manipulate these - ie. to go right rotate mask, if on byte boundary
increase screen address. So, in order to be able to re-use the code for the
loop, the rutine would check if the line went left or right, and place a
RLCA or RRCA at the proper place in the loop. Remember: the alternative was
several copies of the loop, and memory was in short supply at those times.
A test inside the loop for direction would have slowed the rutine down
unacceptally.
Another example from the Spectrum was to shift a register a number of
bits. For example the following
ADD A,A
LD (LOC+1),A ;Enter branch offset.
LOC JR LOC ;skip some of the rotates.
RRC B
RRC B
.
. 7 times
.
RRC B
(Something like this; it's a looong time since i last did Z80-assembler..)
The above will rotate the B register (7-A) bits to the right. A loop would
have been much slower.
While on the subject on Z80 assembler tricks, not related to selfmodifying -
A fast way on the Z80 to move a chunk of memory was the LDIR instruction, a
faster one was to use 32(say) LDI instructions in a loop (reducing loop
overhead), but the fastest (?) was very clever. What you did was to
temporarely save the stack pointer. Then to move a series of bytes you did
something like this in a loop:
LD SP,(SOURCE)
POP AF
POP BC
POP DE
POP HL
POP IX
POP IY
EXX
POP AF
POP BC
POP DE
POP HL
EXX
LD (SOURCE),SP
LD SP,(DESTINATION)
PUSH HL
PUSH DE
.
.
PUSH AF
Finally you corrected the DESTINATION and looped for another chunk. Since
the LDI took 2 instruction bytes to copy one data byte, whereas the PUSH/POP
copied 2 data bytes with the same amount of code, the stack approach was the
fastest, effectively simulating 16-bit moves like the Motorola 68008. Bet
that wasn't what Zilog created PUSH/POP for....
As to whether there is still any use for self-modifying - in
my oppinion rarely, and only if theres a need for maximum speed. Most often
one can get away with multiple fixed copies of a rutine instead of one that
is changing. However, consider drawing a vertical line on a 16-bitplane
screen on aMotorola 680xx processor in some color. Selfmodifying code for
this could read
6~
loop
bset d0,(a0)
bclr d0,1*planediff(a0)
bclr d0,2*planediff(a0)
.
. various bclr's and bset;s
.
bset d0,15*planediff(a0)
lea linediff(a0),a0
dbra d1,loop
where the bclr's and bset's are modifyed. This would be faster than looping
for each bitplane (I think).
Hope somebody enjoy this (I know I do!)
Kristian
==========================================================================
Kristian Nielsen | /// Only the AMIGA
Student at DIKU, University of Copenhagen | ///
(Department of Computer Science) | \\\/// makes it possible!
Denmark | \XX/
==========================================================================
> This feature could be abused in intellectually satisfying ways, of course.
> Consider
> TEST EQ B2,B3,SKIPIT SKIP LEADING BLANKS
> RJ * REMOVE TEST
> Huh? Well, if you don't want to do the test anymore once it fails, this
> will do the trick. The RJ will clobber the word with an EQ *+1 which is
> essentially a no-op.
Ah, but these tricks were needed if you wanted to store the contents of all
registers. The CDC Cyber had no way to store a register without destroying
another (e.g. to store register X6 you had to put the address of the location
where X6 must be stored in A6). The trick is to first saveguard the contents
of a B register with instructions like those above:
+ GE B7,B0,*+1 IF B7>=0 SKIP NEXT INSTRUCTION
RJ * OTHERWISE A RETURN JUMP TO THIS WORD
SB7 B7+B7 DOUBLE B7
this would be repeated 18 times and the words containing the jumps would
indicate what bits of B7 were 1 initially. After that you could save A6
or A7 to B7 and do a normal save routine. You would extract the value
of B7 by examining the words containing jumps and restoring those to their
original value. Restoring all registers also could only be done using
self-modifying code. This was also fairly tricky. When storing a value
in one of the registers A1 to A5 the value at that address would be loaded
into the associated X register. So the main problem was restoring the
value of the last of these A registers. The trick was to take one X
register (X1 say) and unpack it (UXi Xj,Bk) to an exponent in a B register
and a mantissa in the X register; shift by 12 and repeat four times (to four
other B registers). Next you could restore the A1 register (loading an
irrelevant value into X1) and you would recreate X1 from the B registers
using the pack instruction (PXi Xj,Bk) and shifts in reverse order. Next
you would restore the B registers with modified instructions.
All this was still more tricky because the machine used 1-s complement
integer arithmetic but all operations were +0 preferent (i.e. delivered
+0 except in some cases where they delivered -0). You had to be careful
to properly restore those values.
Note: some of the finer details may be incorrect, as this is from 10 year
old memory. But somewhere I still have the code for this on tape.
For the originial question: Do you write assembly code? You may have guessed.
Yes. Currently I am doing assembler for the Cray, RS 6000 next week I think,
and I did some SPARC, MIPS, i860, and many more last year.
>Is this kind of coding really used in real life? Or is it just something
>real prgrammers do for the heck of it? And is there a name for this kind
>of coding? I suppose you could call it 'overlaying', but that name's
>already taken for something quite different... What about 'interleaved
>code'?
If you look at the first dozen bytes of any Apple ][ peripheral card ROM
you will find a similar trick used with SEC or CLC just before a branch of
the opposite kind - the branch's argument being another instruction.
The worst bit of commercial self-modifying code I came across on my machines
was in my OSI Challenger 1P. The character output routine only supported
printable characters, CR and LF. The cursor was always on the bottom line
of the screen. Every time a linefeed arrived (or the line reached the rh
margin) a small routine was copied down from ROM into page 3 where it executed
a self modifing loop to copy the screen up one line.
When I rewrote the output routine to include direct cursor addressing, clear,
home and left/right/up/down cursor movement I replaced that code with some
that only modified data - and doubled the throughput.
--
David Wilson Dept Comp Sci, Uni of Wollongong da...@cs.uow.edu.au
Noooooo problem; just have your program detect whether or not there's a
cache, and if there is, patch out all the self-modifying parts. :-)
--
Human: Gordon Davisson
Internet: gor...@phast.phys.washington.edu
UUCP: {decvax,tektronix}!uw-beaver!uw-june!phastvax!gordon
Bitnet: gordon@uwaphast
"Then I realized that I had spelled '-' wrong."
-- John Whitmore learns APL
I saw some months ago a Game package (F-19 Flight Simulator, I don't
remember the Company) for DOS. This program is protected by a KEY stored
in a marked-bad cluster. I try to copy the program but doesn't work in
the copy disk. Then I use the Turbo debugger program to follow the
code and unprotect it.
When I was following that code I could see they use some self
modifying code to encode the part of the program to read the KEY.
The program is stored in the coded format, and they use some XOR
operations to encode it while the program is running.
I think is a good idea in copy protection..
--
***** Greetings from Mexico! *****
Juan Gabriel Ruiz Pinto Internet:
Ing. Sistemas Electronicos jgab...@mtecv2.mty.itesm.mx
I.T.E.S.M. Campus Monterrey
I had a ZX81, (the 'somewhat bigger brother' of the ZX80), and I have
made function plotting routines, too. Even 3-D plots in hi-res (256*200,
which was amazing for such a simple beastie!). One thing I liked about
the '81 was that you could use an expression anywhere a number was
expected. For example, you could say:
10 GOTO 25*SIN(Y)+23
I admit, not a very useful example, but the best thing was that it
worked for 'val' too! So,
10 INPUT A$
20 FOR X=0 TO 63
30 PLOT X,VAL(A$)
40 NEXT X
gave you a (very simple) function plotter. I don't think this counts as
self-modifying code, but i do think this was the ultimate interpreter!
You could simply enter the function of X in A$, and let 'val' evaluate
it for you! I don't know if the ZX80 could do this, but there was an
upgrade-ROM to ZX81.
---
Henk de Leeuw
opti...@utwente.nl
opti...@utrcu1.UUCP
In article <39...@ns-mx.uiowa.edu> jo...@pyrite.cs.uiowa.edu
(Douglas W. Jones,201H MLH,3193350740,3193382879) writes:
>rumor had it that QED compiled regular expressions to machine code
The DEC Rainbow is a curious machine. It has a Z80 in it to do disk
I/O (and run CP/M programs), and a 8088 in it to run DOS (originally
to run CP/M-86, but whose counting?). The 8088 couldn't access the
disk drive at all, except via some rather strange contorted "calls" to
the Z80. The Z80 code would tend to stuff values into "magic"
addresses in the read/write routine to make it into a read or a write
routine (basically changing a IN instruction to an OUT intstruction).
At least NEWDOS 2.something and higher(?) and another TRSDOS clone for
the TRS-80 model one did s.th. similar (processor was the Z80, too) to
do exponentiation. The z80 has bit set /reset instructions, with
the bit number being an 3-bit field in the opcode. To do 2^n, the
opcode was regularly modified... Indexing with a variable offset from
the index address registers was another job for self-modification.
--
Paper mail: Ignatios Souvatzis, Radioastronomisches Institut der
Universitaet Bonn, Auf dem Huegel 71, D-5300 Bonn 1, FRG
Internet: u50...@mpirbn.mpifr-bonn.mpg.de
> I had a ZX81, (the 'somewhat bigger brother' of the ZX80), and I have
> made function plotting routines, too. Even 3-D plots in hi-res (256*200,
> which was amazing for such a simple beastie!). One thing I liked about
> the '81 was that you could use an expression anywhere a number was
> expected. For example, you could say:
> 10 GOTO 25*SIN(Y)+23
> I admit, not a very useful example, but the best thing was that it
> worked for 'val' too! So,
> 10 INPUT A$
> 20 FOR X=0 TO 63
> 30 PLOT X,VAL(A$)
> 40 NEXT X
> gave you a (very simple) function plotter. I don't think this counts as
> self-modifying code, but i do think this was the ultimate interpreter!
> You could simply enter the function of X in A$, and let 'val' evaluate
> it for you! I don't know if the ZX80 could do this, but there was an
> upgrade-ROM to ZX81.
The BBC Microcomputer (and derivatives) had/have the EVAL statement,
which does the same thing. VAL operates like other BASICs.
On the ZX81 I once implemented a "universal plotter", where you would
enter *both* sides of of the expression. The "=" sign would be changed
to a ">", and proceed as follows:
100 FOR X = 0 TO 63
110 FOR Y = 0 TO 44
120 IF VAL(E$) THEN PLOT X,Y
130 NEXT Y
140 NEXT X
What you would get is a part black, part white display, the interface of
dark and light being describing the graph you wanted. You put in some
scale factors as well (the code above leaves out a lot of detail).
The other trick it pulled was that at each transition from white to
black, the program would note the poistion, and then (if required) would
come back and work out the exact location of the point to within a few
decimal points, and display the values so that they could then be used
to draw the function on graph paper.
Don Stokes, ZL2TNM / / d...@zl2tnm.gp.co.nz (home)
Systems Programmer /GP/ GP PRINT LIMITED Wellington, d...@gp.co.nz (work)
__________________/ / ---------------- New_Zealand__________________________
> When I was following that code I could see they use some self
> modifying code to encode the part of the program to read the KEY.
> The program is stored in the coded format, and they use some XOR
> operations to encode it while the program is running.
>
> I think is a good idea in copy protection..
One of my favourites was one that pushed an entire backwards program on
to the stack, reversing it, and then jumped to the stack to execute it.
cjs
cu...@cynic.wimsey.bc.ca | "What do the letters in EBCDIC stand for?"
cu...@cynic.uucp | "ASCII. 'ASCII' becomes 'EBCDIC' in the
{uunet|ubc-cs}!van-bc!cynic!curt | EBCDIC character set." --Steve Connelly
Running a PDP 11/70 kind of requires you to be acutely aware of what's
running ............. She poked around and discovered that the program
consisted of a single machine language instruction:
MOV 0 1
(Sorry, I don't remember the syntax of PDP-11 assembler.) The effect
what to move a copy of whatever was pointed to by the program counter
(i.e. *this* instruction!) to the next address... which was, of
course, then executed, and which made another copy, etc. It only
stopped growing when it hit the top of its address space and
*wrapped*.
A few years later (that is, some time ago) the same code was
advertised by [whowasit?] in "Scientific American"s "Computer
Recreations" for the Core War virtual machine... same syntax, same
semantics, same purpose (wipe out in limited time all competing
programs...)
I looked at doing this once and was disappointed to find that it was easier
and smaller to use a lookup table than to generate the instruction. The odd
thing is that almost every time I've cooked up a nifty self-modifying way
to get the Z80 to do something, it took more T-states or bytes of code than
the brute force method. Sigh.
This isn't true of my TRS-80 disk I/O subroutines. It's much more space
efficient to just load the opcodes to change the routine to do a read or
write and jump to this general-purpose routine than to write two (rather
large) routines with all the inline code in there. It's a useful hack.
I've found self-modifying code on the Z-80 to be space-efficient, but not
T-state efficient. Tables are large.
--
| b...@epmooch.UUCP (Ben Mesander) | "Cash is more important than |
| ben%serval...@uokmax.ecn.uoknor.edu | your mother." - Al Shugart, |
| !chinet!uokmax!servalan!epmooch!ben | CEO, Seagate Technologies |
I don't know how you feel about BLISS, but there were some guys at
DECUS a couple of years ago that gave a talk on how to do this. There
were all kinds of contortions that they went through to keep from
writing any of the driver in Macro-32. They concluded that, while
possible, it was not something that DEC supported and was generally a
lot of trouble.
Warner
--
Warner Losh i...@Solbourne.COM
We sing about Beauty and we sing about Truth at $10,000 a show.
Probably:
MOV -(PC), -(PC)
ie CPU reads instruction at X, increments PC by 2 to X+2
CPU autodecrements PC by 2 and fetches the instruction at X
CPU autodecrements PC by 2 and stores the instruction at X-2
CPU reads instruction at X-2, increments PC by 2 to X
etc etc
This usually crashes when the CPU gets into some of the low core vectors
or when it wraps around into the IO page
Paul
--
Paul Campbell UUCP: ..!mtxinu!taniwha!paul AppleLink: CAMPBELL.P
Where do you find a "kinder gentler nation" when you need one these days?
The original Blit (from Bell Labs) did this, apparently achieving
quite worthwhile gains. Rob Pike (and perhaps other coauthors) wrote a
very interesting paper in Software Practice & Experience about it;
unfortunately, I don't have the issue number on hand. A lot of very
good ideas were put into the Blit, and it's a great pity so few people
have looked at them since.
--
V9: the kernel where you can do
fgrep <something> */*.[ch]
and not get "Arguments too long".
c...@hawkwind.utcs.toronto.edu ...!{utgpu,utzoo,watmath}!utgpu!cks
> MOV -(PC), -(PC)
>
>ie CPU reads instruction at X, increments PC by 2 to X+2
> CPU autodecrements PC by 2 and fetches the instruction at X
> CPU autodecrements PC by 2 and stores the instruction at X-2
>
> CPU reads instruction at X-2, increments PC by 2 to X
> etc etc
>
>This usually crashes when the CPU gets into some of the low core vectors
>or when it wraps around into the IO page
>
I think this same thing is possible on the 68000 as follows:
LEA TMP(PC),A0
TMP MOVE.L (A0),-(A0)
JMP (A0)
This should leave you with a uniform set of interrupt vectors and a red light
on the front panel! Note that it won't work on a 680x0 with cache memory.
"whowasit" was almost certainly A.K. Dewdney. A book of his collected
"Computer Recreation" columns is available from either Computer Science
Press, Rockville, MD, or else (in the U.K.) from W.H. Freeman publishers
at 20 Beaumont Street, Oxford, OX1 2NQ. (Sorry, I don't actually
have the exact title or ISBN.) I _do_ have Dewdney's "The Turing Omnibus"
also a collection of c. 60 fascinating columns by him. That ISBN is
0-7167-8154-9. Great book, especially if just starting out in C.S.
Tom Rombouts Torrance 'Tater to...@ashtate.A-T.com V:(213)538-7108
The book is called "The Armchair Universe". This excellent book is
broken down
into (I believe) 7 sections, or 'universes' of computer recreations,
such as
'Infinate graphics (Fractals), Analog computers, Core wars, Automata and
others I
can't recall at the moment. sorry, I don't have the ISBN number with
me; if
someone asks I can dig it up.
-mike begley
sp...@iastate.edu
>>I'd like to start a new thread; or maybe two threads:
>>I'm curious about current *real world* use of assembly/machine coding.
>>Where is it still done? And on what systems? And why?
>>Ok, now for the FUN part!
>>Let's chronicle the deeds of the past masters(*)---or indeed, present masters---
>>of machine coding. Who has any stories?
>>What is the hairiest, most beautiful or most awful piece of self modifying code
>>that you've come accross?
>>
>>Any takers?
>>
The most elegant piece of self-modifying code I've ever seen was the single
PDP-11 instruction (started at the top of memory)
MOV -(PC), -(PC)
The parsing of this instruction is 'op-code dst,src'. Since this *is* machine
coding we're talking about, the machine code for the above is (in octal):
014747
[01 = MOV, 4 = pre-decrement mode, 7 = register 7, the PC]. [Just to give you
an idea of how easy it is to step through the octal contents of memory and read
the instructions.] During the fetch cycle, the instruction would be read into
the instruction register, then the PC would be incremented by 2. The source
fetch part of the instruction (the second -(PC)) would then decrement the PC
by 2, so the PC then pointed back to the instruction's address. When the
destination address was computed, the PC would be decremented by 2 again,
and wind up pointing to the word immediately preceding the instruction.
What happens is the instruction "MOV -(PC), -(PC)" gets copied to the word
below the current location, and the PC points to this previous word when the
instruction is finished, so the whole thing begins again at this previous
location. The machine finally halts when it wraps below zero into high-order
memory (which is I/O address space on the PDP-11).
Now, this instruction is so neat that its existence needs no further justifi-
cation. However, it happens to serve a useful purpose: it's a very simple,
crude but not foolproof, test of memory. I used this instruction as of about
three months ago (successfully, I might add) to detect a bad memory chip in
my PDP-11/34A. If the instruction fails before getting to location zero, that
memory area is probably bad (unless there are power supply problems or some-
thing). Of course, the M9312 boot module contains a more thorough memory
diagnostic which can be directly executed; and since I've disassembled the
boot code, I happen to know the magic address that it starts at.
So, not only is this self-modifying code, but I've even used it recently!
Also recently I've written some assembler floating-point routines for the
MIPS processors that run more than twice as fast as what MIPS' C compiler
produces. My assembly code is faster simply because I understood the
algorithms I was using better than my C compiler; I was able to perform
register allocation and pipelining more efficiently.
How do other PDP and DECSYSTEM fans feel about the MIPS processors? I think
they have one of the cleanest architectures around today, and quite different
from the -8, -11, and -10/20. I missed not having addressing modes, a la
PDP-11, at first. As I became used to the MIPS "idiom" though, it grew on me.
Of course, the assembler maintains separate program and data space, so it's
more difficult to write self-modifying code! I actually tried once, and
wasn't able. I wanted to store a bunch of values within a matrix, but
I didn't know the exact offsets from the start of the array that I'd write to
until the procedure was called. The offsets are stored as a 16-bit signed
integer in the lower half of the instruction (in one of the addressing modes).
I thought this would be perfect, but was not able to write into the text
(program) area.
I know that it's widely "known" that writing self-modifying code like this
is a real no-no, sort of like using goto's in a high-level language, but I
wanted to try!
--Paul
--
This is my address: p...@ama.caltech.edu
This is UUCP: ...!{ihnp4,uunet}!
This is my address on UUCP: ...!{ihnp4,uunet}!caltech.edu!ama!ph
Any questions?
"Does Emacs have the Buddha nature?" --Paul Hardy "Yow!" --Zippy
Back in high school I worked for a company which had a contract to convert
educational software to run on the Apple II. The original programs ran on
electromechanical microfilm based machines -- Each question had five possible
answers. The student would press a button. This would advance the film to
a different position, and the machine would either tell you you were right
or give you a hint and let you try again.
This might be a machine that I once heard Ted Nelson (Xanadu / Hypertext
fanatic) mention. He was hired to find a useful application for the machine,
and thought it would be a great training tool. He decided to put together
a course for new Santa Clauses to train on. The one he used (this was an
original; perhaps it was later expanded) allowed four answers, having four
buttons. His favorite part of the course was dealing with strange questions
from kids. The best one he came up with was:
"Santa, are you more important than God?"
a) Yes
b) No
c) Maybe
d) Ho! Ho! Ho!
How about a new thread on the extreme and arcane things people have done
to compress code for limited memory applications?
In our old ATM's (described in other posts) much of our limited memory
was devoted to display data -- stuff like "PLEASE PUT YOUR CARD IN THE
SLOT" and "PLEASE PUT YOUR ENVELOPE IN THE DEPOSITORY."
To save space, we had macros that let us re-use parts of messages for
other messages. I forget the exact syntax, but the above messages
might look something like:
startstr plsptyr
PLEASE PUT YOUR
endstr
CARD
startstr inthe
IN THE
endstr
SLOT
usestr plsptyr
ENVELOPE
usestr inthe
DEPOSITORY
How's that for the Ugly Code Contest? (-:
Notice that both PLEASE PUT YOUR and IN THE are re-used. That's actually
a pretty tame example. We'd split words in half in some cases, trying to
get the maximum use out of every byte. It was also possible to nest and
overlap strings.
There was some overhead associated with the macros, so there was a lower
limit to the number of characters it was worth stringing together
(otherwise we'd only have needed the alphabet (-: ).
The actual data compression was done by hand -- a tedious job at best.
--
The Polymath (aka: Jerry Hollombe, M.A., CDP, aka: holl...@ttidca.tti.com)
Head Robot Wrangler at Citicorp(+)TTI Illegitimis non
3100 Ocean Park Blvd. (213) 450-9111, x2483 Carborundum
Santa Monica, CA 90405 {rutgers|pyramid|philabs|psivax}!ttidca!hollombe
The PP code, when it needs to make a request of the central processor, reads
a word from central memory right over the top of itself:
LDC XXXP central memory address of interlock word
CRM *,ON read one word to THIS location
PSN no-ops to fill five bytes
PSN
PSN
In normal circumstances, the word at XXXP has PP instructions to perform the
CPU interrupt:
CRM *,ON read one word to THIS location
XJN interrupt the CPU
PSN
PSN
When it was necessary to halt such interrupts, the CPU monitor would replace
the word at XXXN with a sequence which looped:
CRM *,ON read one word to THIS location
PSN waste time
PSN waste time
UJN *-4 jump back to the read
Then, when ANY PP goes to make a monitor request, it reads this sequence over
the top of the interrupt code and enters a tight loop, overwriting the code
it is executing during EACH iteration, until the CPU monitor replaces the
normal instructions in central memory.
--
ti...@gssc.gss.com Tim N Roberts, CCP Graphic Software Systems
Beaverton, OR
Zen Master to hot dog vendor: "Make me one with everything."
>Does anyone have any examples of code that actually modifies its own
>logic, as opposed to modifying the logic of a subordinate routine? Any
>assembly-level modification more significant than just faking things
>like indexing and subroutine calls?
Here's a neat hack which, while not modifying the *logic* of the code,
nonetheless did self-modification in a fun way.
In a previous life, I was the chief microcoder for a company making a
bit-slice mini. The microcode word had a two bit field which
specified how many clock ticks the instruction would take (from 5 to 8
ticks, each of 25ns).
Until we had a smart program to determine how long an instruction
*really* needed, we would run everything at the slowest rate.
Now, this machine held its microcode in static RAM. An embedded Z80
(running CP/M, but that's another story) could stop the CPU clock,
rummage around in the control store and set things going again.
You guessed it. We could alter instruction timings on the fly. After
doing it on-line a few times, typing in the raw hex, I hacked up a
little script which would do it for me. It was essentially doing
read() and write() on /dev/mmem. Now *that's* self-modifying code!
For some reason I never fathomed, some instructions couldn't be fudged
in this manner. I'm still amazed it worked at all.
Sorry I can't post any example code. I doubt that many people would
understand HLH Orion microcode anyway.
It was fun writing in that language. I may tell the tale of the
benchmark and the refresh controller one day ...
Paul Leyland
enter: copy "goto enter" at label entry
do critical work
copy "copy \"goto enter\" at label entry" at label entry
Now the first process to arrive at enter changes the instruction to:
enter: goto enter
that is: a (busy waiting) loop. Next processes will be waiting and only one
will be continued when the first process leaves the critical section, voila :-)
--
_ _
/ U | Dolf Grunbauer Tel: +31 55 433233 Internet do...@idca.tds.philips.nl
/__'< Philips Information Systems UUCP ...!mcsun!philapd!dolf
88 |_\ If you are granted one wish do you know what to wish for right now ?
This was a typesetter machine from COMTEC.
Harald Tveit Alvestrand
Harald.A...@elab-runit.sintef.no
C=no;PRMD=uninett;O=sintef;OU=elab-runit;S=alvestrand;G=harald
+47 7 59 70 94
Pretty damned good. Here, please accept this ugly award.
>
> Notice that both PLEASE PUT YOUR and IN THE are re-used. That's actually
> a pretty tame example. We'd split words in half in some cases, trying to
> get the maximum use out of every byte.
Hmmm. And who was the lucky person who got the job of translating the
messages the first time that a bank wanted a machine with a Spanish
language interface? And did that person subsequently have to take early
retirement on grounds of failing mental health? Seriously, my wife
encountered this technique in a large COBOL suite which she had been
hired to translate into French. Must have seemed like a neat idea to
the people who coded it -- people who, unlike The Polymath, cannot have
needed to save bytes that badly. My wife is now working with another
company...
--
Dominic Dunlop
ok...
any SR-56 fans out there?
The SR-56 is a Texas Instruments programmable calculator. It's not a very
fancy one. It didn't have a magnetic reader or anything.. you couldn't store
programs. You had to rekey them each time. But that was fine, because there
was a limit of 100 keystrokes in the program memory.
It's a little strange what the emphasis on keystrokes means. Most programmable
calculators (that I've seen, anyway) have a key-based programming. So the
sequence "1+1=" would be a program which calculates the value 2. Then there
would be a few other ops, like to flash the current number on the display, etc;
also programming operations like some kind of test, and of course a branch.
So the limit of 100 keystrokes was pretty small. I'd have a fairly small
program I wanted to write, and I'd write it out by hand. Often it would be
around 250 keystrokes, and I'd eventually compress it down to 100 with no loss
of functionality and not necessarily any algorithm improvements.
SR-56 programmers would have to get pretty intimate with representations and
things, since the only readout was of the internal code for the keystroke. But
this didn't help too much for code compression.. remember that a number like
"156" would be represented by the THREE keystrokes "1", "5", and "6". However,
a small amount of program/data reuse was possible in that some statements, like
a memory store or a branch (called "goto") took as argument a specified
following number of keystrokes. So if you had something like this:
34: goto
35: 6
36: 7
which transfers control to location 67, and if you wanted a 7 for something,
you could branch to location 36 and begin your next program segment there.
Since constants took so much room, these were one of the big optimization
targets. The absolutely standard one was for getting the number "100", which
if typed in the direct way would take 3 keystrokes. If you typed the two
keystrokes "2" and "common-anti-log", then you'd get 100 with only 2
keystrokes. Similarly, instead of typing "0.3", you'd certainly simply type
".3", two keystrokes instead of three, and instead of "0.25", you'd probably
type "4 reciprocal". The standard coding form also had blanks for the initial
values of the ten data memory locations. So another common thing was to ask,
via this form, the person keying in the program to set up some handy constants
in some of these. These took only two keystrokes to access: "store", then the
memory number (0 to 9). Some programs thus became non-rerunnable, because they
would subsequently use these locations for other things.
The goto instruction took three statements, as you can see above. But there
was a special instruction "reset" which took only one keystroke, which was a
jump to location 00. So you tried to have at least one goto target be at
location 00. Problem was, this was also where the program began execution when
you just pressed the "run" key. So some programs contained instructions saying
that they should be begun by entering "goto" some location (which you could
enter outside of the program, which started it running). Actually, another
possible strategy for getting the user to type some of the keystrokes was to
ask them to begin the program a little, like: To run this program, type "1+",
then press "run". There were also possible bizarre uses of the "subroutine" /
"return" pair, since "return" was just a single keystroke.
Memory locations took two keystrokes to access or to overwrite. However, there
was also a "register" called "t" for "test". The purpose of this register was
to evaluate "if" conditions, like so:
3
x<->t ; exchange the display and "t"; the only instruction there was
; for accessing t
recall
2 ; load memory #2
x=t ; i.e. if x == t
2
1 ; i.e. if memory(2) == 3, goto 21. I might be mis-remembering
; the syntax of the if statement.
However, it ended up having a weird optimization possibility too. Suppose you
had something like this:
recall
0
+
1
=
store
0
If you were able to keep your value in t instead of memory 0, this could be
rewritten as:
x<->t
+
1
=
x<->t
But the really neat part was the fact that once you pressed, for example, "+",
the first operand would already have been picked up, but if the "+" didn't
complete any previous operation, the first operand would still be on the screen
(in "x"). So suppose that instead of "t:=t+1", we wanted "x:=t+1". This is
still possible, as follows:
x<->t
+
x<->t
1
=
which leaves the value of `t' as it was, while calculating it plus one into x.
Similarly, `t', if not storing anything valuable, could be used to hold
intermediate results in expressions for later use. This question of what
remains on the display was quite pervasive in its optimization potential; for
another example,
recall
0
*
recall
0
log
=
, which calculates memory(0)*log(memory(0)), can be shortened to
recall
0
*
log
=
There were also supplied optimizations of various kinds.. for example, there
was an "add into memory" operation (called "sum"), which allowed this:
recall
0
+
1
=
store
0
to be compressed into:
1
sum
0
(except for the fact that the final contents of x would be different, although
even if you wanted it, you could load it at the end and still be saving
keystrokes). The trade-offs between using various of these operations would
often be interesting. Let me just show you one possible interaction. On the
SR-56, and I believe many other TI calculators, you can't type "4+=" to mean
the same as "4+4=". So suppose we wanted to store the contents of the display
into memory #8, and then double it on the screen. We could do this:
store
8
*
2
=
But we could also do this:
+
store
8
=
since, unless my memory is worse than I thought, the "store" causes it to
believe that `x' potentially has a valid second operand again.
There's more, but this is what comes to mind right now. Neat eh?
ajr
--
Oppose the war in the Persian gulf.
Path: mpirbn!unido!mcsun!uunet!cs.utexas.edu!utgpu!news-server.csri.toronto.edu!dgp.toronto.edu!flaps
From: fl...@dgp.toronto.edu (Alan J Rosenthal)
Newsgroups: alt.folklore.computers
Summary: keystroke-based code compression for programmable calculators
Date: 1 Feb 91 16:46:07 GMT
References: <3...@rc6.urc.tue.nl> <1991Jan15.2...@demott.com> <DD3^}~#@rpi.edu> <22...@ttidca.TTI.COM>
Lines: 179
holl...@ttidca.TTI.COM (The Polymath) writes:
>How about a new thread on the extreme and arcane things people have done
>to compress code for limited memory applications?
ok...
any SR-56 fans out there?
The SR-56 is a Texas Instruments programmable calculator. It's not
a very fancy one. It didn't have a magnetic reader or anything..
you couldn't store programs. You had to rekey them each time. But
that was fine, because there was a limit of 100 keystrokes in the
program memory.
<long description of SR56 pessimization possibilities deleted, read
original article for them>
All your memory should be right. I had (still have) the TI-58 (<-note
the missing 'C' at this point), which had 480 'program steps',
reconfigurable in units of 80 program steps for 10 decimal floating
point registers. Other features in addition to the SR56 ones where:
* 3-digit program addresses were stored in only 2 bytes
* 2-digit register addresses were stored in only 1 byte
* Some additional functions, Labels (which were sequentially searched
from 000, so you had to place heavily used labels near the beginning
or use absolute (numerical) addressing. goto <somelabel> was coded in
2 bytes (<goto-code> <label>), where the label was any program key
code except 0,...9. goto <numerical> was coded in 3 bytes: <goto> <02>
<58> for goto 258.
I once - just for fun - rewrote a transport problem solver s.o. wrote
for a HP41 (with program chaining --- eg overlays for problem readin,
solving and readout) for the 58. The Ti58 had no magnetic cards, so I
had to type the data into the registers by hand, and read them out
later myself. The middle part needed 241 (or 321) Bytes --- I don't
remember; 1 Byte more than in 3 or 4 memory partitions. So I (thougth
experiment) placed the last stop key beyond the end of the memory,
partitioned it away and used 80 Bytes for 10 regs instead for 1 stop
key and 79 unused ones. The program would stop at the end of program
area with blinking display... another optimization possibility.
The boot ROM of the PDP-11/73/83/84 has all its messages compressed in the same
way. Like the above, most examples were much less 'tame'.
>Hmmm. And who was the lucky person who got the job of translating the
>messages the first time that a bank wanted a machine with a Spanish
>language interface? And did that person subsequently have to take early
>retirement on grounds of failing mental health?
For the PDP-11/73/83/84, I was that person :-). (I didn't actually translate
the messages, the translators were given the full text of what the messages
were supposed to say, and I wrote a program that loaded the text into EEPROMs
on the machines) The translated text wasn't compressed, as I never did finish
a program that created an optimal compressed text from the original.
-Mike Moroney