Has anyone ever used self-modifying microcode? Would it even be useful?

Chris Barts

unread,

Jan 29, 2007, 8:37:07 AM1/29/07

to

A somewhat cursory Google doesn't bring up anything useful on the subject
of self-modifying microcode. (In fact, the exact phrase search
"self-modifying microcode" (with quotes) brings up zero results.) There's
been a lot of self-modifying machine code in the world -- in fact, the
PDP-8 subroutine calling convention depended on a relatively minor form of
this dark art -- but I can't dredge up any reference to it being used one
level lower.

Further, would it be useful? It seems like a way to squeeze the most out
of code that must always be as fast as possible and must usually fit in
very cramped store. Maybe the store it usually resides in has large speed
penalties for writing, like Flash NVRAM does now. Maybe it's so difficult
to get right the first time the idea of debugging self-modifying microcode
is a quick way to get a laugh or a slow way to end up in the nuthouse.

From my nuthouse by the amber waves of weeds, etc. etc.

--
My address happens to be com (dot) gmail (at) usenet (plus) chbarts,
wardsback and translated.
It's in my header if you need a spoiler.

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Pascal Bourguignon

unread,

Jan 29, 2007, 10:03:21 AM1/29/07

to

Chris Barts <puonegf...@tznvy.pbz> writes:

> A somewhat cursory Google doesn't bring up anything useful on the subject
> of self-modifying microcode. (In fact, the exact phrase search
> "self-modifying microcode" (with quotes) brings up zero results.) There's
> been a lot of self-modifying machine code in the world -- in fact, the
> PDP-8 subroutine calling convention depended on a relatively minor form of
> this dark art -- but I can't dredge up any reference to it being used one
> level lower.
>
> Further, would it be useful? It seems like a way to squeeze the most out
> of code that must always be as fast as possible and must usually fit in
> very cramped store. Maybe the store it usually resides in has large speed
> penalties for writing, like Flash NVRAM does now. Maybe it's so difficult
> to get right the first time the idea of debugging self-modifying microcode
> is a quick way to get a laugh or a slow way to end up in the nuthouse.
>
> From my nuthouse by the amber waves of weeds, etc. etc.

Well, first, the difference between micro-code and code is not too
important. I had a teacher who considered that the routines behind
A-traps on 680x0 were "micro-code", and indeed, they implemented the
semantics for the op-codes 0xA???, like any other micro-code (only
they were written in "code" instead of "micro-code" of course).

Now, in the old times, when memory was scarse, stack was not yet a
common concept, people often used self-modifying code. Microcode
wasn't often changed, and probably wasn't SELF-modifying. On the
other hand, as soon as this notion of micro-code appeared, the notion
of modifying it occured at the same time, but it was modified from the
code, not from itself while running.

Basically most instructions were hard wired, and some code could be
micro-coded, to add new instructions or do whatever the OS (or
application) needed. Each program could download it's own micro-code.
Experiments were done for example to combine several instructions into
one micro-code to optimize languages like lisp (where some primitives
like CAR and CDR are often used in combination with others, therefore
there's some gains to avoid fetching several instructions while doing
several operations that occur often together.

Another consideration, is that the microcode manipulated the processor
organs (the registers, the buses, the ALU, etc). The couple of
microcodes I've looked at hadn't any provision to manipulate the
microcode engine itself, so I don't think it would have been even
possible to write self-modifying microcode. This would have to be
done from the code level anyways.

Perhaps you could search the specifications of various the micro-code
engines and see if they could self-modify themselves first. And then
try to find some micro-code programs on these engines where it was at
least possible, and see if they were self-modifying.

--
__Pascal Bourguignon__ http://www.informatimago.com/

"This machine is a piece of GAGH! I need dual Opteron 850
processors if I am to do battle with this code!"

Anne & Lynn Wheeler

unread,

Jan 29, 2007, 11:42:15 AM1/29/07

to

Chris Barts <puonegf...@tznvy.pbz> writes:
> A somewhat cursory Google doesn't bring up anything useful on the subject
> of self-modifying microcode. (In fact, the exact phrase search
> "self-modifying microcode" (with quotes) brings up zero results.) There's
> been a lot of self-modifying machine code in the world -- in fact, the
> PDP-8 subroutine calling convention depended on a relatively minor form of
> this dark art -- but I can't dredge up any reference to it being used one
> level lower.
>
> Further, would it be useful? It seems like a way to squeeze the most out
> of code that must always be as fast as possible and must usually fit in
> very cramped store. Maybe the store it usually resides in has large speed
> penalties for writing, like Flash NVRAM does now. Maybe it's so difficult
> to get right the first time the idea of debugging self-modifying microcode
> is a quick way to get a laugh or a slow way to end up in the nuthouse.

how 'bout pageable microcode?

floppy disk was originally developed for loading microcode into the
3830 disk controller ... and was also used for loading microcode into
many of the 370 mainframe machines. this typically happened
automatically at power-up ... however there has been recent subthread
here on the "IPL" button on 360/370 front consoles ... "initial
program load" ... which was software (boot) function. However 370s
also had "IMPL" button ... initial microcode program load ... if there
was some service update with included replacing the microprogram
floppy disk ... then the microcode could be reloaded (w/o a power
cycle).

3081 had service processor and a 3310/piccolo, FBA (fixed block
architecture) "hard disk" containing microcode for the 3081 processor
... and some processor functions could involve "paging" microcode from
the 3310.

this is different than an instruction, dynamically modifying some
(frequently immediately) following instruction, in the instruction
stream. a lot of 360 (software) code made use of this feature to
achieve real-storage compactness (compared to paging which also is
oriented towards real-storage compactness). However, it was something
of a performance penalty as processors started attempting to squeeze
instruction latency ... doing instruction decode and setup overlapped
with execution ... there had to be constant checking if some previous
instruction had modified a following instruction that had already been
fetched and decoded.

a couple past posts mentioning pageable microcode:
http://www.garlic.com/~lynn/2000d.html#82 "all-out" vs less aggressive designs (was: Re: 36 to 32 bit transition)
http://www.garlic.com/~lynn/2004j.html#45 A quote from Crypto-Gram

Tim Shoppa

unread,

Jan 29, 2007, 1:25:34 PM1/29/07

to

On Jan 29, 8:37 am, Chris Barts <puonegf+hfr...@tznvy.pbz> wrote:
> A somewhat cursory Google doesn't bring up anything useful on the subject
> of self-modifying microcode.

You'll have to define exactly what "self" you mean.

Machines with loadable microcode must have some way of loading it.
Sometimes pure hardware, sometimes a "console processor", sometimes
instructions in the processor itself. In the last case, if a processor
has microcode then microcode is generally used for at least some phase
of every instruction and often some form of banking is used to say
"well, in executing this instruction I know I'm not going to depend on
something in the writable control store".

Some machines support the concept that different processes can set a
bit in their process control word that define the microcode set used
for each process. If someone could remind me the machine I'm thinking
of I'd appreciate it! The 11/780 comes close, with a bit in the
process header that specifies VAX or PDP-11 compatibility mode, but
those aren't really separate or run-time-modifiable microcodes in that
machine.

The CalData comes close but still not there and I never used it in
anything except PDP-11 mode. Maybe I'm thinking of one of the
timesharing-marketed emulation-through-microcode machines from the
same era?

Tim.

Stan Barr

unread,

Jan 29, 2007, 2:39:40 PM1/29/07

to

On Mon, 29 Jan 2007 06:37:07 -0700, Chris Barts <puonegf...@tznvy.pbz>
wrote:

>A somewhat cursory Google doesn't bring up anything useful on the subject
>of self-modifying microcode. (In fact, the exact phrase search
>"self-modifying microcode" (with quotes) brings up zero results.) There's
>been a lot of self-modifying machine code in the world -- in fact, the
>PDP-8 subroutine calling convention depended on a relatively minor form of
>this dark art -- but I can't dredge up any reference to it being used one
>level lower.
>
>Further, would it be useful? It seems like a way to squeeze the most out
>of code that must always be as fast as possible and must usually fit in
>very cramped store. Maybe the store it usually resides in has large speed
>penalties for writing, like Flash NVRAM does now. Maybe it's so difficult
>to get right the first time the idea of debugging self-modifying microcode
>is a quick way to get a laugh or a slow way to end up in the nuthouse.
>
>From my nuthouse by the amber waves of weeds, etc. etc.

The Rekursiv featured customizable microcode that was exposed to the
program writer. I presume it could be "self-modifying" if the programmer
required it to be. But whether that's the sort of thing you're looking
for is another matter.

To quote a paper[1] on Rekursiv:
"Since so few programmers microcode their own instruction sets, most do not
realise that - because so many microcycles are spent either packaging up
data at the end of one instruction only to have it opened up again at the
start of the next - a microcoded algorithm can run orders of magnitude
faster than the equivalent machine code."

Didn't one of the contributers to this group work on Rekursiv?
I remember a story about the works van trashing someone's Porsche.

[1] The Rekursiv: An Architecture for Artificial Intelligence
David M Harland, Hamish I E Gunn, Ian A Pringle and Bruno Beloff
September 1986
--
Cheers,
Stan Barr stanb .at. dial .dot. pipex .dot. com
(Remove any digits from the addresses when mailing me.)

The future was never like this!

Roger Ivie

unread,

Jan 29, 2007, 2:56:05 PM1/29/07

to

On 2007-01-29, Chris Barts <puonegf...@tznvy.pbz> wrote:
> A somewhat cursory Google doesn't bring up anything useful on the subject
> of self-modifying microcode. (In fact, the exact phrase search
> "self-modifying microcode" (with quotes) brings up zero results.) There's
> been a lot of self-modifying machine code in the world -- in fact, the
> PDP-8 subroutine calling convention depended on a relatively minor form of
> this dark art -- but I can't dredge up any reference to it being used one
> level lower.

You might look to see if you can find anything on the Burroughs B1700. I
don't know much about it, but rumor has it that it could run multiple
jobs each with different microcode.
--
roger ivie
ri...@ridgenet.net

CBFalconer

unread,

Jan 29, 2007, 3:59:20 PM1/29/07

to

Anne & Lynn Wheeler wrote:
>
... snip ...

>
> this is different than an instruction, dynamically modifying some
> (frequently immediately) following instruction, in the instruction
> stream. a lot of 360 (software) code made use of this feature to
> achieve real-storage compactness (compared to paging which also is
> oriented towards real-storage compactness). However, it was something
> of a performance penalty as processors started attempting to squeeze
> instruction latency ... doing instruction decode and setup overlapped
> with execution ... there had to be constant checking if some previous
> instruction had modified a following instruction that had already
> been fetched and decoded.

This is the old way of telling an 8088 from an 8086. The length of
the instruction queue was different, so modifying an instruction 5
bytes north didn't take on an 8086.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

CBFalconer

unread,

Jan 29, 2007, 4:01:42 PM1/29/07

to

Tim Shoppa wrote:
>
... snip ...

>
> Some machines support the concept that different processes can set
> a bit in their process control word that define the microcode set
> used for each process. If someone could remind me the machine I'm
> thinking of I'd appreciate it! The 11/780 comes close, with a bit
> in the process header that specifies VAX or PDP-11 compatibility
> mode, but those aren't really separate or run-time-modifiable
> microcodes in that machine.

I think that was Burroughs.

Brian W Spoor

unread,

Jan 29, 2007, 4:11:18 PM1/29/07

to

Chris Barts wrote:
> A somewhat cursory Google doesn't bring up anything useful on the subject
> of self-modifying microcode. (In fact, the exact phrase search
> "self-modifying microcode" (with quotes) brings up zero results.) There's
> been a lot of self-modifying machine code in the world -- in fact, the
> PDP-8 subroutine calling convention depended on a relatively minor form of
> this dark art -- but I can't dredge up any reference to it being used one
> level lower.
>
> Further, would it be useful? It seems like a way to squeeze the most out
> of code that must always be as fast as possible and must usually fit in
> very cramped store. Maybe the store it usually resides in has large speed
> penalties for writing, like Flash NVRAM does now. Maybe it's so difficult
> to get right the first time the idea of debugging self-modifying microcode
> is a quick way to get a laugh or a slow way to end up in the nuthouse.
>
> From my nuthouse by the amber waves of weeds, etc. etc.
>

I know that the ICL (3 Rivers) Perq had a user writeable microcode
store, so that you could add in additional microcode instructions from a
program. I recall that somebody wrote some microcode assist for
functions such as area fill when I worked at the ICL Graphics Unit in
Reading (many years ago).

Quadibloc

unread,

Jan 29, 2007, 4:49:57 PM1/29/07

to

It is true that older machines did use self-modifying code, either
because they didn't have index registers, or for return from
subroutines.

Self-modifying microcode, however, was never a traditional technique,
even in the wild and wooly early days of computing. In general, a
computer's microcode wasn't even in the computer's *address space*,
and so self-modifying microcode would not have been possible.

However, the Packard Bell 440 computer *did* put its microcode in the
same address space as main memory... so there is an architecture that
*could* make use of this (extremely dangerous) technique.

John Savard

Al _Kossow

unread,

Jan 29, 2007, 1:46:33 PM1/29/07

to

Tim Shoppa wrote:

> The CalData comes close but still not there and I never used it in
> anything except PDP-11 mode. Maybe I'm thinking of one of the
> timesharing-marketed emulation-through-microcode machines from the
> same era?

Burroughs B1700->B1900 had pagable uCode based on language

It wouldn't surprise me if there are games on the Xerox Alto that had
self-modifying uCode, though I don't know of any for sure.

--
Posted via a free Usenet account from http://www.teranews.com

Tim Shoppa

unread,

Jan 29, 2007, 6:21:05 PM1/29/07

to

Al _Kossow wrote:
> Tim Shoppa wrote:
>
> > The CalData comes close but still not there and I never used it in
> > anything except PDP-11 mode. Maybe I'm thinking of one of the
> > timesharing-marketed emulation-through-microcode machines from the
> > same era?
>
> Burroughs B1700->B1900 had pagable uCode based on language
>
> It wouldn't surprise me if there are games on the Xerox Alto that had

> self-modifying uCode [...]

While Googling this afternoon I found the B1700 and Alto mentioned, so
they definitely fall into the "usual suspects", but those aren't the
one I was thinking of.

Stretching my memory more, this was a company in upstate NY or maybe
even Canada that was selling a mini with reconfigurable microcode for
emulation, but to timesharing rather than real-time customers. ISTR it
being able to handle multiple different timesharing emulations (maybe
including HP minis) simultaneously. Would've been early maybe mid
70's, and what I remember is the marketing so maybe it was never a
real computer.

BTW the CalData memories are very slowly coming back to me too. The
one I tinkered with was used in rat neurology experiments at Caltech,
and I always thought of it as a funky 11/34. I never knew that they
were aiming this at a broader emulation market till you posted the
stuff at Bitsavers, cool!

Tim.

Al _Kossow

unread,

Jan 29, 2007, 6:30:54 PM1/29/07

to

Tim Shoppa wrote:

> Stretching my memory more, this was a company in upstate NY or maybe
> even Canada that was selling a mini with reconfigurable microcode

Nanodata QM-1

Michael N. LeVine

unread,

Jan 29, 2007, 6:45:48 PM1/29/07

to

In article <45be7709$0$1944$8826...@free.teranews.com>,
Al _Kossow <a...@spies.com> wrote:

> Tim Shoppa wrote:
>
> > Stretching my memory more, this was a company in upstate NY or maybe
> > even Canada that was selling a mini with reconfigurable microcode
>
> Nanodata QM-1

The VAX 11/750 and PDP-11/60 were microprogramable but I do not know if
they were self modifiying.

ISTR the DEC ALPHA had different microcode depending on which
of 3 o/s's it booted to (VMS,UNIX,WINDOWS). But do not
recall if it was microprogrammable...
--
Michael LeVine - mle...@redshift.com
"Thirty days hath September, April, June and November.
All the rest have thirty one except for Gypsy Rose Lee
and every one knew what she had" - Mel Blanc

CBFalconer

unread,

Jan 29, 2007, 9:55:50 PM1/29/07

to

Tim Shoppa wrote:
> Al _Kossow wrote:
>> Tim Shoppa wrote:
>>
>>> The CalData comes close but still not there and I never used it in
>>> anything except PDP-11 mode. Maybe I'm thinking of one of the
>>> timesharing-marketed emulation-through-microcode machines from the
>>> same era?
>>
>> Burroughs B1700->B1900 had pagable uCode based on language
>>
>> It wouldn't surprise me if there are games on the Xerox Alto that had
>> self-modifying uCode [...]
>
> While Googling this afternoon I found the B1700 and Alto mentioned, so
> they definitely fall into the "usual suspects", but those aren't the
> one I was thinking of.
>
> Stretching my memory more, this was a company in upstate NY or maybe
> even Canada that was selling a mini with reconfigurable microcode for
> emulation, but to timesharing rather than real-time customers. ISTR it
> being able to handle multiple different timesharing emulations (maybe
> including HP minis) simultaneously. Would've been early maybe mid
> 70's, and what I remember is the marketing so maybe it was never a
> real computer.

In the late '60s/early '70s Microdata sold a microprogrammed
machine, which used diodes to form the microrom. Later they came
out with an (expensive) writable ROM board for development
purposes. I used both methods, and the writable boards saved much
time and effort. I think they were a California bunch.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Roger Ivie

unread,

Jan 30, 2007, 12:26:38 AM1/30/07

to

On 2007-01-29, Michael N. LeVine <mlevine...@redshift.com> wrote:
> ISTR the DEC ALPHA had different microcode depending on which
> of 3 o/s's it booted to (VMS,UNIX,WINDOWS). But do not
> recall if it was microprogrammable...

Alpha PALcode has nothing to do with microcode. It's just a special
flavor of system trap. When running PALcode, Alpha instructions are
executed, the PALcode handler just has more visibility into the
processor.
--
roger ivie
ri...@ridgenet.net

Peter Flass

unread,

Jan 30, 2007, 7:45:23 AM1/30/07

to

Sounds a lot like what IBM calls "millicode" on z/Series.

Anne & Lynn Wheeler

unread,

Jan 30, 2007, 9:41:06 AM1/30/07

to

Peter Flass <Peter...@Yahoo.com> writes:
> Sounds a lot like what IBM calls "millicode" on z/Series.

and before that, Amdahl's "macrocode" from early 80s ... it was used
for implementing hypervisor ... i.e. subset of virtual machines
... built into the machine w/o needing vm370 software kernel.

sort of 370 subset ... and one of the differences ... "macrocode"
mode eliminated provisions for supporting self-modifying code ...
and the associated performance penalty ...
http://www.garlic.com/~lynn/2007d.html#1 Has anyone ever used self-modifying microcode? Would it even be useful?

ben yates

unread,

Jan 30, 2007, 2:18:09 PM1/30/07

to

Let's not forget the TMS99xxx series, one or two of which supported
macrostore, so that you could extend the instruction set.

And the 990 series (i.e. TMS9900) itself has an eXecute instruction, while
back then I thought not unusual, people newly introduced to it nearly puke
a lung when they hear that it allows you to build an instruction to be
executed.

Peter Flass

unread,

Jan 30, 2007, 2:49:45 PM1/30/07

to

Al _Kossow wrote:
> Tim Shoppa wrote:
>
>> Stretching my memory more, this was a company in upstate NY or maybe
>> even Canada that was selling a mini with reconfigurable microcode
>
>
> Nanodata QM-1
>

ISTR the Microdata "Reality" system featured user-modifiable microcode,
but I wouldn't call this "self-modifying."

Morten Reistad

unread,

Jan 30, 2007, 3:28:17 PM1/30/07

to

In article <45bfa15f$0$24511$4c36...@roadrunner.com>,

Peter Flass <Peter...@Yahoo.com> wrote:
>Al _Kossow wrote:
>> Tim Shoppa wrote:
>>
>>> Stretching my memory more, this was a company in upstate NY or maybe
>>> even Canada that was selling a mini with reconfigurable microcode

Are you thinking of Prime Computers, Inc from Natick, MA?

The 50-series had costomer loadable microcode, and there actually
were customers that did modify the microcode. The support for PICK
databases had a largish speed boost from some strategic new microcode
AFAIR: This was done before Prime started pushing Pick as Prime
Information.

There were also several sets of alternate microcode for the 68k
series. IBM had one, that had taken on a bluer life and imagined it
was a 360 (or was that 370).

>> Nanodata QM-1
>>
>
>ISTR the Microdata "Reality" system featured user-modifiable microcode,
>but I wouldn't call this "self-modifying."

ISTR the KL10 (PDP10 implementation by DEC) also had a microcode
development kit, but I remember no implemetations. [I may be having
a senior moment here.]

-- mrr

Tim Shoppa

unread,

Jan 30, 2007, 3:38:57 PM1/30/07

to

On Jan 30, 3:28 pm, Morten Reistad <f...@last.name> wrote:
> In article <45bfa15f$0$24511$4c368...@roadrunner.com>,
> Peter Flass <Peter_Fl...@Yahoo.com> wrote:
>
> >Al _Kossow wrote:
> >> TimShoppawrote:

>
> >>> Stretching my memory more, this was a company in upstate NY or maybe
> >>> even Canada that was selling a mini with reconfigurable microcode
>
> Are you thinking of Prime Computers, Inc from Natick, MA?

Al got it right with Nanodata. Interesting that the surviving docs are
so hardware-specific, what I read was mostly marketroid drivel.

> ISTR the KL10 (PDP10 implementation by DEC) also had a microcode
> development kit, but I remember no implemetations. [I may be having
> a senior moment here.]

Those tools are now available should anyone want to give it a try :-).

Tim.

Anne & Lynn Wheeler

unread,

Jan 30, 2007, 4:39:47 PM1/30/07

to

Morten Reistad <fi...@last.name> writes:
> There were also several sets of alternate microcode for the 68k
> series. IBM had one, that had taken on a bluer life and imagined it
> was a 360 (or was that 370).

started out as xt/370 ... i.e. basically add-on to pc/xt ... code name
washington. later it had add-on to pc/at ... as at/370. had severe
memory constraints (by vm370 and cms standards) ... and was quite disk
intensive ... which with everything being done thru co-processor to
8088 in the xt and then mapped to the xt harddisk ... could be quite
painful (single block transfer at a time with 100ms access per).

recent post about rewritting cms applications for pc environment
(as more attractive alternative considering the memory and disk
constraints of the period). recent posts
http://www.garlic.com/~lynn/2006y.html#29 The Elements of Programming Style
http://www.garlic.com/~lynn/2007.html#1 The Elements of Programming Style

note above reference has a little x-over with more recent thread:
http://www.garlic.com/~lynn/2007d.html#4 Jim Gray Is Missing
http://www.garlic.com/~lynn/2007d.html#6 Jim Gray Is Missing

i did some simple benchmarks on early prototype and also noticed that
a lot of stuff page-trashed ... in the 384k bytes available for 370
operation. the result was that I then took the blaim for several month
slip in customer ship while they put together an upgrade to 512k
bytes.

washington was the only product where I was able to ship my CMS paged
mapped filesystem support. At the high-end ... I could benchmark three
times thruput increase with 3380s for filesystem intensive workloads.
The degradation with the 100ms XT harddisks were quite striking ...
and CMS paged mapped filesystem support offered a little improvement.
misc. past posts about CMS page mapped filesystem support
http://www.garlic.com/~lynn/subtopic.html#mmap

Christopher C. Stacy

unread,

Jan 30, 2007, 4:43:06 PM1/30/07

to

Lots of machines in the 1970s have been user microprogammable.
The ones that I've programmed on a lot include:

The HP 2100 series (1970) had a development environment for
its writable control store extension boards. (I never did any
microcoding though: just used the HP 2000 Time Shared BASIC!)

As most afc readers know, the PDP-10 family (1963, 1968) had
macrocode trap handlers for user-defined ("unimplemented") instructions,
but that was not microcode. Later versions (KL, 1975) of the PDP-10
happened to be microcoded machines, and MIT developed their own
(eg. different paging, new general-purpose instructions; they
had done the same in custom hardware mods on the old machines.)

The closest thing to what the OP was looking for might be the Lisp Machine.
Some versions of the Lisp Machine (1974) supported on-line microcode hacking.
The normal ISA implemented by microcode corresponded closely to Lisp,
so nobody (other than the compiler) would write programs below Lisp.
But there was also a Lisp->ucode compiler, so that the user could
dynamically load and call his new micro routines. I never used this
facility, so I don't know if you could reliably unload or replace
microcoded routines (you were supposed to be able to), nor if you
could even touch the regular micro-instruction space while running.
User-microprogramming was never heavily exploited, and the Symbolics
machines did not have any features for user munging of the microstore.

Some of the tangentially related design features on the Lisp Machine
were: trap (to-macrocode) handlers; hardware for supporting garbage
collection; and all-tagged (type/object aware) memory/CPU hardware.

Anne & Lynn Wheeler

unread,

Jan 30, 2007, 8:23:54 PM1/30/07

to

cst...@news.dtpq.com (Christopher C. Stacy) writes:
> Some of the tangentially related design features on the Lisp Machine
> were: trap (to-macrocode) handlers; hardware for supporting garbage
> collection; and all-tagged (type/object aware) memory/CPU hardware.

a few past posts with old email from '79 mentioning attempts to get an
early 801 processor for lisp machines:
http://www.garlic.com/~lynn/2003e.html#65 801 (was Re: Reviving Multics
http://www.garlic.com/~lynn/2006c.html#3 Architectural support for programming languages
http://www.garlic.com/~lynn/2006o.html#45 "25th Anniversary of the Personal Computer"
http://www.garlic.com/~lynn/2006t.html#9 32 or even 64 registers for x86-64?

in 1980 time-frame there were attempts to replace the large number of
different corporate microprocessors with 801s.

however, 801 as a "microcode" processor engine made "self-modifying"
microcode nearly impossible (in the sense of 360/370 instructions
modifying subsequent instructions in the instruction stream).

with separate I&D caches and no provisions for cache consistency
... the instruction and data "data spaces" were somewhat
disjoint. Program loaders needed special operation which would
flush/force any modifications from the data cache back to memory
... and then invalidate any corresponding locations that might happen
to be in the i-cache ... so that instruction fetch would result in an
i-cache miss, forcing a (i-cache) fetch (of the possibly modified
data) from memory (and that doesn't even take into account possible
superscaler instruction pre-fetch, decode, and execution).

posted old email mentioning 801, fort knox, romp, rios, pc/rt,
rs/6000, power/pc, etc.
http://www.garlic.com/~lynn/lhwemail.html#801

misc. collected posts mentioning 801, fort knox, romp, rios, pc/rt,
rs/6000, power/pc, etc
http://www.garlic.com/~lynn/subtopic.html#801

James Dow Allen

unread,

Feb 5, 2007, 2:51:37 AM2/5/07

to

On Jan 29, 8:37 pm, Chris Barts <puonegf+hfr...@tznvy.pbz> wrote:
> self-modifying microcode.

The 370/145 stored the sizes of main memory and
control storage at control storage address FF08 and FF0C.

When AMS/Intersil attached memory, it not only had
to change FF0C but, because of a reconfiguration panel,
had to change it dynamically based on the output of that
panel. This was done by gating the computed result
whenever FF0C was fetched.

There were two address modes: ordinary and
"K-addressible" (used for the 256 bytes FFxx).
The special hardware was invoked only for k-addressible,
so checksums and diagnostics were unaffected.

The hardware that gated the special data was on the same
small circuit board that implemented "Storage-3"
cycles. You won't see "Storage-3" mentioned in IBM logics:
it was a mechanism that allowed AMS/Intersil to attach
a 36-bit storage-out bus to IBM's 72-bit bus!

Before our special hardware was invented, the end-user
was supposed to smuggle us a copy of IBM's floppy
which we "modified." All the cognizant people had
fled for greener pastures, so one day my manager
asked me, the ignorant hippy newcomer, to "modify"
such a floppy. I did it and brought the floppy back to
him. "Where's the copy?" he asked, and was horrified
to realize I didn't know "modify the floppy" meant to
create a copy, leaving IBM's property unaltered!

James Dow Allen

Jeff Jonas

unread,

Feb 17, 2007, 3:07:10 AM2/17/07

to

> It is true that older machines did use self-modifying code,
> either because they didn't have index registers,
> or for return from subroutines.

Thank you for a concise explaination!

I worked on a General Instruments DSP long ago
and was warned not to write self-modifying code.
Of course, I /had to try it/ just to get it out of my system.
As you said, there were not enough index registers (only one)
so I self-modified a load instruction
to increment the literal address field
to allow accessing 2-3 arrays simultaneously.
I never needed it in production (to everyone's relief)
because it would've been undebuggable:
the front panel only accessed the bus so I could not trace,
stop or even SEE the resulting instruction that was
loaded directly into the execution register.

Similarly, the LGP21 allowed
direct loading of the instruction register.

The beloved IBM 1130 had no stack or register
for the return address, so the jump-to-subroutine
wrote the return address to the first word of the subroutine
and started execution at the word after that.
Yes, each subroutine started with a spare word,
so recursion and re-entry was out of the question.

I never saw the Basic-4 machine running but the manual described
how the machine instructions were practically microcode
(perhaps there's some natural overlap of microcode to very-wide-work instruction
machines since they can manipulate things on a very low level).
For example, there was a half-read instruction
and a half-write instruction to save time on core access.
Reading core memory is a destructive process,
so read-without-write-back is faster
(and naturally implements the test-and-set instructions).
If the core location was half-read, then writing back didn't require
an erase first, so a half-write saved time too.

Self modifying code isn't just a /bad idea/, it's illegal,
because many systems won't allow it (not ROM-able)
or it trips up the pipelining and caching.
--

-- mejeep deMeep ferret!

Walter Bushell

unread,

Feb 17, 2007, 12:59:19 PM2/17/07

to

In article <er6d3e$a1h$1...@panix5.panix.com>,
je...@panix.com (Jeff Jonas) wrote:

Another example, from the 70s at NASA we were using machines with no
variable shift instruction. One of my cow-orkers made self modifying
code to give simulate that by putting the number of bits to shift into
the shift instruction.

Oh yes, our machine did the same as the IBM 1130 per subroutine calls.
It became necessary to have absolute addresses and the only way to get
it was to use the subroutine calling instruction eg.

JRST *+1
NOP ;ABSOLUTE ADDRESS OF LOCATION WILL BE PLACED HERE
LDA *-1 ; PICK UP ADDRESS

We used self modifying code all the time so it is not completely
undebugable, just a couple of orders of magnitude harder.

I bet it's still in use for embedded applications.

--
"The power of the Executive to cast a man into prison without formulating any
charge known to the law, and particularly to deny him the judgement of his
peers, is in the highest degree odious and is the foundation of all totali-
tarian government whether Nazi or Communist." -- W. Churchill, Nov 21, 1943

krw

unread,

Feb 17, 2007, 4:52:45 PM2/17/07

to

In article <proto-6F77D1....@reader2.panix.com>,
pr...@panix.com says...

I'd be very surprised if there was any use of self-modifying code
in recent applications, embedded or not. Processors that prefetch
code or have I-caches make a mess of self-modifying code. ...not
to mention that embedded types don't appreciate you orders-of-
magnitude increase in debug complexity.

--
Keith

Eric Sosman

unread,

Feb 18, 2007, 9:45:14 AM2/18/07

to

Does the "Just In Time" compiler found in most Java
implementations count as a use of self-modifying code? Surely
the machine winds up executing instructions that did not come
from mass storage but were created on the fly. True, the JIT
does not modify its own instructions (as far as I know), but
the program as a whole modifies itself as it runs.

There's also the technique that was used (and may still be)
to accommodate 80386 systems that had no floating-point unit.
Compilers would emit calls to F-P emulation routines, but on a
system with hardware F-P the "emulator" would overwrite the CALL
that invoked it with the intended F-P instruction. I think I
read somewhere that some emulator CALLs were padded with NOPs
to make enough room for the potential overwrite.

--
Eric Sosman
eso...@acm-dot-org.invalid

krw

unread,

Feb 18, 2007, 10:00:08 AM2/18/07

to

In article <LK2dnYRaQvvj-0XY...@comcast.com>,
eso...@acm-dot-org.invalid says...

IMO, no more than a BASIC interpreter.

> Surely
> the machine winds up executing instructions that did not come
> from mass storage but were created on the fly. True, the JIT
> does not modify its own instructions (as far as I know), but
> the program as a whole modifies itself as it runs.

Are you saying that a compiler generates self-modifying code? A
linker?

> There's also the technique that was used (and may still be)
> to accommodate 80386 systems that had no floating-point unit.
> Compilers would emit calls to F-P emulation routines, but on a
> system with hardware F-P the "emulator" would overwrite the CALL
> that invoked it with the intended F-P instruction. I think I
> read somewhere that some emulator CALLs were padded with NOPs
> to make enough room for the potential overwrite.

Still not self-modifying. These locations would be patched at load
time, not run time. Self-modifying code with modern processors is
*very* ugly. The I-caches aren't multi-ported, thus cannot be
written. Any modifications have to be written to memory (and D-
caches) then refetched into the I-Cache; ugly.

--
Keith

Anne & Lynn Wheeler

unread,

Feb 18, 2007, 10:19:03 AM2/18/07

to

krw <k...@att.bizzzz> writes:
> Still not self-modifying. These locations would be patched at load
> time, not run time. Self-modifying code with modern processors is
> *very* ugly. The I-caches aren't multi-ported, thus cannot be
> written. Any modifications have to be written to memory (and D-
> caches) then refetched into the I-Cache; ugly.

past posts

http://www.garlic.com/~lynn/2007d.html#1 Has anyone ever used self-modifying microcode? Would it even be useful?

http://www.garlic.com/~lynn/2007d.html#3 Has anyone ever used self-modifying microcode? Would it even be useful?
http://www.garlic.com/~lynn/2007d.html#7 Has anyone ever used self-modifying microcode? Would it even be useful?
http://www.garlic.com/~lynn/2007d.html#9 Has anyone ever used self-modifying microcode? Would it even be useful?

and for store-into d-caches (as opposed to store-thru) ... you need
explicit operations to flush any data modifications from d-cache back
to main memroy, then explicitly invalidate any corresponding locations
in the i-cache (or maybe just global cache operations, flush all of
d-cache to memory and then invalidate all of the i-chache) ... so that
i-fetch will result in pulling the modified locations from memory

there was a similar but different problem with the introduction of
168-3 for some installations. the 168-3 doubled the size of system
cache (vis-a-vis) 168-1 ... and used the "2k" address bit for indexing
the additional cache lines.

however, this met that when running in 370 2k virtual page mode (as
opposed to 4k virtual page mode) ... the machine only ran with half
the cache (i.e. like a 168-1).

there were some number of installations that were running dos/vs
and/or vs1 (under vm370) on 370/168 ... and not only didn't see any
performance improvement with upgrade to 168-3 ... but actually saw a
performance decrease. the issue was that normally vm370 ran with
configuration set to 4k virtual page mode ... except when dispatching
a virtual machine with 2k "shadow tables". This could result in
constantly switching hardware configuration bit back and forth between
2k page mode and 4k page mode. Because the cache indexing used
different mapping in the two modes ... the hardware had to also
completely flush the cache every time the 2k/4k page mode
configuration bit was changed (resulting in customer upgrade to 168-3
with double the cache size, seeing worse thruput).

Eric Sosman

unread,

Feb 18, 2007, 10:47:38 AM2/18/07

to

krw wrote:
> In article <LK2dnYRaQvvj-0XY...@comcast.com>,
> eso...@acm-dot-org.invalid says...
>> krw wrote:
>>>
>>> I'd be very surprised if there was any use of self-modifying code
>>> in recent applications, embedded or not. Processors that prefetch
>>> code or have I-caches make a mess of self-modifying code. ...not
>>> to mention that embedded types don't appreciate you orders-of-
>>> magnitude increase in debug complexity.
>

>> There's also the technique that was used (and may still be)
>> to accommodate 80386 systems that had no floating-point unit.
>> Compilers would emit calls to F-P emulation routines, but on a
>> system with hardware F-P the "emulator" would overwrite the CALL
>> that invoked it with the intended F-P instruction. I think I
>> read somewhere that some emulator CALLs were padded with NOPs
>> to make enough room for the potential overwrite.
>
> Still not self-modifying. These locations would be patched at load

> time, not run time. [...]

Perhaps I didn't explain myself clearly. The patching
occurred at run time, not at load or link or anything other
pre-execution phase. The program as loaded contained CALLs
to F-P handlers. The first time each CALL was executed, the
handler would determine whether the machine supported hardware
F-P. If so, the handler overwrote the CALL, replacing it with
FADD or FMUL or whatever. If not, the handler emulated the
desired instruction in software.

That, at least, is my recollection, possibly self-modified ;-)

--
Eric Sosman
eso...@acm-dot-org.invalid

David Powell

unread,

Feb 18, 2007, 1:01:49 PM2/18/07

to

In article <er6d3e$a1h$1...@panix5.panix.com>,

je...@panix.com (Jeff Jonas) in alt.folklore.computers wrote:

>The beloved IBM 1130 had no stack or register
>for the return address, so the jump-to-subroutine
>wrote the return address to the first word of the subroutine
>and started execution at the word after that.

Likewise for the PDP8. Further, if the subroutine was called from any
arbitrary memory field, then it would be called with the data field
set to that of the call. The subroutine code reads the DF, adds in a
literal CIF 0 instruction to build an instruction to change the
instruction field to that of the call, and writes the instruction into
the (hopefully spare) location immediately before the return jump
instruction.

>Yes, each subroutine started with a spare word,
>so recursion and re-entry was out of the question.
>

No, it became a challenge!

<snip>

>
>
>Self modifying code isn't just a /bad idea/, it's illegal,
>because many systems won't allow it (not ROM-able)
>or it trips up the pipelining and caching.

Again, it became a challenge! MR8-Fb memory is UV PROM for the
omnibus PDP8. It has one extra bit per word, set the bit, the word
becomes a pointer to a word of RAM and the word behaves as if in RAM.
There is no limit to the misuse of man's ingenuity.

Regards,

David P.
.

krw

unread,

Feb 18, 2007, 2:17:34 PM2/18/07

to

In article <7M6dnZcO7des6EXY...@comcast.com>,

I'm not buying your recollection. ;-)

As I've said, modifying code that may have been fetched into the I-
cache (or pipeline) is really ugly. I doubt anyone does this on
purpose anymore. It's far easier to test for the presence of FP
and either patch at load time or point to the appropriate runtime
library. Of course the obvious alternative is to simply trap on a
FP operations.

--
Keith

Peter Flass

unread,

Feb 18, 2007, 3:21:26 PM2/18/07

to

Eric Sosman wrote:
> Does the "Just In Time" compiler found in most Java
> implementations count as a use of self-modifying code? Surely
> the machine winds up executing instructions that did not come
> from mass storage but were created on the fly. True, the JIT
> does not modify its own instructions (as far as I know), but
> the program as a whole modifies itself as it runs.

It's not microcode, if that's still the subject, and it's also not new.
"Just-in-time" is what we used to call an incremental compiler, where
a statement is compiled just before it's executed. This is a subset of
"load-and-go" compilers, which compiled a program to memory and then
executed it immediately.

>
> There's also the technique that was used (and may still be)
> to accommodate 80386 systems that had no floating-point unit.
> Compilers would emit calls to F-P emulation routines, but on a
> system with hardware F-P the "emulator" would overwrite the CALL
> that invoked it with the intended F-P instruction. I think I
> read somewhere that some emulator CALLs were padded with NOPs
> to make enough room for the potential overwrite.
>

I've seen the reverse more often. Unimplemented instructions trap to an
emulation routine. I don't know if anyone ever bothered to overwrite
them after the first trap.

Christopher C. Stacy

unread,

Feb 18, 2007, 3:56:17 PM2/18/07

to

Peter Flass <Peter...@Yahoo.com> writes:

The MACLISP compiler (PDP-10) emitted code that would be dynamically
modified at run-time. The first time you called a compiled function,
it would XCT an address to the routine that would locate the function.
That target of the XCT instruction would then be overwritten with the
cached value of the intended function. I don't know if you want to
count this as self-modifying "code" or not; it's similar to udpating
a dispatch table.

Eric Sosman

unread,

Feb 18, 2007, 4:57:54 PM2/18/07

to

At the risk of tedium ...

A disadvantage of the "trap on unimplemented instruction"
method is that there's usually quite a lot of overhead. The
trap mechanism itself may be relatively expensive: there may
be a switch into and out of privileged context, possibly a
change to a different execution stack, that sort of thing.
Once the trap is taken, the handler usually needs to decode
the trapped instruction before it can figure out what emulation
to perform, where the operands are to be found, and where the
results are to be delivered. If the handler runs in a more
privileged context than does the trapped code, all the operands
and destinations need to be access-checked. Using a trap is
conceptually simple, but can be painfully slow. (Pre-emptive
rejection: "He didn't buy an FPU, so he deserves slowness"
seems to me a blame-the-victim argument.)

A plain vanilla CALL to a subroutine, on the other hand,
avoids much of this effort. There's the subroutine linkage
itself; not free, but likely cheaper than trap overhead. And
there's a hidden cost in marshalling arguments: the compiler
might have been able to make better register allocations if it
weren't forced to cope with a CALL in the middle of things. On
the other hand, execution goes immediately to the proper emulator
without the need for a decoding step, the operands and destination
are in the places specified by the subroutine linkage API and
don't need to be sifted out from all the multitude of places a
trapped instruction might have designated, and there's no worry
about privilege breaches because the emulating subroutine runs
in the same context as the caller.

Self-modifying code presents some difficulties, yes: You've
mentioned cache coherency, and there are others. But going from
"X is difficult" to "X is never done" seems too long a step.

--
Eric Sosman
eso...@acm-dot-org.invalid

Peter Flass

unread,

Feb 19, 2007, 7:52:43 AM2/19/07

to

I believe this is how DLLs and DSOs are handled now on various systems.
This is what Multics did for external references, all of which were
resolved at run time.

krw

unread,

Feb 19, 2007, 9:46:33 AM2/19/07

to

In article <oIKdncAymZp6VkXY...@comcast.com>,

There is also a large overhead writing to storage that may be
prefetched or I-cached.

> The
> trap mechanism itself may be relatively expensive:

As is any write into the instruction pipeline (including I-cache).

> there may
> be a switch into and out of privileged context, possibly a
> change to a different execution stack, that sort of thing.
> Once the trap is taken, the handler usually needs to decode
> the trapped instruction before it can figure out what emulation
> to perform, where the operands are to be found, and where the
> results are to be delivered. If the handler runs in a more
> privileged context than does the trapped code, all the operands
> and destinations need to be access-checked. Using a trap is
> conceptually simple, but can be painfully slow. (Pre-emptive
> rejection: "He didn't buy an FPU, so he deserves slowness"
> seems to me a blame-the-victim argument.)

Well, the victim didn't but the hardware needed for the
application. Problem?

> A plain vanilla CALL to a subroutine, on the other hand,
> avoids much of this effort. There's the subroutine linkage
> itself; not free, but likely cheaper than trap overhead. And
> there's a hidden cost in marshalling arguments: the compiler
> might have been able to make better register allocations if it
> weren't forced to cope with a CALL in the middle of things. On
> the other hand, execution goes immediately to the proper emulator
> without the need for a decoding step, the operands and destination
> are in the places specified by the subroutine linkage API and
> don't need to be sifted out from all the multitude of places a
> trapped instruction might have designated, and there's no worry
> about privilege breaches because the emulating subroutine runs
> in the same context as the caller.

So use separate DLLs for FP/^FP systems.

>
> Self-modifying code presents some difficulties, yes: You've
> mentioned cache coherency, and there are others. But going from
> "X is difficult" to "X is never done" seems too long a step.

Ok, show me a modern use of self-modifying code used for such
things. Hell, show me an antique!

--
Keith

Scott McPhillips [MVP]

unread,

Feb 19, 2007, 1:49:58 PM2/19/07

to

krw wrote:
> Ok, show me a modern use of self-modifying code used for such
> things. Hell, show me an antique!
>

Extreme but legitimate case, actually in modern use: Obfuscation of
anti-piracy measures within a commercial application's binary image. Of
course, I can't show it to you because that might defeat the purpose!

krw

unread,

Feb 19, 2007, 7:23:36 PM2/19/07

to

In article <nKOdnXfMlro0bETY...@comcast.com>, "Scott
McPhillips [MVP]" <org-dot-mvps-at-scottmcp> says...

Foul!! obfuscation <> "such things". ;-) (the uglier the better)

Self-modyfing code was used to determine processors in the early PC
days. One would write a location (IIRC) at PC+5 then see what
happened. If the modification was taken the processor was a 8088
(4-byte prefetch), if not it was an 8086 (6-byte prefetch). This
also shows that instruction writes aren't protected/coherent.

--
Keith

Paul Repacholi

unread,

Feb 21, 2007, 9:54:48 AM2/21/07

to

"Quadibloc" <jsa...@ecn.ab.ca> writes:

> Self-modifying microcode, however, was never a traditional technique,
> even in the wild and wooly early days of computing. In general, a
> computer's microcode wasn't even in the computer's *address space*,
> and so self-modifying microcode would not have been possible.

> However, the Packard Bell 440 computer *did* put its microcode in the
> same address space as main memory... so there is an architecture that
> *could* make use of this (extremely dangerous) technique.

Another was the lowest* of the 360 line, it stored its microcode in core.
Someone did a ??? to microcode compiler and stuff to load apps in ucode.
It was reputed to about match a 360/4x class machine.

*360/20? 25?

Walter Bushell

unread,

Feb 21, 2007, 1:49:49 PM2/21/07

to

In article <MPG.2043823c8...@news.individual.net>,
krw <k...@att.bizzzz> wrote:

Creating a variable shift or rotate instruction, actually used when I
worked at NASSA.

Storing return address for subroutines.

Getting the absolute address of the current sector.

Last two are not strictly speaking modifying _code_, but it is in the
same area.

Jeff Jonas

unread,

Feb 23, 2007, 12:54:08 PM2/23/07

to

In article <nKOdnXfMlro0bETY...@comcast.com>,

Scott McPhillips [MVP] <org-dot-mvps-at-scottmcp> wrote:
>krw wrote:
>> Ok, show me a modern use of self-modifying code used for such
>> things. Hell, show me an antique!

>Extreme but legitimate case, actually in modern use: Obfuscation of
>anti-piracy measures within a commercial application's binary image.

I see 2 cases: binary-only code for general purpose systems,
and embedded systems.

For general systems (PCs, workstations), it's reasonable for programs
to obfuscate security keys and such in the files to resist grepping,
but reconstruct the key in memory only while running.
Malware is using that to thwart anti-virus software,
mostly by delivering payload that's compressed with unusual methods
and/or encrypted with keys that are acquired when they "phone home".

Embedded systems that use chips such as the AVR or PIC
have the ROM on the same die as the CPU, meaning
- there are no external pins to access the memory directly
as with ROM/EPROM/EEPROM/FLASH memory chips
we see on motherboards for the BIOS
- they offer security settings for portions of the ROM
to be execute only (cannot read), or erase protected.
Still, the RAM/registers must eventually contain the required keys,
thus the recent cracking of the DVD/Blue-Ray systems
when the valid secret key resided in RAM.