On the contrary, readers of comp.arch should be vitally interested about
FIBSTACK, since it's all about architecture.
The Burroughs B6x00/B7x00 systems and their successors to the present
have always had instructions to directly read (RPRR) and write (SPRR)
the D (display) registers, but these were rarely used outside of
diagnostic software, and they were not necessary to implement the type
of addressing that Louis describes. The registers are set by the normal
display register update that is done during a procedure call.
There are three ideas behind this concept:
1. A "stack" on the Burroughs systems is three things: (a) a push-pop
workspace for expression evaluation, (b) a storage area for
procedure-call history, and (c) and addressing environment for the
nested scopes needed by a block-structured programming language. That
last is the most important to this discussion.
2. A stack frame that implements the storage for locals of a procedure
is just a contiguous chunk of words in memory. If the stack frame is
part of the current execution scope, then there will be a D register
pointing to the first word of that frame. Therefore, the words of that
frame can be addressed as locals using normal address couples,
(LL,offset), where LL (lexicographic level) is the D register index and
offset is the displacement from the base of the stack frame to the word
desired. This is the most efficient form of addressing available to
Normal State programs.
3. In a block-structured language such as Algol or Pascal, there are D
registers pointing not only to the base of the current stack frame, but
to the bases of all of the lower-level stack frames that are currently
in scope. All of their words are also accessible using that same
efficient address-couple mechanism. Therefore, all you have to do is
arrange for the data structure you want to access to look like a
more-global stack frame of a procedure you call.
Also note that procedures can be called by indirect references (e.g., a
procedure identifier passed by name) and that a more-global stack frame
does not need to be part of the same program's stack. The Burroughs
architecture implemented cross-stack addressing (sometimes called
"cactus stack" addressing). See Hauck and Dent's 1968 AFIPS paper for
Also, the B6700 Reference Manual is available at
(in particular, see
The original idea behind this type of addressing probably was to
implement hierarchical process families, but it turned out to have a
number of other useful applications as well. It just took a while to
In the general case, the setup and activation of this type of addressing
can get really complex, and even the simple cases are not all that
simple. There are three control words that do most of the heavy lifting,
The Stuffed Indirect Reference Word (SIRW), the Mark Stack Control Word
(MSCW) and the Program Control Word (PCW).
I'll describe how these worked on the B6x00/B7x00 systems. The details
have changed quite a bit over the years, but the concept has remained
the same. In the following, the notation [S:N] defines a bit field
within a word, with S denoting the starting bit number (high order=47,
low-order=0) and L denoting the number of bits to the right (towards bit
0) from the starting bit.
An SIRW (tag=1) is used to point to a word in a stack. Its main use is
to point to words that are (or may be) outside the current scope chain.
[There used to be an un-stuffed IRW that pointed only to words in the
current scope; it is no longer present in modern systems]. It consists
of three main fields.
[45:10] the stack number of the stack in which the word exists.
Each stack is defined by a data descriptor having its base address and
length. All of these stack descriptors are arranged in a one-dimensional
array, the Stack Vector, indexed by their stack number. A descriptor for
the Stack Vector is stored at (0,2) in the MCP stack. This is an address
couple known to the hardware.
[35:16] the displacement from the base of the stack identified
above to the start of the stack frame containing the word being
addressed. The word at the base of that stack frame must be an MSCW.
[12:13] the offset from the start of that stack frame to the word
being addressed. Thus, if you add this offset and the displacement
above, that yields the offset to the word from the base of stack. The
two values are stored separately because there are cases (particularly
during procedure call) where you need to determine the scope within
which the referenced word exists.
An MSCW (tag=3) provides stack frame and scope linkage. Its major fields
[45:10] stack number of the stack that contains the stack frame for
the next lower-level (more-global) scope.
[35:16] displacement from the base of that stack to the start of
the stack frame for the next more-global scope. The word at the start of
each stack frame must be an MSCW. Thus, this field and the stack number
link to the stack frames for the more-global scopes, and allow the
processor to restore the D registers for the scope chain.
[18:5] lexicographic level associated with this stack frame.
[12:13] offset to the prior stack frame in this stack. This links
to the stack frame of the immediate caller, and allows the stack to be
cut back when a procedure exits.
A PCW (tag=7) defines the entry point to a procedure or subroutine. Its
major fields are:
[45:10] stack number of the stack containing this PCW.
[35:3] the byte offset within the first word of code where
execution is to begin.
[32:13] the word offset within the code segment for the procedure
where execution is to begin.
[18:5] the lexicographic level at which the procedure will run.
[12:13] the displacement within the program's Segment Dictionary to
the descriptor for the procedure's code segment. The Segment Dictionary
is itself a stack, with its own stack number, but is used only as a
read-only addressing environment. It consists of a single stack frame,
which for Normal State programs is always addressed at LL=1 (the global
data environment for most use programs is LL=2, and the MCP globals are
at LL=0). The Segment Dictionary contains descriptors for code segments,
descriptors for read-only data segments (e.g., Algol Value Arrays), and
single-word constants. Since the Segment Dictionary is at a more global
level, contains only read-only elements, and is implemented as a
separate stack, it is sharable among multiple data stacks. Hence,
universal and automatic code reentrancy.
Thus, the PCW indicates the segment, word, syllable (byte), and lex
level of the entry point to the procedure.
The more subtle thing about a PCW is that its _location in the stack_
determines the addressing scope for the procedure. The stack frame in
which the PCW resides becomes the next more-global scope for the called
procedure. The MSCW for that stack frame then links to the scopes that
are more global than that, if any.
This is the key to getting the D registers to address data areas that
would not normally be considered to be part of a stack. If we can put an
MSCW at the beginning of the area that contains appropriate linkage
values, then place PCWs within that area, then construct SIRWs that
point to those PCWs, and use those SIRWs in a procedure-call sequence,
the called procedure will have the data area as its next more-global
addressing environment, and can access the words of that area using
efficient address-couple addressing. What's more, the MSCW at the
beginning of that data area can be linked to yet more-global
environments, so you could implement a nicely hierarchical addressing
scheme for multiple data areas.
To me, the really cool thing is that all of the register setup and
restoration is completely automatic. Procedure call automatically loads
the D registers with the necessary values. Procedure exit automatically
restores the original state. Don't need no stinkin' instructions to set
A typical procedure call sequence would look like this:
MKST mark stack: construct and push a MSCW
NAMC (LL,n) name call: push an IRW to the PCW or its SIRW
(code to push any parameters to the procedure)
ENTR enter the procedure
That's it -- four bytes of code, plus whatever is necessary to evaluate
The FIBSTACK mechanism that Louis mentioned was the first (that I know
of) to exploit this addressing technique, and it's a good example to
explore. It was implemented as part of the complete rewrite of Logical
I/O (LIO) that was released in the Mark II.0 MCP. LIO is the user-level
API used to access files. The original implementation was based on that
of the B5500, but was an ambitious generalization of many of the B5500
I/O concepts, including the idea of File Attributes. Alas, the original
implementation was complex, buggy, and really, really slow.
Part of the problem was that with all of that generalization, there were
a lot of features and options to choose from, and if you tried to
implement selection of the necessary features in a straightforward way,
it took a lot of tests to figure out what a particular I/O call had to
do. The solution was to break up that one big, mother READ routine into
a bunch of small, tightly focused routines that each addressed a
specific situation, and to build a mechanism that could efficiently
select and execute the appropriate one in any given case.
A FIB (File Information Block) is a system-managed data structure that
represents a file declared in a program. Note that we generally use
"file" to represent two entirely different, but related things. There is
"file" meaning a physical collection of data -- a reel of tape, a set of
areas or extents on disk, etc. Then there is "file" meaning the API we
use to access those physical entities. This discussion is oriented
towards that latter meaning.
Physically, a FIB is just an array of words in memory. A file
declaration in a program is represented at run time by a data descriptor
for that array. Like most such things in the B5500 and successor
systems, it is allocated on first reference. Another data structure
stored in the code file and addressed through the Segment Dictionary
contains a list of attribute/value pairs used to initialize the FIB and
set values established by the file declaration. As part of that
initialization, any attributes specified by file equations in the
initiating environment (e.g., FILE statements in WFL) will be merged
with the compiled-in attributes. User programs do not manipulate the
contents of the FIB directly; they do that through file attribute
assignments and I/O statements.
The LIO routines in the MCP do access the FIB array directly, however,
and in the original implementation that was done by indexing the FIB
descriptor, a more expensive form of addressing than that of address
couples. Thus, a second goal in the redesign was to give the LIO
routines much more efficient access to the words in the FIB.
The original way this was done on the B6500/6700 was to create an
additional set of descriptors in the Stack Vector that effectively
mapped all of physical memory. A stack for that version of the
architecture could be up to 64Ki words in size, and maximum memory was
1Mi words, so only 16 additional descriptors were needed.
The MCP initialized the FIB array so that FIB contained an MSCW. This
was a dummy MSCW, in that there was no prior procedure call, and no
more-global scope. It was there to satisfy the addressing requirements
of the hardware so that the D registers could be loaded and restored
FIB contained an SIRW termed the "Selector". This SIRW had a stack
number corresponding to the 64KiW chunk of memory in which the FIB was
allocated, and a displacement in [35:16] from the beginning of that
chunk to the MSCW in FIB. The offset field in [12:13] pointed into
the FIB, but the low-order three bits were zero. This SIRW pointed to a
PCW for a sequential WRITE routine. The next word was a PCW for a
sequential READ routine; the word after that was for a random WRITE
routine, etc. There were eight such procedures; modern systems support
up to 16.
To perform an I/O call, the code in the user program generated by the
compiler would first mark the stack. Then it would index and load the
word at FIB onto the top of stack. It would then OR into the
low-order bits of the SIRW the offset for the particular type of I/O
call being done (0=sequential write, 1=sequential read, 2=random write,
3=random read, etc.) It could not do an ADD to apply this offset, since
the SIRW has tag=1 and is not considered to be a data operand, but the
LOR operator would tolerate the non-data tag.
Now we have an SIRW on top of stack, above the MSCW pushed by the
mark-stack operator, which points to a PCW in the FIB. The
compiler-generated code in the user program completes the LIO call by
pushing parameters into the stack (a value used to specify random record
address/carriage control, a length, an indexed descriptor to the
program's record area, etc.) An ENTR operator completes the call by
pushing a return address onto the stack, using the linkage in the SIRW
to traverse the new scope chain and load the D registers as necessary,
load the new code address from the PCW selected in the FIB, and start
executing the code for the LIO procedure.
Note that the FIB itself _is not passed as a parameter_ in the call.
Instead, the fact that the PCW was located in the pseudo-stack frame
contained within the FIB makes the area of memory starting with the
FIB's MSCW the immediately global scope for that procedure, and
therefore the LIO procedure can access words of the FIB using address
couples having offsets relative to the MSCW. In Algol terms, the FIB
looks to the procedure like that procedure's containing block.
Another aspect of this implementation, and one that probably had more
impact on performance than the clever addressing scheme, was that the
set of PCWs indexed in the FIB were mutable. There were many dozens of
LIO procedures, possibly hundreds. Each procedure that was optimized for
a specific I/O situation, and contained as little internal testing and
branching as possible. There were, for example, separate routines for
blocked and unblocked I/O, translated vs. untranslated I/O, tape vs.
disk vs. printer I/O. If the conditions under which I/O was being
performed changed (say, switching between sequential and random reads),
even while the file was open, LIO would replace certain PCWs with others
that were more suitable for the new conditions. This behind-the-scenes
adaptation was transparent to user programs.
Even though the refactoring of LIO into lots of smaller, streamlined
procedures was responsible for most of the performance gain, it was the
architectural support in the system that allowed that approach to be
implemented so efficiently.
As an aside, the master set of LIO PCWs was stored in an array within
the MCP. I became intimately familiar with that array during one
episode, probably in late 1971. We had just installed the Mark II.0 MCP,
which not only included the first release of FIBSTACK Logical I/O, but
an equally re-engineered physical I/O implementation. Our B6500 started
throwing fatal dumps at random intervals. None of us knew anything about
FIBSTACK at that point, just that we had a new (and MUCH faster) I/O
implementation. Analysis of the dumps (at that time still done on an
8.5x11x11 stack of paper with a red pen) showed the processor trying to
do a procedure call against a word that had a tag of 5 (data
descriptor), not 7 (PCW), so the question was, why are we trying to do a
procedure call against a data descriptor? The dumps also showed that it
was supposed to be a PCW in a FIB. After a lot of time pouring over the
MCP listing and tracing things back in the dumps, we found the master
array of LIO PCWs, and lo, *ALL* of them had tags of 5. So the next
question was, what could cause all of the words in an array to suddenly
change from tag 7 to tag 5? The answer, some days later, turned out to
be overlay (virtual memory) I/O. That master array of PCWs was
overlayable, and memory allocation pressure was occasionally forcing it
out to disk. We eventually discovered that the problem only occurred
when the overlay I/O was done by the fourth of the four Multiplexor I/O
channels on the system. That channel would be selected only if the other
three were already busy, which up to that point they probably never had
been. The new LIO and physical I/O implementations were so much more
efficient, though, that the system started doing many more I/Os in
parallel, and channel #4 started to be used occasionally. Once we
determined the problem was specific to that channel, we turned the
problem over to the FEs, who eventually found a wire missing from the
backplane. And yes, that wire carried a signal for the second bit of the
tag. One quick wirewrap job later, the dumps disappeared.
40+ years later, the FIBSTACK mechanism is still present in the modern
Unisys ClearPath MCP, although a lot of low-level hardware details have
changed, largely to accommodate larger stack numbers and memory address
widths. It has held up well against the significant changes in I/O
technology that have occurred over the years. As best I can tell from a
fading memory and samples of current compiler code generation, the
calling sequence in user programs has not changed at all. For example,
here is modern code for an Algol sequential WRITE statement:
WRITE (F, 100, A);
MKST AE mark the stack
ZERO B0 push zero index for the FIB
NAMC (02,0004) 5004 push SIRW to the FIB descriptor
STFF AF stuff the SIRW (these days it's a no-op)
INDX A6 index the FIB descriptor by the zero
LOAD BD push the word at FIB onto the stack
ZERO B0 push record/CC (not used with sequential)
LT8 100 B264 push the number of units to write
ZERO B0 push zero index for the record area
NAMC (02,0003) 5003 push SIRW to the user's record area
INDX A6 index to element zero of record area
LT48 BE 080000000080 (push some control data)
ENTR AB enter the LIO procedure
Note that since this is a sequential write at offset zero within the
FIB's PCWs, no offset had to be OR-ed into the Selector SIRW at FIB.
For other operations, that code would be between the first INDX operator
and LOAD. For example, if this were a random read (which uses offset 3),
these additional instructions would appear:
LT8 3 B203 push a literal 3
Algol Structure Blocks are indeed similar to the FIBSTACK mechanism, and
are essentially a generalization of it that is safe to use in user-level
code. They are something of an object-oriented mechanism. You define a
Structure Block as a "type" (somewhat like an OO class), then use that
type to declare either instances or arrays of instances for that type.
Allocation of memory for an instance occurs on first reference. There is
no "new" operator or its equivalent.
Structure Blocks have a body, which can contain data and procedure
declarations, similar to the declarative part of a standard Algol block.
Each declaration can be marked as public (visible outside the block) or
private. Most Algol declarations are allowed within a Structure Block,
but there are some restrictions, e.g., files are okay, but DMSII
database invocations are not. PROLOG and EPILOG procedures can be
defined within the Structure Block to initialize the block's members
upon instantiation (at first reference) and wrap up prior to deallocation.
When you call a public procedure member of a Structure Block, the
addressing environment is similar to that for FIBSTACK: the members of
the block's body appear to have been declared in the procedure's
containing block, which in fact they were. Thus, they can be accessed by
the procedure using address couples.
The Structure Block type declaration generates some "stack-building"
code to provide basic initialization during instantiation. Each
Structure Block instance is represented by a data descriptor. These
instance descriptors are initialized by the compiler as a request token,
similar to the request tokens used for arrays. When one of these request
tokens is first touched, it causes a system interrupt. In response to
that interrupt, the MCP allocates space for the instance, stores in that
memory area an MSCW and a "Selector" SIRW similar to those used by
FIBSTACK, and then executes the stack-building code in the context of
that memory area as if it were a piece of a normal stack. Zeroes are
pushed by ZERO operators to initialize scalar variables, request tokens
are constructed and pushed for arrays and other complex entities, and
PCWs are pushed by MPCW operators for the public and private procedures.
Any PROLOG procedure would be executed at the end of the stack-building
Public members of an instance are accessed the same way that the LIO
procedures are accessed in a FIB -- the compiler generates code to index
the instance descriptor and load word 1 (the Selector SIRW) onto the
stack. It then generates code to OR an offset for the desired member
into that copy of the SIRW, at which point you can use that SIRW to
access one of the data members or call one of the procedure members.
To illustrate, here is a very simple program using Structure Blocks:
TYPE STRUCTURE BLOCK DEMO;
PRIVATE INTEGER COUNT;
PRIVATE INTEGER VAL;
PRIVATE REAL AVG;
PUBLIC INTEGER PROCEDURE INC(I);
VALUE I; INTEGER I;
AVG:= (AVG*(COUNT-1) + VAL)/COUNT;
PUBLIC REAL PROCEDURE GETAVG;
PROLOG PROCEDURE INIT;
AVG:= VAL:= COUNT:= 0;
DEMO ARRAY A[0:9];
The code to call B.INC looks like this:
NAMC (02,0004) 5004 push SIRW to B onto the stack
ONE B1 push literal 1
INDX A6 index the instance descriptor
LOAD BD load the Selector SIRW at B
LT8 6 B206 push the offset to INC
IMKS CF Insert a MSCW below the Selector
LOR 91 OR the offset into the Selector
LT8 4 B204 push the parameter (4) to INC
ENTR AB enter the INC procedure
DLET B5 pop the return value, not used
Since this is the first time that B was actively referenced in the
program, an interrupt would have occurred on (I think) the INDX operator
to allocate and initialize the instance.
The code to access A.INC is similar to that for B, just prefixed by
an extra step to index A to fetch the instance descriptor.