I am just confusing about the following and i hope someone can clear
it out for me :)
I have read about paging and i understand that in pentium the CR3
register holds the starting address of the page table in the memory.
This register is generally called the Page Table Base Register (PTBR)
Pure segmentation in based maping needs also a table however it is
often fully cached in the processor as its size tend to be small.
Now it seems the modern processor instruction set , implicitely or
explicitely deals with different segments in the application, i.e the
code is segmented. we don't have only one linera adress space but
many. However, windows NT and linux both ignore segmentation memory
mapping since according to the documentation all the application
segments descriptors are equal, they are all mapped into the same
linear space with base =0 and limit 4GB ( 32 bits processors).
Having said that how the page table of each segment can be retrieved
as there is only one PTBR in the system? there should not be only one
The Intel scheme doesn't use a page table per segment -- it uses the
segmentation hardware to map to a linear virtual address space, and
then uses the paging system to map this to physical memory.
No. There is only one linear address space. Different segments can start
at different offsets within that linear space (and can overlap), but the
result is always a simple linear address.
>However, windows NT and linux both ignore segmentation memory
>mapping since according to the documentation all the application
>segments descriptors are equal, they are all mapped into the same
>linear space with base =0 and limit 4GB ( 32 bits processors).
>Having said that how the page table of each segment can be retrieved
>as there is only one PTBR in the system? there should not be only one
There is only one set of page tables. You start with a segment number and
offset (ie, CS:123456). You add the starting address for that segment to
the offset, and that produces a linear address. You look up the linear
address in the page tables, and that produces a physical address. The
physical address goes out on the bus.
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
Many thanks for your reply. I said above and you agreed on that Linux
and Windows ignore segmentation by setting the base in the segment
descriptor to zero. therefore,the linear addresses of CS:123456 and DS:
123456 are the same and equal to Base+123456= 123456. As such any
address of the code segment, data segment and any other segment will
point to the same entry in the page table. how this can happen? sorry
but i am still confused
Thanks Joe.... and since windows and linux ignore segmentation by
setting the base value of all descriptors to zeros, how can we protect
between the different segments. as i said above Cs: offset, DS: offset
will be translated to the value offset then passed to the page
table....where it will index the same entry!!
Yes, they all point to the same place in the same pagetable, because
they access the same byte of memory. The only difference is what you are
allowed to do to the byte. If it is accessed through the CS, then you
can execute it. If it is accessed through the DS, then you can
read/write it. What you can do is then controlled further by the
settings of the writable/executable/kernel/user flags in the descriptor
used. Thus, if you access the byte through the CS, but the CS is set to
be a user segment, then if the pagetable lists the page as either data,
or kernel data, or kernel code, then you will get a fault.
In other words, the pagetable defines what you type of memory something
is, whereas the segment descriptor defines what you are trying to do to
the memory. Thus, if your user code is given only a user data and a user
code descriptor, (remember that it cannot create its own descriptors)
then it can only access pages marked as user data or user code. If the
descriptor and the pagetable entry do not match according to a specific
set of rules, then a fault occurs.
Hope that helps.
but you can also use the pagetable to limit to execute only, with no
read. (probably screws up any compilers that store constants in the code
segments, but can be done.)
Thanks Matt for your answer but i am still confused despite trying to
look over more online materials. May i give the following example,
assume a user application with 2 segments: code and data. each segment
has 4 pages. therefore the operating systems will create a page table
of 8 valid entries. however the page number from the code segment
ranges from 0 to 3 as does the page number from the data segment. they
are fully overlapped and half of the page table and as such the
application physical address space can't be addressed. i really don't
know what i am missing...
No, the OS creates a page table of four entries. In the usual 32 bit
flat model, the code and data segments map to the same place in the
linear address space. If a program makes a code reference to CS:100,
the segment setup will map that to location 100 in the linear address
space, which is in page 0, so it's 100 bytes into wherever the OS maps
page 0. If a program makes a data reference to DS:100, it is mapped
to the exact same place, location 100 in the linear address space,
which being the same location is also 100 bytes into the mapped page
For most purposes, on the 32 bit x86 it's easiest to pretend that the
segments don't exist at all. The OS generally sets up all the segments
to do a 1:1 mapping of each segment into the linear address space.
:) thanks John. but after reading your post i got confused further.
Your explanation means that DS:100 and CS:100 will point to the same
byte/word in the physical memory (as they transit through the same
page table entry and have both the same relative offset) although that
physical location should hold two separate items...please forgive my
ignorance...it seems i am missing an elementary concept
There are two totally different mechanisms for protecting information.
1) Putting it in NON-overlapping segments. Thus CD:100 and DS:100 are
different addresses, in different pages. You separate your 4Gb address
space into chunks which are then given to processes. Each chunk
(segment) has a specific type, such as user data, kernel code etc. Each
is seen by the code as starting at 0, and ending at some arbitrary number.
2) Putting in overlapping segments, but separating the
code/data/kernel/user stuff by actually giving them different addresses.
The pagetable then controls the protection, more than the
segmentation, and each segment starts and ends at some arbitrary value.
This is not a problem, as you usually compile/link your code to use
separate addresses for code and data.
Most compilers are written for the second mechanism, and assume that all
code and data segments cover the entire address range. Separating
processes is then a case of changing the pagetable pointer, so that all
code thinks it is running at the same addresses, and accessing data at
the same addresses, regardless of where it is loaded in real memory.
Right. It's a single flat address space.
> although that physical location should hold two separate items...
No, it shouldn't. This is a Von Neumann architecture, not a Harvard
Although it is possible to set up 386 segments so that the code and
data map to different places, nobody does so. On the 286 you had to
do separate code and data segments because few programs could fit all
of the code and data into a single 64K segment. On the 386 a single
segment can map the entire 32 bit linear address space, so that's what
As someone else noted, the way we keep code and data separate is to
put them at different addresses, and we can use page protection to
(mostly) prohibit broken programs from writing into their code.
> Your explanation means that DS:100 and CS:100 will point to the same
> byte/word in the physical memory (as they transit through the same
> page table entry and have both the same relative offset) although that
> physical location should hold two separate items...
Indeed, at first glance it does look like something crazy's going on. The
secret is that if page 0 is a code page then your memory manager will
never return a block of memory containing the linear address 100. And so
your compiler will (should) never generate a reference to DS:100.
When your memory manager allocates memory and returns a linear address,
that address encodes the PDE, PTE and offset. The only way this address
could be 100 is if PDE=0, PTE=0, offset=100. That implies the first page
in the first page table. If the first page was previously allocated for
code, then the memory manager won't return any address in that range when
it allocates heap memory - it might map page 1 and return the address
PDE=0, PTE=1, offset=100.
So the two addresses you're thinking of (the 100th byte in code and the
100th byte in data) will look something like this (illustration only, I
haven't done the arithmetic):
CS:100 -> PDE=0, PTE=0, offset=100 -> linear 0x00000064
DS:4196 -> PDE=0, PTE=1, offset=100 -> linear 0x00001064
and the newer "no-execute" ... countermeasure for (buffer overflow)
attacks that polute data areas with executable instructions and then
attempt to get execution transferred there.
Researcher: CPU No-Execute Bit Is No Big Security Deal
'No Execute' Flag Waves Off Buffer Attacks
What's the new /NoExecute switch that's added to the boot.ini file
CPU-Based Security: The NX Bit
A detailed description of the Data Execution Prevention (DEP) feature in
Windows XP Service Pack 2, Windows XP Tablet PC Edition 2005, and
Windows Server 2003
misc. past posts mentioning buffer overflow
OK i think i understand now. Many thanks to you all. Considering
windows and linux memory mapping in IA32, most of the compilers opt
for the flat memory model. As such, the offset "X" in an instruction
with operand DS:X is actually relative to the beginning of the entire
flat logical address space of the application and not the Data
segment. This does not match the segmentation section in the
silberschatz OS textbook! actually he talks more about the theoritical
Please Is there any reference or textbook that i can consult
addressing compilers code generation practical implementation
techniques , e.g flat model, segmented model , etc
That's not surprising--segmented architectures have long been more
popular in academia than in real life. The idea of tidily
constraining each address to its proper segment seems appealing, and
works OK for the toy programs you write in programming classes, but
doesn't work very well in interestingly large practical programs.
Segments worked adequately on Multics, back when both memories and
addresses were small, but the Intel chips killed them dead. On the
286, the segments were too small for a lot of data structures, and
poorly chosen bit layout in segment selectors made it needlessly hard
to use multiple segments for a single array or data structure. Also,
loading segment registers on a 286 was very slow, so there was a
performance advantage if you designed your program to use as few
segments as possible. The 386 segments were just as slow, and Intel's
decision to do paging in a single linear address space rather than per
segment made the segments all but useless since unlike the 286, using
segments didn't let you address more memory than the "tiny model" that
uses one code segment and one data segment both mapped to the same
memory. IBM 390 mainframes had an addressing mode similar to what the
386 would have been if there were a page table per segment, but the
newer Z series has flat 64 bit addressing.
>Please Is there any reference or textbook that i can consult
>addressing compilers code generation practical implementation
>techniques , e.g flat model, segmented model , etc
Compiler books generally assume flat model addressing unless they
specifically say otherwise, and in any event I can't think of many
compiler techniques useful for segmented addressing. My book "Linkers
and Loaders" might be useful to help understand how programs and data
are laid out in memory.
Exactly. We just don't worry about CS, DS, etc and deal with
protection strictly at the paging level.
Near as I can tell, you're assuming the segmentation is relevant --
the whole point of the way Windows and Linux do their VM is that it's
not. So CS:100 and DS:100 do indeed point to the same place: the
same place in the linear address space since CS and DS are both
zeroed, and then the same place in the physical memory since the
paging system only has one set of page tables per process.
It's just treating the whole thing as a single, linear, per-process
address space. So you're perfectly free to try to read an instruction
from address 100, or read or write data from address 100. But if you
try to do both in a single program, you're either doing something very
weird or you've got a bug (and many people would argue that you're
trying to do something weird enough that it constitutes a bug...).
Actually, segmentation works quite
well when it has the right hardware
and programmer mindset support.
Unisys 2200 XPA-series mainframes
which use segmentation (i.e.
"banking" in Unisys-speak) on top
of a very large (2**57 word) paged
intermediate address space are
still around and have been doing
useful work since their introduced
in the 1990's.
And the last time I looked, the OS
for these Unisys 2200 series was
no toy and consisted several million
lines of code.
Some regard Linux as a toy by
> Segments worked adequately on Multics, back when both memories and
> addresses were small, but the Intel chips killed them dead. ...
IIRC Multics was pretty much dead
before the 8008 even hit the fan.
> ... On the
> 286, the segments were too small for a lot of data structures, and
> poorly chosen bit layout in segment selectors made it needlessly hard
> to use multiple segments for a single array or data structure. Also,
> loading segment registers on a 286 was very slow, so there was a
> performance advantage if you designed your program to use as few
> segments as possible. The 386 segments were just as slow, ...
> ...and Intel's
> decision to do paging in a single linear address space rather than per
> segment made the segments all but useless since unlike the 286, using
> segments didn't let you address more memory than the "tiny model" that
> uses one code segment and one data segment both mapped to the same
> memory. ...
IMHO, the use of a single linear
address space wasn't so much the
problem as the fact that it was
2**32 bytes long, the same size
as the physical address space,
rather than at least 2*35 bytes
long, large enough to hold every
possible based segment.
Unfortunately, I don't think
there were enough unused bits in
the segment descriptors to pull
this off while maintaining
> ... IBM 390 mainframes had an addressing mode similar to what the
A great textbook, and the one I picked for teaching senior-level OS
from this semester. But I made a point of skipping the segmentation
section (because it's just not used at this point).
> Please Is there any reference or textbook that i can consult
> addressing compilers code generation practical implementation
> techniques , e.g flat model, segmented model , etc
Actually, Silberschatz does it as well as anybody I know. The thing
is, an architecture supporting a model doesn't mean the OS has to use
Any comments from the B5500/6700/A-Series builders or customers?
Blaming one party is easy. Finding a solution is hard.
How's that working for you? Virus-wise?
> IIRC Multics was pretty much dead
> before the 8008 even hit the fan.
This would have been in 1972, which makes this statement wrong to the
degree of being completely ridicolous, as easily verified on the web.
I must admit I'd forgotten about the Burroughs machines. My
impression is that they're the healthiest segmented machines around
today, but they also suffer from performance and address space issues.
Fine thanks. On my FreeBSD box, the protection between processes and a
design that doesn't run everything as the superuser is much more important
than putting code and data in separate address spaces.
For the latter, recent x86 models have per-page no-execute protection,
but I hear it doesn't help much.
there is MVS and various descendants ... where the same ("segmented")
image of the kernel appears in every virtual address space ... along
with the "common segment" ... which was an early MVS gimick allowing
pointer-passing paradigm to continue to work between different
applications and various subsystems functions when the were moved into
different virtual address spaces (i.e. application could squirrel
something away in the "common segment" and make a subsystem call,
passing a pointer to the "common segment" data). of course,
"dual-address" space ... and follow-on "access registers" ... were
attempt to obsolete the need for the common segment ... aka allowing
called routines (in different virtual address spaces) to "reach" back
into the virtual address space of the calling routine.
misc. recent posts mentioning common segment
http://www.garlic.com/~lynn/2007g.html#59 IBM to the PCM market(the sky is falling!!!the sky is falling!!)
http://www.garlic.com/~lynn/2007k.html#27 user level TCP implementation
http://www.garlic.com/~lynn/2007o.html#10 IBM 8000 series
http://www.garlic.com/~lynn/2007q.html#26 Does software life begin at 40? IBM updates IMS database
http://www.garlic.com/~lynn/2007q.html#68 Direction of Stack Growth
http://www.garlic.com/~lynn/2007r.html#56 CSA 'above the bar'
http://www.garlic.com/~lynn/2007r.html#69 CSA 'above the bar'
the ingrained (MVS) "common segment" even resulted in custom hardware
support in later machine generations. in lots of implementations,
table-look-aside (TLB) hardware implementation is virtual address space
"associative" (each TLB entry is associated with a specific virtual
address space). Segment sharing can result in the same virtual address
(information) in the same (shared) segment appearing multiple times in
the TLB (associated with use by specific virtual address spaces). The
"common segment" use was so prevalent in MVS ... that it justified
special TLB handling ... where the dominant TLB association was virtual
address space ... but there was a special case for common segment
entries ... to eliminate all the (unncessary) duplicate entries.
If everything is in the same address space - then any non-superuser can read
anything, and the very notion of super/non-super users is null and void, like
it is in Win9x/Me.
> For the latter, recent x86 models have per-page no-execute protection,
> but I hear it doesn't help much.
NX bit is in PAE (64bit PTEs) mode only. Helps a lot.
In the case of high-performance, numeric-intensive computing, the
Burroughs B5000/6000/7000/A Series machines (now known as Unisys
ClearPath MCP systems) admittedly do not stack up all that well. It's my
impression that for many this problem space is the only one in which to
measure "performance", but this facet of computing is only one of many.
In general server-oriented roles, especially OLTP and transactional data
base applications, the Unisys MCP architecture has superior qualities
and stacks up much better against other systems. The variable-length
segmentation is not entirely responsible for that, of course, but it
certainly is a contributing factor.
John's comment about address space is one that I really cannot
understand, though. All of the MCP machines going back to the B5500 have
had huge virtual address spaces, just not ones that are allocated all in
one piece. I'm not sure how you could compute the total virtual address
space limit on the current systems, but it's easily many trillions of
bytes PER TASK. You would hit some practical limitations, such as the
maximum size of the system's segment descriptor tables, long before
running out of virtual address space provided by the architecture. In
almost 40 years of working with these systems, I've never seen an
application on them that came even close to pressuring the virtual
address space, let alone exceeding it. There were serious PHYSICAL
address space problems with the B6700/7700 systems in the late 1970s and
early '80s, but these were resolved by the mid-80s in the A Series line,
and done so largely without impact on existing applications.
So are segment register loads on any x86 CPU.
x86-style segmentation requires the _developer_ (not even the compiler) to be
aware of it.
First of all, SS != DS issue, which is especially "fine" in DLLs with their own
data segment. MakeProcInstance for callbacks is second issue.
For tiny segments (286, including 16bit Windows), huge pointers are major
This, combined with high cost of segment register load and portability, caused
the most OS designers to abandon segment support from the OSes in the timeframe
of late 1980ies (design) and early-to-mid 90ies (market).
First, the 'everything' that is supposed to be in the same address
space is supposed to refer to the address space of a single process,
which is flat, meaning a pointer to anything in this address space is
just a 0-based offset. Second, MMUs usually support per-page access
permission with different privilege levels. Otherwise, they would be
quite useless for implementing memory protection.
John said it was his "impression," which is something short of a claim
to knowing the one and only truth. I'm sure he's willing to stand
corrected, just as the rest of us are open to education.
> In the case of high-performance, numeric-intensive computing, the
> Burroughs B5000/6000/7000/A Series machines (now known as Unisys
> ClearPath MCP systems) admittedly do not stack up all that well. It's my
> impression that for many this problem space is the only one in which to
> measure "performance", but this facet of computing is only one of many.
> In general server-oriented roles, especially OLTP and transactional data
> base applications, the Unisys MCP architecture has superior qualities
> and stacks up much better against other systems. The variable-length
> segmentation is not entirely responsible for that, of course, but it
> certainly is a contributing factor.
> John's comment about address space is one that I really cannot
> understand, though. All of the MCP machines going back to the B5500 have
> had huge virtual address spaces, just not ones that are allocated all in
> one piece. I'm not sure how you could compute the total virtual address
> space limit on the current systems, but it's easily many trillions of
> bytes PER TASK. You would hit some practical limitations, such as the
> maximum size of the system's segment descriptor tables, long before
> running out of virtual address space provided by the architecture. In
> almost 40 years of working with these systems, I've never seen an
> application on them that came even close to pressuring the virtual
> address space, let alone exceeding it. There were serious PHYSICAL
> address space problems with the B6700/7700 systems in the late 1970s and
> early '80s, but these were resolved by the mid-80s in the A Series line,
> and done so largely without impact on existing applications.
Can you give an overview of how A-Series segmentation works? "RTFM" is
OK in my case, since I have the manual...
(I've trimmed follow-ups, since as far as I know, Linux hasn't been
ported to the A-Series. Of course, I'm willing to stand corrected on that.)
No MMU supports ACLs based on users/groups for memory accesses. The only
support is kernel/user, read-only/writeable and sometimes - not on all CPUs -
It was sort of an editorial "we". But if Intel's paging
implementation had included an execute permission (before the recently
added NX bit), buffer overflow exploits wouldn't be a serious problem.
Control Data's last great mainframe series, the Cyber 180, was strongly
influenced by Multics. It used a segmented architecture, with 16 rings, of
which only 9 were actually used. It went into production in the mid-1980s,
although I doubt any of them are still in use today.
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
>> just a 0-based offset. Second, MMUs usually support per-page access
>> permission with different privilege levels.
> No MMU supports ACLs based on users/groups for memory accesses. The only
> support is kernel/user, read-only/writeable and sometimes - not on all CPUs -
Why would it need to support ACLs? Since the page table is
per-process, and all the MMUs I'm familiar with support kernel/user,
just set the user-mode permissions based on the process owner's (or
whatever) access. No reason to make the MMU complicated to support
something that's easily done in software.
I do recognize not having execute/no-execute permission is a problem,
but that's independent of ACLs.
> :) thanks John. but after reading your post i got confused further.
> Your explanation means that DS:100 and CS:100 will point to the same
> byte/word in the physical memory (as they transit through the same
> page table entry and have both the same relative offset) although that
> physical location should hold two separate items...please forgive my
> ignorance...it seems i am missing an elementary concept
No, the compiler will not generate code like that. The compiler is aware
of the fact that it's generating code for a machine with "non-separate I
and D space", i.e. there is *one* linear address space per process, and
code and data share this address space. If you intend to write assembler
code, you'll have to be aware of that fact, too.
So, if you have 8k program code (i.e. 2 pages of 4k each) and 8k data
(i.e. 2 pages), then they will be mapped to different areas in this one
address space, e.g. the code might be mapped from 0..8k and the data
might be mapped to 128k..136k.
The other way round, if you have a NOP instruction at address CS:100,
then reading a byte from DS:100 will return 0x90, the opcode for a NOP
These are my personal views and not those of Fujitsu Siemens Computers!
Josef Möllers (Pinguinpfleger bei FSC)
If failure had no penalty success would not be a prize (T. Pratchett)
Company Details: http://www.fujitsu-siemens.com/imprint.html
What a MMU support need not necesarily be describable in terms of NTFS
file system access permissions, despite that you may know these
> The only
> support is kernel/user,
aka 'different privilege levels'
> read-only/writeable and sometimes - not on all CPUs -
aka 'access permisssions
Actual MMUs are not necessarily that simplistic. For instance, an
ARMv5 MMU supports per-page access permission which can be 'no
access', 'ro-access in priviledged mode, no access in user mode',
'ro-access in both priviledged mode and user mode', 'r/w access in
privileged mode, no access in user mode', 'r/w access in privileged
mode, ro-access in user mode', 'r/w access in privileged mode and in
user mode'. Additionally, each descriptor belongs to one of sixteen
'domains' and the exists a 'domain controll register', which can,
individually for each process and every domain, grant either 'client'
or 'manager' access to memory belonging to that particular domain,
where 'client' access (in all modes) are checked against the access
permission bits, while manager accesses are not.
BTW, the assumption that what one does not understand will certainly
be wrong is a common, but usually wrong one.
> IBM 390 mainframes had an addressing mode similar to what the
> 386 would have been if there were a page table per segment, but the
> newer Z series has flat 64 bit addressing.
Minor correction: The only difference between Z and S/390 (in this
regard) is the address space size. Z still supports Access-Register
mode (which is the one comparable to a machine with segmentation).
Note that what IBM calls "segments" are really just the second level
in the hierarchical page table, and Z adds three new "region" levels
to cover the larger address range.
Perhaps you meant to say that Z operating systems use a single flat
view where their S/390 predecessors exploited AR mode to get more
effective virtual address range (multiple 2G spaces).
http://www.garlic.com/~lynn/2007t.html#9 How the pages tables of each segment is located
hot off the press:
Buffer Overflows Are Top Threat, Report Says
Research data says buffer overflow bugs outnumber Web app
vulnerabilities, and some severe Microsoft bugs are on the decline
... snip ...
as before ... lots of past posts mentioning buffer overlow
Right. Now that there's a 64 bit address space, I wouldn't expect anyone
to use AR mode other than for backward compatbility.
Ah, no. AR mode is used extensively for communication between address
spaces. For example, let's say you issue an SQL request to DB2. The
DB2 subsystem can write the results of the Select directly into your
address space. That is its primary function.
A secondary use for AR mode is for dataspaces (and related entities,
like Hiperspaces), which *are* used for additional data storage that
won't fit in a single 2GB address space. It is reasonable to expect
that the use of dataspaces will decrease as more and mode code becomes
64-bit aware, and that AR mode will end up mainly doing IPC.
I'd think that in OLTP, fast context switching would be important,
which you get from the stack architecture. How does the segmentation
help? Burroughs style segments are certainly helpful for reliability,
since they make it nearly impossible to clobber program code, but the
extra memory traffic to load all those segment descriptors has to be
paid for somehow.
>> John's comment about address space is one that I really cannot
>> understand, though. All of the MCP machines going back to the B5500
>> have had huge virtual address spaces, just not ones that are
>> allocated all in one piece.
The limit I'm wondering about is per-segment, not overall. On the 286,
there were plenty of segments (8K per process plus 8K global) but the
per-segment size was the problem.
Most of what I know about the Burroughs and descendants' architecture
is from Blaauw and Brooks. Their description is somewhat confusing
(not really their fault, since the hardware architecture is
phenomenally complicated), but as far as I can tell, each segment is
limited to 32K words. I realize that multidimensional arrays are
arrays of pointers so each row of an array is a separate segment, but
do you never have structures or text blobs that don't fit in 15 bits
of intra segment address.
>(I've trimmed follow-ups, since as far as I know, Linux hasn't been
>ported to the A-Series.
32-bit x86 processors have a single "linear" address space that segments are
mapped into. Because the linear address space is 32-bit, instead of
something more sensible, common practice is to set all of the segments to
base=0, limit=4GB. That means the offset maps to the linear address space
and segmentation is effectively disabled, giving you a "flat" memory model.
This is _not_ what your textbooks call a "segmented" system, despite the
misleading presence of segments.
64-bit x86 processors in Long Mode basically mandate the above behavior
since the base and limit of CS, DS, SS, and ES are are all ignored. Of
course, AMD decided to make that mandatory because every 32-bit OS already
worked that way, so there was no point in wasting silicon to support
(FS and GS still work as expected in both modes but are usually dedicated by
the OS to specific purposes, like per-CPU and per-thread structures, so
they're generally ignored by user code and compilers.)
Paging is what translates linear addresses to physical addresses, and each
process will have its own set of page tables. When the kernel switches
processes, it simply resets CR3 to the PTB for the new process, restores the
registers, and jumps back into the appropriate place in the process's code.
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
IIRC there should be a TLB flush in there somewhere?
The interesting thing about this is that that jump back is just a standard
return to userspace in the same manner as any kernel API would. This is
because the the stack has actually been replaced as part of changing of the
IIRC, the processor will automatically (i.e. not explicitly software
directed) do a TLB flush any time CR3 is changed if it caches based on
virtual addresses. I'm not aware of any existing x86 processors that cache
based on physical addresses, but AIUI they're theoretically possible and
wouldn't require a TLB flush.
No. "Updates to the CR3 register cause the entire TLB to be
invalidated except for global pages."
>> IIRC there should be a TLB flush in there somewhere?
> IIRC, the processor will automatically (i.e. not explicitly software
> directed) do a TLB flush any time CR3 is changed if it caches based on
> virtual addresses.
I stand corrected.
No B6700 or its successors has ever been taken down by a buffer overflow.
Not even on a B6700 heavily loaded with mixed production and development
work in 1972. Never. How much is that worth?
Typically today these are called "MCP systems" by everyone but Unisys
marketing, since the marketdroids have changed the names so often that
nobody else can keep track of them. And all the rest of the multiple OSs
named MCP have faded away, so the term "MCP system" has become fairly
On Tue, 27 Nov 2007 06:18:56 +0000 (UTC), John L wrote:
> I'd think that in OLTP, fast context switching would be important,
> which you get from the stack architecture. How does the segmentation
Mostly it does not help with process switching. It helped more when memory
was scarce, because you did not need to make much in particular present in
order to swap out a segment. (And code segments are read-only and thus
could be replaced without swapping out.) But as with any machine, swapping
and overlaying was always a huge kludge; it didn't get around the fact that
disk is slower than RAM.
MCP systems have never been that fast on process initiation, termination,
and switching, even with the stack architecture to help. They are great in
that they carry a lot of information with the process, and that you can
switch from one very complex environment to another with no more oerhead
than a trivial switch, but all in all really lightweight processes are not
part of the design.
> Burroughs style segments are certainly helpful for reliability,
> since they make it nearly impossible to clobber program code, but the
> extra memory traffic to load all those segment descriptors has to be
> paid for somehow.
Nah, they are just about always in the fastest CPU cache. Was probably more
of an issue in the B6700 days. However, in those days the fact that the
code is extremely compact made a lot of difference -- fewer memory fetches
for code, more likely to find the code word in cache. And the compactness
of the code was in part due to the work done in the CPU for the
segmentation and the stack.
>>> John's comment about address space is one that I really cannot
>>> understand, though. All of the MCP machines going back to the B5500
>>> have had huge virtual address spaces, just not ones that are
>>> allocated all in one piece.
This was Paul's comment. There is one caveat. It's true that the virtual
address space of a task has always been essentially unlimited. However, at
one time the physical address space was limited to 1 megaword (6MB). You
can't use a huge amount of virtual memory if your physical is severely
limited. This was becoming a major problem by the late 1970s. There were
several generations of systems with kludges which allowed more physical
memory on a system, but it was difficult for a single task to use more than
1MW. Finally -- I guess it was in the mid 1980s -- they completed the
architecture changes which allowed even a single task to use amounts of
physical memory which are still not common. (I think it runs into some
limitations at half a terabyte, but I'm not sure, and even that could be
very easily expanded within the current struture.) And again, that is a
limitation on physical memory, not virtual.
But this may have been where the idea about a limited address space came
> The limit I'm wondering about is per-segment, not overall. On the 286,
> there were plenty of segments (8K per process plus 8K global) but the
> per-segment size was the problem.
On most current MCP systems, I can declare a single row of an array to be
up to 2^28 words. (The MCP will actually segment the physical memory
allocated to such a large array, but this is totally invisible to the
program. At one time this invisible segmentation carried a performance
penalty, but larger control stores have pretty much eliminated the
penalty.) But there's hardly ever a need to use a single array row that way
-- software which does so is usually doing structuring within a large flat
area, and on MCP system you would use at least some of the MCP's memory
structuring instead of doing it yourself.
There's probably still a limitation on the size of code segments. But the
compilers take care of that, and it's totally transparent. It's been years
since I was even aware of the size of code segments except in bizarre cases
where it would be a clue to some strange bug.
If you want a lot of big segments, just declare
ARRAY A [0:2**20-1, 0:2**20-1, 0:2**20-1];
(Note that exponentiation is Algol is **.) That gives you a billion array
rows of a million words each. A googol of them would only take a couple
more lines. You would not be able to use them all due to physical memory
limitations and the number of lifetimes it would take to go through them,
but you could compile the program and access some random words from
anywhere just to prove the point.
Ironically, the most noticeable limit today is one you aren't likely to
hear about. IOs are still limited to 2^16 words (6 * 2^16 bytes).
> Most of what I know about the Burroughs and descendants' architecture
> is from Blaauw and Brooks.
I'm not familiar with their book. I recommend Elliot Organick's "Computer
System Organization: The B-5700-B-6700 Series". Used copies are not
numerous but are not hard to find either. It won't give all the details,
but it's well written. Three decades out of date, but the basics haven't
> Their description is somewhat confusing
> (not really their fault, since the hardware architecture is
> phenomenally complicated), but as far as I can tell, each segment is
> limited to 32K words.
Totally incorrect. I can't even guess where this misconception came from.
As mentioned above, the modern limit is about 2^28 words. In programing
terms it was never less than 2^20 words. There were various limits that
were smaller, but they did not affect programming.
Remember that most MCP segmentation was always transparent to the
Code segments? Maybe code segments are limited to 2^15 words, though
actually 2^13 comes to mind. But it's totally transparent. When the
compiler fills up a code segment, it generates a branch to the next code
segment. End of problem.
> I realize that multidimensional arrays are
> arrays of pointers so each row of an array is a separate segment, but
> do you never have structures or text blobs that don't fit in 15 bits
> of intra segment address.
Huh? I do not even have a context to hang this in. Perhaps you are thinking
that structures have to be implemented within a segment? But no, if you
have a structure within a structure, the embedded structure is simply
represented by a descriptor, which has all the capabilities of a top level
descriptor. That's why the virtual address space is essentially unlimited.
This method does cause probems in copying structures, which is not a strong
point under the MCP.
If that's not it, please explain a bit more what type of structure you are
>>(I've trimmed follow-ups, since as far as I know, Linux hasn't been
>>ported to the A-Series.
> Too bad.
However, C was ported long ago, and POSIX compliant programs run pretty
well. I realize that's not the same thing by a long shot, but a lot of
useful programs have become available under the MCP as a result.
Art Works by Melynda Reid: http://paleost.org
> I'm not familiar with their book. I recommend Elliot Organick's "Computer
> System Organization: The B-5700-B-6700 Series". Used copies are not
> numerous but are not hard to find either. It won't give all the details,
> but it's well written. Three decades out of date, but the basics haven't
Mike Hore mike_h...@OVE.invalid.aapt.net.au
Posted via a free Usenet account from http://www.teranews.com
I would guess that it came from the B-5500 where there were only 15
bits for address. I was told once that this number came more or less
from the IBM 7090 family, which also had only 15 bits of address and
which were competitors to the B5500.
>>>(I've trimmed follow-ups, since as far as I know, Linux hasn't been
>>>ported to the A-Series.
>> Too bad.
>However, C was ported long ago, and POSIX compliant programs run pretty
>well. I realize that's not the same thing by a long shot, but a lot of
I haven't looked at any of this, but will comment that it's hard to do
something like Linux in a Burroughs-style machine because C in a more
conventional machine lets you get away with things that the Burroughs
architecture would prohibit. And there are differences in coding
style. In the B5500 MCP there were lots of arrays, taking advantage
of the array descriptor facilities, but no structures. In Unix/Linux
there tend to be arrays or lists of structures. Things that are grouped
together in a structure in Unix tend to be scattered into a whole bunch
of arrays in MCP. In B5500 MCP a central datum was the mix index, a
small number representing a job currently in the mix and used as an index
into all the arrays of information about that job. In Unix the process ID
number serves somewhat the same purpose, but PIDs run into thousands, with
most of the available numbers being unused at any one time.
The B5500 was an outstanding batch job machine and a lousy time sharing
machine. The reason it was lousy was that absolute addresses get into the
stack, so when a job is swapped completely out it has to be swapped back
in to the same memory addresses it previously occupied. This was corrected
by an architectural change in the B6500. Also the disk I/O was slow
enough that swapping a large job in and out was seriously time consuming.
While there was a time sharing version of B5500 MCP, many sites instead
ran the batch MCP and an application called R/C or Remote/Card with was
similar to Wylbur as used on OS/360. R/C allows a terminal user to
call up and edit a program file, submit it as a batch job for execution,
and then examine the output when the job completes.
First, Louis' point concerning my comment on John L's "impression" is
well taken. As I read John's sentence concerning the Burroughs machines,
the first clause was an impression, but the second was a conclusion. If
that was not what John intended, then I apologize.
Louis asked if I could give an overview of how the Burroughs
segmentation works, and John Ahlstrom has privately encouraged me to do
so. I think I know this reasonably well from a software perspective, but
as John L points out, the architecture is phenomenally complicated. The
elements of the architecture are also extremely synergistic, and it's
impossible to talk about one aspect (such as segmentation) in the
absence of others. As a result, this is a very long post -- about 9900
In order to understand the context in which memory addressing and
segmentation operate, I also have to talk to some degree about stacks,
compilers, codefiles, and the very tight integration between the
hardware architecture and the MCP (operating system) architecture. This
isn't easy, so please bear with me while I lay down some groundwork and
then to try to describe (without pictures) how segmentation and memory
addressing for this very complex, but fascinating architecture work.
I will start with a short history lesson. The origin of the Burroughs
stack/descriptor architecture was the B5000 of 1962. Bob Barton is
generally credited with the conceptual design of the architecture. This
machine was a major departure from anything Burroughs had attempted
before. It was clearly influenced by the Rice University (where Barton
had spent some time) computer, particularly by the Rice system's use of
"codewords", from which the concept of descriptors arose. I believe the
Ferranti Atlas was also an influence on the B5000 design. A somewhat
updated version of the architecture was released as the B5500 in 1964
and stayed in production until 1970. At the very end there was briefly a
version called the B5700, but it was really a B5500. (The B5900
mentioned below is a contemporary of the other Bx900 models and not a
variant of the B5500.)
Burroughs began work on a successor machine, the B6500, in 1965, but it
was not released until 1969. It had a very difficult introduction to
customer use, and did not work properly until it was updated and
re-released as the B6700 in 1971. All B6500 CPUs in the field were then
replaced with B6700s.
The B6500/B6700 was a substantially different design from the B5500 --
the two instruction sets were quite different, as were the format of
descriptors and other control words. Limits on the size of both physical
and virtual address spaces were increased dramatically. A dramatically
different way of handling character-oriented data was also introduced.
About the only things that got carried over without change were the word
size (48 bits), the integer and floating point word formats, and the
six-bit character code, known as BCL. The MCP operating system was also
completely redesigned. Nonetheless, the two designs are conceptually
similar, and both the B6500/6700 hardware and system software should be
seen as refinements of those for the B5500.
The Burroughs B6700 is the basis for the Unisys ClearPath MCP systems
that are still being sold today. There have been some tweaks to the
instruction set, and a major change was made in the mid-1980s to the way
memory is addressed to accommodate still larger physical address spaces,
but the processor architecture (at least from the perspective of
user-level software) has remained essentially the same since 1970. The
architecture is now referred to internally as "E-mode" (I think the "E"
stands for "emulation"), and its formal specification has gone through a
number of levels, known as Beta, Gamma, Delta, and (the latest) Epsilon.
The machines have had a variety of marketing names over the years. Since
the various names can be confusing to those not familiar with the
product line, here is a summary:
Burroughs B5000/5500/5700 -- the original early 1960s architecture.
Burroughs/Unisys A Series (MicroA, A1-A7, A9-A13, A14-A19)
Unisys ClearPath 4600/4800/5600/5800/6800
Unisys ClearPath NX4200 (the first of the models where the hardware is a
standard Intel box and the MCP architecture is emulated on the Intel
processor. This approach is currently used on all small-to-medium
Unisys ClearPath LX5000/6000/7000 (later emulated systems)
Unisys ClearPath Libra (these are also referred to as the "CS" series;
the Libra 300, 400, and 520 models are emulated systems)
Having so many models, we need a generic name, so I'll follow the
current Unisys convention and refer to them as MCP systems
(acknowledging that other, now obsolete, Burroughs/Unisys architectures
also used operating systems named "MCP").
With that background, I will try to describe how memory addressing and
segmentation works on the B6700, since discussing that particular model
gives a good conceptual base. The mechanism was similar, but more
primitive, on the B5500. This mechanism changed with the A Series to
expand the physical address space and implement improvements in some
other areas. I'll discuss those differences later. Even though the B6700
has been obsolete for 30 years, I'll talk about it mostly in the present
Words and Tags.
The best place to start is with word formats. The B6700 (like the B5500)
has a 48-bit data word. One of the significant changes from the B5500,
however, was the addition of some extra bits to the word, called the
"tag". The B6700 and later systems through E-mode Beta had a three-bit
tag. E-mode Gamma and subsequent systems use a four-bit tag.
The tag indicates what type of information the word contains. Generally,
tags 0, 2, 4, and 6 are data types that user-level code can freely
manipulate; the rest are control types that are used by a combination of
the hardware and operating system kernel. The tag values for the B6700 are:
0 = single-precision data word (numeric or character data)
1 = IRW (indirect reference word) or SIRW (stuffed indirect
reference word). These are effectively indirect addresses to
words in a program stack. A brief discussion of what "stuffed"
means is covered in the section on Addressing Data, below.
2 = double-precision data word (typically numeric)
3 = general purpose protected control word: stack linkage, code
segment descriptor. All words in a code segment (one containing
executable instructions) also must have tags of 3.
4 = SIW (step index word): control word for the STBR (step branch)
looping instruction -- a nice idea, but the performance was
poor, and STBR is no longer supported on current systems. This
tag value is now used by the MCP as a special marker word in
stacks (e.g., for fault trap indicators).
5 = data descriptor. These words are the basis for data
segmentation and are discussed in detail below.
6 = uninitialized operand. Words with this tag value can be
overwritten, but reading them generates a fault interrupt. Used
as the initial and NULL value for pointers.
7 = PCW (program control word). The entry point to a procedure
(subroutine). Also used as one form of dynamic branch address.
The additional tag values introduced with E-mode Gamma are primarily
used for optimizations of descriptors after a segment has been allocated
and are not significant for this conceptual discussion.
There are user-mode instructions that can read and set the tag on the
word at top of stack. You could theoretically do quite a bit of mischief
with this on the B6700, but the compilers are a trusted component of the
system, so in practice this was not a problem. In more recent systems,
the microcode carefully restricts which tag values can be set on data
words in the stack. The Set Tag instruction (STAG) running in user-level
code cannot now, for example, set tag 5 on a data word with bit 47=1
(i.e., create a present data descriptor -- see below for what this means).
This brings up the subject of "codefiles" (files containing executable
code) and compilers. A codefile is a controlled entity under the MCP.
Codefiles can be generated only by compilers, which are programs that
must be "blessed" using a privileged MCP command. There is no assembler
for these systems, nor anything that allows a user to generate arbitrary
sequences of object code. Recompiling the source code for, say, the
COBOL-85 compiler does not generate a compiler, just an ordinary program
that inputs source code and outputs a file of instructions and other
run-time data. That file is not a codefile, however. It cannot be
executed and cannot be turned into a codefile. There is no facility for
a program (even one running with the highest privileges) to read that
file and present any of its data to the hardware as a code segment.
Words of instructions in a code segment must have tags of 3. There is no
way for user-level code to set those tags, let alone create a code
segment and branch to it. The I/O hardware has a variant of the disk
read operation that will apply tags of 3 to data as it is being read and
transferred to memory, but only the MCP has access to the physical I/O
hardware, so there is no way for user programs to initiate such an
[When I say "no way" here, I mean that the hardware/MCP architecture is
specifically designed to prohibit such a thing, and I'm not aware of
even a conceptual attack that could get around the design. There have
been some holes discovered in the past, though, and of course it's
entirely possible that there are more that have not yet been brought to
light. I won't speculate on the likelihood of their existence. To me,
the weakest part of this design is the trust that must be placed in the
compilers. If you can generate arbitrary code, or modify a valid
codefile externally (e.g., on a backup tape) and reload it, you can get
around at least some of the system's protections, but it's still not easy).]
This discussion of codefiles and compilers highlights a basic
characteristic of MCP memory segmentation: code and data are entirely
separate entities and are allocated and managed by the system
separately. There are code segments and data segments, and while they
are allocated from the same system-global heap and may be adjacent in
physical memory, logically they are separate and addressed entirely
differently. Both types of segments can be created only by the MCP. The
contents of code segments are loaded solely from codefiles. Code
segments are read-only, and as we will see, are automatically reentrant.
Data Segment Descriptors.
Descriptors are called such because they "describe" an area of memory.
MCP systems are a form of capability architecture, and the descriptors
are the capability -- you have to have access to the descriptor to
access the data it describes. Descriptors are the basis of memory
addressing and memory segmentation.
A tag-5 word in the B6700 architecture represents a data descriptor. The
word has a number of fields which I will identify using what is called
partial-word notation, [s:n], where "s" is the starting bit number and
"n" is the length of the field in bits. Bit 47 is high order and bit 0
is low order; all MCP systems use big-endian addressing.
[47:1] Presence bit (commonly known as the "P-bit")
[46:1] Copy bit
[45:1] Indexed bit
[44:1] Paged bit
[43:1] Read-only bit
[42:3] Element Size field
0 = single precision words
1 = double precision words
2 = four-bit characters (packed decimal)
3 = six-bit characters (BCL, now obsolete)
4 = eight-bit characters (EBCDIC)
[39:20] Length or index field (determined by [45:1])
[19:20] Segment starting address
Accessing the data in a segment is done by means of "indexing" the
descriptor with a zero-relative offset into the segment. There are a
series of instructions that do this, as will be detailed below.
The Presence bit indicates whether the memory area represented by the
descriptor is physically present in memory. It is the basis for the
virtual memory mechanism. "Virtual memory" is IBM's clever marketing
term that everyone has adopted, but it does not accurately describe what
goes on in MCP systems. "Automated memory segment overlay" is a more
accurate term, and is what Burroughs called it before IBM's term caught
When the Presence bit is 1, the segment is physically present in memory
starting at the real address in [19:20]. (Note that in A Series and
later systems, the real memory address is no longer contained in
descriptors, as will be discussed later).
When the Presence bit is zero, the segment is not present in memory (or
has never been allocated) and the descriptor is said to be "absent"
(although it's really the data area that is absent). Attempting to
access a segment through an absent descriptor generates a "Presence bit
interrupt" -- what other systems would call a page fault. When handling
this interrupt, the MCP interprets the value in the address field as
* If zero, this indicates that the segment has never been allocated.
The MCP simply allocates an area of length specified by [39:20],
relocating or overlaying other physically-present segments as
necessary. The MCP clears the allocated area to binary zeroes
(primarily to wipe out any nasty tags that may be there from prior
use of that space), fixes up the address field in the descriptor and
sets its Presence bit. Exiting from the interrupt handler causes the
hardware to restart the instruction which generated the interrupt.
The compilers generate code in the initialization section of
procedures to build these "untouched" descriptors directly in the
program's stack. The physical memory area is not allocated until
(and unless) the program actually references it.
* Some types of objects require additional information from the
compiler (e.g., for multi-dimensional array-of-arrays structures,
the length field in the descriptor only specifies the length of the
first dimension so certain bit patterns in the address field for
untouched descriptors point to data structures the compiler has
built within the codefile that carry the necessary information for
the other dimensions).
* Otherwise, the segment was once present in memory but has been
rolled out, usually due to pressure from other memory allocation
activity. The value in the address field of the descriptor indicates
the location within the task's "overlay file" (a temporary, unnamed
file the MCP allocates for each task) where the data associated with
this descriptor has been rolled out. The MCP allocates an
appropriate area of memory, reads the segment from the overlay file,
fixes up the descriptor with the new real memory address, and exits
the interrupt handler to restart the instruction that was
The Copy bit indicates that a tag-5 word is not the original descriptor
for an area, but rather a copy of it. A non-copy descriptor is said to
be the "mom" descriptor for an area, and there can only be one such mom.
Copy descriptors are generated automatically by a number of
instructions, principally those involved with indexing and loading words
on the top of stack (e.g., to pass an array as a parameter to a
procedure). Present copies point directly to the segment's memory area;
absent copies point to the mom.
In the B6700, there is no central page table, per se, to keep track of
memory areas. The mom descriptor effectively serves the purpose of a
page table entry. Every area of physical memory (allocated or available)
is surrounded by a set of "memory link" words that are used by the MCP
memory allocation routines to keep track of allocated areas and locate
available areas of appropriate size. For allocated areas, one of these
link words points back to the mom descriptor. The handling of moms and
copies is another thing that has changed with the A Series and later models.
The Indexed bit indicates whether the descriptor points to a whole
segment or to one element within a segment. When the bit is 0, the
length/index field contains the length of the segment. Some indexing
instructions generate an "indexed" copy descriptor with both the Indexed
and Copy bits set to 1 (all indexed descriptors are by definition
copies). Indexed descriptors are typically used as a pointer, e.g., as a
destination address for store operators or as a call-by-reference
parameter to a procedure. They are also used as starting addresses for
string manipulation instructions. Algol has a "pointer" type; it
represents an indexed data descriptor.
In an indexed descriptor, the length field is replaced by a
zero-relative offset into the segment. The physical address of that
element (assuming the segment is present in memory) is the sum of the
base address in [19:20] and the offset in the length field.
Depending on the value of the Element Size field [42:3], a descriptor
could be word oriented or character oriented. Physically, memory is
accessed as whole words. When a character-oriented descriptor is
indexed, its length field is replaced by a specially-encoded offset.
Bits [35:16] are a word offset within the segment; bits [39:4] are a
character offset within that word. This obviously limits the offset for
character pointers to 393215 for eight-bit characters and twice that for
four-bit packed decimal digits (although this limit can be effectively
eased by the use of paged areas, discussed next). The string
instructions understand these character offsets and transparently handle
character operations that start in the middle of words.
The Paged bit indicates whether the memory area for the descriptor is
monolithic (0) or paged (1). If paged (the original term for this is
"segmented", but talking about segmented segments can be confusing), the
address field does not point to the address of the data, but instead to
a vector of other descriptors. Each of these descriptors in turn points
to a data segment. In memory, this looks identical to a two-dimensional
array-of-arrays structure. The difference is in how the software
accesses the data. With a standard two-dimensional array, the software
must explicitly compute and apply an index for each dimension. With a
paged segment, the software is unaware of the second dimension. When the
indexing instructions encounter a descriptor with the Paged bit set,
they partition the index value into a page number and page offset and
automatically re-index the second dimension. If an indexed copy is
generated, it is a copy of the descriptor for the page, not the original
(paged) descriptor, and the index offset will be that within the page.
This helps alleviate the smaller limit for word offset available with
indexed character descriptors.
For the B6700, the page size was 256 words. Data areas were typically
paged when they exceeded 1024 words, but the programmer had some control
over this. The page size on current systems is 8192 words. The paging
threshold is adjustable by the system administrator, and by default is
also 8192 words. As with all memory areas in the system, the individual
pages are not allocated until first referenced. The array of page
descriptors (called a "dope vector") is not even allocated until one of
the pages is initially accessed. All pages are individually relocatable
String operations that run off the end of a physical memory area
generate a "segmented [paged] array" interrupt. The MCP responds to this
interrupt by locating the next page in sequence (if there is one) and
restarting the operation. If there is no next page (which would also be
the case if the string operation ran off the end of an unpaged segment),
the result is a "Segmented Array Error", with which everyone who has
programmed for MCP systems is no doubt intimately familiar. Unless
trapped, this error (a form of boundary violation) is fatal to the program.
The Read-only bit is just that -- if 1, it marks the segment as
non-writeable. This is primarily used for segments that represent
constant pools generated by the compiler and which are loaded from the
The Element Size field, as discussed above, determines whether the
descriptor represents single precision word, double precision word, or
character-oriented data. It is quite common, especially with COBOL
programs, to have multiple descriptors with differing size fields
pointing to the same segment. Only one of these could be the mom, of
course; the rest would have to be copies. This allows the software to
address the same memory area as a mixture of word and character fields.
The B5500 strictly used six-bit characters; the B6700 was basically an
eight-bit EBCDIC machine, but could also handle six- and four-bit
characters. Support for the six-bit codes was dropped from the
architecture in the Bx900 models, ca. 1980. I understand that the latest
Libra models have added support for 16-bit characters.
The role of the Length/index and Address fields has been covered in the
discussion above, so nothing additional needs to be said about them here.
Code Segment Descriptors.
Thus far, the discussion has been about data descriptors. There are also
descriptors for code segments. They are similar to data descriptors, but
simpler. Code segment descriptors have a tag of 3 and the Presence bit,
Copy bit, Length, and Address fields of data descriptors. Bits [45:6]
are not used. Code segments cannot be indexed or paged. They are by
definition read only. The only element size they support is single
Code segment descriptors live in a special type of data segment called
the Segment Dictionary. An image of this segment is built by the
compiler (all descriptors being in their absent, untouched form, of
course) and stored in the codefile. The Segment Dictionary is loaded by
the MCP as part of task initiation. In addition to code segment
descriptors, the Segment Dictionary may contain (read-only) data
descriptors for constant pools and scalar constant values. The Segment
Dictionary in memory is actually a type of stack, although not for
push/pop type of activity. As we will see, stacks in MCP systems are a
central element in the addressing environment, and it is for this
purpose that a Segment Dictionary is loaded as a stack. Segment
Dictionaries are also sharable -- if multiple tasks are initiated from
the same codefile, the Segment Dictionary is loaded only once and the
separate tasks are linked to this common copy. Thus, all of the object
code and read-only constant pools for a program are automatically reentrant.
Object code is addressed using a three-part index. The first part is the
"segment number", which is the code segment descriptor's offset within
the Segment Dictionary. The second part is the word offset within the
segment. The third part is the instruction syllable (byte) offset within
the word. These numbers are zero-relative and generally written in hex,
so an address of 03C:0041:3 indicates segment #60, word offset 65, byte
The processor uses variable-length instructions and can branch to any
syllable offset within a segment. By using one of several dynamic branch
instructions with a Program Control Word (PCW, tag 7) as an argument, a
program can branch across segments -- in fact PCWs allow programs to
branch and call procedures across stacks. When a program branches to a
different code segment (either directly or by means of a procedure
call), the segment is made present if it is not already so, and the
segment length and base memory address are loaded into registers within
the processor designated for that purpose. Intra-segment branches use
only the word and syllable offsets, and therefore do not need to
continually reference the code segment descriptor.
The Presence bit in a code segment descriptor indicates physical
presence or absence of the segment in memory, just as for data segments.
For an absent descriptor, the address field indicates the offset within
the codefile where the segment starts. Code segments and constant pools
are not loaded from the codefile until first reference. When a program
is initiated, the Segment Dictionary is the only thing that is loaded
from the codefile. Everything else is loaded as a result of Presence bit
Since codefiles and code segments are guaranteed to be read only, code
segments (and constant pools loaded from the codefile) are never rolled
out to an overlay file. The physical memory area is simply deallocated
and the codefile offset restored in the absent descriptor (that offset
is stored in one of the memory link words while the segment is present
and the descriptor's address field points to the area in memory). The
code or data is simply reloaded from the codefile the next time a task
trips over the descriptor's Presence bit.
Descriptors obviously support bounds checking, along with the dynamic
relocation and overlay of real memory areas, but they have another
significant advantage -- the dynamic resizing of data segments. Since
the length of a segment is part of the descriptor, the basis for bounds
checking is centralized. The physical memory area can be resized by a
user program at run time, the length in the descriptor will be updated,
and future indexing operations will check against the new length.
The B6700 and all later systems support the programmatic resizing of
data segments. Code segments are immutable, so resizing them is not a
meaningful operation. Support for resizing varies by language, but in
Algol, it is performed using the RESIZE intrinsic procedure.
RESIZE(A,N) resizes the array A to a new length of N. The unit of N
is determined by the Element Size field in the descriptor. The old
segment is deallocated and its contents are lost. The segment with
the new length will not be allocated until it is next referenced.
RESIZE(A,N,RETAIN) allocates a new segment of length N and copies up
to N units from the old segment to the new one. If the new length is
shorter, any remaining units from the old segment are lost; if the
new length is longer, the remainder of the segment is filled with
binary zeros. Once the data is copied to the new segment, the old
segment is deallocated.
RESIZE(A,N,PAGED) is similar to the RETAIN option, but creates the
new data segment with a paged data descriptor.
RESIZE works on individual segments. In the case of multi-dimensional
array-of-arrays structures, the row for each final dimension can be
resized independently and to differing lengths.
Actually, RESIZE can be applied to any of the dimensions of a
multi-dimensional array. Given an Algol declaration of
ARRAY M [0:99, 0:49, 0:63, 0:4095];
RESIZE(M[4,7,*,*], 75, RETAIN) would resize the M[4,7] dope vector in
the third dimension from its original length of 64 to a new length of
75. This would create 11 new, untouched descriptors of length 4096 at
the end of that dope vector.
These resizable array-of-arrays structures are often used, especially in
Algol, to implement flexible, safe, dynamic storage allocation schemes
for user programs.
Addressing Data -- Indexing and the Stack.
Note that in the discussion thus far, nothing has been said about
user-level programs having access to memory addresses -- real or
virtual. Data descriptors for the B6700 can hold real memory addresses,
but the content of descriptors is controlled only by hardware
instructions and the MCP, not user-level code. User-level code (and all
but small parts of the MCP kernel, for that matter) does not manipulate
addresses, it manipulates offsets into memory areas. Separate memory
areas generally relate to separate objects for a programming language
(e.g., arrays for Algol, Pascal, and Fortran, or 01 levels for COBOL).
Since there are no addresses, there can be no invalid addresses. You can
try to use an invalid offset, of course, but that offset must be applied
against a descriptor, where it is ALWAYS checked against the length
field. If the offset is less than zero or greater than or equal to the
length, your task becomes the happy recipient of the "Invalid Index"
bounds violation interrupt. Depending on the language you are using, you
can trap this interrupt (actually, what you trap is the MCP's response
to the interrupt), but you can't restart the operation which caused it
-- your only options are to branch to a recovery routine or enter the
catch portion of a try/catch block. If you do not trap the interrupt,
your task is terminated by the MCP.
Another thing that has not been mentioned yet is registers. There are
certainly registers in the processor -- the B6700 had dozens of them --
but the instruction set accesses them implicitly. None of them are
accessed directly by user-level code. (Actually, on the B6700 that
wasn't quite true -- compilers would generate code to access some of the
registers directly, but this was not common. Access to these registers
has been tightened up considerably on more recent systems). Instead of
loading addresses directly into registers and using those to read and
store real or virtual memory locations, user-level programs on MCP
systems access data in two ways: (a) directly in a stack or (b) outside
of a stack by going through a descriptor that is in a stack.
A stack in the MCP architecture is simultaneously three things:
(a) a memory area for push/pop expression evaluation,
(b) the stack frames (activation records) and return history for
procedure calls, and
(c) the basis for addressing global and local variables in a
The B5000 and all of its descendants were designed to support Algol. To
understand how stack addressing works, you need to understand how Algol
programs can be structured. If you don't know Algol, think Pascal --
they're very similar.
An Algol program is constructed as a series of nested blocks. Each block
can contain local variables (including procedure declarations), but also
has addressability to the variables in all of its more global (i.e.,
containing) blocks. The body of a procedure can be considered a block,
whether it has local declarations or not. The scope rules for
identifiers in this scheme are the same as for a two-level language such
as C, it's just that there can be more levels. In MCP systems, this
nesting is referred to as the "lexicographical level" (lex level or LL).
The global code ("outer block") of most user programs runs at LL=2.
First-level procedures (such as you would have in a C program) run at
LL=3. Any procedures declared within those first-level procedures
(something you can't do in C) would run at LL=4, and so forth.
The Segment Dictionary is loaded as a stack with an environment of LL=1.
Therefore, multiple tasks initiated from the same codefile see their
code segments and constant pools as globals at a higher level -- hence
the Segment Dictionary and all of the segments based on it are
reentrant. The MCP stack (another that is purely an addressing
environment, not a data space for a task) is at LL=0. Note that stack
addressing can cross stacks. Also note that on the B6700, all MCP
globals were in the scope of the user programs. This allowed them to
call MCP service procedures directly, as if they were global to the user
program (which, in fact, they were). There is no "service call" or
"branch communicate" or other major environment change to access O/S
services -- user programs simply call what to them are global procedures.
This accessibility to the MCP stack was recognized as a serious security
issue fairly early on, and later systems blocked direct access to LL=0.
User programs now still call MCP service routines as normal procedures,
but access to the procedure entry points and other global objects is
provided through indirect addresses in the Segment Dictionary. These
indirect addresses are initially set up to cause a fault interrupt when
first accessed, at which point the MCP verifies the access and replaces
the intentionally bad word with a valid SIRW (tag=1, Stuffed Indirect
Reference Word) to the MCP global. Subsequent references to the service
routine merely require a one-level indirection to reach the entry point.
The "stuffed" for an SIRW refers its ability to address a word in a
stack outside the current scope chain, and in particular, across stacks.
What gets stuffed in the word is sufficient environment information to
allow out-of-scope, cross-stack addressing. A normal IRW (also tag=1)
only addresses within the current scope chain. There is an instruction
(STFF) that converts a normal IRW to an SIRW. Perhaps inevitably, it is
commonly called the "get stuffed" operator. (Current systems no longer
generate normal IRWs -- all reference words are now generated as SIRWs
unconditionally, so STFF has become a no-op).
The local variables for a block are allocated in the stack. In the case
of procedures, this provides efficient recursion. Simple blocks (i.e.,
nested BEGIN/ENDs containing declarations) are implemented as
parameterless procedure calls. Therefore, more-global variables are
lower in the stack, or possibly in a separate stack (note that unlike
some systems, MCP stacks grow from low addresses to high addresses -- a
push increments the top-of-stack address, not decrements it).
Addressing within the current scope chain is a very common thing to do,
so MCP systems provide a series of base registers, called the "D" (for
"display") registers. There is one D register for each lex level, and it
contains the absolute (real) memory address to the base of a block's
stack frame. The B6700 had 32 such D registers, but this proved (for
once) to be more than necessary. Later systems cut back to 16 D
registers (allowing user procedures nested 13 levels deep -- I doubt
that I've ever coded anything that goes more than four levels deep). The
B5900, which was the first microcoded processor and was based on
bit-slice chips, tried to get by with four D registers (0, 1, 2, and
current LL), but that didn't work too well, and that approach was
abandoned in later designs.
The simplest addressing mode for MCP systems is based on these D
registers and uses a construct known as an "address couple". The address
couple has two fields, LL (which selects a D register) and an offset
from the address in that D register. This is written "(LL,offset)" --
thus (2,17) refers to the 17-th (zero relative) word in the global
(LL=2) address space of a program. For an Algol program, this address
space would be the outer block; for COBOL, it would be WORKING-STORAGE;
for FORTRAN, it would be COMMON; for a C program it would be the
environment for static declarations.
With that introduction to stack addressing, here is the key concept:
scalar variables are allocated in the stack; non-scalar variables
(arrays, structures, records, whatever) are allocated outside the stack.
A descriptor is allocated in the stack to access that non-scalar area.
To illustrate this, here is a simple Algol program:
2: INTEGER I; % I is allocated at (2,2)
3: ARRAY A[0:99]; % descriptor for A at (2,3)
4: I:= 1;
6: A[I]:= A[I-1]+I
7: UNTIL I:= I+1 > 100;
The stack offset for I starts at 2 because there are two linkage words
at the base of a stack frame for (a) the procedure return address and
(b) a control word called the MSCW (Mark Stack Control Word) that allows
the processor to reconstruct the D registers to the values for the
calling environment upon procedure (or in this case, block) exit.
Here is the code that Algol would generate for this snippet (note that
this is slightly idealized and out of order from what the current Algol
compiler would generate, but both the concept and effect are accurate).
Line 1: (nothing)
Line 2: ZERO push a zero value onto the stack at (2,2)
Line 3: LT48 000006400000 push a skeletal descriptor with length=100
onto the stack at (2,3)
LT8 5 push a literal 5 onto the stack
STAG set a tag=5 onto the skeletal descriptor; this
leaves an untouched descriptor at (2,3)
PUSH flush the top-of-stack (TOS) words into memory
LT16 2048 push literal 2048 (this is a marker word for the
MCP BLOCKEXIT procedure called at the end)
BSET 47 set bit 47 in TOS word
LT8 6 push a literal 6
STAG set tag=6 on the BLOCKEXIT marker word
Line 4: ONE push a literal 1 onto the stack
NAMC (2,2) Name Call: push an IRW for I onto the stack
STOD Store Destructive: store the 1 to the (2,2)
address; pop both the IRW and literal 1
Line 5: (nothing)
Line 6: VALC (2,2) Value Call: push a copy of the value of I
NAMC (2,3) push an IRW to the descriptor for A
INDX Index: pop both parameters; push an indexed copy
descriptor for A[I] (think: address of A[I])
VALC (2,2) push another copy of I
ONE push a literal 1
SUBT Subtract: pop both parameters and push the
value of I-1
NAMC (2,3) push IRW to the descriptor for A
NXLV index and load value: index A by I-1; pop both
parameters and push the value of A[I-1]
VALC (2,2) push a copy of the value of I
ADD pop top two values; push the value of A[I-1]+1
STOD store A[I-1]+I in A[I]; pop both addr & value
Line 7: VALC (2,2) push a copy of the value of I
ONE push a literal 1
ADD pop top two values and push the value of I+1
NTGR integerize TOS word with rounding
NAMC (2,2) push an IRW for I onto the stack
STON Store Nondestructive: store I+1 back into I;
pop the address but leave value I+1 on TOS
LT8 100 push a literal 100 onto the stack
GRTR Compare Greater: [(I+1) > 100]; pop both values;
push result of comparison: 1 if true, 0 if false
BRFL 4:4 branch false: if low-order bit of TOS word is 0,
branch to Line 6 (word 4, syllable 4 in the
current code segment); pop the TOS word
Line 8: MKST construct and push a MSCW (Mark Stack Control
Word) in preparation for a procedure call
NAMC (1,4) push an IRW to the PCW for the MCP's BLOCKEXIT
procedure (actually, for an MCP intrinsic, it
would be an IRW to an SIRW in the Segment
Dictionary to the PCW in the MCP stack for
BLOCKEXIT). This procedure is not passed any
parameters, but if it were, they would be pushed
into the stack at this point.
ENTR Enter: call the BLOCKEXIT procedure
EXIT Exit Procedure: cut back the stack (thus
destroying this activation record), and exit the
outer block of the program (this exits back into
an MCP procedure which terminates the task and
disposes of the stack's memory and related
When this program is initiated, the MCP reads some information from the
codefile that tells it how to set up the data stack, including a
recommended initial size. If the Segment Dictionary is not already
present (due to another task executing the same codefile), a "code
stack" is allocated for the Segment Dictionary and its image is loaded
from the codefile. There is a base area of the data stack that the MCP
uses for task management, which it also sets up. No program globals are
loaded, however -- this will be done by stack-building code generated by
the compiler for the outer block's data segment (as is shown for I and A
in the example above). Instead, the MCP creates a dummy stack frame that
makes it appear as if this task has called a procedure, but the return
address from that call is set up as the entry point to the outer block's
The MCP also constructs a TOSCW (Top of Stack Control Word) at the base
of the stack, which tells the hardware how to find the top of stack and
the base of the top stack frame. From that, the processor can
reconstruct all of the stack linkage, D registers, return instruction
address, and so forth. After building the initial stack image, the MCP
simply links the task into the READYQ, the prioritized list of tasks
waiting for a processor. Once the task rises to the top of this queue, a
processor is assigned to it, at which point the processor "exits" into
the entry point.
The B6700 has a instruction, MVST (Move to Stack) that switches the
processor from its current stack to another one. This instruction
constructs a TOSCW for the current stack and uses the TOSCW for the new
stack to reconstruct stack linkage and registers settings. Later systems
did context switching in different ways, but it appears that on the
current Libra systems, MVST is once again how it's done.
Note that once the MCP sets up the initial stack image and releases the
new task to the READYQ, all further saving and restoring of registers
and other state information is handled automatically by the hardware.
Since all registers have specific purposes (i.e., there are no
general-purpose registers being used who knows when and for what), the
hardware knows when the value of a register needs to be pushed into
memory or recalled. This applies not only to context switches between
tasks, but also to all procedure calls. Hardware interrupts are
implemented as a forced procedure call on the stack that currently has
the interrupted processor, so the same state-saving mechanism is used
for interrupts as well.
Back on the Algol example above, the very first thing that happens when
the processor exits to the entry point is a Presence bit interrupt as it
detects the Presence bit is zero in the descriptor for the outer block's
code segment. Execution continues once this code segment is made present.
The stack-building code at the beginning of the outer block creates the
local variables for the stack frame and pushes them onto the stack. In
the case of integer I, this is simply a literal zero; in the case of
array A, the code constructs an untouched data descriptor of length 100.
The 100 words of memory for the array will not be allocated until the
descriptor is first "touched" and its zero Presence bit detected. This
will happen the first time the NXLV (Index and Load Value) instruction
is executed in Line 6. Note that the INDX (Index) instruction executed
earlier does not cause a Presence bit interrupt, since it only generates
an indexed copy descriptor and does not attempt to access an array
element. The INDX instruction effectively acts as a "load address"
instruction. Bounds checking takes place on both INDX and NXLV, however.
The program then proceeds to initialize the value of I (some compilers
would fold this assignment into the stack-building code, but Algol does
not), and execute the DO loop that iterates through the elements of
array A. To someone used to register-based architectures, this code
probably looks like it generates a lot of memory accesses -- all those
VALC and index operators, not to mention the stack pushes and pops. On
the B6700 that certainly was true, as there was essentially no caching,
except for two TOS registers. More recent implementations use caching
extensively, however, and most of the apparent memory references would
stay inside the processor.
Another thing that may be apparent is that Algol does very little
optimization. It is a one-pass compiler and, for better or worse, emits
instructions in pretty much a what-you-code-is-what-you-get manner.
This program contains an intentional bug, which becomes apparent on the
last iteration of the DO loop. The value of I is 100, which is greater
than the upper bound of array A. When I compiled this and ran it on an
MCP system, I got the following message from the MCP:
2228 MSRHI3:INVALID INDEX @ (00100600)
2228 HISTORY: 003:0001:3 (00100600).
F-DS 2228 (PAUL)OBJECT/SIMPLE/ALGOL ON OPS.
This indicates a bounds violation on line 100600 (a sequence number that
is part of the source file line) at code segment address 003:0001:3,
which is the INDX instruction for Line 6 in the example above. This
bounds checking is not due to any debug or diagnostic mode I enabled for
the compiler or the object code -- it's implicit in the segmented
addressing mechanism for the architecture and cannot be turned off.
The value 2228 is the MCP-assigned task number. F-DS indicates the
program was terminated (discontinued, or "DS-ed" in MCP parlance) due to
a fault interrupt. Although it is not apparent from this example, the
HISTORY line is a trace of return addresses -- it shows the history of
procedure calls that got to the point where the fault occurred.
Assuming this bug did not exist (i.e., the comparison on line 7 was ">
99" instead of ">100"), the loop would have terminated when the value of
I reached 100 and control would have fallen into the END statement for
the block. The NTGR instruction for line 7 is due to the numeric format
used with all MCP systems since the B5000 -- integers are implemented as
a subset of floating-point values, and integer overflow generates a
floating-point result. NTGR normalizes the TOS value as an integer and
generates a fault interrupt if it exceeds the limits of integer
representation (+/- 2**39-1).
The call to BLOCKEXIT at the end of the code for the block is generated
by the compiler to dispose of any complex objects (arrays, files, etc.)
that were declared in this stack frame. The compiler generates a tag-6
marker word at the end of stack-building code that serves as a parameter
to BLOCKEXIT. This marker word contains a bitmask indicating which types
of resources BLOCKEXIT should look for. Failure to call BLOCKEXIT when
required would result in memory leaks, and the presence of this call is
another example of the trust the system places in its compilers. More
recent E-mode levels include a "blockexit bit" in one of the stack
linkage words that can be used by the MCP to enforce proper disposal of
stack frame resources before the frame can be exited.
A Series Enhancements to the Descriptor Mechanism.
The Address field of data and code segment descriptors is 20 bits wide,
which allows for a total of 1048576 words (6 MB). The B6700 has the same
maximum physical memory size, so the field width was adequate. In the
late 1960s and early '70s 1 MW seemed near infinite, but as systems
became larger through the '70s (and especially as the use on-line
applications and data bases grew during this period), this upper limit
on physical memory of 1 MW became grossly inadequate. The
B6800/7800/6900/7900 implemented a somewhat crude paging technique (the
infamous "Global(tm)Memory") that helped somewhat on multi-processor
systems, but the physical address space for a given processor at any one
time remained at 1 MW.
The A Series models introduced starting in the early 1980s addressed
this problem by implementing a concept known as ASD (Actual Segment
Descriptor). Heretofore, the "mom" descriptor in a program's data stack
was the owner of a memory area and pointed directly to it when the
segment was present in memory. There was no room to expand the address
field in the 48 bit descriptor word, so the role of owner was moved from
the mom to a central ASD table in memory. Instead of a real memory
address, the Address field in descriptors now holds an index into this
ASD table. On the latest processors, each entry in this table contains
eight 48-bit words, of which only the first three are used by the
hardware. The actual location, length, and status of each allocated
memory area is now stored in these table entries, hence the ASD name. It
functions similarly to the page table in other virtual memory
architectures, except that the "pages" are variable-length segments.
Most processors use caching to reduce the incidence of real memory
accesses to this table, effectively implementing a form of TLB.
The concept of "mom" and "copy" descriptors no longer exists in the
architecture, at least not in anything like the way it was in the B6700.
All descriptors in data segments (except untouched descriptors) point to
an ASD table entry. In fact, since E-mode level Gamma introduced the use
of four-bit tags (initially for the A11 and A16 models in the early
1990s), tag-5 words are used only to represent untouched descriptors.
Once the area is allocated, descriptors accessible to user-level code
(now called "virtual segment descriptors" or VSDs) use tag values of C,
D, E, and F to identify various combinations of indexed/unindexed and
The ASD index field in these VSD words is currently 23 bits in length,
allowing for a maximum of just over 8 million segments in the system.
The maximum length of a segment is still limited to 2**20-1 words,
although it appears this could be extended. The physical address field
in the ASD is currently 36 bits, allowing a maximum physical memory
space of 64 GW or 384 GB. The large Libra systems I have encountered
recently have 4-8 GW of physical memory, which appears to be more than
With so much physical memory available, most systems today run with
large amounts of the memory space unallocated, and almost no overlay due
to memory allocation pressure takes place. As a result, the Presence bit
mechanism now largely serves as an allocate-on-first-reference
capability. Should you fill up the physical memory, however, the
automated overlay ("virtual memory") mechanism will still do its thing.
The size of the ASD table is established at system initialization time,
based on the total size of physical memory and a factor that is settable
by the system administrator. Running out of ASD table entries is a
no-no, and causes the system to halt.
A serious issue with the B6700 design was management of copy
descriptors. Every time the presence, location, or size of a segment
changed, not only did the mom descriptor need to be fixed up, but all of
the copies, too. The only way to find those copies was to search for
them, so copy descriptors were only permitted in stacks. There were
(still are) special instructions to search for these copies, and a
considerable software investment was made to minimize the number of
stacks that needed to be searched, but stack-search overhead could be
fierce, especially on a system that was near or beyond the thrashing point.
The ASD implementation considerably helped this situation. Copies no
longer contain real addresses to the data area or to the mom -- instead
they point to the ASD table entry for the segment. This table index does
not change through the life of the segment, so copies no longer need to
be found and fixed up when the location, size, or status of a segment
One of the nicest aspects of the ASD implementation was that it had
essentially no impact on user applications. Since descriptors are
managed by the MCP, the details of how the indexing instructions compute
real addresses are opaque to user-level code. It was necessary to
recompile some programs, although not specifically to support the ASD
addressing changes -- the B6700-era compilers emitted code that would
access fields of descriptors (e.g., to determine the length of an area
in Algol, you could obtain a tagless copy of the descriptor and isolate
bits [39:20]). Starting with E-mode level Gamma, the length field was
not even in the VSDs accessible by user-level code anymore, so new
instructions (e.g., GLEN to determine the length of a segment) were
implemented to perform these functions, and the old methods (along with
the Algol syntax that supported them) were deprecated. Some other
model-specific instruction sequences (such as directly accessing
processor registers) were eliminated at the same time, all of which
improved the security of the system and reduced somewhat the reliance on
trusted compilers. That reliance was not eliminated entirely, however.
There was a lengthy transition period that allowed users to recompile
their programs so that their codefiles would be compliant with newer
Issues with Descriptors and the MCP Architecture in General.
The first thing almost everyone comments on when first being exposed to
the stack- and descriptor-based architecture of MCP systems is the
memory access overhead of push/pop on the stack and of having to index
through descriptors to reach data. The second thing that gets comments
is the lack of user-accessible registers.
There is no question that these characteristics of the architecture add
overhead and (at least in the myopic view in which most seem to consider
performance issues) degrade performance. This overhead is at least
partially offset, however, by:
* the lack of unnecessary state saving,
* the efficiencies resulting from variable-length memory segments,
* the efficiencies resulting from unnecessary memory allocation by
delaying it until first reference,
* the efficiencies resulting from code and data segments being
closely related to language objects,
* the efficiencies resulting from being able safely to access data
and code across addressing environments and inter-task
(marshalling data across process boundaries is a foreign concept
in the MCP),
* the efficiencies in context switching,
* the efficiencies in interrupt handling, and
* the efficiencies resulting from hardware and operating system
environments that were designed specifically for each other.
In an I/O-intensive, transaction-server environment (which is what the
MCP systems are designed for), this performance trade-off balances out
better than you might think. Where the architecture loses at the micro
level of instruction performance, it gains at the macro level of system
performance. For a server, that's what you want. Need to do
high-performance numerical computation? MCP systems are probably not the
ones you should consider using. Need to do high-performance transaction
processing and safely balance the needs of hundreds or thousands of
tasks competing for processors, memory, and I/O paths? MCP systems do
quite well in that solution space.
There is another aspect to performance that I do not think is considered
often enough -- reliability performance. The lack of low-level bounds
checking in other systems terrifies me, and it should terrify you. The
idea that bound violations can be prevented simply by programmers "being
careful" is both silly and irresponsible. Giving addresses to
programmers is like giving whiskey and car keys to teenagers -- sooner
or later something stupid is going to happen, and it's probably going to
be sooner. I say this being a programmer. The current problems we have
with malware are largely due to unchecked memory accesses and allowing
data to be treated as code. These are problems that MCP systems simply
do not have.
As I have said before in this space, there is a cost to using
descriptors and hardware-enforced bounds protection. This is also a cost
to not using descriptors and hardware-enforced bounds protection.
In my opinion, the MCP architecture has two major problems -- and
neither of them relates to performance. The first is the reliance on
trusted compilers. As discussed briefly above, tweaks to the
architecture over the past 30 years have improved this situation, but
the security of the system is still too dependent on the quality of code
that the compilers generate. Barring a social engineering attack, it is
quite difficult get untrusted code into the system in a form that can be
executed. Once an untrustworthy compiler is authorized, though, havoc is
possible. I am not aware that penetration of untrusted code has ever
been a problem since the early B6700 days when some major holes were
exposed in the enforcement of codefile integrity, but this is too ripe
and area for potential abuse, and one that requires enforcement in too
many parts of the system, to be considered an adequate aspect of the
The second problem is that the segmentation, addressing, and memory
management mechanisms of the system are built for hierarchical,
block-structured languages such as Algol and Pascal. Memory objects
effectively "belong" to the block that declared them, and are
automatically deallocated when that block exits. This approach also
works fine for COBOL, and is adequate for FORTRAN (at least through
FORTRAN-77). It works poorly, though, for languages which rely on
heap-based memory management, where an object can have a life after its
originating environment no longer exists.
The MCP compilers and operating system go to great lengths to prevent
"up-level pointers" -- the existence of references to a
locally-allocated segment that can be stored in a more global scope. The
system does not have an efficient way to locate and invalidate such
up-level references when the locally-allocated segment is deallocated,
so their use is prohibited.
Current MCP systems support a C compiler, but the performance of its
code is not all that good, partly because C is basically a high-level
assembler for register-based, flat-address architectures (to which the
MCP architecture is a nearly complete antithesis), and partly because
the C heap is currently implemented as a series of array rows, with C
pointers implemented using integer offsets into the runtime
environment's heap space. It works, but the result is not very efficient.
The problem is even worse for object-oriented languages such as Java. A
descriptor-based capability architecture should be a good fit with an
object-oriented language (a descriptor, after all, is a primitive form
of object), but the current MCP architecture is too closely tied with
the Algol memory model to work well natively with Java. I consider this
to be both a real shame, and a real threat to the future viability of
the MCP architecture.
Those issues aside, I think the MCP architecture is still the most
interesting, if least appreciated, one on the market today. There are
other interesting aspects of the architecture that I have either passed
over quickly (such as stack linkage and cross-stack addressing) or
ignored altogether (such as the process model and the concepts of task
families and synchronous and asynchronous dependent tasks). Then there
are the MCP stack, the stack vector, procedure calls, accidental entries
("thunks"), parameter passing, string and data movement instructions,
server libraries, and connection libraries. These are all also well
integrated with the segmentation, stack, and addressing issues discussed
above. In fact, one of the endlessly fascinating things about this
architecture to me is that, for all of its complexity, everything fits
into a nicely integrated whole. It's quite elegant, really.
For those willing to RTFM, the Unisys support web site allows free
access to essentially all of the documentation for current systems. You
can access this through the front door by going to
http://support.unisys.com/ and clicking the "Access documentation" link
at very bottom of page. Click through the agreement and you will be
presented with a page that can search the documentation. You can also
access documents directly if you know their URL. Here are some useful ones:
ClearPath Libra 680/690 (latest version of E-mode level Epsilon spec):
ClearPath NX6820/6830 (E-model level Delta spec):
Current Algol language reference:
Bitsavers has a good collection of documents for the older MCP systems
In particular, you might want to look at the following documents under
Narrative Description of the B5500 MCP:
B5500 Reference Manual (architecture reference):
B5500 Extended Algol:
Elliot Organick's 1973 book on the MCP architecture:
B6700 Reference Manual (architecture reference):
A good paper by Hauck and Dent on the B6500/6700 stack mechanism:
If on first reading you don't understand this architecture, you're
running about average. I will happily try to reply to questions and
I will certainly agree with Edward that process initiation and
termination carry a fair amount of overhead, but disagree about
switching among processes. Because register state is saved and restored
automatically by the hardware, there is very little unsaved state that
needs to be handled during a process switch. On the B6700 and on the
latest high-end Libra machines there is a single instruction that does
the switch. The overhead Edward may be considering is operating system
overhead for process accounting -- e.g., accumulated CPU time -- but
that is done out of choice, not architectural necessity.
The current hardware limit on the size of a single data segment is
2^20-1 words. I thought that was also the upper limit on the size of a
user-level array row, and seeing 2^28 quoted (and assuming that meant
2^28-1), I had to check this out. I therefore wrote the following Algol
DEFINE MAXBIT = 28 #;
A[2**(MAXBIT DIV 2)-1]:= 2**(MAXBIT DIV 2)-1;
This did not compile (using the MCP 11.1 DMALGOL compiler on a Libra
300), generating an error on the declaration for array A, "This
dimension is declared with too many elements".
Then I tried MAXBIT=27. That compiled, but running the program generated
a run-time error on the assignment to A (which is where the array
would have been allocated, being the first reference to one of its
elements), "DIMENSION SIZE ERROR 1=134217728".
Next, I tried MAXBIT=26, and that both compiled and executed
successfully. The memory dump for the task generated by the next-to-last
statement confirmed that the array was segmented, and allocated as 8193
pages of 8192 words each (the last page is zero length, which is a
hardware requirement when the length of the segmented array is a
multiple of 8192). The dump also confirmed that only pages 0 and 8191
were actually allocated in memory, as they were the only ones that were
The Libra 300 is an E-model level Delta machine (it's also an emulated
machine -- the MCP architecture is implemented in software using a
standard Intel Pentium box running Windows Server 2003). It is possible
that the latest, E-mode level Epsilon machines may allow higher limits,
but I do not have ready access to one to test this.
> There's probably still a limitation on the size of code segments. But the
> compilers take care of that, and it's totally transparent. It's been years
> since I was even aware of the size of code segments except in bizarre cases
> where it would be a clue to some strange bug.
> If you want a lot of big segments, just declare
> ARRAY A [0:2**20-1, 0:2**20-1, 0:2**20-1];
> (Note that exponentiation is Algol is **.) That gives you a billion array
> rows of a million words each. A googol of them would only take a couple
> more lines. You would not be able to use them all due to physical memory
> limitations and the number of lifetimes it would take to go through them,
> but you could compile the program and access some random words from
> anywhere just to prove the point.
> Ironically, the most noticeable limit today is one you aren't likely to
> hear about. IOs are still limited to 2^16 words (6 * 2^16 bytes).
In other words, the largest unpaged memory segment that can be rolled
out to disk is 64K words in length, because the MCP requires virtual
memory paging to be done in one I/O. Permanently resident segments can
theoretically be 2^20-1 words in length, though.
>> Most of what I know about the Burroughs and descendants' architecture
>> is from Blaauw and Brooks.
> I'm not familiar with their book. I recommend Elliot Organick's "Computer
> System Organization: The B-5700-B-6700 Series". Used copies are not
> numerous but are not hard to find either. It won't give all the details,
> but it's well written. Three decades out of date, but the basics haven't
>> Their description is somewhat confusing
>> (not really their fault, since the hardware architecture is
>> phenomenally complicated), but as far as I can tell, each segment is
>> limited to 32K words.
> Totally incorrect. I can't even guess where this misconception came from.
> As mentioned above, the modern limit is about 2^28 words. In programing
> terms it was never less than 2^20 words. There were various limits that
> were smaller, but they did not affect programming.
> Remember that most MCP segmentation was always transparent to the
> Code segments? Maybe code segments are limited to 2^15 words, though
> actually 2^13 comes to mind. But it's totally transparent. When the
> compiler fills up a code segment, it generates a branch to the next code
> segment. End of problem.
The word offset in a code segment is limited to 2^13-1 in a PCW (Program
Control Word, tag=7) and in the intra-segment branch address format of
The important point here, I think, is that both Edward and I have been
using MCP systems for a long, long time, and both of us are quite
familiar with them, but not only do we not worry about segment size
limitations in our daily work, gosh -- we're not even sure what they are
anymore. I had to dive into the architecture reference manual to address
the issues in this thread. There are limits, but they are not
constraints on practical use. That was not the case on the B5500, which
had much lower limits all around, but for the B6700 and later systems,
virtual addressability for either code or data has never been much of a
snip very detailed descriptions
> Those issues aside, I think the MCP architecture is still the most
> interesting, if least appreciated, one on the market today.
Thank you very much Paul for that wonderful description. I have been
looking for something like that for a long time. I also appreciate that
you talked about the disadvantages/flaws in the system as well as its
advantages as it is good to get a dispassionate post like that from an
obviously interested and dedicated "Burroughsian". :-)
One other issue which some might consider a flaw or disadvantage is that
it seems hard to see if you could do a JIT compiler for this system. Is
- Stephen Fuld
(e-mail address disguised to prevent spam)