Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PR1ME C compiler sources

133 views
Skip to first unread message

Derek M. Jones

unread,
Sep 25, 2019, 10:46:07 AM9/25/19
to
All,

The source code of the PR1ME C compiler has been found and
is now available:

https://github.com/arnoldrobbins/gt-swt

The source is in Ratfor.

--
Derek M. Jones
blog:shape-of-code.coding-guidelines.com

Aharon Robbins

unread,
Sep 25, 2019, 2:51:33 PM9/25/19
to
Or rather, the source code of _a_ Prime C compiler. There were others. :-)

Arnold
(Yeah, that's my repo.)

In article <19-0...@comp.compilers>,
Derek M. Jones <derek@_NOSPAM_knosof.co.uk> wrote:
>All,
>
>The source code of the PR1ME C compiler has been found and
>is now available:
>
>https://github.com/arnoldrobbins/gt-swt
>
>The source is in Ratfor.
[As I recall the Prime machines addressed 16 bit words. What did you do
for character pointers? -John]

Dennis Boone

unread,
Sep 25, 2019, 9:19:06 PM9/25/19
to
> Or rather, the source code of _a_ Prime C compiler. There were others. :-)

The only real candidate for "THE" Prime C compiler would be the
Garth Conboy / Pacer Software compiler. I don't know which others
might exist, but if there are two, I'd be surprised if there weren't
three. What did UNH use? Did the UK universities that built some
compilers do C? Etc.

De
[I'd still be interested in hearing what any of them did for character
pointers. -John]

Dennis Boone

unread,
Sep 25, 2019, 9:19:27 PM9/25/19
to
> [As I recall the Prime machines addressed 16 bit words. What did you do
> for character pointers? -John]

There was an extended form of pointer that addressed characters.
48 bits, iirc, and may have been intended to do more than just
whole characters, though whether that was ever fully implemented
would be a good question.

De

Dennis Boone

unread,
Sep 25, 2019, 9:39:18 PM9/25/19
to
> > [As I recall the Prime machines addressed 16 bit words. What did you do
> > for character pointers? -John]

See pdf pages 42 and 43 here:

https://sysovl.info/pages/blobs/prime/programmercompanion/Pocket%20Guide%20Assembly%20Language%20Rev%2018%20FDR3340%201980.pdf

for details on the pointer formats.

De
[The Prime machines suffered from having too many versions of
everything. Page 41 of that quick reference card has a 48 bit pointer
format with a bit number in the low 16 bits. It seems unlikely that a
C compiler would use that as a general pointer format since it doesn't
fit in any sort of normal int, and it's not what you'd want to point
at an int or a function or anything bigger than a char. -John]

Derek M. Jones

unread,
Sep 27, 2019, 1:06:53 AM9/27/19
to
John,

> [The Prime machines suffered from having too many versions of
> everything. Page 41 of that quick reference card has a 48 bit pointer
> format with a bit number in the low 16 bits. It seems unlikely that a
> C compiler would use that as a general pointer format since it doesn't
> fit in any sort of normal int, and it's not what you'd want to point
> at an int or a function or anything bigger than a char. -John]

Some Cray machines and DSP chips have a similar problem with using
word addressing.

Several solve the problem by defining the word to be the smallest
addressable unit, making chars 48-bits in the case of some DSPs.

There seems to be a general dearth of PR1ME C compiler reference
manuals (which wold contain the details).

--
Derek M. Jones
blog:shape-of-code.coding-guidelines.com
[C had issues on any word addressed machine. Some of the hacks people
tried on the PDP-10 were pretty ugly, too. -John]

Aharon Robbins

unread,
Sep 27, 2019, 1:07:17 AM9/27/19
to
>[I'd still be interested in hearing what any of them did for character
>pointers. -John]

ISTR that the GT C compiler had char as a 16-bit quantity, I guess
for this reason. That was V-Mode code. Supposedly the newer
I-Mode instruction set supported addressing 8 bit units, but all
the stuff we did was in V-Mode.
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com

Dennis Boone

unread,
Sep 27, 2019, 12:09:51 PM9/27/19
to
> There seems to be a general dearth of PR1ME C compiler reference
> manuals (which wold contain the details).

For the Prime/Conboy/Pacer compiler:

https://sysovl.info/pages/blobs/prime/devel/C%20Users%20Guide%20T3.0-23.0%20DOC7534-4LA%201990.pdf

De

Derek M. Jones

unread,
Sep 28, 2019, 8:39:33 AM9/28/19
to
Dennis,
Thanks. The characteristics of the C compiler make it one of the
most unusual I have seen.

A discussion that crops up every now and again within the C committee,
is whether there are any processors that will trap if some
pattern of bits (usually a pointer value) is loaded into a register (but
not used to access storage).

I see that the PR1ME pointer value included fault and ring bits.
But the register set does not appear to contain special address
registers.

Do you know if casting an integer to a pointer result could create
a value that trapped when the pointer was loaded into a register?
I assume it could trap if an attempt was made to treat the value as
an address.

George Neuner

unread,
Sep 28, 2019, 8:40:55 AM9/28/19
to
On Thu, 26 Sep 2019 11:53:20 +0100, "Derek M. Jones"
<derek@_NOSPAM_knosof.co.uk> wrote:

>John,
>
>> [The Prime machines suffered from having too many versions of
>> everything. Page 41 of that quick reference card has a 48 bit pointer
>> format with a bit number in the low 16 bits. It seems unlikely that a
>> C compiler would use that as a general pointer format since it doesn't
>> fit in any sort of normal int, and it's not what you'd want to point
>> at an int or a function or anything bigger than a char. -John]
>
>Some Cray machines and DSP chips have a similar problem with using
>word addressing.
>
>Several solve the problem by defining the word to be the smallest
>addressable unit, making chars 48-bits in the case of some DSPs.

Just curious - what DSPs have 48-bit characters?

I have worked with Analog Devices chips that had 16/32/48 bit words in
internal memory and 32/48 bit words in external memory. Instructions
- and extended floats - were 48 bit, but all other data was either 16
or 32 bit. Due to addressing, an individual character could be a 16 or
32 bit value in internal memory, but had to be a 32-bit value in
external memory. Strings - if you used them - were packed to occupy
as few words as possible, and library string functions (mostly)
expected packed sequences rather than arrays of characters.

George

Dennis Boone

unread,
Sep 28, 2019, 9:17:40 PM9/28/19
to
> I see that the PR1ME pointer value included fault and ring bits.
> But the register set does not appear to contain special address
> registers.

There are base registers, and address registers for e.g. field operands
used in packed decimal arithmetic or character string edit instructions.
But for the most part, effective addresses would be computed from base
registers or pointers in memory.

> Do you know if casting an integer to a pointer result could create
> a value that trapped when the pointer was loaded into a register?
> I assume it could trap if an attempt was made to treat the value as
> an address.

I don't think a trap would occur when the register was loaded. It would
happen when you tried to use the register as an address. This would be
unusual at best, though: see the above comment about base registers and
memory pointers. Off the top, I think you might have to abuse the
mapping of some of the user registers to locations 0-7 to get this to
happen. I-mode is more general-register-y, so you might be able to use
a general register as a pointer there.

One version of the architecture manual is here if you're interested.

https://sysovl.info/pages/blobs/prime/archhw/Sys%20Arch%20Ref%20Guide%20Rev%2019.2%20DOC3060-192P%201983.pdf

De

Derek M. Jones

unread,
Sep 28, 2019, 9:18:25 PM9/28/19
to
George,

> Just curious - what DSPs have 48-bit characters?

Motorola DSP56000 Family Optimizing C Compiler uses 24 bits
TMS320C3x/C4x Optimizing C Compiler uses 32 bits

I remember reading a compiler manual and thinking, wow, that's
unusual.

It's not listed here: www.knosof.co.uk/cbook
so the manual reading happened after 2008.

Running pdfgrep over my collection of manuals, using the obvious
search patterns does not return anything.

An hour of google searching does not find anything.

To be continued...

David Brown

unread,
Sep 30, 2019, 6:06:36 PM9/30/19
to
On 28/09/2019 20:19, Derek M. Jones wrote:
> George,
>
>> Just curious - what DSPs have 48-bit characters?
>
> Motorola DSP56000 Family Optimizing C Compiler uses 24 bits
> TMS320C3x/C4x Optimizing C Compiler uses 32 bits
>
> I remember reading a compiler manual and thinking, wow, that's
> unusual.

24-bit DSP's have been popular for audio applications. (There is also
the TPU, a specialised RISC processor used for timer applications in
engine control microcontrollers, that is 24-bit.)

Some processors have larger access sizes to simplify the hardware. The
first DEC Alpha, and some ARM designs, had no instructions for reading
or writing 8-bit or 16-bit data. In effect, these had 32-bit (maybe on
the Alpha it was 64-bit) "byte" sizes. But smaller access sizes could
be easily simulated in software.

I can't think of any application where 48-bit would such a natural fit
that you'd have it as your basic access unit. Some video DSP's have
used 48-bit units, but that is for a vector of 3 16-bit colour units.

Christopher F Clark

unread,
Sep 30, 2019, 6:11:24 PM9/30/19
to
I don't know where you saw the description of the register set. I
suspect it was only describing the "general purpose registers"
associated with IX-mode (which I knew as I*-mode). The 48 bit pointer
registers are not part of that set. And, what I was describing
previously was the way the C compiler worked in V-mode. Reading the
documentation on the C compiler for IX-mode. It is clear that they
added a whole new way of dealing with 32 bit pointers using the
general purpose registers.

From what I read, my guess is that in IX-mode they tried to create a
linear address space more compatible with C programs than the previous
segmented address space was. Of course, to cooperate with the OS and
it's security model they had to respect and interface with the
segmented address space some.

So, what follows is what I remember of the V-mode segmented address
space (with some guesses as to how they probably tweaked it for
IX-mode to make it appear more linear). There were 4 pointer
registers in V-mode. PB -- a pointer to the instruction space. LB --
a pointer to "static" memory. SB -- a pointer to the "stack frame".
XB -- a pointer for general use. If I recall correctly, only the XB
was actually modifiable by normal code; done with the EAXB
instruction, calculate effective address (including doing
indirections) and store it in the XB register. The PB, LB, and SB
registers were only changed by the PCL (procedure call) instruction
(and it's corresponding return). Each of these registers had the two
bits I mentioned previously (although, I forgot the ring bits which
separated them), a ring number 0, 1, 2, or 3 (the OS ran in ring 0 and
user code ran in ring 3, the DBMS used ring 1 or 2 if I recall
correctly, but the other ring was unused), a segment number, a
half-word (16 bit offset), and a bit offset (that was only used by the
hardware at the character (8 bit) level).

Segments were 64k half-words in length (i.e. the half-word offset was
a 16 bit number that wrapped around inside segments). The segment
number was a 12 bit number, i.e. there were 4k segments. So, at the
byte level, there was approximately 29 bits of byte addressable
storage per user process. The system was a (demand paged) virtual
memory system, so pages from other processes could be distinguished
and would not be accessible.

Calls to the OS or DBMS were done through the standard PCL mechanism
which would change which ring you were running in (increasing your
priority), but every segment also had a ring number (as well as every
pointer) had a ring number associated with it and the values were
ORed, so that you got the lowest priority access. Thus, if you fudged
a pointer and you called into the OS, the OS would see your pointer
was in a lower priority space and use only the access rights that
space had to that address. Code could also lower the priority of a
pointer itself, by setting the ring bits, and I believe if you stored
a pointer, the hardware stored the ring bits in the saved pointer to
be the weak access it was using. So, even if your pointer got copied
into a ring 0 memory location, it would remain a ring 3 pointer if it
originally came from user space.

The hardware supported at least 3 faults related to pointers. Access
violation, the pointer was accessing a segment in a way it didn't have
rights to, with roughly the same 3 mode bits read, write, and execute
for each ring. Pointer fault, the fault bit in the pointer was set.
and page fault, the pointer pointed to a page that wasn't currently
mapped in. I believe there was also a segment fault for segments that
did not exist.

To make this be more like a flat address space, one could treat the
segment number and offset within a segment as a contiguous 28 bit
integer that pointed to a 16 bit half-word. The byte offset within
that half-word was in the wrong spot (essentially the sign bit of a 29
bit number), but a rotate of the 29 bits would fix that. The only
issue was the fault bit and ring bits were kind of in the way from a
pure 32 bit rotate. Thus, it was not pretty code, but it was doable.
I'm sure that is what the comment about V-mode C code being slower
than IX-mode as it probably had that done as part of the generated
code.

The other thing to note is that all instruction addresses (at least in
V-mode) were relative to one of the 4 base registers. Thus, you
needed to put one of the base registers (XB presumably) as the start
of the memory you want to be linearly addressible (and make sure that
none of the segments above that base address are in use by higher
privilege code and inaccessible). Other than that, the paging
mechanism should add pages for you as needed to match your address
space and even gaps in the middle could be accommodated so that you
could have a stack and heap that grew toward each other.

So, my guess is that IX mode did roughly that, putting the XB at the
start of the linear address space for C programs and making the
instructions which used the GPR registers as pointers, do the
appropriate bit twiddling in hardware but basing the resulting address
off the XB. Alternately, the instructions using the GPR registers as
pointers could have used "absolute addressing" with no base register,
letting the pointers deal with the segments (and their ring
restrictions) as required. The rings and segments would have still
been there but the code would have had the 29 bits to play with and
probably treated all accesses as if it were from ring 3.

--
******************************************************************************
Chris Clark email: christoph...@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------

Kaz Kylheku

unread,
Sep 30, 2019, 8:29:11 PM9/30/19
to
On 2019-09-29, David Brown <david...@hesbynett.no> wrote:
> On 28/09/2019 20:19, Derek M. Jones wrote:
>> George,
>>
>>> Just curious - what DSPs have 48-bit characters?
>>
>> Motorola DSP56000 Family Optimizing C Compiler uses 24 bits
>> TMS320C3x/C4x Optimizing C Compiler uses 32 bits
>>
>> I remember reading a compiler manual and thinking, wow, that's
>> unusual.
>
> 24-bit DSP's have been popular for audio applications. (There is also
> the TPU, a specialised RISC processor used for timer applications in
> engine control microcontrollers, that is 24-bit.)

Zilog produced a somewhat odd upgrade to the Z80 called eZ80.

(Not related to the incompatible Z8000 from 1979).

No idea in what exact year this was introduced, but it's actually seeing
some commercial success, unlike other Z80 upgrade attempts. Such as,
it's used in TI-84 graphing calculators.

https://en.wikipedia.org/wiki/Zilog_eZ80

This operates in a Z80 compatible mode or a mode called "ADL" in which
registers are 24 bits wide, and there is a 16M flat address space.

Someone made a board for this that runs a Lisp dialect called Maker Lisp
see <https://makerlisp.com/> and more recently CP/M.

Dennis Boone

unread,
Oct 1, 2019, 10:38:45 AM10/1/19
to
> I don't know where you saw the description of the register set. I
> suspect it was only describing the "general purpose registers"
> associated with IX-mode (which I knew as I*-mode). The 48 bit pointer
> registers are not part of that set. And, what I was describing
> previously was the way the C compiler worked in V-mode. Reading the
> documentation on the C compiler for IX-mode. It is clear that they
> added a whole new way of dealing with 32 bit pointers using the
> general purpose registers.

Ignoring floating point stuff, the registers are all 16- or 32-bit. The
48 bit pointers are strictly memory-based.

I mode is a general register mode. It doesn't do much of anything to
hide segmentation. It does include register-relative addressing, that
is, putting pointers into general registers.

IX mode is a small extension to I mode, which adds some additional
manipulation of pointers in registers, and some support for C character
manipulation. Again, doesn't do much of anything to hide segmentation.

> So, what follows is what I remember of the V-mode segmented address
> space (with some guesses as to how they probably tweaked it for
> IX-mode to make it appear more linear). There were 4 pointer
> registers in V-mode. PB -- a pointer to the instruction space. LB --
> a pointer to "static" memory. SB -- a pointer to the "stack frame".
> XB -- a pointer for general use. If I recall correctly, only the XB
> was actually modifiable by normal code; done with the EAXB
> instruction, calculate effective address (including doing
> indirections) and store it in the XB register. The PB, LB, and SB
> registers were only changed by the PCL (procedure call) instruction
> (and it's corresponding return). Each of these registers had the two
> bits I mentioned previously (although, I forgot the ring bits which
> separated them), a ring number 0, 1, 2, or 3 (the OS ran in ring 0 and
> user code ran in ring 3, the DBMS used ring 1 or 2 if I recall
> correctly, but the other ring was unused), a segment number, a
> half-word (16 bit offset), and a bit offset (that was only used by the
> hardware at the character (8 bit) level).

I suppose the base registers are "pointer registers" in the strict
sense, but 3/4 of them have fixed purposes. You can directly alter the
contents of LB and XB via the EALB and EAXB instructions. The obvious
way to alter PB is to use a PCL instruction. The only one left is SB,
which you can modify by using the RSAV and RRST instructions to save
registers and restore them.

> Calls to the OS or DBMS were done through the standard PCL mechanism
> which would change which ring you were running in (increasing your
> priority), but every segment also had a ring number (as well as every
> pointer) had a ring number associated with it and the values were
> ORed, so that you got the lowest priority access. Thus, if you fudged
> a pointer and you called into the OS, the OS would see your pointer
> was in a lower priority space and use only the access rights that
> space had to that address. Code could also lower the priority of a
> pointer itself, by setting the ring bits, and I believe if you stored
> a pointer, the hardware stored the ring bits in the saved pointer to
> be the weak access it was using. So, even if your pointer got copied
> into a ring 0 memory location, it would remain a ring 3 pointer if it
> originally came from user space.

Entrance to the OS is through the PCL instruction and the gate
mechanism. The microcode and/or the OS perform ring selection and
weakening as needed to ensure security. Storing a pointer does not
cause any change in it.

> The hardware supported at least 3 faults related to pointers. Access
> violation, the pointer was accessing a segment in a way it didn't have
> rights to, with roughly the same 3 mode bits read, write, and execute
> for each ring. Pointer fault, the fault bit in the pointer was set.
> and page fault, the pointer pointed to a page that wasn't currently
> mapped in. I believe there was also a segment fault for segments that
> did not exist.

A pointer fault can occur for several reasons: the fault bit being set,
pointing into an invalid location, etc. Page faults are part of the
virtual memory mechanism, and are not reflected to the user via a
condition the way a segmentation or pointer fault (or others).

> So, my guess is that IX mode did roughly that, putting the XB at the
> start of the linear address space for C programs and making the
> instructions which used the GPR registers as pointers, do the
> appropriate bit twiddling in hardware but basing the resulting address
> off the XB. Alternately, the instructions using the GPR registers as
> pointers could have used "absolute addressing" with no base register,
> letting the pointers deal with the segments (and their ring
> restrictions) as required. The rings and segments would have still
> been there but the code would have had the 29 bits to play with and
> probably treated all accesses as if it were from ring 3.

I haven't spent as much time with I mode as with V, but the usual
idiom is to move XB around as needed.

De

George Neuner

unread,
Oct 4, 2019, 11:27:36 AM10/4/19
to
On Sun, 29 Sep 2019 10:53:35 +0200, David Brown
<david...@hesbynett.no> wrote:

>I can't think of any application where 48-bit would such a natural fit
>that you'd have it as your basic access unit. Some video DSP's have
>used 48-bit units, but that is for a vector of 3 16-bit colour units.

Analog Devices SHARC series floating point DSPs had 48-bit
instructions and 40-bit extended precision floats aligned at 48-bit
addresses (probably to use the same address generator as for code).

Nominally, though, it was a 16/32 bit device: integer data, including
chars could be 16 or 32 bits, and ordinary (single precision) floats
were 32 bits.

Admittedly, I never encountered any use for the extended floats, but I
assumed they were there for a reason.

YMMV,
George

ga...@u.washington.edu

unread,
Feb 27, 2020, 5:37:20 PM2/27/20
to
On Monday, September 30, 2019 at 3:06:36 PM UTC-7, David Brown wrote:

(snip)

> Some processors have larger access sizes to simplify the hardware. The
> first DEC Alpha, and some ARM designs, had no instructions for reading
> or writing 8-bit or 16-bit data. In effect, these had 32-bit (maybe on
> the Alpha it was 64-bit) "byte" sizes. But smaller access sizes could
> be easily simulated in software.

Alpha isn't quite that bad.

The load/store instructions work on 32 or 64 bit units, but they ignore
the low bits when doing it.

So, you take a byte address, and use a load instruction to load its
word into a register. (I forget now the names of the memory units.)

Then there are instructions for operating on bytes
in a register which ignore the high bits. So, you can load a byte
from memory into a register with two instructions. To store a byte,
I believe you load the word, replace the byte, and write it back,
so three instructions.

Note that those operations are what CISC processors do without
you thinking about them on many machines, as memory is often
much wider than 8 bits.

Machines not so well designed require masking off the appropriate
bits before operating with them, though many machines ignore high
bits on shift operations. (The 8086 allows shifts up to 255 bits.)

rob...@dodo.com.au

unread,
Feb 27, 2020, 10:03:27 PM2/27/20
to
On 2020-02-28 09:23, ga...@u.washington.edu wrote:

> Machines not so well designed require masking off the appropriate
> bits before operating with them, though many machines ignore high
> bits on shift operations. (The 8086 allows shifts up to 255 bits.)

Who can say that the CDC machines (7600; 70 series, etc) were not
well designed?

They were intended to be fast, and to carry out operations on
words (of 60 bits).

To be sure, it was necessary to mask bits (usually characters),
but there was a simple instruction(s) to generate a mask of
n bits (better than loading a word containing bits to be used
as a mask).

On the other hand, the IBM S/360 was designed from the beginning to
handle bytes of 8 bits, half-words of 16 bits, and words of 32 bits.

Instructions could load and store a byte into/from the low-order
bits of a register, without affecting the other bits. Later models
could load/store one or more bytes into/from a register without
affecting the other bits.

For more general work, masking operations were available in the
32-bit registers.

ga...@u.washington.edu

unread,
Feb 28, 2020, 12:27:46 PM2/28/20
to
On Thursday, February 27, 2020 at 7:03:27 PM UTC-8, rob...@dodo.com.au wrote:
> On 2020-02-28 09:23, ga...@u.washington.edu wrote:
>
> > Machines not so well designed require masking off the appropriate
> > bits before operating with them.

(snip)

> Who can say that the CDC machines (7600; 70 series, etc) were not
> well designed?

> They were intended to be fast, and to carry out operations on
> words (of 60 bits).

CDC machines are designed for fast floating point number crunching.

They are not necessarily designed for fast character manipulation,
as that is supposed to be a relatively small part of the work.

The hardware/software tradeoffs were different so many years ago.

My favorite one has always been how the IBM 704 (and I believe
later 36 bit machines) read in cards. The read row-wise, each row
into two 36 bit words, leaving off 8 columns. This is also the reason
why Fortran (fixed form) uses columns 1-72.

Anyway, after the compiler reads in a card row-wise, it has to
convert to columnwise (six characters per word), including converting
to the appropriate character code. But it presumably saves a lot of
logic in the card reader, where it would be expensive and could be
done in software. The 7094 was the high-end number cruncher at
the time, including its use for S/360 emulation during its development.

But actually, as well as I know, the more usual way to run such
machines was to copy cards to tape, presumably in a cheaper machine,
so that the fast machine didn't waste so much time.

I don't know about the 60 bit machines, but there are stories
about C compilers for Cray machines using 64 bit char.

As with the CDC machines, Cray machines are designed for fast floating
point, and not so fast for fixed point.
[This is getting rather far from compilers but would be totally on-topic
in alt.folklore.computers. -John]
0 new messages