Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Inconsistent bit addressing in the 68020: big- AND little-endian

22 views
Skip to first unread message

Ken Turkowski

unread,
Aug 22, 1984, 6:59:21 PM8/22/84
to
A brief look at the 68020 instruction documentation shows that it has
two different types of bit addressing for different instructions.

The bit test, set, clear, etc. instructions address bits within a word
in a little-endian manner, i.e. bit 0 is in the least significant
position, bit 31 in the most significant.

The bit field instructions, on the other hand, address bits (across and
within words) in a big-endian manner, i.e. bit 0 is in the most
significant position, bit 31 in the least.

This proves that it is impossible to make a consistent big-endian
machine! :-)
--
Ken Turkowski @ CADLINC, Palo Alto, CA
UUCP: {amd,decwrl,dual,flairvax,nsc}!turtlevax!ken
ARPA: turtlevax!k...@DECWRL.ARPA

John Gilmore

unread,
Aug 23, 1984, 5:12:10 AM8/23/84
to
The 68000 addressed bits the wrong way (low order bit == bit 0, but
high order byte == byte 0). This was reportedly fought over internally
at Motorola and the people who said "Let's make BTST #0 test the data
from pin D0" won out -- they didn't want to confuse hardware types.

Well, all us software types beat on them mercilessly, so they fixed it
for the 68020. However, they had to keep the old instructions for
backward compatability. So they defined eight new instructions, which
all operate in a consistent way on bit FIELDS (not just bits). You
provide a byte address, a bit number, and a bit width. The bit number
is a signed 32-bit number; the width is 1 thru 32. They can be
immediate fields or in data registers. The instructions will figure
out which byte or bytes the relevant bits are in, get them, and process
them. The 8 things you can do with them are:

Clear them \
Set them | Same as old 68000 instructions
Invert them |
Test them /
Zero-extend and put them in a register
Sign-extend and put them in a register
Take low-order bits from a register and put them in the bitfield
Find the leftmost 1-bit in the field, put its bit# in a register

This is especially neat for bitmapped graphics, since you can address
the screen with bit numbers on random bit boundaries and the CPU will take
care of finding and lining up the data.

Ian Kaplan

unread,
Aug 24, 1984, 1:55:08 PM8/24/84
to

In Ken Turkowski's article on the big and little endian addressing used
on the 68020 he commented that Motorola's approach "proved" that a
machine with consistant bit addressing (e.g., all little or big endian)
was impossible. I assume that he was joking. If not perhaps a follow
up article could be submitted clairifing this. It is becoming widely
recognized that consistant (or orthogonal) instruction sets are a
desirable architectural feature. (The NS32016 instruction set is
one example of an orthogonal instruction set.) If consistant
instruction sets are desirable, then it seems obvious that consistant
bit addressing is also desirable.


Ian Kaplan
Loral Data Flow Group
Loral Instrumentation
ucbvax!sdccsu3!loral!ian

je...@gatech.uucp

unread,
Aug 27, 1984, 5:07:29 AM8/27/84
to
The next thing you know they will number their bits from 1 to 32 like Pr1me.
(8-{) (my that's complicated)

Jeff Lee
CSNet: Jeff @ GATech ARPA: Jeff.GATech @ CSNet-Relay
uucp: ...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff
--
Jeff Lee
CSNet: Jeff @ GATech ARPA: Jeff.GATech @ CSNet-Relay
uucp: ...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff

r...@rti-sel.uucp

unread,
Aug 27, 1984, 9:22:26 AM8/27/84
to

If you are looking for a consistant architecture, you
should take a look at the VAX architecture. It has the closest
to 100% orthogonal instruction set that I have ever seen. As far
as bit addressing is concerned, it is entirely little endian.
The least significant bit in any size value is ALWAYS bit number
0. And this is with all the bit field instructions of the machine
included. The best reference for this is the VAX Architecture Handbook.

Randy Buckland
Research Triangle Institute
...mcnc!rti-sel!rcb

Steve Glaser

unread,
Aug 28, 1984, 11:27:51 AM8/28/84
to
Actually, the 68020 does number the bits both ways.

Quoting from the 68020 User's Manual page 2-4:

A bit datum is specified by a base address that selects one
byte in memory and a bit number that selects the one bit in
this byte. The most significant bit of the byte is number
seven.

A bit field datum is specified by a base address that selects
one byte in memory, a bit field offset that indicates the
leftmost (base) bit of the bit field in relation to the most
significant bit of the base byte and a bit field width that
determines how many bits to the right of the base bit are in
the bit field. The most significant bit of the base byte is
bit offset 0, the least significant bit of the base byte is
offset 7, and the least significant bit of the previous byte
in memory is offset -1. Bit field offsets may have values in
the range of -2^31 to 2^31-1 and bit field widths may range
between 1 and 32.

Given the high-endian nature of the 68k, it makes sense to do bit
fields this way. It also makes sense to number the bits in a word
using the natural mathematical rules (bit n is the 2^n bit in an
unsigned int). The 68000 doesn't have the bit field data type
so it doesn't have this inconsistency. Oh well, it could be worse.

Steve Glaser
tektronix!steveg

Ken Turkowski

unread,
Aug 28, 1984, 1:34:33 PM8/28/84
to
=== REFERENCED ARTICLE ===================================
From: i...@loral.UUCP (Ian Kaplan)
Subject: Re: Inconsistent bit addressing in the 68020: big- AND little-endian

In Ken Turkowski's article on the big and little endian
addressing used on the 68020 he commented that Motorola's
approach "proved" that a machine with consistant bit addressing
(e.g., all little or big endian) was impossible. I assume that
he was joking. If not perhaps a follow up article could be
submitted clairifing this. It is becoming widely recognized
that consistant (or orthogonal) instruction sets are a
desirable architectural feature. (The NS32016 instruction set
is one example of an orthogonal instruction set.) If
consistant instruction sets are desirable, then it seems
obvious that consistant bit addressing is also desirable.

=== ARTICLE REFERENCED by i...@loral.UUCP =================
From: k...@turtlevax.UUCP (Ken Turkowski)
Subject: Inconsistent bit addressing in the 68020: big- AND little-endian

...


This proves that it is impossible to make a consistent
big-endian machine! :-)

=================================================================

Note the toungue-in-cheek symbol, ian. Obviously it is possible to
make a consistent big- or little-endian machine. In fact, if you
eliminate the old bit set/clear/test instructions, it becomes a
consistent big-endian machine (See sun!gnu John Gilmore's article
16...@sun.uucp). You just have to get used to the convention that bit 0
is the most significant bit.

Hugh Redelmeier

unread,
Aug 31, 1984, 10:52:06 AM8/31/84
to
Randy Buckland says the VAX is entirely little-endian. If you look at
the floating point format, it seems to be middle-endian or something:
the exponent is in the middle of the fraction, splitting the fraction
into two parts. Furthermore, the fraction components contained in succeeding
16-bit words decreases in significance (step-function-endian??).

Perhaps the reason is that in changing the precision of integers, the low order
bits are important (in narrowing, you pray the high order bits are the same;
in widening you make them so), whereas in floating point, the high order
fraction bits are the important ones. Thus, for various tricks, you want
to address an integer by the location of its low bits, and a float by
the location of its exponent & high order bits.

At first glance, the VAX f.p. format looks like a botch; at second glance,
it makes some sense. Any other thoughts?

Dave Martindale

unread,
Sep 1, 1984, 2:34:46 PM9/1/84
to
As someone else pointed out, the VAX floating point format is big-endian,
but that it makes some sense. Look at it this way: The bits adjacent
to the decimal point in an integer are at the addressed [byte] location,
and higher-order bits in the word/longword/quadword are at successively
higher addresses. The bits adjacent to the decimal point in all floating
point formats (plus the exponent, of course) are also at the addressed
[word] location, with lower-order bits at higher memory addresses.
Thus you can use the "feature" of passing the address of a double where
the address of its float portion is needed without having to add an offset.

The bit field instructions are entirely little-endian, contrary to someone
else's comment. Bit fields cross byte, word, and longword boundaries in
a consistent manner - the higher-order bits are always at the higher address.

Bill Shannon

unread,
Sep 3, 1984, 5:59:36 PM9/3/84
to
dmmart...@watcgl.UUCP says:
Thus you can use the "feature" of passing the address of a double where
the address of its float portion is needed without having to add an offset.

It should be made clear that this is EXTREMELY non-portable. It depends
on something which is not true on e.g. the Sun using IEEE floating point
format:

The bit representation of a float is the same as the bit
representation of a double except that it has fewer bits
in the mantissa, and that the address of a double will
point to a float with the "same" value.

On many machines, such as the Sun, this is NOT TRUE. We have been
bitten quite a few times by code that passes the address of a float
to a routine expecting a pointer to a double.

Bill Shannon
Sun Microsystems, Inc.

John Bruner

unread,
Sep 4, 1984, 11:23:48 AM9/4/84
to
The weird byte ordering for VAX floating point is due to compatibility
with the floating-point format on the PDP-11. The little-endian order
for bytes within a 16-bit word and big-endian order of 16-bit words
is also visible on the PDP-11 in its representation of 32-bit long
integers (both by the floating-point processor and the EIS multiply
and divide instructions).
--
John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
MILNET: j...@mordor.ARPA [jdb@s1-c] (415) 422-0758
UUCP: ...!ucbvax!dual!mordor!jdb ...!decvax!decwrl!mordor!jdb

Henry A. Strickland

unread,
Sep 4, 1984, 8:03:58 PM9/4/84
to
I think it's great that there's such little wrong with 68000 & 68020 that
we're reduced to discussing big & little indians. strick! (-:
--
the clouds project henry strickland
school of ics / ga tech
atlanta ga 30332 { akgua allegra hplabs ihnp4 }!gatech!strick

John Gilmore

unread,
Sep 6, 1984, 4:57:56 PM9/6/84
to
I'm really surprised that none of the Arpanauts in these newsgroups have
posted a copy of IEN 134 by Danny Cohen of USC-ISI. It's an Internet
Experiment Note entitled "On Holy Wars and a Plea for Peace" and concerns
bit ordering issues throughout computerdom. It's the best explanation
of the problem (including the evolution of PDP-11 and Vax byte ordering
botches) I've seen anywhere.

Does someone have a machine readable copy somewhere?
If so, MAIL IT TO ME and I will post exactly one copy (!).

John Gilmore, sun!g...@Berkeley.arpa

John Gilmore

unread,
Sep 8, 1984, 7:17:36 PM9/8/84
to
IEN 137 Danny Cohen
U S C/I S I
1 April 1980


ON HOLY WARS AND A PLEA FOR PEACE

INTRODUCTION


This is an attempt to stop a war. I hope it is not too late and that
somehow, magically perhaps, peace will prevail again.

The latecomers into the arena believe that the issue is: "What is the
proper byte order in messages?".

The root of the conflict lies much deeper than that. It is the question
of which bit should travel first, the bit from the little end of the
word, or the bit from the big end of the word? The followers of the
former approach are called the Little-Endians, and the followers of the
latter are called the Big-Endians. The details of the holy war between
the Little-Endians and the Big-Endians are documented in [6] and
described, in brief, in the Appendix. I recommend that you read it at
this point.

The above question arises from the serialization process which is
performed on messages in order to send them through communication media.
If the communication unit is a message - these problems have no meaning.
If the units are computer "words" then one may ask in which order these
words are sent, what is their size, but not in which order the elements
of these words are sent, since they are sent virtually "at-once". If
the unit of transmission is an 8-bit byte, similar questions about bytes
are meaningful, but not the order of the elementary particles which
constitute these bytes.

If the units of communication are bits, the "atoms" ("quarks"?) of
computation, then the only meaningful question is the order in which
bits are sent.

Obviously, this is actually the case for serial transmission. Most
modern communication is based on a single stream of information
("bit-stream"). Hence, bits, rather than bytes or words, are the units
of information which are actually transmitted over the communication
channels such as wires and satellite connections.

Even though a great deal of effort, in both hardware and software, is
dedicated to giving the appearance of byte or word communication, the
basic fact remains: bits are communicated.

Computer memory may be viewed as a linear sequence of bits, divided into
bytes, words, pages and so on. Each unit is a subunit of the next
level. This is, obviously, a hierarchical organization.
2

If the order is consistent, then such a sequence may be communicated
successfully while both parties maintain their freedom to treat the bits
as a set of groups of any arbitrary size. One party may treat a message
as a "page", another as so many "words", or so many "bytes" or so many
bits. If a consistent bit order is used, the "chunk-size" is of no
consequence.

If an inconsistent bit order is used, the chunk size must be understood
and agreed upon by all parties. We will demonstrate some popular but
inconsistent orders later.

In a consistent order, the bit-order, the byte-order, the word-order,
the page-order, and all the other higher level orders are all the same.
Hence, when considering a serial bit-stream, along a communication line
for example, the "chunk" size which the originator of that stream has in
mind is not important.

There are two possible consistent orders. One is starting with the
narrow end of each word (aka "LSB") as the Little-Endians do, or
starting with the wide end (aka "MSB") as their rivals, the Big-Endians,
do.

In this note we usually use the following sample numbers: a "word" is a
32-bit quantity and is designated by a "W", and a "byte" is an 8-bit
quantity which is designated by a "C" (for "Character", not to be
confused with "B" for "Bit)".


MEMORY ORDER

The first word in memory is designated as W0, by both regimes.
Unfortunately, the harmony goes no further.

The Little-Endians assign B0 to the LSB of the words and B31 is the MSB.
The Big-Endians do just the opposite, B0 is the MSB and B31 is the LSB.

By the way, if mathematicians had their way, every sequence would be
numbered from ZERO up, not from ONE, as is traditionally done. If so,
the first item would be called the "zeroth"....

Since most computers are not built by mathematicians, it is no wonder
that some computers designate bits from B1 to B32, in either the
Little-Endians' or the Big-Endians' order. These people probably would
like to number their words from W1 up, just to be consistent.

Back to the main theme. We would like to illustrate the hierarchically
consistent order graphically, but first we have to decide about the
order in which computer words are written on paper. Do they go from
left to right, or from right to left?
3

The English language, like most modern languages, suggests that we lay
these computer words on paper from left to right, like this:

|---word0---|---word1---|---word2---|....

In order to be consistent, B0 should be to the left of B31. If the
bytes in a word are designated as C0 through C3 then C0 is also to the
left of C3. Hence we get:

|---word0---|---word1---|---word2---|....
|C0,C1,C2,C3|C0,C1,C2,C3|C0,C1,C2,C3|.....
|B0......B31|B0......B31|B0......B31|......

If we also use the traditional convention, as introduced by our
numbering system, the wide-end is on the left and the narrow-end is on
the right.

Hence, the above is a perfectly consistent view of the world as depicted
by the Big-Endians. Significance consistency decreases as the item
numbers (address) increases.

Many computers share with the Big-Endians this view about order. In
many of their diagrams the registers are connected such that when the
word W(n) is shifted right, its LSB moves into the MSB of word W(n+1).

English text strings are stored in the same order, with the first
character in C0 of W0, the next in C1 of W0, and so on.

This order is very consistent with itself and with the English language.

On the other hand, the Little-Endians have their view, which is
different but also self-consistent.

They believe that one should start with the narrow end of every word,
and that low addresses are of lower order than high addresses.
Therefore they put their words on paper as if they were written in
Hebrew, like this:

...|---word2---|---word1---|---word0---|

When they add the bit order and the byte order they get:

...|---word2---|---word1---|---word0---|
....|C3,C2,C1,C0|C3,C2,C1,C0|C3,C2,C1,C0|
.....|B31......B0|B31......B0|B31......B0|

In this regime, when word W(n) is shifted right, its LSB moves into the
MSB of word W(n-1).
4

English text strings are stored in the same order, with the first
character in C0 of W0, the next in C1 of W0, and so on.

This order is very consistent with itself, with the Hebrew language, and
(more importantly) with mathematics, because significance increases with
increasing item numbers (address).

It has the disadvantage that English character streams appear to be
written backwards; this is only an aesthetic problem but, admittedly, it
looks funny, especially to speakers of English.

In order to avoid receiving strange comments about this orders the
Little-Endians pretend that they are Chinese, and write the bytes, not
right-to-left but top-to-bottom, like:

C0: "J"
C1: "O"
C2: "H"
C3: "N"
..etc..

Note that there is absolutely no specific significance whatsoever to the
notion of "left" and "right" in bit order in a computer memory. One
could think about it as "up" and "down" for example, or mirror it by
systematically interchanging all the "left"s and "right"s. However,
this notion stems from the concept that computer words represent
numbers, and from the old mathematical tradition that the wide-end of a
number (aka the MSB) is called "left" and the narrow-end of a number is
called "right".

This mathematical convention is the point of reference for the notion of
"left" and "right".

It is easy to determine whether any given computer system was designed
by Little-Endians or by Big-Endians. This is done by watching the way
the registers are connected for the "COMBINED-SHIFT" operation and for
multiple-precision arithmetic like integer products; also by watching
how these quantities are stored in memory; and obviously also by the
order in which bytes are stored within words. Don't let the B0-to-B31
direction fool you!! Most computers were designed by Big-Endians, who
under the threat of criminal prosecution pretended to be Little-Endians,
rather than seeking exile in Blefuscu. They did it by using the
B0-to-B31 convention of the Little-Endians, while keeping the
Big-Endians' conventions for bytes and words.

The PDP10 and the 360, for example, were designed by Big-Endians: their
bit order, byte-order, word-order and page-order are the same. The same
order also applies to long (multi-word) character strings and to
multiple precision numbers.

5

Next, let's consider the new M68000 microprocessor. Its way of storing
a 32-bit number, xy, a 16-bit number, z, and the string "JOHN" in its
16-bit words is shown below (S = sign bit, M = MSB, L = LSB):

SMxxxxxxx yyyyyyyyL SMzzzzzzL "J" "O" "H" "N"
|--word0--|--word1--|--word2--|--word3--|--word4--|....
|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|.....
|B15....B0|B15....B0|B15....B0|B15....B0|B15....B0|......

The M68000 always has on the left (i.e., LOWER byte- or word-address)
the wide-end of numbers in any of the various sizes which it may use: 4
(BCD), 8, 16 or 32 bits.

Hence, the M68000 is a consistent Big-Endian, except for its bit
designation, which is used to camouflage its true identity. Remember:
the Big-Endians were the outlaws.

Let's look next at the PDP11 order, since this is the first computer to
claim to be a Little-Endian. Let's again look at the way data is stored
in memory:

"N" "H" "O" "J" SMzzzzzzL SMyyyyyyL SMxxxxxxL
....|--word4--|--word3--|--word2--|--word1--|--word0--|
.....|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|
......|B15....B0|B15....B0|B15....B0|B15....B0|B15....B0|

The PDP11 does not have an instruction to move 32-bit numbers. Its
multiplication products are 32-bit quantities created only in the
registers, and may be stored in memory in any way. Therefore, the
32-bit quantity, xy, was not shown in the above diagram.

Hence, the above order is a Little-Endians' consistent order. The PDP11
always stores on the left (i.e., HIGHER bit- or byte-address) the
wide-end of numbers of any of the sizes which it may use: 8 or 16 bits.

However, due to some infiltration from the other camp, the registers of
this Little-Endian's marvel are treated in the Big-Endians' way: a
double length operand (32-bit) is placed with its MSB in the lower
address register and the LSB in the higher address register. Hence,
when depicted on paper, the registers have to be put from left to right,
with the wide end of numbers in the LOWER-address register. This
affects the integer multiplication and division, the combined-shifts and
more. Admittedly, Blefuscu scores on this one.

Later, floating-point hardware was introduced for the PDP11/45.

Floating-point numbers are represented by either 32- or 64-bit
quantities, which are 2 or 4 PDP11 words. The wide end is the one with
the sign bit(s), the exponent and the MSB of the fraction. The narrow
end is the one with the LSB of the fraction. On paper these formats are
clearly shown with the wide end on the left and the narrow on the right,
according to the centuries old mathematical conventions. On page 12-3
6

of the PDP11/45 processor handbook, [3], there is a cute graphical
demonstration of this order, with the word "FRACTION" split over all the
2 or the 4 words which are used to store it.

However, due to some oversights in the security screening process, the
Blefuscuians took over, again. They assigned, as they always do, the
wide end to the LOWer addresses in memory, and the narrow to the HIGHer
addresses.

Let "xy" and "abcd" be 32- and 64-bit floating-point numbers,
respectively. Let's look how these numbers are stored in memory:

ddddddddL ccccccccc bbbbbbbbb SMaaaaaaa yyyyyyyyL SMxxxxxxx
....|--word5--|--word4--|--word3--|--word2--|--word1--|--word0--|
.....|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|
......|B15....B0|B15....B0|B15....B0|B15....B0|B15....B0|B15....B0|

Well, Blefuscu scores many points for this. The above reference in [3]
does not even try to camouflage it by any Chinese notation.

Encouraged by this success, as minor as it is, the Blefuscuians tried to
pull another fast one. This time it was on the VAX, the sacred machine
which all the Little-Endians worship.

Let's look at the VAX order. Again, we look at the way the above data
(with xy being a 32-bit integer) is stored in memory:

"N" "H" "O" "J" SMzzzzzzL SMxxxxxxx yyyyyyyyL
...ng2-------|-------long1-------|-------long0-------|
....|--word4--|--word3--|--word2--|--word1--|--word0--|
.....|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|
......|B15....B0|B15....B0|B15....B0|B15....B0|B15....B0|

What a beautifully consistent Little-Endians' order this is !!!

So, what about the infiltrators? Did they completely fail in carrying
out their mission? Since the integer arithmetic was closely guarded
they attacked the floating point and the double-floating which were
already known to be easy prey.
7

Let's look, again, at the way the above data is stored, except that now
the 32-bit quantity xy is a floating point number: now this data is
organized in memory in the following Blefuscuian way:

"N" "H" "O" "J" SMzzzzzzL yyyyyyyyL SMxxxxxxx
...ng2-------|-------long1-------|-------long0-------|
....|--word4--|--word3--|--word2--|--word1--|--word0--|
.....|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|-C1-|-C0-|
......|B15....B0|B15....B0|B15....B0|B15....B0|B15....B0|

Blefuscu scores again. The VAX is found guilty, however with the
explanation that it tries to be compatible with the PDP11.

Having found themselves there, the VAXians found a way around this
unaesthetic appearance: the VAX literature (e.g., p. 10 of [4])
describes this order by using the Chinese top-to-bottom notation, rather
than an embarrassing left-to-right or right-to-left one. This page is a
marvel. One has to admire the skillful way in which some quantities are
shown in columns 8-bit wide, some in 16 and other in 32, all in order to
avoid the egg-on-the-face problem.....

By the way, some engineering-type people complain about the "Chinese"
(vertical) notation because usually the top (aka "up") of the diagrams
corresponds to "low"-memory (low addresses). However, anyone who was
brought up by computer scientists, rather than by botanists, knows that
trees grow downward, having their roots at the top of the page and their
leaves down below. Computer scientists seldom remember which way "up"
really is (see 2.3 of [5], pp. 305-309).

Having scored so easily in the floating point department, the
Blefuscuians moved to new territories: Packed-Decimal. The VAX is also
capable of using 4-bit-chunk decimal arithmetic, which is similar to the
well known BCD format.

The Big-Endians struck again, and without any resistance got their way.
The decimal number 12345678 is stored in the VAX memory in this order:

7 8 5 6 3 4 1 2
...|-------long0-------|
....|--word1--|--word0--|
.....|-C1-|-C0-|-C1-|-C0-|
......|B15....B0|B15....B0|

This ugliness cannot be hidden even by the standard Chinese trick.
8

SUMMARY (of the Memory-Order section)


To the best of my knowledge only the Big-Endians of Blefuscu have built
systems with a consistent order which works across chunk-boundaries,
registers, instructions and memories. I failed to find a
Little-Endians' system which is totally consistent.


TRANSMISSION ORDER


In either of the consistent orders the first bit (B0) of the first byte
(C0) of the first word (W0) is sent first, then the rest of the bits of
this byte, then (in the same order) the rest of the bytes of this word,
and so on.

Such a sequence of 8 32-bit words, for example, may be viewed as either
4 long-words, 8 words, 32 bytes or 256 bits.

For example, some people treat the ARPA-internet-datagrams as a sequence
of 16-bit words whereas others treat them as either 8-bit byte streams
or sequences of 32-bit words. This has never been a source of
confusion, because the Big-Endians' consistent order has been assumed.

There are many ways to devise inconsistent orders. The two most popular
ones are the following and its mirror image. Under this order the first
bit to be sent is the LEAST significant bit (B0) of the MOST significant
byte (C0) of the first word, followed by the rest of the bits of this
byte, then the same right-to-left bit order inside the left-to-right
byte order.

Figure 1 shows the transmission order for the 4 orders which were
discussed above, the 2 consistent and the 2 inconsistent ones.

Those who use such an inconsistent order (or any other), and only those,
have to be concerned with the famous byte-order problem. If they can
pretend that their communication medium is really a byte-oriented link
then this inconsistency can be safely hidden under the rug.

A few years ago 8-bit microprocessors appeared and changed drastically
the way we do business. A few years later a wide variety of 8-bit
communication hardware (e.g., Z80-SIO and 2652) followed, all of which
operate in the Little-Endians' order.
9

Now a wave of 16-bit microprocessors has arrived. It is not
inconceivable that 16-bit communication hardware will become a reality
relatively soon.

Since the 16-bit communication gear will be provided by the same folks
who brought us the 8-bit communication gear, it is safe to expect these
two modes to be compatible with each other.

The only way to achieve this is by using the consistent Little-Endians
order, since all the existing gear is already in Little-Endians order.

We have already observed that the Little-Endians do not have consistent
memory orders for intra-computer organization.

IF the 16-bit communication link could be made to operate in any order,
consistent or not, which would give it the appearance of being a byte-
oriented link, THEN the Big-Endians could push (ask? hope? pray?) for an
order which transmits the bytes in left-to-right (i.e., wide-end first)
and use that as a basis for transmitting all quantities (except BCD) in
the more convenient Big-Endians format, with the most significant
portions leading the least significant, maintaining compatibility
between 16- and 32-bit communication, and more.

However, this is a big "IF".

Wouldn't it be nice if we could encapsulate the byte-communication and
forget all about the idiosyncrasies of the past, introduced by RS232 and
TELEX, of sending the narrow-end first?

I believe that it would be nice, but nice things do not necessarily
occur, especially if there is so much silicon against them.

Hence, our choice now is between (1) Big-Endians' computer-convenience
and (2) future compatibility between communication gear of different
chunk size.

I believe that this is the question, and we should address it as such.

Short term convenience considerations are in favor of the former, and
the long term ones are in favor of the latter.

Since the war between the Little-Endians and the Big-Endians is
imminent, let's count who is in whose camp.

The founders of the Little-Endians party are RS232 and TELEX, who stated
that the narrow-end is sent first. So do the HDLC and the SDLC
protocols, the Z80-SIO, Signetics-2652, Intel-8251, Motorola-6850 and
all the rest of the existing communication devices. In addition to
these protocols and chips the PDP11s and the VAXes have already pledged
their allegiance to this camp, and deserve to be on this roster.
10

The HDLC protocol is a full fledged member of this camp because it sends
all of its fields with the narrow end first, as is specifically defined
in Table 1/X.25 (Frame formats) in section 2.2.1 of Recommendation X.25
(see [2]). A close examination of this table reveals that the bit order
of transmission is always 1-to-8. Always, except the FCS (checksum)
field, which is the only 16-bit quantity in the byte-oriented protocol.

The FCS is sent in the 16-to-1 order. How did the Blefuscuians manage
to pull off such a fiasco?! The answer is beyond me. Anyway, anyone
who designates bits as 1-to-8 (instead of 0-to-7) must be gullible to
such tricks.

The Big-Endians have the PDP10's, 370's, ALTO's and Dorado's...

An interesting creature is the ARPANet-IMP. The documentation of its
standard host interface (aka "LH/DH") states that "The high order bit of
each word is transmitted first" (p. 4-4 of [1]), hence, it is a
Big-Endian. This is very convenient, and causes no confusion between
diagrams which are either 32- (e.g., on p. 3-25) and 16-bit wide (e.g.,
on p. 5-14).

However, the IMP's Very Distant Host (VDH) interface is a Little-Endian.

The same document ([1], again, p. F-18), states that the data "must
consist of an even number of 8-bit bytes. Further, considering each pair
of bytes as a 16-bit word, the less significant (right) byte is sent
first".

In order to make this even more clear, p. F-23 states "All bytes (data
bytes too) are transmitted least significant (rightmost) bit first".

Hence, both camps may claim to have this schizophrenic double-agent in
their camp.

Note that the Lilliputians' camp includes all the who's-who of the
communication world, unlike the Blefuscuians' camp which is very much
oriented toward the computing world.

Both camps have already adopted the slogan "We'd rather fight than
switch!".

I believe they mean it.
11

SUMMARY (of the Transmission-Order section)


There are two camps each with its own language. These languages are as
compatible with each other as any Semitic and Latin languages are.

All Big-Endians can talk to each other with relative ease.

So can all the Little-Endians, even though there are some differences
among the dialects used by different tribes.

There is no middle ground. Only one end can go first.


CONCLUSION


Each camp tries to convert the other. Like all the religious wars of
the past, logic is not the decisive tool. Power is. This holy war is
not the first one, and probably will not be the last one either.

The "Be reasonable, do it my way" approach does not work. Neither does
the Esperanto approach of "let's all switch to yet a new language".

Our communication world may split according to the language used. A
certain book (which is NOT mentioned in the references list) has an
interesting story about a similar phenomenon, the Tower of Babel.

Little-Endians are Little-Endians and Big-Endians are Big-Endians and
never the twain shall meet.

We would like to see some Gulliver standing up between the two islands,
forcing a unified communication regime on all of us. I do hope that my
way will be chosen, but I believe that, after all, which way is chosen
does not make too much difference. It is more important to agree upon
an order than which order is agreed upon.

How about tossing a coin ???
12


time time
| |
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
\ | | /
<-MSB---------------LSB- -MSB---------------LSB->
order (1) | | order (2)


time time
| |
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
/ | | \
<-MSB---------------LSB- -MSB---------------LSB->
order (3) | | order (4)


Figure 1: Possible orders, consistent: (1)+(2), inconsistent: (3)+(4).

13

A P P E N D I X

Some notes on Swift's Gulliver's Travels:


Gulliver finds out that there is a law, proclaimed by the grandfather of
the present ruler, requiring all citizens of Lilliput to break their
eggs only at the little ends. Of course, all those citizens who broke
their eggs at the big ends were angered by the proclamation. Civil war
broke out between the Little-Endians and the Big-Endians, resulting in
the Big-Endians taking refuge on a nearby island, the kingdom of
Blefuscu.

Using Gulliver's unquestioning point of view, Swift satirizes religious
wars. For 11,000 Lilliputian rebels to die over a controversy as
trivial as at which end eggs have to be broken seems not only cruel but
also absurd, since Gulliver is sufficiently gullible to believe in the
significance of the egg question. The controversy is important
ethically and politically for the Lilliputians. The reader may think
the issue is silly, but he should consider what Swift is making fun of
the actual causes of religious- or holy-wars.

In political terms, Lilliput represents England and Blefuscu France.
The religious controversy over egg-breaking parallels the struggle
between the Protestant Church of England and the Catholic Church of
France, possibly referring to some differences about what the Sacraments
really mean. More specifically, the quarrel about egg-breaking may
allude to the different ways that the Anglican and Catholic Churches
distribute communion, bread and wine for the Anglican, but bread alone
for the Catholic. The French and English struggled over more mundane
questions as well, but in this part of Gulliver's Travels, Swift points
up the symbolic difference between the churches to ridicule any
religious war.


For ease of reference please note that Lilliput and Little-Endians
both start with an "L", and that both Blefuscu and Big-Endians start
with a "B". This is handy while reading this note.
14

R E F E R E N C E S

[1] Bolt Beranek & Newman.
Report No. 1822: Interface Message Processor.
Technical Report, BB&N, May, 1978.

[2] CCITT.
Orange Book. Volume VIII.2: Public Data Networks.
International Telecommunication Union, Geneva, 1977.

[3] DEC.
PDP11 04/05/10/35/40/45 processor handbook.
Digital Equipment Corp., 1975.

[4] DEC.
VAX11 - Architecture Handbook.
Digital Equipment Corp., 1979.

[5] Knuth, D. E.
The Art of Computer Programming. Volume I: Fundamental
Algorithms.
Addison-Wesley, 1968.

[6] Swift, Jonathan.
Gulliver's Travel.
Unknown publisher, 1726.
15

OTHER SLIGHTLY RELATED TOPICS (IF AT ALL)


not necessarily for inclusion in this note

Who's on first? Zero or One ??

People start counting from the number ONE. The very word FIRST is
abbreviated into the symbol "1st" which indicates ONE, but this is a
very modern notation. The older notions do not necessarily support this
relationship.

In English and French - the word "first" is not derived from the word
"one" but from an old word for "prince" (which means "foremost").
Similarly, the English word "second" is not derived from the number
"two" but from an old word which means "to follow". Obviously there is
an close relation between "third" and "three", "fourth" and "four" and
so on.

Similarly, in Hebrew, for example, the word "first" is derived from the
word "head", meaning "the foremost", but not specifically No. 1. The
Hebrew word for "second" is specifically derived from the word "two".
The same for three, four and all the other numbers.

However, people have,for a very long time, counted from the number One,
not from Zero. As a matter of fact, the inclusion of Zero as a
full-fledged member of the set of all numbers is a relatively modern
concept.

Zero is one of the most important numbers mathematically. It has many
important properties, such as being a multiple of any integer.

A nice mathematical theorem states that for any basis, b, the first b^N
(b to the Nth power) positive integers are represented by exactly N
digits (leading zeros included). This is true if and only if the count
starts with Zero (hence, 0 through b^N-1), not with One (for 1 through
b^N).

This theorem is the basis of computer memory addressing. Typically, 2^N
cells are addressed by an N-bit addressing scheme. Starting the count
from One, rather than Zero, would cause either the loss of one memory
cell, or an additional address line. Since either price is too
expensive, computer engineers agree to use the mathematical notation of
starting with Zero. Good for them!

The designers of the 1401 were probably ashamed to have address-0 and
hid it from the users, pretending that the memory started at address-1.
16

This is probably the reason that all memories start at address-0, even
those of systems which count bits from B1 up.

Communication engineers, like most "normal" people, start counting from
the number One. They never suffer by having to lose a memory cell, for
example. Therefore, they are happily counting 1-to-8, and not 0-to-7 as
computer people learn to do.

ORDER OF NUMBERS.

In English, we write numbers in Big-Endians' left-to-right order. I
believe that this is because we SAY numbers in the Big-Endians' order,
and because we WRITE English in Left-to-right order.

Mathematically there is a lot to be said for the Little-Endians' order.

Serial comparators and dividers prefer the former. Serial adders and
multipliers prefer the latter order.

When was the common Big-Endians order adopted by most modern languages?

In the Bible, numbers are described in words (like "seven") not by
digits (like "7") which were "invented" nearly a thousand years after
the Bible was written. In the old Hebrew Bible many numbers are
expressed in the Little-Endians order (like "Seven and Twenty and
Hundred") but many are in the Big-Endians order as well.

Whenever the Bible is translated into English the contemporary English
order is used. For example, the above number appears in that order in
the Hebrew source of The Book of Esther (1:1). In the King James
Version it is (in English) "Hundred and Seven and Twenty". In the
modern Revised American Standard Version of the Bible this number is
simply "One Hundred and Twenty-Seven".

INTEGERS vs. FRACTIONS

Computer designers treat fix-point multiplication in one of two ways, as
an integer-multiplication or as a fractional-multiplication.

The reason is that when two 16-bit numbers, for example, are multiplied,
the result is a 31-bit number in a 32-bit field. Integers are right
justified; fractions are left justified. The entire difference is only
a single 1-bit shift. As small as it is, this is an important
difference.

Hence, computers are wired differently for these kinds of
multiplications. The addition/subtraction operation is the same for
either integer/fraction operation.
17

If the LSB is B0 then the value of a number is SIGMA<B(i)*[(2)^i]>,
for i=0,15, in the above example. This is, obviously, an integer.

If the MSB is B0 then the value of a number is SIGMA<B(i)*[(1/2)^i]>,
for i=0,15. This is, obviously, a fraction.

Hence, after multiplication the Integerites would typically keep B0-B15,
the LSH (Least Significant Half), and discard the MSH, after verifying
that there is no overflow into it. The Fractionites would also keep
B0-B15, which is the MSH, and discard the LSH.

One could expect Integerites to be Little-Endians, and Fractionites to
be Big-Endians. I do not believe that the world is that consistent.

SWIFT's POINT

It may be interesting to notice that the point which Jonathan Swift
tried to convey in Gulliver's Travels in exactly the opposite of the
point of this note.

Swift's point is that the difference between breaking the egg at the
little-end and breaking it at the big-end is trivial. Therefore, he
suggests, that everyone does it in his own preferred way.

We agree that the difference between sending eggs with the little- or
the big-end first is trivial, but we insist that everyone must do it in
the same way, to avoid anarchy. Since the difference is trivial we may
choose either way, but a decision must be made.

bra...@trwspp.uucp

unread,
Sep 10, 1984, 12:11:14 PM9/10/84
to
[}{]

>> John Gilmore, sun!g...@Berkeley.arpa

But it was posted. Sometime in Jan. I believe. Anyway, I'll send you a
copy of it separatly. Only post it, however, if there is enough demand
since it is over 35k characters long and since it was already posted.

-- Brad Brahms
usenet: {decvax,ucbvax}!trwrb!trwspp!brahms
arpa: Brahms@USC-ECLC

Jonathon Luers 97320

unread,
Sep 10, 1984, 12:45:58 PM9/10/84
to
Re: Big- vs. little-endians

In the article reprinted by sun!gnu, "On Holy Wars and a Plea for
Peace," Danny Cohen claims that only the Big-endians have produced
a fully consistent machine (memory-order wise); however, I would
submit that even such a Blefuscuian machine as the 68000 cannot
deny an inherent Lilliputian trait: its registers are little-endian!

It is agreed that the 68000 stores the most significant bit of any
length integer at the lower address in main memory:

| C0 | C1 | C0 | C1 |
| W0 | W1 |
| long 0 |
|SMxxxxxxx|yyyyyyyyL|
"J" "O" "H" "N"

But when a long word is put into a register and you try to access
the first half of it (W0), you get instead the Lilliputian first
half, W1:

"J" "O" "H" "N"
|SMxxxxxxx|yyyyyyyyL| (D0.L)
|SMyyyyyyL| (D0.W)
|SMyL| (D0.B)
"N"

In other words:

A 68000 register (e.g. D0) can be viewed as a little memory, of
which only the zeroth element can be accessed: L0, W0, or C0 (referred
to as D0.L, D0.W, D0.B respectively). This is then a little-endian
memory, since C0 is the same as the little (least significant) end
of W0, and W0 is the little end of L0. This is of course very practical
compared to the Big-endian alternative, since a 16-bit integer can
be converted to a 32-bit one with the same value by simply extending
the sign bit for 16 bits to the left, rather than by shifting all 16 bits
to the right. Nevertheless, the little-endians seem to have successfully
planted a mole in the Big-endian camp!


On another issue touched on in Cohen's article, how many readers are
"top-endians" or "bottom-endians" when it comes to representing
memory vertically? I personally favor the top-endian approach,
putting location 0 at the top of the page and letting addresses
increase downward, since this is consistent with code listings and
memory dumps. (I guess that makes me a "computer scientist" rather
than an "engineer" - *sigh*) I'm rather surprised, though, at the
number of published documents (e.g. certain DEC manuals) which have
bottom-endian memory maps or even toggle back and forth.

Thanks to sun!gnu for reposting an entertaining article.

Jon Luers
AT&T Teletype Corp.
ihnp4!tty3b!jhl

bpr...@bmcg.uucp

unread,
Sep 10, 1984, 3:13:25 PM9/10/84
to
Relay-Version: version B 2.10.1 6/24/83; site dcdwest.UUCP
Posting-Version: version B 2.10 5/3/83; site bmcg.UUCP
Message-ID: <13...@bmcg.UUCP>
Date: Mon, 10-Sep-84 15:13:25 PDT

Holy Wars and a Plea for Peace

Organization: Burroughs Corporation, San Diego
Lines: 28

Thank you, gnu, for posting IEN 137; and thank you, Danny Cohen, whereever you
are, for having written it.

There is, however, one point of OTHER SLIGHTLY RELATED TOPICS (IF AT ALL) that
is disputeable. To demonstrate that it is disputeable, I herewith dispute it:

>The designers of the 1401 were probably ashamed to have address-0 and
>hid it from the users, pretending that the memory started at address-1.

Actually, the designers of the 1401 did something even worse than hiding it:
they used this global thing for a very local purpose. The cell addressed by
the address zero was used as the row counter for the card reader! I actually
used it, once, when I needed a '&' character in front of a copy of the card
image. The '&' character in 000 was left upon completion of the read-card
instruction, and represented the value of the last row past the read brushes.
Using this effect follows the First Law of Hacking: "Any phenomenon is an
interface."

Those same designers did a similar trick with address 100, but for the card
punch. In this case, the value left was '9', the last row to pass the punch
die.

Ah! The joys of reminiscing! But now, back to work.

--Bill Price
--
--Bill Price uucp: {decvax!ucbvax philabs}!sdcsvax!bmcg!bprice
arpa:? sdcsvax!bmcg!bprice@nosc

Peter Bain

unread,
Sep 12, 1984, 9:06:15 AM9/12/84
to
Here is the reference for Cohen's article

%A D. Cohen
%T On Holy Wars and a Plea for Peace
%J Computer
%N 10
%V 14
%D 1981
%M Oct.
%P pp 48-54

Another discussion of the subject is

%A H. Kirrman
%T Data Format and Bus Compatibility in Multiprocessors
%J Micro
%V 3
%N 4
%D 1983
%M Aug.
%P 32-47
-peter

Steven Pemberton

unread,
Sep 12, 1984, 11:33:28 PM9/12/84
to
[Please post any follow-ups to this only in net.nlang. The other newsgroups
are only there because the original article was. Because I had to get this
comment in first, I've moved the bug-food to another article :-)]

> In English and French - the word "first" is not derived from the word
> "one" but from an old word for "prince" (which means "foremost").
> Similarly, the English word "second" is not derived from the number
> "two" but from an old word which means "to follow".

There seems to be some confusion here: 'first' comes from 'fore' + 'est',
ie 'most fore'. It then came to be used to mean 'prince' in some related
languages (as it still does today, I believe, in Dutch: vorst, and German:
fuerst). I imagine this developed in the same way that Americans talk of the
'first lady'.

In French the word for first is 'premier', again not from a word for prince,
but from Latin primus (first). The French word for prince is 'prince' from
Latin 'Primus' + 'capere' (to take), and this French word is where the English
comes from. However the 'pr' and 'fr' of the English and French words are
cognate.

In Old English 'other' was used for second. The word 'second' comes from
French, but this does come from the Latin word 'secundus', following.

0 new messages