Ed Prochak <
edpr...@gmail.com> wrote:
(snip, someone wrote)
>> AFAICT, all general-purpose microprocessors use an 8-bit byte, and thus so
>> do systems built around them. Some systems whose CPU was built from
>> discrete logic used other sizes (I'm aware of 6, 10, and 12 bits).
>> Dedicated DSPs often have 32-bit bytes (i.e. memory is addressed in words).
(snip)
> it is conventional nowadays, but there were other options in the past,
> with 9bit bytes on 36 bit machines. And bit slice hardware to build
> your own custom processor width.
Well IBM S/360, byte addressed with 8 bit (EBCDIC) characters was
the successor for various machines with 6 bit character sets.
The 7090 was the popular IBM scientific machine before S/360,
with 36 bit words and a 6 bit (BCDIC) character set.
The IBM machines for business use before S/360 also used a six
bit character set, and most allowed for variable length operations
with a word mark bit in memory.
The scientific and business machines used different mappings for some
of the characters, which complicated things all around.
From Blauuw and Brooks' "Computer Architecture, Concepts and Evolution":
"Desiderata and constraits. The principle desiderata and constraints
established for EBCDIC design follow:
1) The code was to be an extension of BCD, compatible except as
in desiderata 2.
2) There were to be no duals -- that is, the BCD duals had to be
"unwound," and separate codes assigned for these graphics. The
character set had to include at least 53 characters.
3) The character set had to be suitable for interchange. It had to
fit all then-existing media and devices, including tape, disks,
cards, printers, typewriters, and keypunches. These realization
constraints were:
- The typewriter allowed 88 characters plus blank
- The IBM bar printer allowed 52+b or 64+b.
- The IBM chain printer allowed (240/n)+b. The chain was 240
characters long. One could have an integral number of n
repeats, or indeed fractional numbers at some higher cost.
- The keypunch's interpreing printer had a decoding mechanism
that could move nine steps in one direction, and seven in the
other. It could interpret (9 x 7 = 63) + b.
- Although blank did not need to occupy a spot on printer chains
and bars, since it could be printed by supressing hammer
firing, it occupied a code point.
4) The primary character set was to be representable in 6 bits,
but 8 bit versions had to include the lowercase alphabet,
distinguished by only one bit from the uppercase alphabet.
5) The character set was to be universal across the Latin-alphabet
natural languages. The alphabets had to include 29 letters, to
accomodate the German, French, and Scandinavian languages.
(A whole separate language-by-language story shows why 29
is a suitable number ofr many European languages.)
6) The character set was to include punctuation marks needed by
natural language. These were operationally defined by the
character set of a "correspondence" typewriter. This requirement
meant adding the unwound BCD characters : ; ? " ! and ideally
¢.
7) The designers of programming language PL/I wanted the FORTRAN
operators and delimiters, plus the logical operators & | ~,
the relational operators < >, and the brackets [ ]."
The last charcter in 6) is the cent sign. The book has a tilde (~) where
I believe the PL/I logical not (¬) sign should go.
continuing:
"Size of the printer subset of EBCDIC. Considering these constraints
and desiderata, the size z of the EBCDIC printer's uppercase subset
becomes a mathematical exercise:
1) The upper limit for z, including blank, is 64.
2) The characters of z plus the 29 lower case alphabetics had
to be representable on the 88+b typewriter. Hence z<=89-29,
which reduces the maximum to 60.
3) The lower limit of z derives from the 48 of BCD plus 5 to unwind
the duals; it is 53.
4) A 240 character print chain printer could not accomodate five
iterations of 53 characters each, but if restricted to four
iterations, each could have 60 characters. With the blank, it
would give 61 characters, which is greater than the maximum of
60 for the typerwriters one-case subset. Therefore, z=60
(including blank) was the best size."
"The steps taken to select a set of 60 graphics were as follows:
1) ?? was discarded completely because it was used rarely, according
to a customer survey.
2) & was made to serve both as the PL/I symbol for AND, and as the
commercial character.
3) $ was dedicated as one of the three uppercase national alphabet
characters, to be replaced by other currency symbols (for
example, the British pound symbol) as required. The
latin-alphabet languages needing larger alphabets fortunately do
not use unique currency symbols. Notice that the national
alphabet symbols are duals (even multiples) by definition.
4) @ and #, whose usage is mainly in the United States, were
dedicated as uppercase national alphabet characters, along with $.
5) ", ¢, ! were made lowercase national alphabet characters, thereby
getting them onto the typewriter but not onto the 60 character
printer. These three plus the 59 uppercase characters were put
on the 63+b keypunch. This left one remaining code that could be
printed on the keypunch; it was repersented on the card by 0-8-2.
This code was specifically forbidden from having a graphic,
because that would violate typewriter and printer
representability.
6) PL/I was force to give up two graphics. The language designers
chose to give up [ ]. This was a bad mistake -- as delimiters,
brackets are much more powerful than operators in making a
langauge easy to write, read, and parse. It would have been
better to give up < >. (Even after the character set was frozen,
the language designers could have used < > as brackets rather
than as operators. The chose not to; a bad decision, we
believe.)"
So, that is what goes into an eight bit code.
> If I were to hazard a guess, I'd say the 8bit byte became the overall
> standard when CPU manufacturers were no longer memory manufacturers.
> IOW, when memory became a commodity product.
Well, S/360 was pretty popular, and using 8 bits made it easier to
work with the 360. The HP machines with 16 bit words were also pretty
popular in the years before the 8080.
Intel originally developed the 4004 for a BCD calculator, and then
extended it to the 8 bit 8008. Most of the RAM chips at that time were
one bit wide, but EPROMs got popular at 8 bits, presumably to go along
with the 8008 and 8080.
If the microprocessor originated while IBM was in the 36 bit machine
business, things might have been different.
-- glen