Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Decimal Exponent Floating (like in JOSS)

10 views
Skip to first unread message

John Savard

unread,
Mar 15, 2005, 12:55:46 PM3/15/05
to
On my web page, at

http://home.ecn.ab.ca/~jsavard/index.html

I have a link to a document at

http://home.ecn.ab.ca/~jsavard/arch.pdf

which is a 142-page .PDF in two columns. It is derived from what was a
portion of my web page; I eliminated that for the time being for space
reasons, but brought it back in this form as I found I could barely
squeeze it in.

This is the description of an imaginary computer architecture; it is
written to illustrate various computer architectural concepts by
including virtually everything but the kitchen sink.

In any event, I noted that since I went as far as to include a
compressed decimal feature (based on Chen-Ho encoding rather than
Densely-Packed Decimal, since I use ten's complement by default...
although on another page,

http://home.ecn.ab.ca/~jsavard/crypto/mi060301.htm

I note that DPD can be modified to handle ten's complement
quantities), I should also go even further, particularly as decimal
floating-point is currently being seriously proposed, to include
binary floating-point with a decimal exponent, as John von Neumann
included with his JOSS interpreter (historically important for
inspiring FOCAL, which used conventional binary floating-point,
though).

Anyhow, if one wishes to make maximum use of the binary range, one has
a small range of numbers that get an extra decimal digit of precision;
that range is extended a tiny bit upwards part of the time if the test
for "divide by ten" is binary carry out, and some denormal quantities
are allowed if the test for "multiply by ten" is if a fraction is less
than 3/32nds - thus, one has an extra digit that is sometimes there,
and sometimes not.

Although one has exact results for decimal quantities within the
stated precision, therefore, the behavior of such numbers in other
respects is such as to horrify numerical analysts, I'm afraid.

John Savard

Steve Richfie1d

unread,
Mar 15, 2005, 11:35:43 AM3/15/05
to
John,

> This is the description of an imaginary computer architecture; it is
> written to illustrate various computer architectural concepts by
> including virtually everything but the kitchen sink.

Hey, you're stealing my thunder! Take a look at my kitchen sink proposal
at <http://www.smart-life.net/FP> Perhaps we should be borrowing from
each other's kitchen sinks?!

Steve Richfield

John Savard

unread,
Mar 16, 2005, 1:30:25 AM3/16/05
to
Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote in message news:<1110910906.bf168b09db3256f249d01177da071767@teranews>...

> Hey, you're stealing my thunder! Take a look at my kitchen sink proposal
> at <http://www.smart-life.net/FP> Perhaps we should be borrowing from
> each other's kitchen sinks?!

I see that your site isn't about a kitchen sink that looks like it was
designed by Rube Goldberg.

Instead, you're working on a proposal for a more flexible
floating-point arithmetic standard. I don't think you'll find anything
in my architecture that you haven't already covered there. With one
_possible_ exception.

You cite the literature for some of the gradual significance reduction
options you give. I offer, in the "native" floating-point format of
the machine, and in the "small" floating-point type, something I call
'extremely gradual underflow'. It provides use of all possible bit
patterns in a floating-point number, yet without suppressing the most
significant bit.

This is done through a _really_ slow reduction in significance for
smaller numbers; basically, degree of denormalization becomes the most
significant part of the exponent. Yet, it's not wholly new - instead,
it resembles A-law audio encoding, particularly if one finally ends
the sequence by allowing the exponent field to shrink.

John Savard

John Savard

unread,
Mar 16, 2005, 11:23:23 AM3/16/05
to
Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote in message news:<1110910906.bf168b09db3256f249d01177da071767@teranews>...

> Hey, you're stealing my thunder! Take a look at my kitchen sink proposal

> at <http://www.smart-life.net/FP> Perhaps we should be borrowing from
> each other's kitchen sinks?!

Looking at your web page further, I see that what you appear to be
proposing is a standard for the following:

A software floating-point support package for a computer that will
receive a request that it provide floating-point computations which
satisfy certain properties, and which will meet that request using a
floating-point format supported by the underlying hardware if
possible, and will simulate a format in software with those properties
if necessary.

A hardware floating-point coprocessor could also implement this as a
standard.

This is an interesting idea. Why is the committee dealing with IEEE
754 ignoring it?

There is actually a simple reason. Although an interesting idea, it
would require effort to implement. Thus, the fact that it exists as an
alternative is not enough to lead to an effort to standardize it.

The effort to create IEEE 754 resulted from a groundswell of demand
among people using computers to perform numerical calculations, based
on the fact that the floating-point offered on at least some
architectures (i.e., the hexadecimal floating-point of the IBM 360
series) was inadequate, and that gradual underflow was sorely needed.

Some items, such as affine infinity, were dropped from IEEE 754 - they
were implemented on the 8087, but not on its successors, therefore.

If enough people who work with numbers think that something like your
proposal is needed, then, once they take it up, this idea *will* go
somewhere; but there has to be a real demand before it is felt that it
is worth the effort.

John Savard

Steve

unread,
Mar 16, 2005, 5:18:51 PM3/16/05
to
John,

> This is an interesting idea. Why is the committee dealing with IEEE
> 754 ignoring it?

NIH = Not Invented Here. About a decade ago I was approached to
coordinatge the IEEE FP effort, My response "GREAT, now I fan fix all of
the problems it has!" "What Problems?" The next half hour was taken up
with my explanation of its many problems. That was the end of THAT
discussion. I probably should have just kept quiet until I had the position.

> There is actually a simple reason. Although an interesting idea, it
> would require effort to implement. Thus, the fact that it exists as an
> alternative is not enough to lead to an effort to standardize it.
>
> The effort to create IEEE 754 resulted from a groundswell of demand
> among people using computers to perform numerical calculations, based
> on the fact that the floating-point offered on at least some
> architectures (i.e., the hexadecimal floating-point of the IBM 360
> series) was inadequate, and that gradual underflow was sorely needed.
>
> Some items, such as affine infinity, were dropped from IEEE 754 - they
> were implemented on the 8087, but not on its successors, therefore.
>
> If enough people who work with numbers think that something like your
> proposal is needed, then, once they take it up, this idea *will* go
> somewhere; but there has to be a real demand before it is felt that it
> is worth the effort.

Here is the problem. Global Warming, outsourcing, the balance of trade,
etc., are all things that are now being allowed because present
simulations completely and hopelessly fail to predict their effects.
Further, there is good reason that these fixes to FP would make much
more reliable simulations possible. However, the people doing this don't
represent 1% of the PC users. Seeing some of the logical traps that the
VERY competent people on this forum fall into (e.g. using algorithms
that don't preserve significance/dimensionality) and complain about
non-existent problems, what chance is there for the community of users
to ever understand these issues.

Indeed, much of the present '754 is via "proof by authority" of the
person now commonly referred to as the "Father of IEEE FP", when in fact
even HE lacked the skills to do a good job of this, to the point of
leaving no smooth path for extension.

As I see it, the fundamental process is broken with little prospect for
broad-spectrum repair. Even PhD /Math people lack the skills needed to
participate usefully in this discussion. The ONLY hope I see is for some
competent person in Sun or Intel to "see the light" and run with this.

Steve Richfie1d

David W. Cantrell

unread,
Mar 16, 2005, 5:52:53 PM3/16/05
to
jsa...@ecn.ab.ca (John Savard) wrote:
[snip]

> Some items, such as affine infinity, were dropped from IEEE 754 - they
> were implemented on the 8087, but not on its successors, therefore.

FWIW:
I think you meant to say that the projective infinity was dropped. The
affine infinities were retained in IEEE 754.

David

John Savard

unread,
Mar 17, 2005, 1:59:51 PM3/17/05
to
Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote in message news:<1110910906.bf168b09db3256f249d01177da071767@teranews>...

> Perhaps we should be borrowing from

> each other's kitchen sinks?!

At first, I thought that there wasn't that much in common. I was
providing an architecture which included a number of features that
were on old computers and which could be useful for efficiency on some
problems, but, not claiming any expertise in numerical analysis, the
upper limit to any features in that respect which I provided was IEEE
754 - and that I only provided for compatibility.

Yes, I did include my own conceit of "extremely gradual underflow",
but that was to extend numeric range.

But I am now making *major revisions* to my architecture, thanks to
one of your posts. I hadn't realized that unnormalized floating-point
arithmetic could be used to allow multi-precision floating-point
arithmetic; it wasn't on a 360, but I can add "multiply extensibly
unnormalized" and "divide extensibly unnormalized" instructions to
make the necessary additional information visible when required; and I
had forgotten that unnormalized floating-point was useful for keeping
track of significance - and is NOT provided with IEEE 754, since that
format suppresses the first bit.

The problem is that I have been using most of the available opcode
space; but I can add some additional "short page" modes with plenty of
opcode space, and make suitable revisions to opcode translation, and
have room for unnormalized floating-point instructions. They won't
work with all the defined floating-point modes (they are incompatible
with suppressing the first bit of the mantissa, and they are also
incompatible with Extremely Gradual Underflow since denormalization
affects a number's value when that is used), so that will be
explained.

John Savard

Everett M. Greene

unread,
Mar 17, 2005, 2:52:47 PM3/17/05
to
Steve <St...@NOSPAM.smart-life.net> writes:
>
> > This is an interesting idea. Why is the committee dealing with IEEE
> > 754 ignoring it?
>
> NIH = Not Invented Here. About a decade ago I was approached to
> coordinatge the IEEE FP effort, My response "GREAT, now I fan fix all of
> the problems it has!" "What Problems?" The next half hour was taken up
> with my explanation of its many problems. That was the end of THAT
> discussion. I probably should have just kept quiet until I had the position.
[snip]

> Indeed, much of the present '754 is via "proof by authority" of the
> person now commonly referred to as the "Father of IEEE FP", when in fact
> even HE lacked the skills to do a good job of this, to the point of
> leaving no smooth path for extension.
>
> As I see it, the fundamental process is broken with little prospect for
> broad-spectrum repair. Even PhD /Math people lack the skills needed to
> participate usefully in this discussion. The ONLY hope I see is for some
> competent person in Sun or Intel to "see the light" and run with this.

How about enlightening the great unwashed masses as to
the nature of the deficiencies?

Of course, this may be difficult since you say that there
is almost nobody (besides you) who can understand the issues...

Steve Richfie1d

unread,
Mar 17, 2005, 11:48:44 PM3/17/05
to
Everett,

> How about enlightening the great unwashed masses as to
> the nature of the deficiencies?

Pretty much all of the features of <http://www.smart-life.net/FP>

> Of course, this may be difficult since you say that there
> is almost nobody (besides you) who can understand the issues...

I too am scrambling, as is everyone else here. This stuff is MUCH
tougher than anyone on the '754 committee yet realizes. My point was
that there are no saviors here, that people betting on anyone (including
me) because of their credentials or whatever are in for a bad result.

My only "edge" here is that I was VP of a development team that
constructed a full significance implementation at Remote Time Sharing
Corp. that saw two years of commercial time sharing service as the ONLY
arithmetic on their computer WAY back in the early 1970s. We had
considered some of these same issues then, like possibly including
dimensionality, but made different decisions then due to different
design constraints (e.g. a 16-bit ALU, a 1 MHz clock, and a 64 KB RAM).
I was also the primary support person there during the two years of
commercial service after it was implemented. This puts me WAY ahead in
general familiarity with the many subtle issues and practical ways of
dealing with common problems.

However, good math obviously beats bad experience, so if this technology
is to get dusted off here and now and spread across the world, then
there are lots of "little" details that we should get REALLY right. Some
of the participants here individually have more experience with the
subtleties of numerical analysis than Remote Time Sharing Corp did
during its entire existence, and certainly have more such experience
than I do.

This can only work through a consensus process, where we beat on the
issues until we all sing the same tune. To illustrate, as I pointed out
earlier, it wasn't until this week that I FINALLY realized how important
propagating dimensionality was to assuring that significance was being
computed correctly. This came from analyzing some good examples provided
by others here of computations that clearly produce the wrong
significance for their result. I has been assuming all along that
significance and dimensionality were separate issues, when they clearly
are NOT separate issues. Things that screw up significance also screw up
dimensionality. Significance screwups are silent, while dimensionality
screwups are detected immediately. Hence, you compute dimensionality NOT
because you are interested in the result (which would probably be
discarded), but just to generate faults when it gets messed up, e.g.
attempting to add, subtract, or compare quantities with different
dimensionality.

Steve Richfie1d

Steve Richfie1d

unread,
Mar 17, 2005, 11:55:13 PM3/17/05
to
John,

> The problem is that I have been using most of the available opcode

> space ...

My theory is that an add should add, whether the arguments are integers,
'754 numbers, unnormalized numbers, etc. You might even want to throw in
string concatenation! As the old Burroughs mainframes did, the typing
information belongs in the numbers and NOT in the opcodes.

Steve Richfie1d

Terje Mathisen

unread,
Mar 18, 2005, 1:59:16 AM3/18/05
to
Steve Richfie1d wrote:

Uff da!

Late binding at this level would automatically add at least one extra
cycle for all in-memory operands, besides requiring extra tag bits both
in memory and registers.

These delays would be there even in all the instances where a compiler
could prove that the actual type was known (pretty much everywhere when
you're writing in C, Pascal, Fortran etc.)

OTOH, as long as you're designing something just for fun, it really
doesn't matter.

Terje

--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

Steve Richfie1d

unread,
Mar 18, 2005, 1:28:38 AM3/18/05
to
Terje,

>> My theory is that an add should add, whether the arguments are
>> integers, '754 numbers, unnormalized numbers, etc. You might even want
>> to throw in string concatenation! As the old Burroughs mainframes did,
>> the typing information belongs in the numbers and NOT in the opcodes.
>
>
> Uff da!
>
> Late binding at this level would automatically add at least one extra
> cycle for all in-memory operands, besides requiring extra tag bits both
> in memory and registers.

It really depends on the details of the respective representations. If
you have a bit somewhere that says whether it is an integer or FP and
the two are handled very differently, then you are right. However, this
doesn't have to be the case.

Suppose for example the exponent is biased such that with a zero
exponent, the value in the mantissa is interpreted as though it were the
integer that it plainly is, then integers would simply be a type of
unnormalized FP. This is basically what Burroughs did in their
mainframes back in the 1960s and 1970s. Now, add a bit to indicate
whether the mantissa is precise (ala '754's stated presumption) or
whether it is +/- 1/2 LSB, and you have an integer representation that
will cost NO additional time over what would normally be required for
FP, except that it would have to be normalized when adding a FP number
to it to keep the significance correct, a problem that Burroughs didn't
deal with because theirs was not a significance system.

> These delays would be there even in all the instances where a compiler
> could prove that the actual type was known (pretty much everywhere when
> you're writing in C, Pascal, Fortran etc.)

The concept of "type" was obsolete before C was ever even conceived, as
the Burroughs design pretty clearly demonstrates! Types are useful for
programming small antique microcomputers but have no place in modern
CPUs, except for the fact that there is still a large dose of antiquity
in "modern" CPUs.

> OTOH, as long as you're designing something just for fun, it really
> doesn't matter.

It is only "just for fun" until it develops beyond current systems, then
it becomes a proposal for people to throw money at.

Steve Richfie1d

Herman Rubin

unread,
Mar 18, 2005, 10:18:33 AM3/18/05
to
In article <1111127346.e312902f1383c672153a3a1d4f04eead@teranews>,

It is not necessary that the entire opcode be decoded
before the instruction is issued. One of the oddest
I have seen was GEORGE, in which there was a 16 bit
opcode, four of the bits used by the instruction decoder,
another four by the unit to which the code was issued,
and the other 8 almost bit by bit by the central processor,
with only a small connection to the rest of the opcode.

As someone who has made good use of strings of bits in more
than one way, I must disagree with the idea of the type
being in the "numbers". However, it should be that way in
ASSEMBLER code, with type override.

For example, most computers have a subtract operator,
which I would like to write as x = y - z, with the
assembler putting in the appropriate instruction,
which is usually now something like SUBPQR u v w,
where PQR denotes the type, and u v w are the arguments
and the result in SOME order. I have not seen x in
the middle, but I have seen three of the four other
orders.

Now if one want to have the machine treat the words
as of another type, it becomes x =(type) y - z; this
does not convert the strings of bits, as C does, but
uses the bit strings as stated.


--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Tom Linden

unread,
Mar 18, 2005, 11:08:31 AM3/18/05
to
On 18 Mar 2005 10:18:33 -0500, Herman Rubin <hru...@odds.stat.purdue.edu>
wrote:

Compilers are very good at keeping track of type information and
issuing the appropriate instruction. A type-tagged architecture
is both unnecessary and inefficient use of hardware. Afterall, I
don't think there is a dynamic need, these are known at compile-time.
Of course, some compilers are better at this than others, but if you
think adding tagged architecture is a good idea for the language you
are using, then you are using the wrong language. This is a very
bad idea. What do you do for promotion?

>
>

Andy Freeman

unread,
Mar 18, 2005, 11:55:48 AM3/18/05
to
Tom Linden wrote:
> Compilers are very good at keeping track of type information and
> issuing the appropriate instruction. A type-tagged architecture
> is both unnecessary and inefficient use of hardware. Afterall, I
> don't think there is a dynamic need, these are known at compile-time.

Representations (they're not actually types) are OFTEN known at compile
time. It depends on the language.

Moreover, hardware efficiency does not imply problem solving
efficiency. Heck, maximal use doesn't even imply hardware speed and
redundancy can help. (Cache bits are copies and thus unnecessary and
they're "inefficient" because doubling the number doesn't double
performance, but they're still cost-effective.)

-andy

Herman Rubin

unread,
Mar 18, 2005, 12:15:25 PM3/18/05
to

I agree with this; it is Steve Richfield who suggested
this. However, I do not think compilers are good at
anything except the type of applications considered by
the compiler writers, and often even this is not so.

Afterall, I
>don't think there is a dynamic need, these are known at compile-time.
>Of course, some compilers are better at this than others, but if you
>think adding tagged architecture is a good idea for the language you
>are using, then you are using the wrong language.

There is rarely a right language. A right language
would allow the intelligent user to efficiently make
use of the power of the hardware, and this REQUIRES
the use of machine instructions not as the compiler
sees them.

This is a very
>bad idea. What do you do for promotion?

At the assembler level, it is a separate operation.
This is another bad feature of many machines; conversions
between integer and floating are slow and clumsy. They
should be one-cycle register-register instructions.

Brian Inglis

unread,
Mar 18, 2005, 1:44:48 PM3/18/05
to
On 18 Mar 2005 12:15:25 -0500 in alt.folklore.computers,
hru...@odds.stat.purdue.edu (Herman Rubin) wrote:

Obviously the architects don't consider the operation sufficiently
common to warrant adding a data path between the register files;
perhaps storing to memory and reloading, via cache, is almost as fast.

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Brian....@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

Herman Rubin

unread,
Mar 18, 2005, 4:08:47 PM3/18/05
to
In article <ue7m31hifbui4j6ga...@4ax.com>,

Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
>On 18 Mar 2005 12:15:25 -0500 in alt.folklore.computers,
>hru...@odds.stat.purdue.edu (Herman Rubin) wrote:

>>In article <opsnucghapzgicya@hyrrokkin>, Tom Linden <t...@kednos.com> wrote:

>> This is a very
>>>bad idea. What do you do for promotion?

>>At the assembler level, it is a separate operation.
>>This is another bad feature of many machines; conversions
>>between integer and floating are slow and clumsy. They
>>should be one-cycle register-register instructions.

>Obviously the architects don't consider the operation sufficiently
>common to warrant adding a data path between the register files;
>perhaps storing to memory and reloading, via cache, is almost as fast.

It is hard to see how it could possibly be. Converting
an integer in an integer register to floating takes
at least an integer operation (possibly more), and
integer store, a floating load, and a floating operation.

Floating to integer can be worse. The later versions of
POWER have a store integer part of floating to memory.

Steve Richfie1d

unread,
Mar 18, 2005, 6:10:23 PM3/18/05
to
Tom,

> Compilers are very good at keeping track of type information and
> issuing the appropriate instruction. A type-tagged architecture
> is both unnecessary and inefficient use of hardware. Afterall, I
> don't think there is a dynamic need, these are known at compile-time.
> Of course, some compilers are better at this than others, but if you
> think adding tagged architecture is a good idea for the language you
> are using, then you are using the wrong language. This is a very
> bad idea. What do you do for promotion?

I have LOTS of experience with Basic/VB (Visual Basic) where the
"variant" type is a mainstay. They literally have tens of thousands of
NaN types, not to mention a broad spectrum of object types, numerical
types, etc.

Microsoft's VB considers erroneous results to be their own type, with
associated error codes, descriptive strings assembled by the OS to
explain the problem, etc. In short, this is *WAY* beyond the limited NaN
comcept of '754.

They also have some semi-numeric data types, like Date/Time, which is
the number of days as a FP quantity since some time in the Dark Ages,
with fractions indicating fractions of a day. You can perform arithmetic
on these quantities as though they were DP FP, but for I/O they are
dates and times. Money is another semi-numeric data type.

Following your argument leads to a tight place: Do you handle these
semi-numeric types as ordinary numbers and lose their special nature, do
you insist on new operations for every data type leading to a REALLY
cumbersone design when you start looking at the full matrix of potential
conversions, or do you embed the type into the quantity itself?

On most systems, promotion is handled according to a table of potential
promotions that the compiler or run-time system designer has built into
their code. Some promotions end up being the result of several smaller
promotions.

An interesting case: In VB, where the programmer has said to add two
strings, the run-time system examines the strings to see if they can
possibly be interpreted as numbers. If so, the numbers are added and the
result is a number that may then have to be promoted to a string or
whatever the result of the addition is stated to be. If they are NOT
interpretable as numbers, then the two strings are concatenated. All
very cute, but in my years of programming on systems with this feature,
I have yet to find a good use for it ;)

Steve Richfie1d

Tom Linden

unread,
Mar 18, 2005, 7:59:06 PM3/18/05
to

Maybe the problem is a sloppy language spec? I can see no really good
reason
not to tighten the range of possibilites. For example, PL/I and Cobol have
picture data types which allow you to use character representation of
numbers.
so if you try to represent numbers as character strings there has to be a
set
of rules to determine behaviour. In the example below, the compiler warns
you
the you may not know what you are doing, in the third addition the
conversion of
the string blows up because it isn't valid for conversion to the type
implied
by the expression.

FREJA> pli pr
FREJA> creat pr.pli
tt: proc options(main);
dcl a fixed dec(10);
dcl f float bin(53);
a = 1 + '5';
put skip list(a);
f = 1.0 + '5.0';
put skip list(f);
f = 1.0 + '5.fluff';
end;
*EXIT*
FREJA> pli pr

a = 1 + '5';
........^
%PLIG-W-NOTARITH, Implicit conversion. A nonarithmetic expression,
a constant '5', has been used in a context
requiring an arithmetic value.
at line number 4 in file SYS$SYSDEVICE:[TOM.SCRATCH]PR.PLI;4

f = 1.0 + '5.0';
..........^
%PLIG-W-NOTARITH, Implicit conversion. A nonarithmetic expression,
a constant '5.0', has been used in a context
requiring an arithmetic value.
at line number 6 in file SYS$SYSDEVICE:[TOM.SCRATCH]PR.PLI;4

f = 1.0 + '5.fluff';
..........^
%PLIG-W-NOTARITH, Implicit conversion. A nonarithmetic expression,
a constant '5.fluff', has been used in a context
requiring an arithmetic value.
at line number 8 in file SYS$SYSDEVICE:[TOM.SCRATCH]PR.PLI;4

%PLIG-I-SUMMARY, Completed with 0 errors, 3 warnings, and
0 informational messages.
FREJA> link pr
%LINK-W-WRNERS, compilation warnings
in module TT file SYS$SYSDEVICE:[TOM.SCRATCH]PR.OBJ;4
FREJA> run pr

6
6.000000000000000E+00
%PLI-F-ERROR, PL/I ERROR condition.
-PLI-F-CONVERSION, PL/I CONVERSION condition.
-PLI-I-ONSOURCE, The conversion source is '5.fluff'.
-PLI-I-ONCNVPOS, The erroneous character is at position 3.
%TRACE-F-TRACEBACK, symbolic stack dump follows
image module routine line rel PC abs PC
DPLI$RTLSHR 0 000000000006ADA4
00000000000ACDA4
DPLI$RTLSHR 0 000000000005413C
000000000009613C
DPLI$RTLSHR 0 0000000000054014
0000000000096014
DPLI$RTLSHR DPLI_UTL_MAIN_HANDLER DPLI$UTL_MAIN_HANDLER
2211 0000000000000028
00000000000B1DF8
----- above condition handler called with exception 001E8324:
%PLI-F-CONVERSION, PL/I CONVERSION condition.
-PLI-I-ONSOURCE, The conversion source is '5.fluff'.
-PLI-I-ONCNVPOS, The erroneous character is at position 3.
----- end of exception message
0 FFFFFFFF800A1E5C
FFFFFFFF800A1E5C
DPLI$RTLSHR 0 000000000006A04C
00000000000AC04C
DPLI$RTLSHR 0 0000000000052294
0000000000094294
PR TT TT 8 00000000000002E0
00000000000302E0
0 FFFFFFFF802553D4
FFFFFFFF802553D4

>
> On most systems, promotion is handled according to a table of potential
> promotions that the compiler or run-time system designer has built into
> their code. Some promotions end up being the result of several smaller
> promotions.
>
> An interesting case: In VB, where the programmer has said to add two
> strings, the run-time system examines the strings to see if they can
> possibly be interpreted as numbers. If so, the numbers are added and the
> result is a number that may then have to be promoted to a string or
> whatever the result of the addition is stated to be. If they are NOT
> interpretable as numbers, then the two strings are concatenated. All
> very cute, but in my years of programming on systems with this feature,
> I have yet to find a good use for it ;)

Overloading + with concatenation is silly, it could have been a typo
>
> Steve Richfie1d

Steve Richfie1d

unread,
Mar 18, 2005, 6:47:08 PM3/18/05
to
Herman, et al,

>>Compilers are very good at keeping track of type information and
>>issuing the appropriate instruction. A type-tagged architecture
>>is both unnecessary and inefficient use of hardware.
>

> I agree with this; it is Steve Richfie1d who suggested
> this.

Point of clarification. My '754 replacement proposal does NOT include
implicit typing. Maybe in another decade or two ...

I think Burroughs had it right, but the arguments are really complex.
The sum total was that their system was virtually crash-proof - the
things that normally crash computers had no way of even being stated in
their architecture. You can't really argue the issues one-by-one, but
rather they must be taken as a whole. For example, to compute a
subscripted location, you did not (and indeed could not) simply add the
offset to the origin of the array. Instead, you performed an array
access via an array descriptor that was in memory protected from your
potential modification. Full subscript checking was a fully automatic
part of these operations. These sorts of complex operations did NOT slow
these computers down at all, as they had the additional hardware to
perform all of the extra work in parallel.

The mess we are now in came from two sources, early micros where there
was a severely restricted supply of transistors, and early
supercomputers where the transistors themselves were very expensive,
e.g. the hand wiring of a CRAY. Until we shed this legacy, computers
will continue to be unreliable contraptions, with people arguing over
how things "should" be to save a cycle here and there. How many cycles
are lost when your system crashes? How many cycles are lost because you
are unemployed because of uncontrolled outsourcing?

> There is rarely a right language. A right language
> would allow the intelligent user to efficiently make
> use of the power of the hardware, and this REQUIRES
> the use of machine instructions not as the compiler
> sees them.

The perfect optimizer theory: With a perfect optimizer, it doesn't make
any difference exactly how you write your program, the binary comes out
the same.

Compiler complexity goes as the SQUARE of the number of syntax and
semantic features, because the writer must consider all possible
interactions.

Put together, this means that languages should be SIMPLE like FORTRAN
rather than C/C++, but with an optimizer to recognize cumbersome code to
do simple things that is replaced with the appropriate machine
instruction(s) to do what is wanted.

Steve Richfie1d

Trevor L. Jackson, III

unread,
Mar 18, 2005, 8:40:54 PM3/18/05
to
Steve Richfie1d wrote:
> Tom,
>
>> Compilers are very good at keeping track of type information and
>> issuing the appropriate instruction. A type-tagged architecture
>> is both unnecessary and inefficient use of hardware. Afterall, I
>> don't think there is a dynamic need, these are known at compile-time.
>> Of course, some compilers are better at this than others, but if you
>> think adding tagged architecture is a good idea for the language you
>> are using, then you are using the wrong language. This is a very
>> bad idea. What do you do for promotion?
>
>
> I have LOTS of experience with Basic/VB (Visual Basic) where the
> "variant" type is a mainstay. They literally have tens of thousands of
> NaN types, not to mention a broad spectrum of object types, numerical
> types, etc.
>
> Microsoft's VB considers erroneous results to be their own type, with
> associated error codes, descriptive strings assembled by the OS to
> explain the problem, etc. In short, this is *WAY* beyond the limited NaN
> comcept of '754.

It is probably built on top of the '754 NaNs. There are up to ee18 of
them after all.

>
> They also have some semi-numeric data types, like Date/Time, which is
> the number of days as a FP quantity since some time in the Dark Ages,
> with fractions indicating fractions of a day. You can perform arithmetic
> on these quantities as though they were DP FP, but for I/O they are
> dates and times.

That's a Lotus standard. I has roots in VisiCalc and probably prior to
that.

/tj3

Trevor L. Jackson, III

unread,
Mar 18, 2005, 8:44:26 PM3/18/05
to
Steve Richfie1d wrote:

That leads to the conclusion that we should all be doing arithmetic in
Lisp. It has a DWIM (do what I mean) primitive which is lacking in
other languages

Not.

/tj3

Steve Richfie1d

unread,
Mar 18, 2005, 8:14:52 PM3/18/05
to
Trevor,

> That leads to the conclusion that we should all be doing arithmetic in
> Lisp. It has a DWIM (do what I mean) primitive which is lacking in
> other languages

When I was maintaining the back end to CDC's Supercomputer FORTRAN
compiler, I pointed out that they had probably put more work into the
compiler than it would take to hand-compile all of the programs that it
would ever see! This suggests some sort of macro approach, where people
hand-code how to translate statement sequences into machine code. They
were too far down the road they were on to turn around, but this
approach might be the best one for an entirely new hyper-complex
architecture like the 205.

Steve Richfie1d

Tom Linden

unread,
Mar 18, 2005, 9:58:46 PM3/18/05
to

No, what it suggests is that it was not well-managed.
>
> Steve Richfie1d

Steve Richfie1d

unread,
Mar 18, 2005, 8:43:21 PM3/18/05
to
Tom,

>> When I was maintaining the back end to CDC's Supercomputer FORTRAN
>> compiler, I pointed out that they had probably put more work into the
>> compiler than it would take to hand-compile all of the programs that
>> it would ever see! This suggests some sort of macro approach, where
>> people hand-code how to translate statement sequences into machine
>> code. They were too far down the road they were on to turn around,
>> but this approach might be the best one for an entirely new
>> hyper-complex architecture like the 205.
>
>
> No, what it suggests is that it was not well-managed.

Yea, there was some of that too. It was one gigantic module - the
assembly language listings if stacked one on top of the other would make
a pile about 8 feet tall. Imagine trying to chase down what was
clobbering some location in memory in such a pile! A little more than
half was in the optimizer and vectorizer, where bugs typically took
about 2 weeks each to find and fix.

Steve Richfie1d

Tom Linden

unread,
Mar 18, 2005, 10:46:18 PM3/18/05
to

I am glad it wasn't me:-) And you probably had to do it in C, poor sot:-)
>
> Steve Richfie1d

Steve Richfie1d

unread,
Mar 18, 2005, 11:40:53 PM3/18/05
to
Tom,

>> Yea, there was some of that too. It was one gigantic module - the
>> assembly language listings if stacked one on top of the other would
>> make a pile about 8 feet tall. Imagine trying to chase down what was
>> clobbering some location in memory in such a pile! A little more than
>> half was in the optimizer and vectorizer, where bugs typically took
>> about 2 weeks each to find and fix.
>
>
> I am glad it wasn't me:-) And you probably had to do it in C, poor sot:-)

No, it was in Cyber 200 series supercomputer assembly language. It had
two closely coupled processors; one scalar processor, and between one
and four vector processors - it had to run on customer's computers
regardless of their configuration. Maybe 1% of the lines of code were
vector operations - if you got em, then use em. The essence of the
design of the compiler was to organize things so that you could use
vector operations on the tables.

I came into this late in the game. It was already operational and
working pretty well - with just enough bugs to keep 10 of us busy. The
others preferred working on the front end where most of the bugs were,
because you could actually find and fix a problem in a couple days.
However, the back end of the compiler was something else, and I DO so
enjoy puzzles.

Most of the bugs were situations where one optimization would break some
underlying presumption that another optimization was coded to presume.
There weren't many simple programming errors.

Steve Richfie1d

Morten Reistad

unread,
Mar 19, 2005, 4:00:03 AM3/19/05
to

They are half-way to what I use to call a "surface language":

This is a language that really is the bottom half of a large
application; where types, procedures and semantics are strongly
adapted to the matter at hand.

Think about accounting. A t-account can be a list-like structure;
and transactions another language type. All regulatory requirements
of logging and verifiability can be done at this language layer; making
it easy for _accountants_, not professional programmers, to write their
application on top.

I know PL/1 does type conversion with wild abandon.

Personally, I think they overdo it.

-- mrr


jmfb...@aol.com

unread,
Mar 19, 2005, 6:51:47 AM3/19/05
to
In article <3a1jtnF...@individual.net>,

Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote:
>Tom,
>
>>> When I was maintaining the back end to CDC's Supercomputer FORTRAN
>>> compiler, I pointed out that they had probably put more work into the
>>> compiler than it would take to hand-compile all of the programs that
>>> it would ever see! This suggests some sort of macro approach, where
>>> people hand-code how to translate statement sequences into machine
>>> code. They were too far down the road they were on to turn around,
>>> but this approach might be the best one for an entirely new
>>> hyper-complex architecture like the 205.
>>
>>
>> No, what it suggests is that it was not well-managed.
>
>Yea, there was some of that too. It was one gigantic module - the
>assembly language listings if stacked one on top of the other would make
>a pile about 8 feet tall. Imagine trying to chase down what was
>clobbering some location in memory in such a pile!

Then write an address break routine. It's better if you get
a hardware assist but you can do this with software.

<snip>

/BAH

Subtract a hundred and four for e-mail.

Toon Moene

unread,
Mar 19, 2005, 11:54:36 AM3/19/05
to
Steve Richfie1d wrote:

> Most of the bugs were situations where one optimization would break some
> underlying presumption that another optimization was coded to presume.
> There weren't many simple programming errors.

Sounds like GCC to me :-)

BTW, do you happen to know why the 205 software team wanted to use a
language to write the OS (VSOS) in (IMPL) that was so slightly different
from Fortran that we routinely ran into bugs caused by subtle
incompatibilities between the two languages ?

Why not simply use Fortran in the first place ?

--
Toon Moene - e-mail: to...@moene.indiv.nluug.nl - phone: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
A maintainer of GNU Fortran 95: http://gcc.gnu.org/fortran/
News on GNU Fortran 95: http://gfortran.org/

Edward A. Feustel

unread,
Mar 19, 2005, 12:01:43 PM3/19/05
to

"Steve Richfie1d" <St...@NOSPAM.smart-life.net> wrote in message
news:3a1d3qF...@individual.net...

> Point of clarification. My '754 replacement proposal does NOT include
> implicit typing. Maybe in another decade or two ...
>
> I think Burroughs had it right, but the arguments are really complex. The
> sum total was that their system was virtually crash-proof - the things
> that normally crash computers had no way of even being stated in their
> architecture. You can't really argue the issues one-by-one, but rather
> they must be taken as a whole. For example, to compute a subscripted
> location, you did not (and indeed could not) simply add the offset to the
> origin of the array. Instead, you performed an array access via an array
> descriptor that was in memory protected from your potential modification.
> Full subscript checking was a fully automatic part of these operations.
> These sorts of complex operations did NOT slow these computers down at
> all, as they had the additional hardware to perform all of the extra work
> in parallel.
>
> The mess we are now in came from two sources, early micros where there was
> a severely restricted supply of transistors, and early supercomputers
> where the transistors themselves were very expensive, e.g. the hand wiring
> of a CRAY. Until we shed this legacy, computers will continue to be
> unreliable contraptions, with people arguing over how things "should" be
> to save a cycle here and there. How many cycles are lost when your system
> crashes? How many cycles are lost because you are unemployed because of
> uncontrolled outsourcing?

I agree with you that many of the problems associated with computers and
debugging might be eliminated if data was either described by descriptors or
typed with tags, or both. I wrote a
paper in 1973:

E.A. Feustel, "On the Advantages of Tagged Architectures",

I.E.E.E. Transactions on Computers, Vol. 22, pp. 644-652(Jul 73).

In which I advanced a number of arguments for doing this and for providing
"Self-Describing Data". Note that XML seems to have taken up this idea for
data to be transferred between machines.

Unfortunately, life moved toward the C language which has a basic notion

in conflict with typing: an address is just a number and you can add an

integer to it and come up with another address which you can use. This
notion

makes it quite difficult to implement on machines that know the difference
between addresses and numbers. (See Iliffe's The Basic Language Machine
book.)

If further as he describes, addresses represent either locations to which
you can transfer control or regions of storage that can be indexed, you have
gone a long way to prevent buffer overflow and to prevent "hostile" takeover
of a rogue program. There are also significant possibilities for the use of
parallelism and multiple functional units, a'la the CDC and Cray machines
and even restructurable computers:

S.S. Reddi and E.A. Feustel, "On Restructurable Computer Architectures",

I.E.E.E. Transactions on Computers, Vol. 27, pp.1-20(1978).

A Best Transactions Paper Prize from the Computer Society for 1978.

It becomes interesting for machines to do "For All Elements" mathematics
using as much parallelism as they can muster. And if they know what to do
with "infinity", NaNs, etc., so much the better.

As has been pointed out, at the time, memory and peripheral storage were
perhaps the most expensive parts of computer systems, so there was
considerable economic justification for not making the change -- witness
the ICL 1900 instead of the practical implementation of Iliffe's design.
Logic was also limited. As I recall (perhaps incorrectly) someone said that
an IBM 360/50 was implemented with about 50K gates. And it was not fun to
build machines from "minute-scale-integration" cmos or ecl! Think what could
be done with today's VLSI!

Now, I think that the big problem is operating systems. For better or for
worse, Unix/Linux and programs written in C or C-like derivatives with a
C-style (flat data) memory model have taken over and it is unlikely they
will be supplanted any time soon.

For a new architecture to take hold quickly, you need to have an "off the
shelf" OS and a large stock of programs to run on them -- hence the success
of SPARC, MIPS, and Power PC to name a few (let alone the Intel x86). It is
so much "easier" to evolve than to have a revolution to data flow or
something "more interesting" even if there can be significantly better
security, error detection, hardware optimization, etc.

As to languages that could benefit from machine types, take LISP, Icon, and
APL to name a few and data as mentioned XML.

Ed Feustel


----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= East/West-Coast Server Farms - Total Privacy via Encryption =---

Anne & Lynn Wheeler

unread,
Mar 19, 2005, 12:25:05 PM3/19/05
to
"Edward A. Feustel" <efeu...@direcway.com> writes:

> I agree with you that many of the problems associated with computers
> and debugging might be eliminated if data was either described by
> descriptors or typed with tags, or both. I wrote a paper in 1973:
>
> E.A. Feustel, "On the Advantages of Tagged Architectures",
>
> I.E.E.E. Transactions on Computers, Vol. 22, pp. 644-652(Jul 73).
>
> In which I advanced a number of arguments for doing this and for
> providing "Self-Describing Data". Note that XML seems to have taken
> up this idea for data to be transferred between machines.

fs (future system) was an effort that was targeted at all sorts of
complex hardware descriptors ... there was some analysis that the
worst case could have five levels of indirection (in the hardware) for
each argument. one of the nails in the fs coffin was some analysis
that if you took the highest performance technology available at the
time from 195s for building an FS machine, the resulting application
thruput would be about that of 370/145 (possibly 10:1 or 20:1
slow-down). in any case, fs was killed before even being announced
or publicized. specific refs:
http://www.garlic.com/~lynn/2000f.html#16 FS - IBM Future System

misc. other references
http://www.garlic.com/~lynn/subtopic.html#futuresys

to some extent, the start of 801/risc effort in the 70s was born to
try and do the exact opposite of the FS effort ... and all complexity
was moved to the (PL.8) compiler. misc. past 801
http://www.garlic.com/~lynn/subtopic.html#801

the precursor to XML was SGML
http://www.garlic.com/~lynn/subtopic.html#sgml

and before that was GML which was invented at the science center
in 1969:
http://www.garlic.com/~lynn/subtopic.html#545tech

g, m, & l are the last name initials of three of the people at the
science center ... but officially was "generalized markup language"
... and an implementation was added to the cms "script" command for
document formating (script had originally started off with dot-command
formating controls) ... and gml allowed the separtion/independence of
the specification of the document components from the specification of
the formating of those compoents ... and some applications started
using the specification of the document components for things other
than formating. however there were other efforts at the science center
along the lines of self-describing data ... one involved years of 7x24
performance monitoring data.

total trivia, the w3c offices are now only a couple blocks from the
old science center location at 545 tech sq.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Herman Rubin

unread,
Mar 19, 2005, 1:55:58 PM3/19/05
to
In article <r4udnedEbYh...@comcast.com>,

>> Herman, et al,

Worse than that; while the complexity goes up as the square,
the time take to take these into account goes up exponentially.

One thing I suggested was that the user supply a list of ways
of performing user-considered basics, and the compiler decide
among them according the machine and even the program. This
is exponential.

>> Put together, this means that languages should be SIMPLE like FORTRAN
>> rather than C/C++, but with an optimizer to recognize cumbersome code to
>> do simple things that is replaced with the appropriate machine
>> instruction(s) to do what is wanted.

The optimizer should also take into account user-added
machine instructions, or procedures the compiler writer
did not take into account.

>That leads to the conclusion that we should all be doing arithmetic in
>Lisp. It has a DWIM (do what I mean) primitive which is lacking in
>other languages

Not with that syntax. For example, see the comment I
made about the syntax for machine instructions; I do not
think that he did a great job on this, but Seymour Cray
seems to be the best. I am unwilling to have a macro
begin with a macro name in use; this requires overloading,
and even user-assigned priority.

By the time those 50 additional characters have been typed,
there may well be several errors made.
>Not.

>/tj3

Nick Spalding

unread,
Mar 19, 2005, 3:05:40 PM3/19/05
to
Toon Moene wrote, in <423c5752$0$22152$4d4e...@news.nl.uu.net>:

> Steve Richfie1d wrote:
>
> > Most of the bugs were situations where one optimization would break some
> > underlying presumption that another optimization was coded to presume.
> > There weren't many simple programming errors.
>
> Sounds like GCC to me :-)
>
> BTW, do you happen to know why the 205 software team wanted to use a
> language to write the OS (VSOS) in (IMPL) that was so slightly different
> from Fortran that we routinely ran into bugs caused by subtle
> incompatibilities between the two languages ?
>
> Why not simply use Fortran in the first place ?

NIH!
--
Nick Spalding

iai...@truecircuits.com

unread,
Mar 19, 2005, 5:56:57 PM3/19/05
to
Herman> conversions between integer and floating are slow
Herman> and clumsy. They should be one-cycle
Herman> register-register instructions.

Hmmm... There might be a kernel of truth in here. Care to tease it
out?

- Do calculations typically switch back-and-forth between
floating-point and integer operations?

I suspect going back and forth is rare in straight math. Logical ops
on FP numbers do not yield numbers, integer adds and subtracts could
just as well be done as FP adds and subtracts. BUT:

- Are floating-point calculations often used to determine addressing
for load-store operations?

I suspect this happens frequently, or could happen frequently if the
speed path was there. Table look-ups, algorithms that deform and then
resample to a grid... I suspect there may even be many physics
algorithms that look a lot like texture mapping (multidimensional
interpolated table lookups).

So here's a question for you: what sort of machine instruction would be
useful to tell the CPU how to do a multidimensional interpolated table
lookup? Since it's probably got too many inputs, and maybe too many
outputs, and too many variations (dimensions, interpolation datatype,
interpolation scheme, etc), it should probably be factored into a few
ops. What are those ops?

Trevor L. Jackson, III

unread,
Mar 19, 2005, 6:18:59 PM3/19/05
to
That was a joke. Just like doing arithmetic in Lisp.

/tj3

Steve

unread,
Mar 19, 2005, 7:07:02 PM3/19/05
to
We've been discussing types and where this information should be kept.
However, MOST of the computations done by computers are in computing
addresses, not values on the way to final results. Most of the time,
when you are dealing with integers, it is on the way to computing an
address.

I think it has become pretty obvious that leaving address computation to
programmers or compilers is a prescription for unreliable systems. Most
crashes are due to clobbering memory due to defective address
computation. Further, good design can have the address computation and
subscript checking done in parallel with other operations, whereas
making address computation just a part of integer operations forecloses
this valuable opportunity for simultaneous speed and run-time checking.

Hence, those who have been stating what "should" happen with mixed
types, e.g. leave them up to the compiler, are REALLY indirectly saying
to leave the present addressing mess as it is, when much better methods
are known. The ONLY way to be safe with an open language like C is to do
the checking at run time. To do this in a compiler requires going back
to the restrictive addressing of FORTRAN, and completely eliminating
pointers from the language.

Hence, rather than discussing what "should" happen, I think we need to
look at the possible choices and consider which is best:
1. Leave systems with the present defect that simple programming errors
like ++ing too far can clobber memory that is NOT what the programmer
thought he was working with (Change nothing).
2. Do the checking at compile time, which means that you cannot compile
C/C++ code on such a machine without risking #1 (Go back to COBOL and
FORTRAN).
3. REQUIRE that compilers include code to check ALL addressing at
run-time. This is monumentally difficult with C/C++ where it can be
pretty hard to determine what a pointer is pointing into. You only need
to check store operations to keep the systems from crashing, but you
also need to check the loads for security (The slowest possible approach).
4. Specially mark integers and addresses in some way so that
reasonableness of operation can be verified at run-time (What I was
suggesting).
5. Require that pointer and array operations go through a special
operator that simultaneously does the address computation and checks the
result (the Burroughs approach).

Have I missed any possibilities here?

The problem with the previous discussions was that there was an implied
false assumption that integers are usually used with themselves or FP,
when in fact they are often/usually used for address computation.

OK, take your choice. Which combination of features/faults looks best?

It seems obvious (to me) that a LOT more hardware smarts are in order to
do addressing correctly and efficiently, like #5 above which provides
both maximum speed and 100% checking. Whether these should be rolled up
into what started out as an FP spec is still open for debate. However,
this discussion DOES affect the operation of integers.

Hence, when saying how integers "should" be represented, please don't
forget to indicate your prejudice regarding the whole addressing issue
that integers are but a small part of.

Again, this whole discussion has NOTHING to do with my FP proposal, as
that ONLY covers FP, not integers and addresses (yet).

Steve Richfie1d

John Savard

unread,
Mar 19, 2005, 9:11:29 PM3/19/05
to
Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote in message news:<3a1d3qF...@individual.net>...

> The mess we are now in came from two sources, early micros where there

> was a severely restricted supply of transistors, and early
> supercomputers where the transistors themselves were very expensive,
> e.g. the hand wiring of a CRAY.

I'm not philosophically opposed to labelling types. It doesn't make
sense to interpret a floating-point number as a string of characters,
or as an integer.

But transistors are still expensive, and will continue to be until
there is no way to make a chip more powerful except by putting more
cores on it. Then, using many transistors to make the component
processors faster will be worth it, because not everything is
parallelizable.

The Burroughs machines were real computers, and part of my goal is to
illustrate existing computer architectural concepts.

But for now, my orientation is conventional - and flexibility and
performance are the main goals the architecture addresses. Not safety
or correctness. But I do have a hardware just-in-time compilation
assist, if you want P-code of any kind!

And I've finally updated the description - with changes to the
external vector coprocessing as well as including unnormalized
floating-point.

John Savard

John Savard

unread,
Mar 19, 2005, 9:18:37 PM3/19/05
to
Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote in message news:<1111126955.1e89b2985f2074b0d64f86d305e4039e@teranews>...
> This can only work through a consensus process, where we beat on the
> issues until we all sing the same tune. To illustrate, as I pointed out
> earlier, it wasn't until this week that I FINALLY realized how important
> propagating dimensionality was to assuring that significance was being
> computed correctly. This came from analyzing some good examples provided
> by others here of computations that clearly produce the wrong
> significance for their result.

The only good example of that I can think of off the top of my head is
taking the logarithm of a number - it converts significant figures
into a fixed magnitude of significance.

Incidentally, though, E = mc^2 is not dimensionally unsound, as you
appeared to have claimed in another post.

Energy is what is expended by moving a mass through a distance against
a force.

If you move one kilogram one meter against a constant pull of one
Newton, you expend one Joule.

A Newton is the unit of force which accelerates that on which it pulls
by one meter per second squared.

It does work out.

John Savard

Christian Bau

unread,
Mar 20, 2005, 4:44:29 AM3/20/05
to
In article <3a1aupF...@individual.net>,
Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote:

> An interesting case: In VB, where the programmer has said to add two
> strings, the run-time system examines the strings to see if they can
> possibly be interpreted as numbers. If so, the numbers are added and the
> result is a number that may then have to be promoted to a string or
> whatever the result of the addition is stated to be. If they are NOT
> interpretable as numbers, then the two strings are concatenated. All
> very cute, but in my years of programming on systems with this feature,

In my record collection, there are about half a dozen songs whose title
could be interpreted as a number: 39. 40, 1-1, 1-2, 2-1, 2-2, 99, 911,
1969, 1984. There are a few records with an album title that is a
number: 5, 2006, 90125, 85555 and a few others.

Now what do I have to do to make sure that song title and album title
have to be concatenated without problems?

> I have yet to find a good use for it ;)

Job security.

Trevor L. Jackson, III

unread,
Mar 20, 2005, 6:35:46 AM3/20/05
to
John Savard wrote:

> I'm not philosophically opposed to labelling types. It doesn't make
> sense to interpret a floating-point number as a string of characters,
> or as an integer.

Sometimes it does. Floating point types are text when NaNs have
embedded labels in them that describe the nature of the problem that
created them.

And IEEE-754 floating points types are specifically designed to make it
possible to treat them as integers, that's why their binary
representations can be monotonic. Smart compilers on AMD/intel machines
can perform comparisons on floating point values without using the FPU.
And in the case of stupid compilers programmers can perform the same
optimization.

/tj3

Christian Bau

unread,
Mar 20, 2005, 8:23:34 AM3/20/05
to
In article <KuOdncu56f9...@comcast.com>,

"Trevor L. Jackson, III" <tl...@comcast.net> wrote:

I think you will find that comparing floating point values by looking at
their bits only may be more difficult than you think. You have to worry
about NaN's, which you have to distinguish from Infinities, you have to
worry about +0 and -0 which have to compare equal, and of course IEEE
754 uses signed magnitude whereas most CPUs use 2s complement for
integers, so for negative numbers the comparison is exactly the other
way round.

Brian Inglis

unread,
Mar 20, 2005, 8:34:25 AM3/20/05
to
fOn Sun, 20 Mar 2005 06:35:46 -0500 in alt.folklore.computers, "Trevor

L. Jackson, III" <tl...@comcast.net> wrote:

Well written math library functions contain more integer ops than FP.

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Brian....@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

jmfb...@aol.com

unread,
Mar 20, 2005, 7:43:02 AM3/20/05
to
In article <1111273017....@l41g2000cwc.googlegroups.com>,

iai...@truecircuits.com wrote:
>Herman> conversions between integer and floating are slow
>Herman> and clumsy. They should be one-cycle
>Herman> register-register instructions.
>
>Hmmm... There might be a kernel of truth in here. Care to tease it
>out?
>
>- Do calculations typically switch back-and-forth between
>floating-point and integer operations?
>
>I suspect going back and forth is rare in straight math. Logical ops
>on FP numbers do not yield numbers, integer adds and subtracts could
>just as well be done as FP adds and subtracts. BUT:
>
>- Are floating-point calculations often used to determine addressing
>for load-store operations?
>
>I suspect this happens frequently, or could happen frequently if the
>speed path was there. Table look-ups, algorithms that deform and then
>resample to a grid... I suspect there may even be many physics
>algorithms that look a lot like texture mapping (multidimensional
>interpolated table lookups).
>
>So here's a question for you: what sort of machine instruction would be
>useful to tell the CPU how to do a multidimensional interpolated table
>lookup?

A machine architecture that has an indirect bit for effective
address calculation for all instructions.

Bernd Paysan

unread,
Mar 20, 2005, 8:38:26 AM3/20/05
to
iai...@truecircuits.com wrote:
> - Are floating-point calculations often used to determine addressing
> for load-store operations?

If you build your own division or square root by multiplication, the "bit
field of FP number gives address offset" operation can be quite frequent.
If you have a fdivapprox/fsqrtapprox instruction with a table in ROM and
hardcoded bit field extraction, it goes away.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

Tom Linden

unread,
Mar 20, 2005, 8:55:08 AM3/20/05
to

Depends on how frequently the construct were employed. implementing
something like
[i1][i2]...[i7](base reg) would probaly give the cpu designers a
headache, but
[i](br) where the compiler computes i from the other indices, in my
experience
often allows for some optimization in the form of code hoisting and common
subexpression elimination and it probaly is done with shadow registers to
the
induction variable. Then again, it may not:-)

Time to revive the VAX and extend it to 64 bit (and big Endian:-) )

Tom Linden

unread,
Mar 20, 2005, 8:59:03 AM3/20/05
to

Trivial. Use VA-PL/I from IBM.
dcl song char(100) varying,
title char(100) varying,
name char(100) varying;
name = title || song;

Now that wasn't hard.

Peter Corlett

unread,
Mar 20, 2005, 9:08:47 AM3/20/05
to
John Savard <jsa...@ecn.ab.ca> wrote:
[...]

> I'm not philosophically opposed to labelling types. It doesn't make
> sense to interpret a floating-point number as a string of
> characters, or as an integer.

It sounds like you need a copy of "Hacker's Delight". It explains all
sorts of interesting tricks and algorithms to do with numbers on a
computer in a variety of representations: unsigned, twos-complement
and sign-magnitude binary integers, binary integers in bases e, i and
-2, Gray code, floating point, and 2D co-ordinates in a Hilbert curve.

For example, the trick where you treat an IEEE floating point as an
integer for comparisons is covered. The cases where it doesn't work
are shown and explained. Whether you choose to use the trick is thus
an engineering compromise, hopefully based on knowing that the input
values are in range or "erroneous" returns don't matter, rather than
"hey, wow, this is fast, who cares if the results are sometimes
broken"[0].

I don't believe it covers characters though. They're a bit too trivial
for a maths book :)

If you do want an example of mixing integers and characters, I did
once see a very cute nybble-to-ASCII algorithm for the 6502 that used
neither a lookup table, or a compare and branch. From faint memory,
you put your nybble in the accumulator. You'd switch to BCD mode which
would cause an BCD overflow (which would add 6 to the accumulator to
normalise it) and a carry set if n was greater than 9. You'd then
add-with-carry 0x30. Hey presto, '0'..'9','A'..'F'.


[0] aka the PHP/MySQL approach.

--
PGP key ID E85DC776 - finger ab...@mooli.org.uk for full key

jmfb...@aol.com

unread,
Mar 20, 2005, 8:21:30 AM3/20/05
to
In article <423d83ef$0$38046$5a6a...@news.aaisp.net.uk>,

I have to ask. Was that a part of the architecture design
or a happy side effect that somebody discovered?

CBFalconer

unread,
Mar 20, 2005, 9:24:10 AM3/20/05
to
Christian Bau wrote:
>
... snip ...

>
> I think you will find that comparing floating point values by
> looking at their bits only may be more difficult than you think.
> You have to worry about NaN's, which you have to distinguish
> from Infinities, you have to worry about +0 and -0 which have
> to compare equal, and of course IEEE 754 uses signed magnitude
> whereas most CPUs use 2s complement for integers, so for
> negative numbers the comparison is exactly the other way round.

IF you use IEEE 754. You can also decide that there are no NANs,
infinities, denormals etc. and define 0.0 to be a zero exponent
field. Exit one problem. Then you can resolve magnitudes by
simply examining the exponent field. Diddle the result on the
basis of the signs, found in the significand field. If equal so
far, mask off the sign bit in the significand and compare the
remaining bits as an integer. Done, and simple.

You can also decide that any overflow or underflow in a FP value is
a serious error, and trap on it. Just as you really should on any
integer overflow. Not quite as user convenient, but we got rid of
an awful lot of code.

--
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare

CBFalconer

unread,
Mar 20, 2005, 9:47:13 AM3/20/05
to
Peter Corlett wrote:
>
... snip ...

>
> If you do want an example of mixing integers and characters, I did
> once see a very cute nybble-to-ASCII algorithm for the 6502 that
> used neither a lookup table, or a compare and branch. From faint
> memory, you put your nybble in the accumulator. You'd switch to BCD
> mode which would cause an BCD overflow (which would add 6 to the
> accumulator to normalise it) and a carry set if n was greater than
> 9. You'd then add-with-carry 0x30. Hey presto, '0'..'9', 'A'..'F'.

SOP for the 8080, z80, and even the 8086/80x86

;
; Convert (a) lsbits to hex char
; a,f
hexdig: ani 0fh
adi 090h
daa
aci 040h
daa
ret

Trevor L. Jackson, III

unread,
Mar 20, 2005, 10:21:16 AM3/20/05
to
Christian Bau wrote:

No. All negative '754 values are less than all positive '754 values
when both are taken as integers. This issues in in some of the
documents from the mid-80's so none of it is news.

The whole point of what I wrote is that it is not necessary to decode
the numeric values of '754 numbers in order to compare them. The only
thing you have to be careful of is NaNs, and that's not a big burden.

/tj3

Tom Linden

unread,
Mar 20, 2005, 10:06:44 AM3/20/05
to
On 20 Mar 2005 14:08:47 GMT, Peter Corlett <ab...@dopiaza.cabal.org.uk>
wrote:

That algorithm goes back to the 50's at Univac, IMMSMC. I used it 25
years or so ago for implementing packed decimal arithmetic.

John Savard

unread,
Mar 20, 2005, 12:09:44 PM3/20/05
to
Anne & Lynn Wheeler <ly...@garlic.com> wrote in message news:<m3mzszi...@lhwlinux.garlic.com>...

> in any case, fs was killed before even being announced

> or publicized,

Of course, IBM was watched so closely in those days that some word did
get out.

The death of that which was supposed to revolutionize the entire world
of computing, of course, must have come as a tremendous personal
embarassment to one James Martin.

But many of the FS concepts did survive in the AS/400, of course.

John Savard

Rob Warnock

unread,
Mar 20, 2005, 10:23:53 PM3/20/05
to
<jmfb...@aol.com> wrote:
+---------------

| iai...@truecircuits.com wrote:
| >So here's a question for you: what sort of machine instruction would be
| >useful to tell the CPU how to do a multidimensional interpolated table
| >lookup?
|
| A machine architecture that has an indirect bit for effective
| address calculation for all instructions.
+---------------

Heh! Yes, I loved the PDP-10 too, Barb! But you forgot to mention that
the word pointed to by an indirection could *also* have an indirect bit
and an index register (as well as a new direct address), and so on,
which led to the Algol compiler [at least] being able to do multi-
dimensional array references with a single instruction! Well, after
the subscripts were put into registers, and assuming the Illiffe vectors
for the array had been set up (at compile time). E.g.:

foo[a,b,c,d,e] := foo[a,b,c,d,e] + 1;

might be compiled as:

...code to load T0 with a...
...code to load T1 with b...
...code to load T2 with c...
...code to load T3 with d...
...code to load T4 with e...
AOS @FOO(T0) ; increment foo[a,b,c,d,e]

Which [as Barb knows] worked because the "FOO" label pointed not
to the actual array, but to a vector of words containing indirect
bits and index register fields of "T1" and Y (direct address) fields
of the next-level Illiffe vectors, and so on. [Only the last-level
vectors lacked the indirect bits.]

Tha main problem with this was that the register numbers to be used
for indexing each level (array dimension) of a given array had to be
picked at compile time, which meant that at run time you had to have
*exactly* those registers available for the subscripted references
to that array. Made register allocation/packing "interesting"...


-Rob

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

John Savard

unread,
Mar 20, 2005, 11:53:48 PM3/20/05
to
Steve <St...@NOSPAM.smart-life.net> wrote in message news:<1111277353.db4bbc6caab704507f71ea2dea0b600c@teranews>...

> 2. Do the checking at compile time, which means that you cannot compile
> C/C++ code on such a machine without risking #1 (Go back to COBOL and
> FORTRAN).

> Have I missed any possibilities here?

I think your problem may be that you have included one possibility too
many.

If the problem is that people are using C and C++ when they *ought* to
be using FORTRAN, COBOL, or Pascal, or Modula-2, for that matter... it
is true that one way to solve the problem is to design the underlying
architecture *so that it is impossible to write a C compiler for it*
but that does not appear to be what you are intending.

The C language was intended, among other things, to serve as a
replacement for assembler language for systems programming. Hence, it
allows you to do bad, nasty things. A machine with tagged types _may_
have special instructions that let you look at the innards of a
floating-point number (for reasons pointed out by people who replied
to one of my posts) or it may not.

There are basically two possibilities. A C program that has a bug
because it runs on a machine without tagged types will either have the
same bug on a machine with tagged types - because the compiler will
use a special instruction to get around the protection - or it won't,
because it is only possible to implement a subset of C on that
machine.

John Savard

Brian Inglis

unread,
Mar 21, 2005, 5:31:31 AM3/21/05
to
On Sun, 20 Mar 2005 21:23:53 -0600 in alt.folklore.computers,
rp...@rpw3.org (Rob Warnock) wrote:

>the Illiffe vectors

>a vector of words containing indirect

>bits and index register fields and direct address fields


>of the next-level Illiffe vectors, and so on.

Almost a repost of your 20 year old article in net.lang.c!
In those 20 years plus, I've never seen a mention of an Iliffe vector.
Googling reveals it was supported by Atlas Autocode in the 1960s.
Any references to its invention, by J.K.Iliffe I presume?

Nick Maclaren

unread,
Mar 21, 2005, 5:40:02 AM3/21/05
to

In article <i14t31p62ju26m68q...@4ax.com>,

Brian Inglis <Brian....@SystematicSW.Invalid> writes:
|> On Sun, 20 Mar 2005 21:23:53 -0600 in alt.folklore.computers,
|> rp...@rpw3.org (Rob Warnock) wrote:
|>
|> >the Illiffe vectors
|>
|> >a vector of words containing indirect
|> >bits and index register fields and direct address fields
|> >of the next-level Illiffe vectors, and so on.
|>
|> Almost a repost of your 20 year old article in net.lang.c!
|> In those 20 years plus, I've never seen a mention of an Iliffe vector.

You haven't been paying proper attention to all of my posts :-)

It was before my time, but was the term that I first heard used
for the technique.

|> Googling reveals it was supported by Atlas Autocode in the 1960s.
|> Any references to its invention, by J.K.Iliffe I presume?

Probably but, given its era, documentation is likely to be sparse
or non-existent. It must be getting ripe for patenting ....


Regards,
Nick Maclaren.

CBFalconer

unread,
Mar 21, 2005, 7:27:05 AM3/21/05
to
Nick Maclaren wrote:
>
... snip ...

>
> Probably but, given its era, documentation is likely to be sparse
> or non-existent. It must be getting ripe for patenting ....

Shhh. Bill and cohorts are listening.

dcw

unread,
Mar 21, 2005, 7:37:20 AM3/21/05
to
In article <d1m8a2$q3o$1...@gemini.csx.cam.ac.uk>,

I just happen to have my "Atlas Autocode Reference Manual" (1965
edition) with me. It describes the use of "dope vectors" and
"Iliffe vectors".

The ICL 2900 used Iliffe vectors, but it seems that it called them
"dope vectors".

David

Tom Linden

unread,
Mar 21, 2005, 8:31:32 AM3/21/05
to

Are you sure they are synonyms? The term Dope Vector refers to
Descriptors, which appears to be a bit more general than Iliffe vector.
>
> David

dcw

unread,
Mar 21, 2005, 9:01:28 AM3/21/05
to

Not really synonyms, no. What the 2900 had seem more like what Atlas
Autocode called "Iliffe vectors", in that you had a hierarchy of them
pointing to the rows (and rows of rows, etc.) of multi-dimensional arrays,
but I agree that they also contained size information that would be in the
dope vectors of Atlas Autocode arrays. As I understand it, the 2900 "dope
vectors" were actually more like Iliffe's "codewords" than the Atlas
Autocode "Iliffe vectors" were.

David

jmfb...@aol.com

unread,
Mar 21, 2005, 8:39:14 AM3/21/05
to
In article <46-dnTstgrL...@speakeasy.net>,

rp...@rpw3.org (Rob Warnock) wrote:
><jmfb...@aol.com> wrote:
>+---------------
>| iai...@truecircuits.com wrote:
>| >So here's a question for you: what sort of machine instruction would be
>| >useful to tell the CPU how to do a multidimensional interpolated table
>| >lookup?
>|
>| A machine architecture that has an indirect bit for effective
>| address calculation for all instructions.
>+---------------
>
>Heh! Yes, I loved the PDP-10 too, Barb! But you forgot to mention that
>the word pointed to by an indirection could *also* have an indirect bit
>and an index register (as well as a new direct address), and so on,

Exactly! That's why you can do multi-dimensional. Oh...sorry.
I forgot the necessity of the index.

>which led to the Algol compiler [at least] being able to do multi-
>dimensional array references with a single instruction! Well, after
>the subscripts were put into registers, and assuming the Illiffe vectors
>for the array had been set up (at compile time). E.g.:
>
> foo[a,b,c,d,e] := foo[a,b,c,d,e] + 1;
>
>might be compiled as:
>
> ...code to load T0 with a...
> ...code to load T1 with b...
> ...code to load T2 with c...
> ...code to load T3 with d...
> ...code to load T4 with e...
> AOS @FOO(T0) ; increment foo[a,b,c,d,e]
>
>Which [as Barb knows] worked because the "FOO" label pointed not
>to the actual array, but to a vector of words containing indirect
>bits and index register fields of "T1" and Y (direct address) fields
>of the next-level Illiffe vectors, and so on. [Only the last-level
>vectors lacked the indirect bits.]
>
>Tha main problem with this was that the register numbers to be used
>for indexing each level (array dimension) of a given array had to be
>picked at compile time, which meant that at run time you had to have
>*exactly* those registers available for the subscripted references
>to that array. Made register allocation/packing "interesting"...

<GRIN> Or, as TW would say, "A small matter of programming."

jmfb...@aol.com

unread,
Mar 21, 2005, 8:48:08 AM3/21/05
to
<snip>

I just remembered. Whenever I executed one of those, I would
verbally think zzzziiiiippppp which is what we would say when
the child game of "The Thread Follows the Needle" ended.

Herman Rubin

unread,
Mar 21, 2005, 10:09:37 AM3/21/05
to
In article <1111277353.db4bbc6caab704507f71ea2dea0b600c@teranews>,
Steve <data...@NOSPAM.DirecWay.com> wrote:
>We've been discussing types and where this information should be kept.
>However, MOST of the computations done by computers are in computing
>addresses, not values on the way to final results. Most of the time,
>when you are dealing with integers, it is on the way to computing an
>address.

It depends; good programs would do little of this. Most
of the ways to "avoid gotos" require much additional
computation of addresses, slowing everything down.
Or if they have too, it would be going through loops.

>I think it has become pretty obvious that leaving address computation to
>programmers or compilers is a prescription for unreliable systems.

I see no reason why.

Most
>crashes are due to clobbering memory due to defective address
>computation.

Crashes, maybe, but exceptions which could be foreseen are a
more likely cause. However, testing for all of this can
easily multiply the running time of a program by 10 or more.


--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Tom Linden

unread,
Mar 21, 2005, 4:21:13 PM3/21/05
to
On Mon, 21 Mar 2005 20:50:28 GMT, Brian Inglis
<Brian....@SystematicSW.Invalid> wrote:

> fOn Mon, 21 Mar 05 14:01:28 GMT in alt.folklore.computers,

> AFAIR dope vectors refer to structures containing at least: base,
> dimensions, bound(s), ... whereas Iliffe vectors seem to contain only
> pointers to (possibly indirected) pointers or data.

Here is what we use in the PL/I compiler

declare 1 dope_vector based(node_ptr),
2 type char(1),
2 dmy union, /* * ultrix doesn't support */
3 dims char(1),
3 dummy,
4 bit_offset bit(4),
4 dimens bit(4),
2 size fixed bin(15),
2 bound(max_dimensions),
3 lower fixed bin(31),
3 upper fixed bin(31),
3 span fixed bin(31);


>
> Maybe I'll see if ILL can access a copy of "Basic Machine Principles",
> as that seems to be the most likely source of the term, unless anyone
> knows of a better (likely) source document?
>

Nick Maclaren

unread,
Mar 21, 2005, 4:21:15 PM3/21/05
to
In article <8hcu315t8effqalfk...@4ax.com>,

Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
>
> >|> In those 20 years plus, I've never seen a mention of an Iliffe vector.
> >
> >You haven't been paying proper attention to all of my posts :-)
>
> Your (archived) posts postdated Rob's and gave no background.

Oh, yes, but I was merely niggling about your statement that you hadn't
seen a mention :-)

I certainly have been using the term for 34 years, because I learnt
it in 1970/1 at Nottingham, but that is not a reference.

>AFAIR dope vectors refer to structures containing at least: base,
>dimensions, bound(s), ... whereas Iliffe vectors seem to contain only
>pointers to (possibly indirected) pointers or data.

No, not at all. The term "dope vectors" was used for a period as a
synonym for Iliffe vectors - the concept (and, I believe, both terms)
predate me. The other concept was normally called "descriptors" or
"array descriptors" round about 1970, was heavily used in compilers
like those on the KDF9 (e.g. Egtran), WATFIV and Algol 68. The last
introduced the use of triples (base, stride, limit or various other
forms) for each dimension.

I never saw dope vectors used as a synonym for descriptors before
about 1990.

>Maybe I'll see if ILL can access a copy of "Basic Machine Principles",
>as that seems to be the most likely source of the term, unless anyone
>knows of a better (likely) source document?

Sorry, no.


Regards,
Nick Maclaren.

John R. Levine

unread,
Mar 21, 2005, 2:15:31 PM3/21/05
to
>| A machine architecture that has an indirect bit for effective
>| address calculation for all instructions.
>
>Heh! Yes, I loved the PDP-10 too, Barb!

Sounds more like the GE 635 to me. Like the PDP-10 it had indefintely
deep indirect addressing, but it also offered at each level either pre
or post-indexing (only one pending post-index remembered) and a
"tally" scheme that was useful for implementing stacks and character
processing. I gather that the indirect addressing let the Fortran
compiler put the argument addresses in line after the call, using
indirection if the argument was itself an argument to the calling
routine.

The 635 was a more interesting machine than you might have thought,
particularly when used with Dartmouth's software. DTSS supported
multiple languages (not just Basic), let users run arbitrary binary
programs, and provided snappy response to 100 users on a machine about
the speed of a KA-10.


Brian Inglis

unread,
Mar 21, 2005, 3:50:28 PM3/21/05
to
fOn Mon, 21 Mar 05 14:01:28 GMT in alt.folklore.computers,
D.C....@ukc.ac.uk (dcw) wrote:

AFAIR dope vectors refer to structures containing at least: base,


dimensions, bound(s), ... whereas Iliffe vectors seem to contain only
pointers to (possibly indirected) pointers or data.

Maybe I'll see if ILL can access a copy of "Basic Machine Principles",


as that seems to be the most likely source of the term, unless anyone
knows of a better (likely) source document?

--

Stephen Fuld

unread,
Mar 21, 2005, 2:32:11 PM3/21/05
to

"Steve Richfie1d" <St...@NOSPAM.smart-life.net> wrote in message
news:3a1d3qF...@individual.net...

snip

> I think Burroughs had it right, but the arguments are really complex. The
> sum total was that their system was virtually crash-proof - the things
> that normally crash computers had no way of even being stated in their
> architecture. You can't really argue the issues one-by-one, but rather
> they must be taken as a whole.

If you mean that you must consider every aspect of the Burroughs large scale
system architecture together and not try to analyze them separately, then I
disagree. For example, one could imagine a computer that used the Burroughs
memory addressing scheme, but not its stack oriented instruction set. Or
one that had those two, but added a "privledged" execution mode for some of
the code in order to eliminate the requirement of giving special privledges
to any program that generates code (to insure that it doesn't crash the
system).

> For example, to compute a subscripted location, you did not (and indeed
> could not) simply add the offset to the origin of the array. Instead, you
> performed an array access via an array descriptor that was in memory
> protected from your potential modification. Full subscript checking was a
> fully automatic part of these operations. These sorts of complex
> operations did NOT slow these computers down at all, as they had the
> additional hardware to perform all of the extra work in parallel.

Sort of. The hardware still has to access the array descriptor in memory
prior to computing the real memory address. Then there is a second memory
access to get the actual data. These cannot be overlapped as you need the
output from the first to begin teh second. The result is an extra memory
access (albeit not programmer visible, but still taking real time) for the
array access. This mattered less (in terms of performance) when the CPU was
relativly slower compared to the memory (i.e. when said systems were
designed), than it does now. In terms of array subscript calculations,
today a multiply is faster than a memory access. But you do lose the
protection aspects. The trick would be to design a mechanism that gained
the protection without the performance cost.

--
- Stephen Fuld
e-mail address disguised to prevent spam


Andrew Swallow

unread,
Mar 21, 2005, 4:27:28 PM3/21/05
to
Stephen Fuld wrote:
> The trick would be to design a mechanism that gained
> the protection without the performance cost.
>

On a machine supporting hardware capability registers we used to give
static 1 dimensional arrays their own capabilities. This caught all out
of range accesses on 1D arrays and many on 2D arrays. An enhanced
capability feature that performs range checks whilst calculating the
parameter multiplication would be a nice enhancement. Although we have
to do is get the complier writers to move the loading of the array
pointer register(s) outside of loops.

Andrew Swallow

Brian Inglis

unread,
Mar 21, 2005, 3:41:47 PM3/21/05
to
On 21 Mar 2005 10:40:02 GMT in alt.folklore.computers,
nm...@cus.cam.ac.uk (Nick Maclaren) wrote:

>
>In article <i14t31p62ju26m68q...@4ax.com>,
>Brian Inglis <Brian....@SystematicSW.Invalid> writes:
>|> On Sun, 20 Mar 2005 21:23:53 -0600 in alt.folklore.computers,
>|> rp...@rpw3.org (Rob Warnock) wrote:
>|>
>|> >the Illiffe vectors
>|>
>|> >a vector of words containing indirect
>|> >bits and index register fields and direct address fields
>|> >of the next-level Illiffe vectors, and so on.
>|>
>|> Almost a repost of your 20 year old article in net.lang.c!
>|> In those 20 years plus, I've never seen a mention of an Iliffe vector.
>
>You haven't been paying proper attention to all of my posts :-)

Your (archived) posts postdated Rob's and gave no background.

>It was before my time, but was the term that I first heard used


>for the technique.
>
>|> Googling reveals it was supported by Atlas Autocode in the 1960s.
>|> Any references to its invention, by J.K.Iliffe I presume?
>
>Probably but, given its era, documentation is likely to be sparse
>or non-existent. It must be getting ripe for patenting ....

Given it's use in C for 35 years for multidimensional array access,
and the references to Atlas documents in the libraries at cam.ac.uk,
even MS couldn't get away with that one for long.

Tom Linden

unread,
Mar 21, 2005, 5:29:27 PM3/21/05
to

We have been doing that for the last 25 or more years.
>
> Andrew Swallow

Andrew Swallow

unread,
Mar 21, 2005, 5:48:00 PM3/21/05
to
Tom Linden wrote:

Which machines have such registers?

Andrew Swallow

Tom Linden

unread,
Mar 21, 2005, 5:47:03 PM3/21/05
to
On Mon, 21 Mar 2005 22:48:00 +0000 (UTC), Andrew Swallow
<am.sw...@btopenworld.com> wrote:

I may have misunderstood, I thought you were referring to the hoisting of
the loading of the array base register out of the loop
>
> Andrew Swallow

Andrew Swallow

unread,
Mar 21, 2005, 6:22:26 PM3/21/05
to
Tom Linden wrote:

That is a useful thing to do as well. When loading a register takes 4
ram reads it is worth optimising out of small loops.

Andrew Swallow

Steve Richfie1d

unread,
Mar 21, 2005, 6:30:07 PM3/21/05
to
Herman,

>>We've been discussing types and where this information should be kept.
>>However, MOST of the computations done by computers are in computing
>>addresses, not values on the way to final results. Most of the time,
>>when you are dealing with integers, it is on the way to computing an
>>address.

> It depends; good programs would do little of this.

Every array reference and every modified pointer uses some sort of
address computation. Most computers have some approach to do simple
cases without slowing computation, with different ideas of what
constitutes "simple". Most computers do NOT have any way of checking
their computations without incurring some loss in speed.

> Most
> of the ways to "avoid gotos" require much additional
> computation of addresses, slowing everything down.
> Or if they have too, it would be going through loops.

I suspect this depends a LOT on what you are doing.

>>I think it has become pretty obvious that leaving address computation to
>>programmers or compilers is a prescription for unreliable systems.

Programmers rarely check even half of all of the things that could
conceivably go wrong.

Compilers are better, but any weakness in any compiler that compiles any
of the code for your computer makes your computer unreliable. I'd bet
that more than a dozen compilers were used to compile the code in your
OS, applications you use, etc. Besides, unless you stick to FORTRAN and
COBOL, there are things that compilers just can't check without slowing
your code WAY down, like possibly ++ing too far.

> I see no reason why.
>
> Most
>
>>crashes are due to clobbering memory due to defective address
>>computation.
>
>
> Crashes, maybe, but exceptions which could be foreseen are a
> more likely cause.

Do you REALLY want to bet the reliability of your computer that NONE of
the 20,000 or so programmers who worked on your OS and applications
missed ANY of the possibilities for defective addresses?! Sounds like a
bad bet for me. I'd say that what you are proposing is "theoretically"
possible, but with SO many programmers involved, I can't even say that.

> However, testing for all of this can
> easily multiply the running time of a program by 10 or more.

*THIS* is the issue. With reasonable hardware design (as found in the
old Burroughs mainframes and possibly nowhere else) this can be done
with NO loss in speed. As with my FP proposal, what is needed is some
way to make this work at all on existing hardware, and a design to make
it work well on future hardware. One possibility:

Supposed that pointers came with 2 numbers, the memory location and a
pointer to a memory block descriptor, that contained the beginning and
end of the block, along with the size and type of the values stored
therein. Implementations could simply ignore the descriptor and run with
present problems, or check the descriptor and be sure of not clobbering
anything. Future hardware would always check the descriptor while
allowing execution to continue, and mark the cache entry that held a
stored value as unchecked until it is checked, so it doesn't get flushed
to RAM until checking has been performed.

Of course, once you decide to provide better hardware, you can do MUCH
better, e.g. multidimensional address and interpolation computation.

Steve Richfie1d

Steve Richfie1d

unread,
Mar 21, 2005, 7:07:43 PM3/21/05
to
John,

YES, another GE/Honeybucket jock!

Don't forget the Repeat Double instruction that allowed you to construct
your own instructions that ran at the same speed as hardwired
instructions! There were two instruction execution units, one for even
locations and one for odd locations. The Repeat Double instruction had
to be located on an odd location, and locked the next two instructions
into their respective execution units and executed them over and over
until a particular stated condition occurred. These were used for most
of the moves, searches, type conversions, etc. needed to support COBOL.
With the addition of character-level indirect addressing, it was an
INCREDIBLY powerful string processing machine, easily outrunning its
competition with similar clock speeds by between 2:1 and 3:1.

This same technology could probably be applied today to gain a LOT more
speed for small loops, as it eliminates the loop overhead while still
being able to perform each iteration differently, e.g. searching for
some pattern, etc. This provides nearly array-processor speed but
without the usual limitations of vectorizability.

These were originally engineered to replace 16 IBM-7094s used to perform
range control that were on lease, which they did very well. They even
included a Gray-To-Binary instruction to support their initial application.

Then came the 360s and the faster honeybuckets faded into history.

Steve Richfie1d
===============

Steve Richfie1d

unread,
Mar 21, 2005, 7:28:47 PM3/21/05
to
Stephen,

>>I think Burroughs had it right, but the arguments are really complex. The
>>sum total was that their system was virtually crash-proof - the things
>>that normally crash computers had no way of even being stated in their
>>architecture. You can't really argue the issues one-by-one, but rather
>>they must be taken as a whole.

> If you mean that you must consider every aspect of the Burroughs large scale
> system architecture together and not try to analyze them separately, then I
> disagree. For example, one could imagine a computer that used the Burroughs
> memory addressing scheme, but not its stack oriented instruction set.

There WAS some (relatively minor) interplay, as the stack architecture
determined which instructions were likely to occur one after the other,
which in turn affected how the systems that overlapped operation needed
to be designed to minimize contention.

Yes, you are right that architectures can be mixed, but at the end of
the day it must all play together. My point was that there was a gestalt
to Burroughs' memory management and protection that was being lost in
the discussions here, as people discussed how integers should (or should
not) be marked as such, when they were a part of this grand plan.

> Or
> one that had those two, but added a "privledged" execution mode for some of
> the code in order to eliminate the requirement of giving special privledges
> to any program that generates code (to insure that it doesn't crash the
> system).

GREAT caution would be needed here not to sacrifice software reliability
in so doing something like this.

>>For example, to compute a subscripted location, you did not (and indeed
>>could not) simply add the offset to the origin of the array. Instead, you
>>performed an array access via an array descriptor that was in memory
>>protected from your potential modification. Full subscript checking was a
>>fully automatic part of these operations. These sorts of complex
>>operations did NOT slow these computers down at all, as they had the
>>additional hardware to perform all of the extra work in parallel.
>
>
> Sort of. The hardware still has to access the array descriptor in memory
> prior to computing the real memory address. Then there is a second memory
> access to get the actual data. These cannot be overlapped as you need the
> output from the first to begin teh second. The result is an extra memory
> access (albeit not programmer visible, but still taking real time) for the
> array access.

Of course, these days this would just be a cache hit.

> This mattered less (in terms of performance) when the CPU was
> relativly slower compared to the memory (i.e. when said systems were
> designed), than it does now.

Speeds are still comparable when you compare present CPU speeds to
present cache speeds. Indeed, present caches are typically larger than
RAMs were back then, so that present RAM is now really a form of
secondary storage.

> In terms of array subscript calculations,
> today a multiply is faster than a memory access.

But not faster than an L1 cache hit.

> But you do lose the
> protection aspects. The trick would be to design a mechanism that gained
> the protection without the performance cost.

YES, you have stated the issue. There are a number of reasonable
possibilities given that the checking can be done AFTER the operation in
the case of loads, and after the operation for stores if you mark the
cache location as unchecked until it has passed checking. For even more
speed, you can have enough array-descriptor registers to hold enough
array descriptors for most loops, and run all array accesses through
this mechanism.

I am NOT strongly opinionated as to which approach is best, but I DO
believe that SOME efficient and required mechanism is needed to escape
the present situation of flakey computers.

Does anyone else out there have ideas regarding how to make memory
accessing safer? If we hand SOMETHING up for everyone to see, then there
will be no more excuse for building unreliable CPUs.

Steve Richfie1d

Dik T. Winter

unread,
Mar 21, 2005, 10:18:53 PM3/21/05
to
In article <3a9cm2F...@individual.net> Steve Richfie1d <St...@NOSPAM.smart-life.net> writes:
> Does anyone else out there have ideas regarding how to make memory
> accessing safer? If we hand SOMETHING up for everyone to see, then there
> will be no more excuse for building unreliable CPUs.

Not unless you are willing to sacrifice speed. While in the days you
speak of it was easy to put a cable between two parts of the central
processor, this is no longer the case now. You have to consider your
whole processor lay-out when you want to put in a new data-path. And
it might even be impossible, unless you allow for another layer on
your wafer.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Steve Richfie1d

unread,
Mar 21, 2005, 9:17:31 PM3/21/05
to
Dik,

> Not unless you are willing to sacrifice speed. While in the days you
> speak of it was easy to put a cable between two parts of the central
> processor, this is no longer the case now. You have to consider your
> whole processor lay-out when you want to put in a new data-path. And
> it might even be impossible, unless you allow for another layer on
> your wafer.

NO! What you say might be true *IF* you had to be able to trap without
proceeding beyond the fault, but this is not the case with careful
design. With loads you can take as long as you like to fault, you just
make life a bit more difficult for the programmer to debug it. With
stores, you can just delay flushing the cache cell to RAM until the
address has proven to be valid, or alternatively, start verification as
soon as an address has been formed, before you even get to the store
instruction that uses it.

If you actually have the new hardware do the subscripting then it will
obviously cost some time, but it ALREADY costs time to add the offset to
the origin of an array. Of course, in either case this would usually be
overlapped with other processing.

I agree with you that as an add-on it would be a MAJOR hassle. However,
I don't see any problem if it is designed into a new processor with a
new instruction set.

Steve Richfie1d

Rob Warnock

unread,
Mar 22, 2005, 12:15:08 AM3/22/05
to
Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
+---------------

| rp...@rpw3.org (Rob Warnock) wrote:
| >the Illiffe vectors
| >a vector of words containing indirect
| >bits and index register fields and direct address fields
| >of the next-level Illiffe vectors, and so on.
|
| Almost a repost of your 20 year old article in net.lang.c!
+---------------

(*blush*) I see I really need to make some progress on that little
project of collecting all my old postings, so I can just give a URL
like Lynn Wheeler does... ;-} ;-}

+---------------


| In those 20 years plus, I've never seen a mention of an Iliffe vector.
| Googling reveals it was supported by Atlas Autocode in the 1960s.
| Any references to its invention, by J.K.Iliffe I presume?

+---------------

Yikes! I misspelled his name again!! (Got to remember, only *one* ell,
only *one* ell...)


-Rob

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

Andrew Reilly

unread,
Mar 22, 2005, 2:14:17 AM3/22/05
to
Hi Steve,

You were spared my previous attempt to answer this general question (in
c.a and a.f.c because pan crashed on me :-) Hopefully I'll be luckier
this time...

On Sat, 19 Mar 2005 18:07:02 -0700, Steve wrote:
> I think it has become pretty obvious that leaving address computation to
> programmers or compilers is a prescription for unreliable systems.

Nope, that's not the observed case at all. Even reliable systems leave
the address computation to programmers and compilers.

> Most
> crashes are due to clobbering memory due to defective address
> computation.

Yep, so don't do that. (Programmer's fault, usually by using the wrong
language or compiler switches.) Hardware just does what it's told.

> Further, good design can have the address computation and
> subscript checking done in parallel with other operations, whereas
> making address computation just a part of integer operations forecloses
> this valuable opportunity for simultaneous speed and run-time checking.

No, quite the contrary. Address and subscript checking in special address
arithmetic registers and ALUs forecloses the ability to use those hardware
resources for other sorts of operations on the one hand, and makes it
harder to generate addresses (or indices) with instructions not supported
by the AALUs. Vis 68000 -> PPC or MIPS transitions. Unifying all integer
operations was a net performance boost. You're welcome to have extra
hardware to do the work at the same time. That's just more FUs in your
multi-issue (VLIW/EPIC/super-scalar) core.

> Hence, those who have been stating what "should" happen with mixed
> types, e.g. leave them up to the compiler, are REALLY indirectly saying
> to leave the present addressing mess as it is, when much better methods
> are known. The ONLY way to be safe with an open language like C is to do
> the checking at run time. To do this in a compiler requires going back
> to the restrictive addressing of FORTRAN, and completely eliminating
> pointers from the language.

There are compiler options that use fat pointers and other tricks to make
C "safe".

Or "don't do that": use a language that is safe to start with, if that's
what you're interested in. Almost all of them, (by number) in this day
and age are such. I heard the other day that it's actually easier to hire
Java programmers than C or C++ ones, they're churning them out of Unis so
quickly. I'e., you're seeing the trailing edge of a historical artifact
caused by the excessive zeal for "speed" over "correctness" or "security"
of an obviously immature industry. For most purposes computers are "fast
enough", now, and spammers ensure that we care about security. So new
code is written in Java or C# or Ada or Python or Perl or whatever, and
the problem that's bugging you goes away.

If you really, really need to run a "legacy" program that's written in C,
that you don't trust, then you can do so in a sandbox with strict controls
on what effect it can have on the rest of the system, and a strategy for
restarting it if it breaks.

> Hence, rather than discussing what "should" happen, I think we need to
> look at the possible choices and consider which is best: 1. Leave
> systems with the present defect that simple programming errors like
> ++ing too far can clobber memory that is NOT what the programmer thought
> he was working with (Change nothing).

Most of the industry has decided that that's not a great idea. Very
little code is still written where that can be an issue.

> 2. Do the checking at compile
> time, which means that you cannot compile C/C++ code on such a machine
> without risking #1 (Go back to COBOL and FORTRAN).

Or forward to Java, C#, Python, Perl, etc. Or use a "safe C" compiler
mode, or run in a sandbox. Not a problem.

> 3. REQUIRE that
> compilers include code to check ALL addressing at run-time. This is
> monumentally difficult with C/C++ where it can be pretty hard to
> determine what a pointer is pointing into. You only need to check store
> operations to keep the systems from crashing, but you also need to check
> the loads for security (The slowest possible approach).

It can be done though, if you really need to.

> 4. Specially
> mark integers and addresses in some way so that reasonableness of
> operation can be verified at run-time (What I was suggesting).

Vis: lisp, Java (sort of). I.e., it's available now, if you want to do
that. No need to change the hardware.

> 5.
> Require that pointer and array operations go through a special operator
> that simultaneously does the address computation and checks the result
> (the Burroughs approach).

Since you can't run C in such an environment anyway[1], you're limited to
one of the "safe" languages, which is a solved problem by one of the
previous steps.

[1] I believe that there are some C compilers and the like that do run in
such environments. These typically only actually work with that subset of
C *programs* that would be happy to run under something like a JVM or the
MS-IL VM, or which behave well with things like Bohem-et-al garbage
collectors. In this case, the problem is still solved by one of the
previous steps. In the other case, the capability hardware solution
doesn't help, because it thwarts the operation that makes the program work
at all.

> Have I missed any possibilities here?

Yes, there's a possibility between universal run-time checking and
exclusive compile-time checking, which is what most of the hot VM systems
do: put the checks in, conceptually, but allow the last compile step to
perform common sub expression elimination and other optimization proofs on
it. This can give you the safety of the language with *no* additional
run-time operations over the equivelent operation in C or Fortran, for
some types of loops.

> The problem with the previous discussions was that there was an implied
> false assumption that integers are usually used with themselves or FP,
> when in fact they are often/usually used for address computation.

Well, as others have mentioned: sometimes FP is used for address
computation. You don't want to make that any harder than it already is.
(Particularly since that's becoming a very performance-critical operation
in graphics pipelines and the like.)

> It seems obvious (to me) that a LOT more hardware smarts are in order to
> do addressing correctly and efficiently, like #5 above which provides
> both maximum speed and 100% checking. Whether these should be rolled up
> into what started out as an FP spec is still open for debate. However,
> this discussion DOES affect the operation of integers.

It seems to me that you're still trying to waste hardware and power on
what is obviously a coding practice/language issue.

Cheers,

--
Andrew

Brian Inglis

unread,
Mar 22, 2005, 3:11:13 AM3/22/05
to
On Mon, 21 Mar 2005 23:15:08 -0600 in alt.folklore.computers,
rp...@rpw3.org (Rob Warnock) wrote:

>Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
>+---------------
>| rp...@rpw3.org (Rob Warnock) wrote:
>| >the Illiffe vectors
>| >a vector of words containing indirect
>| >bits and index register fields and direct address fields
>| >of the next-level Illiffe vectors, and so on.
>|
>| Almost a repost of your 20 year old article in net.lang.c!
>+---------------
>
>(*blush*) I see I really need to make some progress on that little
>project of collecting all my old postings, so I can just give a URL
>like Lynn Wheeler does... ;-} ;-}
>
>+---------------
>| In those 20 years plus, I've never seen a mention of an Iliffe vector.
>| Googling reveals it was supported by Atlas Autocode in the 1960s.
>| Any references to its invention, by J.K.Iliffe I presume?
>+---------------
>
>Yikes! I misspelled his name again!! (Got to remember, only *one* ell,
>only *one* ell...)

It's understandable given the visual confusion between I (eye) and l
(ell) in most sans serif fonts. That search got me to change my
browser default sans serif proportional font to one of the few
(Verdana) in which confusable letters and digits are all now visually
distinct and renders nicely across normal sizes.

Brian Inglis

unread,
Mar 22, 2005, 1:03:37 AM3/22/05
to
On Mon, 21 Mar 2005 19:17:31 -0700 in alt.folklore.computers, Steve
Richfie1d <St...@NOSPAM.smart-life.net> wrote:

>Dik,
>
>> Not unless you are willing to sacrifice speed. While in the days you
>> speak of it was easy to put a cable between two parts of the central
>> processor, this is no longer the case now. You have to consider your
>> whole processor lay-out when you want to put in a new data-path. And
>> it might even be impossible, unless you allow for another layer on
>> your wafer.
>
>NO! What you say might be true *IF* you had to be able to trap without
>proceeding beyond the fault, but this is not the case with careful
>design. With loads you can take as long as you like to fault, you just
>make life a bit more difficult for the programmer to debug it. With
>stores, you can just delay flushing the cache cell to RAM until the

renamed register to cache ^^^^^^^^^^^^^^^^^ nowadays

RAM has consistently been overused by compiler writers for static data
and temporaries, rather than being kept in its proper place for
register spill areas, I/O buffers, and large constants that don't fit
in instruction immediate operands!

Brian Inglis

unread,
Mar 22, 2005, 2:58:29 AM3/22/05
to
On Tue, 22 Mar 2005 07:59:04 +0100 in alt.folklore.computers, Terje
Mathisen <terje.m...@hda.hydro.com> wrote:

>Rob Warnock wrote:
>
>> Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
>> +---------------

>> | nm...@cus.cam.ac.uk (Nick Maclaren) wrote:
>> | >|> Googling reveals it was supported by Atlas Autocode in the 1960s.
>> | >|> Any references to its invention, by J.K.Iliffe I presume?
>> | >
>> | >Probably but, given its era, documentation is likely to be sparse
>> | >or non-existent. It must be getting ripe for patenting ....
>> |
>> | Given it's use in C for 35 years for multidimensional array access,
>> | and the references to Atlas documents in the libraries at cam.ac.uk,
>> | even MS couldn't get away with that one for long.

>> +---------------
>>
>> Although they *do* seem to be using them, at least in the "Microsoft
>> Vision SDK", whetever that is:
>>
>> <http://robotics.dem.uc.pt/norberto/nicola/visSdk.pdf>
>>
>> <http://msdn.microsoft.com/library/en-us/dnvissdk/html/vissdk.asp>
>> "For better efficiency, the Vision SDK transparently uses Iliffe
>> vectors. An Iliffe vector is an array of pointers to column zero
>> for each row."

>BTW, pretty much every single C program out there do use Iliffe vectors:

That was my reading of the MS Vision SDK articles: it's written in C.
Big whoop! Just waiting now for the term to appear in the Excel
marketing blurbs.

Terje Mathisen

unread,
Mar 22, 2005, 1:59:04 AM3/22/05
to
Rob Warnock wrote:

> Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
> +---------------
> | nm...@cus.cam.ac.uk (Nick Maclaren) wrote:

> | >|> Googling reveals it was supported by Atlas Autocode in the 1960s.
> | >|> Any references to its invention, by J.K.Iliffe I presume?
> | >
> | >Probably but, given its era, documentation is likely to be sparse
> | >or non-existent. It must be getting ripe for patenting ....
> |
> | Given it's use in C for 35 years for multidimensional array access,
> | and the references to Atlas documents in the libraries at cam.ac.uk,
> | even MS couldn't get away with that one for long.

> +---------------
>
> Although they *do* seem to be using them, at least in the "Microsoft
> Vision SDK", whetever that is:
>
> <http://robotics.dem.uc.pt/norberto/nicola/visSdk.pdf>
>
> <http://msdn.microsoft.com/library/en-us/dnvissdk/html/vissdk.asp>
> "For better efficiency, the Vision SDK transparently uses Iliffe
> vectors. An Iliffe vector is an array of pointers to column zero
> for each row."
>

> [Though they can certainly be used for more than just "column zero"...]

The interesting point here is that even with relatively slow integer
mul, it is quite often faster to use an N-dimensional array directly,
instead of having N (or N-1) levels of memory indirection, because said
memory accesses are becoming slower & slower.

It is only when the individual elements of an array becomes
variable-sized, that it is obviously better to use indirect access.

BTW, pretty much every single C program out there do use Iliffe vectors:

int main(int arg, char *argv[])

will pass the two-dimensional character array argv as an array of
pointers to strings (i.e. character vectors). :-)

Terje

--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

Rob Warnock

unread,
Mar 22, 2005, 12:25:08 AM3/22/05
to
Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
+---------------
| nm...@cus.cam.ac.uk (Nick Maclaren) wrote:
| >|> Googling reveals it was supported by Atlas Autocode in the 1960s.
| >|> Any references to its invention, by J.K.Iliffe I presume?
| >
| >Probably but, given its era, documentation is likely to be sparse
| >or non-existent. It must be getting ripe for patenting ....
|
| Given it's use in C for 35 years for multidimensional array access,
| and the references to Atlas documents in the libraries at cam.ac.uk,
| even MS couldn't get away with that one for long.
+---------------

Although they *do* seem to be using them, at least in the "Microsoft
Vision SDK", whetever that is:

<http://robotics.dem.uc.pt/norberto/nicola/visSdk.pdf>

<http://msdn.microsoft.com/library/en-us/dnvissdk/html/vissdk.asp>
"For better efficiency, the Vision SDK transparently uses Iliffe
vectors. An Iliffe vector is an array of pointers to column zero
for each row."

[Though they can certainly be used for more than just "column zero"...]

CBFalconer

unread,
Mar 22, 2005, 3:43:32 AM3/22/05
to
Steve Richfie1d wrote:
>
... snip ...

>
> Does anyone else out there have ideas regarding how to make memory
> accessing safer? If we hand SOMETHING up for everyone to see, then
> there will be no more excuse for building unreliable CPUs.

Very simple. Insist on ECC memory on all your machinery. Proven
technology. Nothing new.

--
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare

Nick Maclaren

unread,
Mar 22, 2005, 1:12:33 PM3/22/05
to
In article <opsn1v191mzgicya@hyrrokkin>, Tom Linden <t...@kednos.com> wrote:
>
>I post already the dope vector we employ in PL/I, which is largely
>unchanged since 1974, when RAF wrote first PL/I for Translations
>Systems. Whilst I haven't seen the Multics sources I will to give you
>odds that it is substantially the same. Note that it also is used for
>strings as well as arrays. The discussion on Dopes in the referenced paper
>should perhaps made clear that dopes are avoided when the address
>arithmentic can be done at compile-time.

That wouldn't surprise me for a second, but is not my point. The
question is what did you call them in 1974, and where did you get
the term from?


Regards,
Nick Maclaren.

Tom Van Vleck

unread,
Mar 22, 2005, 1:47:24 PM3/22/05
to
"Tom Linden" <t...@kednos.com> wrote:
> I think those involved in the Multics implementation
> would need to chime in, if that is also what they called.
> Bob's article suggets so, but I agree it is not
> conclusive.

What I remember is that the 1966 EPL (Early PL/I) used dope
for strings, arrays, and structures. Its code was not
efficient and Multics suffered. This compiler had both
"specifiers" and "dope" as described in
http://www.multicians.org/mspm-bb-2.html

As I remember, with the introduction of the 1969 Multics
PL/I compiler (version 1), "argument descriptors" were used
but "dope" was not generated.. dope might have been needed
to support some language constructs, such as cross sections,
that either didn't work at all or were forbidden for system
programming. This is the compiler described in the
Freiburghouse FJCC paper.

The Version 2 Multics PL/I compiler, circa 1972, produced
far more efficient code and supported more language
features. ISTR that there were some cases where dope-like
information was needed at call boundaries, when one used
cross sections.

Seongbae Park

unread,
Mar 22, 2005, 12:13:53 PM3/22/05
to
Terje Mathisen <terje.m...@hda.hydro.com> wrote:
> Rob Warnock wrote:
...

>> Although they *do* seem to be using them, at least in the "Microsoft
>> Vision SDK", whetever that is:
>>
>> <http://robotics.dem.uc.pt/norberto/nicola/visSdk.pdf>
>>
>> <http://msdn.microsoft.com/library/en-us/dnvissdk/html/vissdk.asp>
>> "For better efficiency, the Vision SDK transparently uses Iliffe
>> vectors. An Iliffe vector is an array of pointers to column zero
>> for each row."
>>
>> [Though they can certainly be used for more than just "column zero"...]
>
> The interesting point here is that even with relatively slow integer
> mul, it is quite often faster to use an N-dimensional array directly,
> instead of having N (or N-1) levels of memory indirection, because said
> memory accesses are becoming slower & slower.

Another side effect of indirection in Iliffe vectors is
that the compiler can not efficiently prefetch the accesses.
With an N-dimensional array, it can keep prefetching
since it knows the address of the next row.
Whereas with Iliffe vectors, it doesn't know
or it has to fetch the pointer to the next row
and prefetch based on that.
If the size of the row is huge, this issue won't matter much.
But if it isn't, there can be a performance hit...

BTW, a few SPECcpu2000 programs also use Iliffe vectors.
--
#pragma ident "Seongbae Park, compiler, http://blogs.sun.com/seongbae/"

Ken Hagan

unread,
Mar 22, 2005, 4:44:24 AM3/22/05
to
>> In terms of array subscript calculations, today a multiply is faster
>> than a memory access.

Steve Richfie1d wrote:
>
> But not faster than an L1 cache hit.

Not "faster", perhaps, but for most recent Pentiums the latency is about
3 clocks for either.

In practice, the "correct" choice of instruction probably depends on the
surrounding code and on the mixture of execution units in the processor
that you happen to be running on, since you want to avoid overburdening
the latter with too many instructions of the same type.

wclo...@lanl.gov

unread,
Mar 22, 2005, 8:51:49 PM3/22/05
to

Nick Maclaren wrote:
> <snip>
> I never saw dope vectors used as a synonym for descriptors before
> about 1990.
> <snip>

Some further comments, that may be out of date as google is not showing
recent articles. The PL/I article makes it clear that the dope vector
terminology was used in the earlier user manual, so that terminology
was in use by early 1969. An IBM/UK description of PL/I using its dope
vector terminology appeared in 1970
http://portal.acm.org/ft_gateway.cfm?id=356564&type=pdf

An IBM report of 1980 appears to use the IFFLE? meaning
http://www.research.ibm.com/journal/rd/246/ibmrd2406L.pdf

A 1981 report on image processing by Srihari uses the array descriptor
meaning
http://portal.acm.org/ft_gateway.cfm?id=356862&type=pdf

The 1969 indirect description of a "dope vector" (in terms of
procedures accessing it) appears to be using if for array descriptors
for an Algol system
http://www.chilton-computing.org.uk/acl/applications/algol/p005.htm

The BLISS description may indeed be of IFFLE(?) vectors

A 1982 article on stack machines appears to be using the terminology
for IFFLE vectors
http://portal.acm.org/ft_gateway.cfm?id=809752&type=pdf

A early planned implementation of Ada (1980) uses the same terminology
for array descriptors
http://portal.acm.org/ft_gateway.cfm?id=948646&type=pdf

A 1978 descrtion of the MU5 (British) appears to use dope vector for
array descriptors
http://portal.acm.org/ft_gateway.cfm?id=359333&type=pdf

FWIW the earliest on-line reference I could find to dope vector is that
of PL/I manual which implies that the term was used for desctiptor
before early 1969, the earliest description compatible with IFFLE
vector semantics is 1971. These dates probably say more about the
limitations of google than about the true origins of the terminology.

Dik T. Winter

unread,
Mar 22, 2005, 7:34:26 PM3/22/05
to
In article <d1pj1j$ped$1...@gemini.csx.cam.ac.uk> nm...@cus.cam.ac.uk (Nick Maclaren) writes:
...
> |> http://portal.acm.org/ft_gateway.cfm?id=359624&type=pdf
>
> That is clearly using them for descriptors. 1977.

The Allocation of Storage for Arrays in Algol-60
K. Sattly and P.Z. Ingerman
University of Pennsylvania
Office of Computer Research and Education
November, 1960

page 3:

"Stored along with the elements of an array will be a "dope vector"
containing the information necessary to reference the array (the
parameters of the storage-mapping function)."

And a "dope-vector" for an n-dimensional array contains 2n+1 words,
where the first is the number of subscripts, the remainder is
pairwise the size and the lower bound of each subscript.

Note that this was the boyhood of dynamic array allocation.

Scott Lurndal

unread,
Mar 22, 2005, 4:09:22 PM3/22/05
to
"Stephen Fuld" <s.f...@PleaseRemove.att.net> writes:
>
>"Steve Richfie1d" <St...@NOSPAM.smart-life.net> wrote in message
>news:3a9cm2F...@individual.net...
>
>snip

>
>>>>For example, to compute a subscripted location, you did not (and indeed
>>>>could not) simply add the offset to the origin of the array. Instead, you
>>>>performed an array access via an array descriptor that was in memory
>>>>protected from your potential modification. Full subscript checking was a
>>>>fully automatic part of these operations. These sorts of complex
>>>>operations did NOT slow these computers down at all, as they had the
>>>>additional hardware to perform all of the extra work in parallel.
>>>
>>>
>>> Sort of. The hardware still has to access the array descriptor in memory
>>> prior to computing the real memory address. Then there is a second
>>> memory access to get the actual data. These cannot be overlapped as you
>>> need the output from the first to begin teh second. The result is an
>>> extra memory access (albeit not programmer visible, but still taking real
>>> time) for the array access.
>>
>> Of course, these days this would just be a cache hit.
>
>Perhaps. I am not familiar with the details of the Burroughs
>implementation. If one had say a three dimensional array of 100 X 100 X 100
>elements, how many words of memory are needed for the descriptor? If one

Since the postulated array occupies a contiguous region of memory, why
would the descriptor need anything more than a base and a limit?

While my large systems days were limited, and were 20 years ago,
my recollection is that the descriptor on the stack is simply a base/limit.
IIRC there were tag bits associated with each memory word that indicated
what the word held (a scaler, a descriptor, etc).

>wants to protect each subscript instead of just the final address, then I
>presume you need three descriptors. And for all but the last one, the

In the Large Systems architecture the subscripts will be read from tagged memory
words (tagged as scalers) and be pushed onto the stack prior to being
operated on. I don't believe that any descriptor is required
for a scaler value.

Pseudo code:

PUSH K
PUSH 10000
MULTIPLY
PUSH I
PUSH 100
MULTIPLY
PUSH J
ADD
ADD
PUSH ARRAY[TOS]

K: SCALER 0
I: SCALER 0
J: SCALER 0

The only instruction that requires a descriptor access is the last one. Range
checking each individual subscript would require the compiler to generate the
appropriate code.

Since the descriptor is already on the stack, and probably in the
most recent activation frame, it is likely that descriptor access will
hit any implementation-defined cache of the Top Of Stack words and be
quite efficient.

This is all (mis)remembered from a class I took in 1984 describing the
A-Series (Large Systems at the time) architecture and may be incorrect
in small ways.

scott

wclo...@lanl.gov

unread,
Mar 22, 2005, 11:31:33 AM3/22/05
to

Nick Maclaren wrote:
> In article <8hcu315t8effqalfk...@4ax.com>,

> Brian Inglis <Brian....@SystematicSW.ab.ca> wrote:
> >
> > >|> In those 20 years plus, I've never seen a mention of an Iliffe
vector.
> > >
> > >You haven't been paying proper attention to all of my posts :-)
> >
> > Your (archived) posts postdated Rob's and gave no background.
>
> Oh, yes, but I was merely niggling about your statement that you
hadn't
> seen a mention :-)
>
> I certainly have been using the term for 34 years, because I learnt
> it in 1970/1 at Nottingham, but that is not a reference.
>
> >AFAIR dope vectors refer to structures containing at least: base,
> >dimensions, bound(s), ... whereas Iliffe vectors seem to contain
only
> >pointers to (possibly indirected) pointers or data.
>
> No, not at all. The term "dope vectors" was used for a period as a
> synonym for Iliffe vectors - the concept (and, I believe, both terms)
> predate me. The other concept was normally called "descriptors" or
> "array descriptors" round about 1970, was heavily used in compilers
> like those on the KDF9 (e.g. Egtran), WATFIV and Algol 68. The last
> introduced the use of triples (base, stride, limit or various other
> forms) for each dimension.

>
> I never saw dope vectors used as a synonym for descriptors before
> about 1990.
> <snip>

Nick:

FWIW in the USA the term dope vector has had a usage as a synonym for
array descriptor implemented as a one dimensional array of integers
since at least since 1969, where it was used to describe the
implementation of the Multics PL/I compiler.

http://www.multicians.org/pl1-raf.html

I suspect the usage originated with APL, but three or four attempts did
not find an appropriate on-line refference. Other pre-1990 usages
include descriptions of the Burroughs B5000

http://www.ajwm.net/amayer/papers/B5000.html

The discusssion of high level language data types

http://portal.acm.org/ft_gateway.cfm?id=359624&type=pdf

The implementation of BLISS

http://vms.process.com/ftp/vms-freeware/FILESERV/BLISS-ARTICLE.PS

Peter Flass

unread,
Mar 22, 2005, 6:18:54 PM3/22/05
to
Nick Maclaren wrote:

PL/I(F) called them "dope vectors" as of version 1 in, what, 1967?

John Savard

unread,
Mar 22, 2005, 9:26:59 AM3/22/05
to
Steve Richfie1d <St...@NOSPAM.smart-life.net> wrote in message news:<3a997uF...@individual.net>...

> Do you REALLY want to bet the reliability of your computer that NONE of
> the 20,000 or so programmers who worked on your OS and applications
> missed ANY of the possibilities for defective addresses?!

As one of the other people who replied to you also noted, a
Burroughs-type architecture means you have to stop programming in C
anyways.

However, while unions and pointer arithmetic are hallowed parts of the
C language, I think it _is_ reasonable to say that a bounds overflow
in an *array subscript* is something that can be disallowed, even in
C, without changing the language.

As it happens, my imagined architecture, taking the NX bit of 64-bit
x86 processors as an inspiration, included both a feature analogous to
that, and other features for enforcing bounds checking. One option was
a mode where every instruction that is indexed would have a field
giving a maximum value for the index register.

But while that can help to prevent many problems, it misses the
biggest possibility for overflows. If you pass an array as a parameter
to a subroutine, the subroutine could go past the array's bounds. This
is what a Burroughs-like architecture could prevent.

Modifying the C input/output library, though, would very largely take
care of that without changing the hardware.

John Savard

It is loading more messages.
0 new messages