Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Portability

332 views
Skip to first unread message

ken...@cix.compulink.co.uk

unread,
Apr 6, 2013, 5:12:23 AM4/6/13
to
It just occurred to me that porting a classical threaded Forth to a new
machine would be considerably easier than one that uses any form of
optimisation. That is at least if we are talking about changing
processors. Probably not the case with different underlying OS on the
same processor. Is that correct?

Ken Young

Albert van der Horst

unread,
Apr 6, 2013, 7:31:48 AM4/6/13
to
In article <Q_ednQPtEZJqe8LM...@giganews.com>,
Not optimising is easier than optimising, sure.

I've some experience with porting a classical Forth.

If you talk about going from a linux x86 to an Alpha, versus
MS-Windows x86, you must realize that not having to change code words
counts for something too.

OTOH, don't think that the linux system call mechanism is equal or even
similar on different micro processors. That takes some sorting out.

I'd call it a wash.

>
> Ken Young

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Elizabeth D. Rather

unread,
Apr 6, 2013, 4:34:45 PM4/6/13
to
On 4/6/13 1:31 AM, Albert van der Horst wrote:
> In article <Q_ednQPtEZJqe8LM...@giganews.com>,
> <ken...@cix.compulink.co.uk> wrote:
>> It just occurred to me that porting a classical threaded Forth to a new
>> machine would be considerably easier than one that uses any form of
>> optimisation. That is at least if we are talking about changing
>> processors. Probably not the case with different underlying OS on the
>> same processor. Is that correct?
>
> Not optimising is easier than optimising, sure.
>
> I've some experience with porting a classical Forth.
>
> If you talk about going from a linux x86 to an Alpha, versus
> MS-Windows x86, you must realize that not having to change code words
> counts for something too.
>
> OTOH, don't think that the linux system call mechanism is equal or even
> similar on different micro processors. That takes some sorting out.
>
> I'd call it a wash.

It's our experience that different operating systems are at least as
much of a challenge as different processors. And, yes, of course an
optimizing compiler is harder than an ITC implementation. However, we
did take pains to optimize our ITC Forths to each processor, making
appropriate register allocations for stack pointers, etc., so as to get
the best performance on that platform. Back in the day, FORTH, Inc. ITC
Forths were as much as 10x faster than FigForths on the same processor.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Ed

unread,
Apr 8, 2013, 7:56:31 PM4/8/13
to
Elizabeth D. Rather wrote:
> ...
> However, we
> did take pains to optimize our ITC Forths to each processor, making
> appropriate register allocations for stack pointers, etc., so as to get
> the best performance on that platform. Back in the day, FORTH, Inc. ITC
> Forths were as much as 10x faster than FigForths on the same processor.

More definitions might have been coded for speed but given the
goal of FigForth - to put Forth into the hands of hobbyists cheaply
when no-else would - it has stood up remarkably well. If there
was an outright design flaw in FigForth, I'd nominate:

: < - 0< ;

-32000 30000 < . 0 ok

Even computer math needs to work as expected ...





Albert van der Horst

unread,
Apr 8, 2013, 9:30:44 PM4/8/13
to
I cared to look that up. In the 6502, the 6809, the Z80 and the
8086 listings I inspected, < (``LESS'') was implemented as a small
code sequence, that looked ok.
[Admittedly some of these listings were as recent as 1982.]

So, as they say, this one is "busted"

By the way, since time immemorial this bug is caught by the infamous
Hayes test.

Anton Ertl

unread,
Apr 9, 2013, 3:21:29 AM4/9/13
to
alb...@spenarnc.xs4all.nl (Albert van der Horst) writes:
>In article <kjvlfb$bfv$1...@speranza.aioe.org>, Ed <inv...@nospam.com> wrote:
>>Elizabeth D. Rather wrote:
>>> ...
>>> However, we
>>> did take pains to optimize our ITC Forths to each processor, making
>>> appropriate register allocations for stack pointers, etc., so as to get
>>> the best performance on that platform. Back in the day, FORTH, Inc. ITC
>>> Forths were as much as 10x faster than FigForths on the same processor.

The second coming of Jeff Fox? Please name the processor and the
benchmark. I remember you claiming that fig-Forth's NEXT contained
subroutine calls, which is false.

>>More definitions might have been coded for speed but given the
>>goal of FigForth - to put Forth into the hands of hobbyists cheaply
>>when no-else would - it has stood up remarkably well. If there
>>was an outright design flaw in FigForth, I'd nominate:
>>
>> : < - 0< ;
>>
>> -32000 30000 < . 0 ok
>>
>>Even computer math needs to work as expected ...
>
>I cared to look that up. In the 6502, the 6809, the Z80 and the
>8086 listings I inspected, < (``LESS'') was implemented as a small
>code sequence, that looked ok.

Yes, in figForth '<' is a signed comparison. AFAIK some of Chuck
Moore's Forths use the so-called circular comparison shown above.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2013: http://www.euroforth.org/ef13/

Elizabeth D. Rather

unread,
Apr 9, 2013, 4:18:24 AM4/9/13
to
On 4/8/13 9:21 PM, Anton Ertl wrote:
> alb...@spenarnc.xs4all.nl (Albert van der Horst) writes:
>> In article <kjvlfb$bfv$1...@speranza.aioe.org>, Ed <inv...@nospam.com> wrote:
>>> Elizabeth D. Rather wrote:
>>>> ...
>>>> However, we
>>>> did take pains to optimize our ITC Forths to each processor, making
>>>> appropriate register allocations for stack pointers, etc., so as to get
>>>> the best performance on that platform. Back in the day, FORTH, Inc. ITC
>>>> Forths were as much as 10x faster than FigForths on the same processor.
>
> The second coming of Jeff Fox? Please name the processor and the
> benchmark. I remember you claiming that fig-Forth's NEXT contained
> subroutine calls, which is false.

It was an 80's PC. Someone wrote an article in Byte claiming a "faster
NEXT" for the PC. It cut the "standard" FigForth NEXT from 12
instructions to about 8. FORTH, Inc's was 2. So we ran some more
benchmarks at that time.

Andrew Haley

unread,
Apr 9, 2013, 5:11:34 AM4/9/13
to
It's deliberate. That's just a circular relational: < is defined as
being "before" on the number circle. Given that arithmetic is reduced
modulo N, it's not entirely unreasonable, and it's exactly what you
want for some robotics applications. It is unconventional, I agree.

I believe that Dean Sanderson wrote a paper about this.

Andrew.

Anton Ertl

unread,
Apr 9, 2013, 8:41:19 AM4/9/13
to
"Elizabeth D. Rather" <era...@forth.com> writes:
>>>>> Back in the day, FORTH, Inc. ITC
>>>>> Forths were as much as 10x faster than FigForths on the same processor.
>>
>> The second coming of Jeff Fox? Please name the processor and the
>> benchmark. I remember you claiming that fig-Forth's NEXT contained
>> subroutine calls, which is false.
>
>It was an 80's PC. Someone wrote an article in Byte claiming a "faster
>NEXT" for the PC. It cut the "standard" FigForth NEXT from 12
>instructions to about 8. FORTH, Inc's was 2. So we ran some more
>benchmarks at that time.

Here are the number of instructions in fig-Forth's NEXT for a number of
CPUs:

12 6502
8 6800
11 8085
11 Z80
2 6809
5 8086
2 PDP-11

So presumably you had a 2-instruction ITC NEXT on the 6502. I don't
think that's possible, but if it is, I would be highly interested to see
that.

Coos Haak

unread,
Apr 9, 2013, 11:21:55 AM4/9/13
to
Op Tue, 09 Apr 2013 12:41:19 GMT schreef Anton Ertl:

> "Elizabeth D. Rather" <era...@forth.com> writes:
>>>>>> Back in the day, FORTH, Inc. ITC
>>>>>> Forths were as much as 10x faster than FigForths on the same processor.
>>>
>>> The second coming of Jeff Fox? Please name the processor and the
>>> benchmark. I remember you claiming that fig-Forth's NEXT contained
>>> subroutine calls, which is false.
>>
>>It was an 80's PC. Someone wrote an article in Byte claiming a "faster
>>NEXT" for the PC. It cut the "standard" FigForth NEXT from 12
>>instructions to about 8. FORTH, Inc's was 2. So we ran some more
>>benchmarks at that time.
>
> Here are the number of instructions in fig-Forth's NEXT for a number of
> CPUs:
>
> 12 6502
> 8 6800
> 11 8085
> 11 Z80
> 2 6809
> 5 8086
> 2 PDP-11
>
> So presumably you had a 2-instruction ITC NEXT on the 6502. I don't
> think that's possible, but if it is, I would be highly interested to see
> that.
>
> - anton

If only instructions count (and not clock ticks ;-) 1802 Figforth ITC NEXT
with 11 instructions seems to fall in the same range.

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html

Elizabeth D. Rather

unread,
Apr 9, 2013, 2:06:27 PM4/9/13
to
On 4/9/13 2:41 AM, Anton Ertl wrote:
> "Elizabeth D. Rather" <era...@forth.com> writes:
>>>>>> Back in the day, FORTH, Inc. ITC
>>>>>> Forths were as much as 10x faster than FigForths on the same processor.
>>>
>>> The second coming of Jeff Fox? Please name the processor and the
>>> benchmark. I remember you claiming that fig-Forth's NEXT contained
>>> subroutine calls, which is false.
>>
>> It was an 80's PC. Someone wrote an article in Byte claiming a "faster
>> NEXT" for the PC. It cut the "standard" FigForth NEXT from 12
>> instructions to about 8. FORTH, Inc's was 2. So we ran some more
>> benchmarks at that time.
>
> Here are the number of instructions in fig-Forth's NEXT for a number of
> CPUs:
>
> 12 6502
> 8 6800
> 11 8085
> 11 Z80
> 2 6809
> 5 8086
> 2 PDP-11
>
> So presumably you had a 2-instruction ITC NEXT on the 6502. I don't
> think that's possible, but if it is, I would be highly interested to see
> that.

When did I say anything about 6502's? FORTH, Inc. never supported the
6502. My statement above was related to the IBM PC c. mid-80's. And I
don't have access to whatever I would need to give the number of
instructions for the others, except that I remember that the longest was
the 1802 at around 4 or 5.

But our systems had a lot more performance enhancements than an
efficient NEXT. We carefully laid out optimal register assignments on
each processor, based on trial codings of primitives, and had quite a
lot more code primitives than FigForth.

Bernd Paysan

unread,
Apr 9, 2013, 4:34:31 PM4/9/13
to
Elizabeth D. Rather wrote:
> But our systems had a lot more performance enhancements than an
> efficient NEXT. We carefully laid out optimal register assignments on
> each processor, based on trial codings of primitives, and had quite a
> lot more code primitives than FigForth.

Gforth contains an 8086 port (with primitives recycled from Klaus Kohl's
Forth from the 80s), and the NEXT there is

: next,
lods,
ax w xchg,
w ) jmp, ;

This is of course optimized for size (4 bytes) and the 8086, not for modern
processors. You somehow have to load a word, increment a pointer by 2, and
jump to what it contains for ITC. If you do DTC, you can do this with two
instructions (lots + ax jmp), or one (ret) if you use the stack pointer as
Forth's instruction pointer - a design decision that wasn't very likely in
the 80s, as you then couldn't use push+pop to access the data stack.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

Elizabeth D. Rather

unread,
Apr 9, 2013, 6:05:42 PM4/9/13
to
On 4/9/13 10:34 AM, Bernd Paysan wrote:
> Elizabeth D. Rather wrote:
>> But our systems had a lot more performance enhancements than an
>> efficient NEXT. We carefully laid out optimal register assignments on
>> each processor, based on trial codings of primitives, and had quite a
>> lot more code primitives than FigForth.
>
> Gforth contains an 8086 port (with primitives recycled from Klaus Kohl's
> Forth from the 80s), and the NEXT there is
>
> : next,
> lods,
> ax w xchg,
> w ) jmp, ;
>
> This is of course optimized for size (4 bytes) and the 8086, not for modern
> processors. You somehow have to load a word, increment a pointer by 2, and
> jump to what it contains for ITC. If you do DTC, you can do this with two
> instructions (lots + ax jmp), or one (ret) if you use the stack pointer as
> Forth's instruction pointer - a design decision that wasn't very likely in
> the 80s, as you then couldn't use push+pop to access the data stack.
>

Sorry, I no longer have the source for the 8086 version, just the 306.
Its NEXT was:

: NEXT { LODS W 0 XCHG W ) LIP} 27FF97AD , ;

(and I don't remember what the notation was, either).

Hugh Aguilar

unread,
Apr 9, 2013, 8:00:46 PM4/9/13
to
On Apr 9, 12:21 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> alb...@spenarnc.xs4all.nl (Albert van der Horst) writes:
>
> >In article <kjvlfb$bf...@speranza.aioe.org>, Ed <inva...@nospam.com> wrote:
> >>Elizabeth D. Rather wrote:
> >>> ...
> >>> However, we
> >>> did take pains to optimize our ITC Forths to each processor, making
> >>> appropriate register allocations for stack pointers, etc., so as to get
> >>> the best performance on that platform. Back in the day, FORTH, Inc. ITC
> >>> Forths were as much as 10x faster than FigForths on the same processor.
>
> The second coming of Jeff Fox?  Please name the processor and the
> benchmark.  I remember you claiming that fig-Forth's NEXT contained
> subroutine calls, which is false.

You're comparing Elizabeth Rather to Jeff Fox??? I don't recall Jeff
Fox spouting grandiose claims such as this.

All of those ITC Forths from the 1980s were slow as molasses. The only
purpose of ITC was to save memory, which was an issue when computers
had only 8K or 16K total. By the time the 64K computers came out
though, it made more sense just to go with STC.

Nowadays, ITC makes sense again though --- because the code-cache is
32K, which is a lot like programming on a 1980s computer --- at least,
that my reason for going with ITC in CAMForth (that, plus it is less
work for me). I'm doing a lot of optimization, in the sense that
common sequences of words ("combos") get compiled into a single
primitive --- I've never heard of any Forth Inc. compiler doing that
at all (except for a tiny handful of exceptional cases that SwiftForth
does) --- SwiftForth pretty much just dully compiles every word's xt,
although the ICODE words get expanded out as macros to save the cost
of the CALL and RET (although this bloats the code considerably, so
the resultant cache-thrashing slows down the processor more than
getting rid of the CALL and RET sped it up).

Having NEXT contain a subroutine call would only make sense in DTC,
not ITC (that worked especially well on the PDP-11, because the top
value of the return-stack was held in a register) --- and there was no
DTC Fig-Forth.

Elizabeth D. Rather

unread,
Apr 9, 2013, 8:19:08 PM4/9/13
to
^386

Hugh Aguilar

unread,
Apr 9, 2013, 9:28:28 PM4/9/13
to
On Apr 9, 5:41 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> "Elizabeth D. Rather" <erat...@forth.com> writes:
>
> >>>>> Back in the day, FORTH, Inc. ITC
> >>>>> Forths were as much as 10x faster than FigForths on the same processor.
>
> >> The second coming of Jeff Fox?  Please name the processor and the
> >> benchmark.  I remember you claiming that fig-Forth's NEXT contained
> >> subroutine calls, which is false.
>
> >It was an 80's PC. Someone wrote an article in Byte claiming a "faster
> >NEXT" for the PC. It cut the "standard" FigForth NEXT from 12
> >instructions to about 8. FORTH, Inc's was 2. So we ran some more
> >benchmarks at that time.
>
> Here are the number of instructions in fig-Forth's NEXT for a number of
> CPUs:
>
> 12 6502
>  8 6800
> 11 8085
> 11 Z80
>  2 6809
>  5 8086
>  2 PDP-11
>
> So presumably you had a 2-instruction ITC NEXT on the 6502.  I don't
> think that's possible, but if it is, I would be highly interested to see
> that.

A 2-instruction NEXT on the 6502? That is just more of Elizabeth
Rather's baloney. Considering that the 6502 had 8-bit registers, and
the xt is 16-bit, the 6502 was really the worst possible candidate for
a threaded Forth system (that is why it has the most-expensive NEXT in
your list above).

I also used a language called Promal, that was similar to Pascal, and
it had a pretty short NEXT --- it accomplished this by having only 128
primitives and an 8-bit xt. I don't remember how Promal worked
(although I did buy the complete source-code at the time). If I were
to write something like that now, I would do this:

macro next
{
lda (ip),y
iny
tax
jmp (table,x)
}
This requires every threaded word to fit inside of a page, so the
upper byte of IP never rolls over (the users won't be allowed to write
lengthy words, and if they write a lot of short words they will waste
memory, but if their words are all slightly less than 256 bytes long
then they are doing well). I would just encourage everybody to write
short words, and then I would automatically macro-expand sub-words as
necessary to fill out words so they are just short of the 256 byte
boundary. Also, I would have a few primitives that do a second jump
through another table --- this way I can have many hundreds of words
(secondaries) written in assembly language, although less than 128 of
them (the primitives) would be fast.

I did write a 65c02 cross-compiler, and it was STC --- so NEXT was a
JSR --- one instruction (3 bytes, compared to 2 bytes for a threaded
system). That was a lot less complicated, and a lot faster, and it
required only a little bit more memory.

Mine was derived from ISYS Forth for the Apple-II. The clever part was
the use of a split parameter stack. The low bytes were in one stack,
and the high bytes were in another stack. The two stacks were $40
bytes apart. I could address the low byte of the TOS with $0,x and the
high byte of the TOS with $40,x. DROP compiled into a single INX
instruction. I could make room on the stack for a new datum with a
single DEX instruction. Also, I had a peephole-optimizer that removed
INX DEX pairs, and also converted a JSR followed by a RTS into a
single JMP. There may have been some other minor optimizations as well
--- it was pretty simple, but it generated code that was better than
any commercial Forth or C system.

The guy who wrote ISYS Forth was pretty smart! There were a lot of
smart people doing innovative work with Forth in the 1980s. Elizabeth
Rather wasn't one of them. There was nothing innovative coming out of
Forth Inc. --- Forth Inc. largely ruined Forth's reputation with their
grossly inefficient systems.

Nowadays there are a lot of people with their own Forth, which they
claim to have written themselves. Almost all of these are just warmed
over cadavers of 1970s-vintage Forths. Win32Forth and CIForth are two
examples. SwiftForth is also, as Forth Inc. has just been keeping
Chuck Moore's old work running all of these years, porting it to new
processors, but not adding anything significant --- the Forth world is
like that movie, "Frankenweenie," in which the kid had to periodically
jolt his dead dog with electricity to get it to go again, and also sew
body parts back on that had fallen off due to wear and tear. There are
quite a lot of Forth "experts" (mad-scientist types) who have managed
to get an old Forth system to work on modern hardware and have
exclaimed: "Its alive! Its alive!"

I'm writing CAMForth from scratch though --- it is not derived from
anything --- doing it this way is a lot more time-consuming, but it
will be something that I can call my own. My programming in Forth is
motivated almost entirely by the desire to have something that I can
be proud of.

Ed

unread,
Apr 9, 2013, 11:00:37 PM4/9/13
to
Albert van der Horst wrote:
> ...
> I cared to look that up. In the 6502, the 6809, the Z80 and the
> 8086 listings I inspected, < (``LESS'') was implemented as a small
> code sequence, that looked ok.
> [Admittedly some of these listings were as recent as 1982.]
>
> So, as they say, this one is "busted"
>
> By the way, since time immemorial this bug is caught by the infamous
> Hayes test.

Don't know if it was officially a bug. Certainly the 8080 v1.1 listing
I had corrected it. The technique must have been in wide use because
Forth-79 (and later 83) specifically excluded it:

< n1 n2 -- flag 139 "less-than"

True if n1 is less than n2.

-32768 32767 < must return true.
-32768 0 < must be distinguished.

I found it used in a commercial system - Stackworks SL5/Forth,
which claims to follow Forth-77.

BTW if anyone still has the Forth-77 document, perhaps they could
scan it in and upload to the forth archive at Taygeta.



Chris

unread,
Apr 9, 2013, 11:45:30 PM4/9/13
to era...@forth.com

> Sorry, I no longer have the source for the 8086 version, just the 386.
>
> Its NEXT was:
>
>
>
> : NEXT { LODS W 0 XCHG W ) LIP} 27FF97AD , ;
>
>
>
> (and I don't remember what the notation was, either).
>
>
for polyFORTH ISD-4 Segmented 8086 model that runs on MS-DOS

: NEXT CS: LODS W 0 XCHG W ) CS: LIP ;

The standalone 8086 model is probably better.

Ed

unread,
Apr 10, 2013, 12:02:17 AM4/10/13
to
Chris wrote:
> ...
> for polyFORTH ISD-4 Segmented 8086 model that runs on MS-DOS
>
> : NEXT CS: LODS W 0 XCHG W ) CS: LIP ;
>
> The standalone 8086 model is probably better.

To my knowledge polyForth single-segment 8086 NEXT is coded
in-line as:

LODSW
XCHG AX,DI
JMP WORD PTR [DI]

In FD there was an article:

"A Faster NEXT Loop" (FD V9N6)

This was MVP-FORTH (8086, Forth-79) whose original NEXT was 7
instructions (compared to FIG's 5). It explained how NEXT could be
reduced to 3 instructions and the mods necessary. It was essentially
as above but used a different register and NEXT was not in-line.
The article claimed a 25% speed improvement over the original.



Hugh Aguilar

unread,
Apr 10, 2013, 2:00:41 AM4/10/13
to
On Apr 9, 9:02 pm, "Ed" <inva...@nospam.com> wrote:
> To my knowledge polyForth single-segment 8086 NEXT is coded
> in-line as:
>
>      LODSW
>      XCHG AX,DI
>      JMP WORD PTR [DI]

Wow! Somebody is admitting that PolyForth put everything (compiler,
dictionary, application code and application data) in a single 64K
segment!

When I brought up this issue previously, Elizabeth Rather and her
sycophants all denied that this was ever true --- typically trying to
divert everybody's attention to the 80386 version of PolyForth that
did address more than 64K --- and never mind that by the time that the
80386 came out, Forth was considered to be a bad joke by essentially
everybody, and C had become the standard language for professional
programming. The 16-bit x86 was the crucible that every language had
to prove itself in, and PolyForth failed badly, pulling the entire
Forth community down with it.

This severe memory limitation of PolyForth is why I chose UR/Forth ---
it had machine code in one 64K segment (CS), threaded code and data in
another (DS), and the dictionary headers in another (ES), roughly
comparable to the Small memory-model. Also, UR/Forth was DTC, so NEXT
was:
lodsw
jmp ax
I wrote that 65c02 cross-compiler in 16-bit UR/Forth --- PolyForth
would not have been capable of supporting such a large program. Also,
I tested 16-bit UR/Forth and found it to be essentially the same speed
as Small memory-model Borland Turbo C on every benchmark. All in all,
UR/Forth was capable of supporting professional programming ---
PolyForth's 64K limitation put it in the same category as QBasic
(actually QBasic was better, because only the application had to fit
in 64K, not the application, and the compiler, and the symbol table,
and the IDE).

How could Forth Inc. fail to support multiple segments on the 16-bit
x86??? The segment registers aren't that hard to figure out. Everybody
else in the world grasped the idea that the 8086 supported more than
64K; that was the whole point of upgrading to the 8086 from the 8080/
Z80. Most likely, 8086 PolyForth was just a warmed over 8080 program
that had been ported to the 8086 line-by-line by programmers who knew
almost nothing about the 8086, and didn't want to learn.

Albert van der Horst

unread,
Apr 10, 2013, 7:46:51 AM4/10/13
to
In article <K4ednVfAlIoqDfnM...@supernews.com>,
Elizabeth D. Rather <era...@forth.com> wrote:
>On 4/9/13 10:34 AM, Bernd Paysan wrote:
>> Elizabeth D. Rather wrote:
>>> But our systems had a lot more performance enhancements than an
>>> efficient NEXT. We carefully laid out optimal register assignments on
>>> each processor, based on trial codings of primitives, and had quite a
>>> lot more code primitives than FigForth.
>>
>> Gforth contains an 8086 port (with primitives recycled from Klaus Kohl's
>> Forth from the 80s), and the NEXT there is
>>
>> : next,
>> lods,
>> ax w xchg,
>> w ) jmp, ;
>>
>> This is of course optimized for size (4 bytes) and the 8086, not for modern
>> processors. You somehow have to load a word, increment a pointer by 2, and
>> jump to what it contains for ITC. If you do DTC, you can do this with two
>> instructions (lots + ax jmp), or one (ret) if you use the stack pointer as
>> Forth's instruction pointer - a design decision that wasn't very likely in
>> the 80s, as you then couldn't use push+pop to access the data stack.
>>
>
>Sorry, I no longer have the source for the 8086 version, just the 306.
>Its NEXT was:
>
>: NEXT { LODS W 0 XCHG W ) LIP} 27FF97AD , ;
>
>(and I don't remember what the notation was, either).

From the 386 on, those XCHG aren't needed.
lods
jmp [ax] \ Jump to an address fetched from where ax points.
\ AX can serve as the work register to arrive at >BODY DOES>
\ constant content etc.

AD FF 20


>
>Cheers,
>Elizabeth

Albert van der Horst

unread,
Apr 10, 2013, 7:55:59 AM4/10/13
to
In article <kk2kju$hu4$1...@speranza.aioe.org>, Ed <inv...@nospam.com> wrote:
>Albert van der Horst wrote:
>> ...
>> I cared to look that up. In the 6502, the 6809, the Z80 and the
>> 8086 listings I inspected, < (``LESS'') was implemented as a small
>> code sequence, that looked ok.
>> [Admittedly some of these listings were as recent as 1982.]
>>
>> So, as they say, this one is "busted"
>>
>> By the way, since time immemorial this bug is caught by the infamous
>> Hayes test.
>
>Don't know if it was officially a bug. Certainly the 8080 v1.1 listing
>I had corrected it. The technique must have been in wide use because
<SNIP>

figForth may not have been an official standard, its glossary was clear
and unambiguous, at least at this point. (You can find it on my website
and other places.). It clearly states that < is supposed to work on
signed numbers. So the behaviour you describe would be a defect ("or bug")
as in "deviation from the specification".

Anton Ertl

unread,
Apr 10, 2013, 11:03:24 AM4/10/13
to
Well, the 6502 is the only CPU I know of where fig-Forth has a
12-instruction NEXT.

> My statement above was related to the IBM PC c. mid-80's.

Ok, IBM PC means 8086 fig-Forth, and as you can see above, its NEXT
has 5 instructions, not 12. If your performance comparisons are as
accurate as your instruction counts, they are worthless.

Andrew Haley

unread,
Apr 10, 2013, 11:10:28 AM4/10/13
to
Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
> In article <kk2kju$hu4$1...@speranza.aioe.org>, Ed <inv...@nospam.com> wrote:
>>Albert van der Horst wrote:
>>> ...
>>> I cared to look that up. In the 6502, the 6809, the Z80 and the
>>> 8086 listings I inspected, < (``LESS'') was implemented as a small
>>> code sequence, that looked ok.
>>> [Admittedly some of these listings were as recent as 1982.]
>>>
>>> So, as they say, this one is "busted"
>>>
>>> By the way, since time immemorial this bug is caught by the infamous
>>> Hayes test.
>>
>>Don't know if it was officially a bug. Certainly the 8080 v1.1 listing
>>I had corrected it. The technique must have been in wide use because
> <SNIP>
>
> figForth may not have been an official standard, its glossary was
> clear and unambiguous, at least at this point. (You can find it on
> my website and other places.). It clearly states that < is supposed
> to work on signed numbers. So the behaviour you describe would be a
> defect ("or bug") as in "deviation from the specification".

Not really, no, any more than this is a deviation from the
specification:

2147483647 dup + . -2 ok

If anyone has this paper, it'd be nice to get it scanned:

Leo Brodie and Dean Sanderson, "Division, Relationals, and Loops",
_1981 Rochester FORTH Standards Conference_, p. 117-121.

Andrew.

Anton Ertl

unread,
Apr 10, 2013, 11:19:29 AM4/10/13
to
"Ed" <inv...@nospam.com> writes:
>To my knowledge polyForth single-segment 8086 NEXT is coded
>in-line as:
>
> LODSW
> XCHG AX,DI
> JMP WORD PTR [DI]

Just for completeness, here's the fig-Forth NEXT:

next:
lods ax
mov bx,ax
next1: ;entry point used by EXECUTE
mov dx,bx
inc dx
jmp word ptr[bx]

The third and fourth instruction are actually only useful for routines
like docol dovar etc., not for primitives, so if you optimize for
speed, you move these instructions to docol etc, leaving essentially
the three instructions that polyForth has. OTOH, if you optimize for
space, it's better to have them only in one place.

Anton Ertl

unread,
Apr 10, 2013, 11:56:19 AM4/10/13
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>> It clearly states that < is supposed
>> to work on signed numbers. So the behaviour you describe would be a
>> defect ("or bug") as in "deviation from the specification".
>
>Not really, no, any more than this is a deviation from the
>specification:
>
>2147483647 dup + . -2 ok

Do you really claim that

-30000 30000 <

producing false is correct when the specification says "signed
comparison", not "circular comparison"? And what is your example
supposed to illustrate? There is no comparison there.

Albert van der Horst

unread,
Apr 10, 2013, 2:50:26 PM4/10/13
to
In article <Z66dnedz25l5HfjM...@supernews.com>,
Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>> In article <kk2kju$hu4$1...@speranza.aioe.org>, Ed <inv...@nospam.com> wrote:
>>>Albert van der Horst wrote:
>>>> ...
>>>> I cared to look that up. In the 6502, the 6809, the Z80 and the
>>>> 8086 listings I inspected, < (``LESS'') was implemented as a small
>>>> code sequence, that looked ok.
>>>> [Admittedly some of these listings were as recent as 1982.]
>>>>
>>>> So, as they say, this one is "busted"
>>>>
>>>> By the way, since time immemorial this bug is caught by the infamous
>>>> Hayes test.
>>>
>>>Don't know if it was officially a bug. Certainly the 8080 v1.1 listing
>>>I had corrected it. The technique must have been in wide use because
>> <SNIP>
>>
>> figForth may not have been an official standard, its glossary was
>> clear and unambiguous, at least at this point. (You can find it on
>> my website and other places.). It clearly states that < is supposed
>> to work on signed numbers. So the behaviour you describe would be a
>> defect ("or bug") as in "deviation from the specification".
>
>Not really, no, any more than this is a deviation from the
>specification:
>
>2147483647 dup + . -2 ok

(You would have to state a 16 bit equivalent.)
That is not a deviation from the specification. I cite from the
first page of the fig glossary:
"
All arithmetic is implicitly 16 bit signed integer math, with error and
under-flow indication unspecified.
"
That means that as soon as the result of + cannot be
represented as a 16 bits signed integer, anything goes.
There is nothing in the glossary allowing sloppiness for a perfectly
defined and representable result of <.

>
>If anyone has this paper, it'd be nice to get it scanned:
>
> Leo Brodie and Dean Sanderson, "Division, Relationals, and Loops",
> _1981 Rochester FORTH Standards Conference_, p. 117-121.

We have a lot of those Rochester things. Waiting for everybody to be dead
for 75 years to get it scanned. ;-(.

>
>Andrew.

Bernd Paysan

unread,
Apr 10, 2013, 4:14:48 PM4/10/13
to
Albert van der Horst wrote:
> We have a lot of those Rochester things. Waiting for everybody to be dead
> for 75 years to get it scanned. ;-(.

Nope, those 75 years are a moving target. Anything that was published after
"Steamboat Willie" is going to be copyrighted forever.

I hope that a fairy would grant Walt Disney company the one wish they have:
Copyright should last forever minus one day. And then the heirs of the
Grimm Brothers and Hans Christian Andersen would sue the company into
bancruptcy, for not only violating their copyright, but making the utter
kitsch out of their work... And then we can get rid of this rubbish.

Andrew Haley

unread,
Apr 10, 2013, 5:18:58 PM4/10/13
to
Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>> It clearly states that < is supposed
>>> to work on signed numbers. So the behaviour you describe would be a
>>> defect ("or bug") as in "deviation from the specification".
>>
>>Not really, no, any more than this is a deviation from the
>>specification:
>>
>>2147483647 dup + . -2 ok
>
> Do you really claim that
>
> -30000 30000 <
>
> producing false is correct when the specification says "signed
> comparison", not "circular comparison"?

It's as [in]correct as

+ n1 n2 - sum

Return the sum of n1 + n2

failing to do just that.

> And what is your example supposed to illustrate? There is no
> comparison there.

They are both a direct consequence of the fact that computer
arithmetic is (usually) circular.

Andrew.

Anton Ertl

unread,
Apr 11, 2013, 9:24:41 AM4/11/13
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>>> It clearly states that < is supposed
>>>> to work on signed numbers. So the behaviour you describe would be a
>>>> defect ("or bug") as in "deviation from the specification".
>>>
>>>Not really, no, any more than this is a deviation from the
>>>specification:
>>>
>>>2147483647 dup + . -2 ok
>>
>> Do you really claim that
>>
>> -30000 30000 <
>>
>> producing false is correct when the specification says "signed
>> comparison", not "circular comparison"?
>
>It's as [in]correct as
>
>+ n1 n2 - sum
>
>Return the sum of n1 + n2
>
>failing to do just that.

A + that does not implement the specification is incorrect, sure.
However, in ANS Forth, the specification of + is:

|3.2.2.2 Other integer operations
|
|In all integer arithmetic operations, both overflow and underflow
|shall be ignored. The value returned when either overflow or underflow
|occurs is implementation defined.
|
|6.1.0120 + plus CORE ( n1|u1 n2|u2 -- n3|u3 )
|
|Add n2|u2 to n1|u1, giving the sum n3|u3.

>> And what is your example supposed to illustrate? There is no
>> comparison there.
>
>They are both a direct consequence of the fact that computer
>arithmetic is (usually) circular.

Signed comparison is not at all circular. Nor is unsigned comparison.
And there cannot be an overflow when you compare two numbers of the
same type. The two possible results are definitely representable in a
cell.

Hugh Aguilar

unread,
Apr 11, 2013, 11:50:32 PM4/11/13
to
On Apr 11, 6:24 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Andrew Haley <andre...@littlepinkcloud.invalid> writes:
> >They are both a direct consequence of the fact that computer
> >arithmetic is (usually) circular.
>
> Signed comparison is not at all circular.  Nor is unsigned comparison.
> And there cannot be an overflow when you compare two numbers of the
> same type. The two possible results are definitely representable in a
> cell.

Circular???

There was a lot of weird stuff in Forth. I've already brought up the
way that the 16-bit x86 version of PolyForth only addressed a single
64K segment. Now we are pondering two's-complement arithmetic as if it
was only invented yesterday. I think people just lost patience with
Forth --- there was too much talk about its potential, too many
dunderhead blunders, and not enough accomplishment --- the world moved
on.

Andrew Haley

unread,
Apr 12, 2013, 5:17:30 AM4/12/13
to
Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>>>> It clearly states that < is supposed
>>>>> to work on signed numbers. So the behaviour you describe would be a
>>>>> defect ("or bug") as in "deviation from the specification".
>>>>
>>>>Not really, no, any more than this is a deviation from the
>>>>specification:
>>>>
>>>>2147483647 dup + . -2 ok
>>>
>>> Do you really claim that
>>>
>>> -30000 30000 <
>>>
>>> producing false is correct when the specification says "signed
>>> comparison", not "circular comparison"?
>>
>>It's as [in]correct as
>>
>>+ n1 n2 - sum
>>
>>Return the sum of n1 + n2
>>
>>failing to do just that.
>
> A + that does not implement the specification is incorrect, sure.
> However, in ANS Forth, the specification of + is:

If we were talking about ANS Forth that would be relevant.

>>They are both a direct consequence of the fact that computer
>>arithmetic is (usually) circular.
>
> Signed comparison is not at all circular.

Indeed not, but this isn't one. That's why we're talking about it.

Andrew.

Andrew Haley

unread,
Apr 12, 2013, 5:18:18 AM4/12/13
to
Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>If anyone has this paper, it'd be nice to get it scanned:
>>
>> Leo Brodie and Dean Sanderson, "Division, Relationals, and Loops",
>> _1981 Rochester FORTH Standards Conference_, p. 117-121.
>
> We have a lot of those Rochester things. Waiting for everybody to be dead
> for 75 years to get it scanned. ;-(.

I'm sure Forth Inc wouldn't mind.

Andrew.

Anton Ertl

unread,
Apr 12, 2013, 9:43:39 AM4/12/13
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>>>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>>>>> It clearly states that < is supposed
>>>>>> to work on signed numbers. So the behaviour you describe would be a
>>>>>> defect ("or bug") as in "deviation from the specification".
>>>>>
>>>>>Not really, no, any more than this is a deviation from the
>>>>>specification:
>>>>>
>>>>>2147483647 dup + . -2 ok
>>>>
>>>> Do you really claim that
>>>>
>>>> -30000 30000 <
>>>>
>>>> producing false is correct when the specification says "signed
>>>> comparison", not "circular comparison"?
>>>
>>>It's as [in]correct as
>>>
>>>+ n1 n2 - sum
>>>
>>>Return the sum of n1 + n2
>>>
>>>failing to do just that.
>>
>> A + that does not implement the specification is incorrect, sure.
>> However, in ANS Forth, the specification of + is:
>
>If we were talking about ANS Forth that would be relevant.

Not ANS Forth? "2147483647 dup + ." is certainly beyond fig-Forth,
Forth-79, and Forth-83. So what Forth specification are you
discussing?

In any case, independently of the particular Forth:

A + that does not implement the specification is incorrect, sure.

And a < that does not implement the specification is incorrect as
well.

": < - 0< ;" would be an incorrect implementation of < wrt fig-Forth's
specification, while fig-Forth's actual < is a correct implementation.

>>>They are both a direct consequence of the fact that computer
>>>arithmetic is (usually) circular.
>>
>> Signed comparison is not at all circular.
>
>Indeed not, but this isn't one. That's why we're talking about it.

What "this" is not one what? And what's "it".

WJ

unread,
Apr 12, 2013, 10:41:25 AM4/12/13
to
Hugh Aguilar wrote:

> This severe memory limitation of PolyForth is why I chose UR/Forth ---
> it had machine code in one 64K segment (CS), threaded code and data in
> another (DS), and the dictionary headers in another (ES), roughly
> comparable to the Small memory-model. Also, UR/Forth was DTC, so NEXT
> was:
> lodsw
> jmp ax
> I wrote that 65c02 cross-compiler in 16-bit UR/Forth --- PolyForth
> would not have been capable of supporting such a large program. Also,
> I tested 16-bit UR/Forth and found it to be essentially the same speed
> as Small memory-model Borland Turbo C on every benchmark. All in all,
> UR/Forth was capable of supporting professional programming ---

And UR/Forth was killed by the ANS standard?

Andrew Haley

unread,
Apr 12, 2013, 12:08:13 PM4/12/13
to
fig-FORTH, and explicitly so. I absolutely agree that this would be a
bug in any ANS Forth.

> ": < - 0< ;" would be an incorrect implementation of < wrt
> fig-Forth's specification, while fig-Forth's actual < is a correct
> implementation.

fig-FORTH's actual < is

: < - 0< ;

I'm looking at the original source code.

Around that time, when standardization was a work in progress, there
was a lot of dicussion about such things. Sanderson and Brodie's
paper explained why Forth Inc preferred the circular relational. The
fig-FORTH model was based on microFORTH, which was created by Dean
Sanderson. It is therefore not at all surprising that < is defined in
the same way. It is not a mistake.

Andrew.

Elizabeth D. Rather

unread,
Apr 12, 2013, 2:30:08 PM4/12/13
to
Inasmuch as this is a Rochester paper, any control of it properly
belongs to Forsley or whatever remains of that organization.

I personally think it would be great for all the Rochester proceedings
to be published online.

Albert van der Horst

unread,
Apr 12, 2013, 2:50:38 PM4/12/13
to
In article <D4SdnUAv5-fwrPXM...@supernews.com>,
Okay. It is a curious historical side note.
I've looked into 6 or 7 versions (mostly even in print) and they
differ from this.

>
>Around that time, when standardization was a work in progress, there
>was a lot of dicussion about such things. Sanderson and Brodie's
>paper explained why Forth Inc preferred the circular relational. The
>fig-FORTH model was based on microFORTH, which was created by Dean
>Sanderson. It is therefore not at all surprising that < is defined in
>the same way. It is not a mistake.

I've giving you quotes from the figForth glossary and installation
manual which was widely circulated. No one except you manages to find
back traces of fig-Forth's original sin. Very early in the game
apparently fig-Forth repented. So we should pardon once and
for all.

Albert van der Horst

unread,
Apr 12, 2013, 2:58:59 PM4/12/13
to
In article <hrqdnYrGEpAtz_XM...@supernews.com>,
Elizabeth D. Rather <era...@forth.com> wrote:
>On 4/11/13 11:18 PM, Andrew Haley wrote:
>> Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>>> If anyone has this paper, it'd be nice to get it scanned:
>>>>
>>>> Leo Brodie and Dean Sanderson, "Division, Relationals, and Loops",
>>>> _1981 Rochester FORTH Standards Conference_, p. 117-121.
>>>
>>> We have a lot of those Rochester things. Waiting for everybody to be dead
>>> for 75 years to get it scanned. ;-(.
>>
>> I'm sure Forth Inc wouldn't mind.
>
>Inasmuch as this is a Rochester paper, any control of it properly
>belongs to Forsley or whatever remains of that organization.

And that is the problem. It is a great pain to find out even
who owns the copyright. I think about pushing legislation to the
effect that make it hard for lawyers to come out of the woods and
shoot down people who honestly make an effort to preserve cultural
and technical heritage.

>
>I personally think it would be great for all the Rochester proceedings
>to be published online.

Of course. And you can be sure that if it becomes online "illegally"
that I have nothing to do with it.

>
>Cheers,
>Elizabeth

Elizabeth D. Rather

unread,
Apr 12, 2013, 5:29:47 PM4/12/13
to
On 4/12/13 8:58 AM, Albert van der Horst wrote:
> In article <hrqdnYrGEpAtz_XM...@supernews.com>,
> Elizabeth D. Rather <era...@forth.com> wrote:
>> On 4/11/13 11:18 PM, Andrew Haley wrote:
>>> Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>>>> If anyone has this paper, it'd be nice to get it scanned:
>>>>>
>>>>> Leo Brodie and Dean Sanderson, "Division, Relationals, and Loops",
>>>>> _1981 Rochester FORTH Standards Conference_, p. 117-121.
>>>>
>>>> We have a lot of those Rochester things. Waiting for everybody to be dead
>>>> for 75 years to get it scanned. ;-(.
>>>
>>> I'm sure Forth Inc wouldn't mind.
>>
>> Inasmuch as this is a Rochester paper, any control of it properly
>> belongs to Forsley or whatever remains of that organization.
>
> And that is the problem. It is a great pain to find out even
> who owns the copyright. I think about pushing legislation to the
> effect that make it hard for lawyers to come out of the woods and
> shoot down people who honestly make an effort to preserve cultural
> and technical heritage.

Not at all. Ask Larry Forsley, he's still around (though not on clf). I
have his email, although I don't feel comfortable publishing it here. I
have sent it privately to Albert and will give it privately to anyone
who is serious about pursuing this.

Cheers,
Elizabeth

>> I personally think it would be great for all the Rochester proceedings
>> to be published online.
>
> Of course. And you can be sure that if it becomes online "illegally"
> that I have nothing to do with it.
>
>>
>> Cheers,
>> Elizabeth
>
>
> Groetjes Albert
>


--

Ed

unread,
Apr 13, 2013, 12:57:18 AM4/13/13
to
Albert van der Horst wrote:
> In article <kk2kju$hu4$1...@speranza.aioe.org>, Ed <inv...@nospam.com> wrote:
> ...
> >Don't know if it was officially a bug. Certainly the 8080 v1.1 listing
> >I had corrected it. The technique must have been in wide use because <SNIP>
>
> figForth may not have been an official standard, its glossary was clear
> and unambiguous, at least at this point. (You can find it on my website
> and other places.). It clearly states that < is supposed to work on
> signed numbers. So the behaviour you describe would be a defect ("or bug")
> as in "deviation from the specification".

The Fig Model manual I have defines : < - 0< ; so it's fair to say it
was "official" as far as Fig-Forth was concerned. Short of asking Bill
Ragsdale, we'll never know whether it was the behaviour he intended.
We do know that most implementors, being made aware of its
limitations, chose to "fix" it.

Once Forth-79 came out, its fate was sealed. Assuming it was originally
in Forth Inc products, the question would be at what point in time did
they elect to "fix" it ...



Ed

unread,
Apr 13, 2013, 1:34:21 AM4/13/13
to
By that time I suspect Ray had done all he wanted in Forth. AFAIK
he went into medicine. It beats living out one's days on c.l.f. Heck,
even gardening is better :)



Andrew Haley

unread,
Apr 13, 2013, 4:57:23 AM4/13/13
to
Sure, but it's the fig-FORTH model (for 6502, written in Forth) on
which all the others were based

>>Around that time, when standardization was a work in progress, there
>>was a lot of dicussion about such things. Sanderson and Brodie's
>>paper explained why Forth Inc preferred the circular relational.
>>The fig-FORTH model was based on microFORTH, which was created by
>>Dean Sanderson. It is therefore not at all surprising that < is
>>defined in the same way. It is not a mistake.
>
> I've giving you quotes from the figForth glossary and installation
> manual which was widely circulated. No one except you manages to
> find back traces of fig-Forth's original sin.

That's their problem, not mine. If you've got the Installation Manual
with the source you'll find it on Block 38.

> Very early in the game apparently fig-Forth repented. So we should
> pardon once and for all.

If you've got the proceedings withe Sanderson and Brodie's paper, you
should read it.

Andrew.

Anton Ertl

unread,
Apr 13, 2013, 7:59:20 AM4/13/13
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>> In article <D4SdnUAv5-fwrPXM...@supernews.com>,
>> Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>>>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>>>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>>>>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>>>>>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>>>>>>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>>>>>>>Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
>>>>>>>>>> It clearly states that < is supposed
>>>>>>>>>> to work on signed numbers. So the behaviour you describe would be a
>>>>>>>>>> defect ("or bug") as in "deviation from the specification".
>>>>>>>>>
>>>>>>>>>Not really, no, any more than this is a deviation from the
>>>>>>>>>specification:
>>>>>>>>>
>>>>>>>>>2147483647 dup + . -2 ok
...
>>>fig-FORTH's actual < is
>>>
>>>: < - 0< ;
>>>
>>>I'm looking at the original source code.
>>
>> Okay. It is a curious historical side note.
>> I've looked into 6 or 7 versions (mostly even in print) and they
>> differ from this.
>
>Sure, but it's the fig-FORTH model (for 6502, written in Forth) on
>which all the others were based

So the first implementation contained a bug, or they would not have
changed it. So they did not share your idea that it is "not really,
no" a bug.

Although, looking at two versions, the code looks still funny:

6502 v1.1:

L1254 .BYTE $81,$BC
.WORD L1246 ; link to U<
LESS .WORD *+2
SEC
LDA 2,X
SBC 0,X ; subtract
LDA 3,X
SBC 1,X
STY 3,X ; zero high byte
BVC L1258
EOR #$80 ; correct overflow
L1258 BPL L1260
INY ; invert boolean
L1260 STY 2,X ; leave boolean
JMP POP

8086 v1.0:
DB 81H
DB '<'+80H
DW EQUAL-4
LESS DW $+2
POP DX
POP AX
MOV BX,DX
XOR BX,AX ; T E S T FOR EQUAL S I G N S
JS LES1 ; SIGNS NOT T H E SAME
SUB AX,DX
LES 1: OR AX, AX ; TEST SIGN S I T
MOV AX ,0 ; ASSUME FALSE C O N D I T I O N
JNS LES2 ; NOT L E S S TEEN
INC AX ; TRUE (1)
LES2: JMP APUSH

It seems the authors did not like the cmp instructions of these
architectures. I am not sure that these implementations are correct.
Correct implementations that are shorter are certainly possible.

What's even funnier is:

6502 v1.1:

; U<
; Unsigned less than
;
L1246 .BYTE $82,'U',$BC
.WORD L1244 ; link to =
ULESS .WORD DOCOL
.WORD SUB ; subtract two values
.WORD ZLESS ; test sign
.WORD SEMIS

Why have U< if you are implementing circular comparison? And vice
versa, why implement U< as circular comparison?

Concerning your integer overflow argument, I found this within two
minutes of reading the "Installation Manual and Glossary"
<http://www.forth.org/fig-forth/contents.html>:

|Unless otherwise noted, all references to numbers are for 16 bit signed
|integers. [...]
|
|All arithmetic is implicitly 16 bit signed integer math, with error and
|under-flow indication unspecified.

So wraparound on overflow is compliant with the specification,
circular comparison for < is not.

Andrew Haley

unread,
Apr 13, 2013, 3:47:32 PM4/13/13
to
That does not follow. Perhaps they thought it was a bug, but were
wrong. Perhaps they thought that even if it wasn't a bug it would
surprise people.

> So they did not share your idea that it is "not really, no" a bug.

Maybe they didn't. They may not have understood why < was defined
that way, taking it for a bug.

> Concerning your integer overflow argument, I found this within two
> minutes of reading the "Installation Manual and Glossary"
> <http://www.forth.org/fig-forth/contents.html>:
>
> |Unless otherwise noted, all references to numbers are for 16 bit signed
> |integers. [...]
> |
> |All arithmetic is implicitly 16 bit signed integer math, with error and
> |under-flow indication unspecified.

Yes, I read that.

> So wraparound on overflow is compliant with the specification,
> circular comparison for < is not.

You're missing the point. At the time there was some discussion about
how signed comparisons should be done, and this was one opinion. As
was once explained by someone at Forth Inc, (Sanderson?) "we can't
promise you that a word does what you want for your application, but
we can promise that we've thought about it." Sometimes the circular
relational is appropriate, sometimes not. But in any case, this is an
argument that was over many years ago.

The question I have addressed is whether < , as it was written, was an
error, as has been suggested. I have pointed out that it probably
wasn't, and I have referred you to the paper where the superiority of
the circular relational for the kinds of applications Forth was used
for at the time was described. Arguing about whether it was
"compliant with the specification" is irrelevant in the context of
this historical argument.

Andrew.

rickman

unread,
Apr 12, 2013, 10:44:22 AM4/12/13
to
Can you explain that? Comparison is done by subtracting. Subtraction
can indeed produce overflows for two numbers of the same type. -32000
32000 - does produce an overflow in a 16 bit signed number system.
Likewise 1 2 - produces an overflow in a 16 bit unsigned number system.

Even if you ignore the overflows and just work with the result you get,
cases can't be distinguished. -32768 32767 - produces 1 as does 2 1 -,
but they should produce different results in a comparison test. Do I
have this wrong?

I'm not clear what your point is.

--

Rick

Bernd Paysan

unread,
Apr 15, 2013, 10:20:53 AM4/15/13
to
rickman wrote:
> Can you explain that? Comparison is done by subtracting. Subtraction
> can indeed produce overflows for two numbers of the same type. -32000
> 32000 - does produce an overflow in a 16 bit signed number system.
> Likewise 1 2 - produces an overflow in a 16 bit unsigned number system.

Yes, and the rule for comparison is that a<b is true if (a-b<0) xor
overflow. Many processors have flags indicating signed overflow and carry
(unsigned overflow, sometimes on subtraction, it's borrow, which is the
inverse).

> Even if you ignore the overflows and just work with the result you get,
> cases can't be distinguished. -32768 32767 - produces 1 as does 2 1 -,
> but they should produce different results in a comparison test. Do I
> have this wrong?
>
> I'm not clear what your point is.

That it can be done right and has been done right.

The things I'm missing are mixed-type comparisons. If you compare
signed<unsigned, it's true if (in Forth words):

: su< ( n u -- flag ) over 0< >r u< r> or ;

There are a number of cases where mixed operations are quite useful. I like
mixed s*u multiplication and s/u division, because you usually have one
factor without sign, and then there's no need to handle the negative case,
which costs you a bit on multiplication and some headaches on division.
IIRC, Forth Inc. had s/u division during the early days, and was quite happy
with that.

Anton Ertl

unread,
Apr 15, 2013, 11:46:15 AM4/15/13
to
rickman <gnu...@gmail.com> writes:
>On 4/11/2013 9:24 AM, Anton Ertl wrote:
>> Signed comparison is not at all circular. Nor is unsigned comparison.
>> And there cannot be an overflow when you compare two numbers of the
>> same type. The two possible results are definitely representable in a
>> cell.
>
>Can you explain that? Comparison is done by subtracting.

On all architectures that I have programmed, it is done with a
comparison instruction; e.g., CMP on 6502, CMP on IA-32 and AMD64, CMP
on the 68000, SLT/SLTU (signed/unsigned) on MIPS, cmplt/cmpult on
Alpha etc.

>Even if you ignore the overflows and just work with the result you get,
>cases can't be distinguished. -32768 32767 - produces 1 as does 2 1 -,
>but they should produce different results in a comparison test. Do I
>have this wrong?

So, just doing a subtraction is obviously insufficient to achieve a
proper comparison. If you insist on not using a proper comparison
instruction, but have an overflow flag, you can do what Bernd
described. If you don't even have that, you can do it, e.g., thus
(untested):

: < { a b -- f }
a b xor 0< if \ different signs?
a 0<
else
a b - 0< \ for the same sign, circular should work.
then ;

Sieur de Bienville

unread,
Apr 15, 2013, 10:19:26 PM4/15/13
to
On Apr 15, 10:46 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> rickman <gnu...@gmail.com> writes:
> >On 4/11/2013 9:24 AM, Anton Ertl wrote:
> >> Signed comparison is not at all circular.  Nor is unsigned comparison.
> >> And there cannot be an overflow when you compare two numbers of the
> >> same type. The two possible results are definitely representable in a
> >> cell.
>
> >Can you explain that?  Comparison is done by subtracting.
>
> On all architectures that I have programmed, it is done with a
> comparison instruction; e.g., CMP on 6502, CMP on IA-32 and AMD64, CMP
> on the 68000, SLT/SLTU (signed/unsigned) on MIPS, cmplt/cmpult on
> Alpha etc.

I don't remember how the 68k CMP instruction works, but the IA-32 and
the AMD64 CMP instruction is a nondestructive version of the SUB
instruction. So on a PC it is in fact done by subtracting.

Virtually,
Michael Morris

Anton Ertl

unread,
Apr 16, 2013, 8:53:00 AM4/16/13
to
Sieur de Bienville <morrim...@gmail.com> writes:
>I don't remember how the 68k CMP instruction works, but the IA-32 and
>the AMD64 CMP instruction is a nondestructive version of the SUB
>instruction. So on a PC it is in fact done by subtracting.

CMP sets the flags, and the flags contain enough information to
complete the comparison. E.g., if you do JL/SETL (L for less), you
get a signed comparison and if you do JB/SETB (B for below), you get
an unsigned comparison. You can even get a circular comparison with
JS/SETS (S for sign).

I guess you could use SUB instead of CMP, and it would still work; the
point is that, indeed, the number result of a modulo subtraction does
not contain enough information for a proper signed or unsigned
comparison; the flags result of a CMP instruction does have that
information.

Elizabeth D. Rather

unread,
Apr 16, 2013, 2:12:35 PM4/16/13
to
I don't remember general mixed arithmetic, but I do know that all the
MOD words (MOD, /MOD, M/MOD, */MOD, etc.) featured an unsigned *divisor*
at FORTH, Inc for many years. This made a cleaner implementation, and we
never encountered a "real life" need for a signed mod divisor. We
changed in the 90's to be ANS compliant, having failed to persuade the
TC of our point of view.

Cheers,
Elizabeth

A.K.

unread,
Apr 16, 2013, 4:21:13 PM4/16/13
to
Back in those days division was horribly slow. So one preferred
algorithms that avoided division or modulus if possible.

Andreas


Bernd Paysan

unread,
Apr 16, 2013, 4:33:05 PM4/16/13
to
Elizabeth D. Rather wrote:
> I don't remember general mixed arithmetic, but I do know that all the
> MOD words (MOD, /MOD, M/MOD, */MOD, etc.) featured an unsigned *divisor*
> at FORTH, Inc for many years. This made a cleaner implementation, and we
> never encountered a "real life" need for a signed mod divisor.

Nor do I. I however encountered the need of a full-range unsigned mod
divisor ([1..ffff]; this sort of integer arithmetics is very common on 16
bit platforms), with a signed dividend. Fortunately, you can trivially
build that out of um/mod:

: m/mod ( d u -- umod nquot ) \ Forth Inc. style
>r s>d r@ and + r> um/mod ;

> We
> changed in the 90's to be ANS compliant, having failed to persuade the
> TC of our point of view.

This is a bit funny, because there are already two possible signed
divisions; adding a third one wouldn't have done more harm ;-). Due to the
different rounding, you can use the two standard divisions reliable only for
two positive inputs, which means you could as well use um/mod.

The argument back then was that CPUs do have exactly these signed division
instructions (especially with symmetric rounding), but however, CPUs do need
some time for the overhead of correct sign treatment, too, and then, floored
is almost equal, and Forth Inc.'s division mode is even faster.

The signed*unsigned multiplication is particularly handy, if you interpret
your numbers as fractions in the range [0..1[, and then do a mixed singed-
unsigned multiplicatin and a NIP. As above, you can easily use um* to
define it:

: usm* ( u n -- d ) 2dup 0< and >r um* r> - ;

rickman

unread,
Apr 16, 2013, 5:29:38 PM4/16/13
to
On 4/16/2013 8:53 AM, Anton Ertl wrote:
> Sieur de Bienville<morrim...@gmail.com> writes:
>> I don't remember how the 68k CMP instruction works, but the IA-32 and
>> the AMD64 CMP instruction is a nondestructive version of the SUB
>> instruction. So on a PC it is in fact done by subtracting.
>
> CMP sets the flags, and the flags contain enough information to
> complete the comparison. E.g., if you do JL/SETL (L for less), you
> get a signed comparison and if you do JB/SETB (B for below), you get
> an unsigned comparison. You can even get a circular comparison with
> JS/SETS (S for sign).
>
> I guess you could use SUB instead of CMP, and it would still work; the
> point is that, indeed, the number result of a modulo subtraction does
> not contain enough information for a proper signed or unsigned
> comparison; the flags result of a CMP instruction does have that
> information.

So, if I am understanding this correctly, there is no lower level Forth
primitives that can be used to implement an efficient < word. It
depends on coding this in the instruction set of the CPU used? In fact,
it *requires* the CPU to offer an overflow detection, right?

--

Rick

Elizabeth D. Rather

unread,
Apr 16, 2013, 5:44:33 PM4/16/13
to
I can't really imagine < not being a primitive, just as I can't imagine
+ not being one.

rickman

unread,
Apr 16, 2013, 6:00:47 PM4/16/13
to
I don't follow the "comparison" at all (if I can risk a pun). I believe
several times in this thread the < word has been given as a Forth
definition. Whether it is implemented as a primitive or not is an
implementation issue. Whether it *must* be implemented as a primitive
says something about what the hardware requirements are for Forth.

Interesting that I've never seen anything indicating just what the
hardware requirements for Forth are. In general Forth avoids the use of
flags, rather the only test is to compare the top of stack with zero.
So to require an overflow detection, which is pretty awkward to do
without a flag, is interesting to say the least. I was going to delete
the overflow detection from my CPU design, I guess I'll hold off on that.

--

Rick

Andrew Haley

unread,
Apr 16, 2013, 6:33:09 PM4/16/13
to
rickman <gnu...@gmail.com> wrote:

> I don't follow the "comparison" at all (if I can risk a pun). I
> believe several times in this thread the < word has been given as a
> Forth definition. Whether it is implemented as a primitive or not
> is an implementation issue. Whether it *must* be implemented as a
> primitive says something about what the hardware requirements are
> for Forth.
>
> Interesting that I've never seen anything indicating just what the
> hardware requirements for Forth are. In general Forth avoids the
> use of flags, rather the only test is to compare the top of stack
> with zero. So to require an overflow detection, which is pretty
> awkward to do without a flag, is interesting to say the least. I
> was going to delete the overflow detection from my CPU design, I
> guess I'll hold off on that.

Novix didn't have an overflow flag either, and did < as - 0< .
I don't think it ever was a problem.

Andrew.

Elizabeth D. Rather

unread,
Apr 16, 2013, 6:46:37 PM4/16/13
to
You're right, it's an implementation issue. Still, from the POV of an
implementer, I can't imagine adding the overhead of a call to what
should be a very few instructions.

Andrew correctly notes that Novix did it in high level, but calls were
essentially free on that chip. It's a special case.

Anton Ertl

unread,
Apr 17, 2013, 6:23:43 AM4/17/13
to
rickman <gnu...@gmail.com> writes:
>So, if I am understanding this correctly, there is no lower level Forth
>primitives that can be used to implement an efficient < word.

Yes; nor U<. Unless of course you consider the implementation I gave
efficient enough given the frequency of occurence of < and U<.
Looking at <http://www.complang.tuwien.ac.at/forth/peep/sorted>, 0.04%
of the dynamically executed primitives are < and 0.01% are U<. You
may or may not implement some of the words that are primitives in
Gforth with < or U<, so YMMV (e.g., 0.5% of the primitives are (LOOP),
and I am too lazy to work out if that can be implemented just with 0<).

>It
>depends on coding this in the instruction set of the CPU used? In fact,
>it *requires* the CPU to offer an overflow detection, right?

No. You could also have direct instructions for < and U<. E.g., MIPS
has SLT and SLTU, and Alpha has CMPLT and CMPULT.

rickman

unread,
Apr 17, 2013, 1:09:36 PM4/17/13
to
I still think we are talking about two different issues. If the
definition of < is different from - 0< and requires the detection of
overflow, there is no way to implement < on a machine without overflow
detection, at least without a lot of extra work. I believe overflow can
be defined in terms of the sign bits, comparing them before and after
the subtraction, so it would be quite messy indeed.

How is it that < can be defined as - 0< on the Novix and it not be a
problem? Was this not an ANS compatible Forth?

--

Rick

Andrew Haley

unread,
Apr 17, 2013, 1:20:58 PM4/17/13
to
rickman <gnu...@gmail.com> wrote:
>
> How is it that < can be defined as - 0< on the Novix and it not be a
> problem? Was this not an ANS compatible Forth?

It was not. Novix Forth was odd in many ways.

Andrew.

Lars Brinkhoff

unread,
Apr 17, 2013, 1:26:16 PM4/17/13
to
Sieur de Bienville <morrim...@gmail.com> writes:
> I don't remember how the 68k CMP instruction works

Finally an excuse to break out the old M68000PM/AD.


CMP

Operation: Destination - Source => cc

Description: Subtracts the source operand from the destination data
register and sets the condition codes to the result; the data
register is not changed. The size of the operation can be byte,
word, or long.

Condition Codes:

X - Not affected.
N - Set if the result is negative; cleared otherwise.
Z - Set if the result is zero; cleared otherwise.
V - Set if an overflow occurs; cleared otherwise.
C - Set if a borrow occurs; cleared otherwise.


And then e.g. BLT uses this test: N && !V || !N && V.

> but the IA-32 and the AMD64 CMP instruction is a nondestructive
> version of the SUB instruction.

So essentially the same.

Brad Eckert

unread,
Apr 17, 2013, 1:32:25 PM4/17/13
to
On Tuesday, April 16, 2013 2:29:38 PM UTC-7, rickman wrote:
> So, if I am understanding this correctly, there is no lower level Forth
> primitives that can be used to implement an efficient < word. It
> depends on coding this in the instruction set of the CPU used? In fact,
> it *requires* the CPU to offer an overflow detection, right?
>
From eForth:
: < 2DUP XOR 0< IF DROP 0< EXIT THEN - 0< ;

Without an overflow flag, < is wordy. On my Forth CPU, subtraction sets overflow and carry flags. There are words to overwrite the TOS with a flag depending on various combinations of C,V,N,Z states. That's how I support signed and unsigned <, <=, >, >=, etc. It works pretty well, although if you have interrupts you need a way to save and restore the flags.

nobody@nowhere

unread,
Apr 17, 2013, 1:36:24 PM4/17/13
to


In article <e354336c-b9ec-4544...@googlegroups.com>, Brad
Eckert <hwf...@gmail.com> wrote:
>On Tuesday, April 16, 2013 2:29:38 PM UTC-7, rickman wrote:
>> So, if I am understanding this correctly, there is no lower level Forth=
>=20
>> primitives that can be used to implement an efficient < word. It=20
>> depends on coding this in the instruction set of the CPU used? In fact,=
>=20
>> it *requires* the CPU to offer an overflow detection, right?
>>=20
>From eForth:
>: < 2DUP XOR 0< IF DROP 0< EXIT THEN - 0< ;
>
>Without an overflow flag, < is wordy. On my Forth CPU, subtraction sets ove=
>rflow and carry flags. There are words to overwrite the TOS with a flag dep=
>ending on various combinations of C,V,N,Z states. That's how I support sign=
>ed and unsigned <, <=3D, >, >=3D, etc. It works pretty well, although if yo=
>u have interrupts you need a way to save and restore the flags.
>

For reference, these are the implementations of < on the Forth I use:

< implementations

SF80C552 (Subrutine threded 8051 FORTH)


TRUE: ; push FFFF
MOV A,#0FFh
SJMP PUSH_AA

; ( --- 0 )

FALSE: ; push 0000
CLR A

; ( -- n )

PUSH_AA: ; push ACC,ACC
DEC R0
MOV @R0,A
PUSH_A1:
DEC R0
MOV @R0,A
RET



;*****************************************************************************
; < ( n n -- f )
;*****************************************************************************

DFB 81H,"<"+80h
DWM STOD-7
LESS:
MOV A,R0
ADD A,#2
MOV R1,A
MOV A,@R0
XRL A,@R1
JB 0E7h,LESS_2 ; ACC.7
LJMP ULESS ; same sign
LESS_2:
MOV A,@R0
INC R0
INC R0
INC R0
INC R0
JB 0E7h,LESS_3
LJMP TRUE
LESS_3: LJMP FALSE

PAGE
;*****************************************************************************
; U< ( u u -- f )
;*****************************************************************************

DFB 82h,"U","<"+80h
DWM LESS-4
ULESS:
MOV A,R0
MOV R1,A ; R1 u2MSB
INC R0
INC R0 ; R0 u1MSB
MOV A,@R0 ; get u1MSB in ACC
CLR C
SUBB A,@R1 ; A-u2MSB
INC R1 ; point to LSBytes
INC R0
JC ULESS_TRUE ; if CARRY u1<u2
JNZ ULESS_FALSE ; if not 0 u1>u2
MOV A,@R0 ; do the same
SUBB A,@R1
JC ULESS_TRUE
ULESS_FALSE:
INC R0 ; need to pop u1 and u2
LJMP FALSE ; because
ULESS_TRUE:
INC R0 ; TRUE and FALSE
LJMP TRUE ; will push 16bit



LMI 8086 Metacompiler

CODE < BX POP AX POP AX, BX SUB AX, # -1 MOV 1$ JL
AX INC 1$: APUSH, END-CODE

CODE U< AX POP DX POP DX, AX SUB AX, # -1 MOV 1$ JB
AX INC 1$: APUSH, END-CODE


UR Forth 386


CODE < EAX, EAX XOR EBX POP ECX POP EBX, ECX CMP
AL SETLE EAX DEC EAX PUSH NEXT, END-CODE


CODE U< EAX, EAX XOR EBX POP ECX POP EBX, ECX CMP
AL SETNA EAX DEC EAX PUSH NEXT, END-CODE


VXF Arm

CODE < \ n1 n2 -- flag
\ *G Return TRUE if n1 is less than n2.
ldmfd psp ! { r0 }
cmp r0, tos
mvn .lt tos, # 0
mov .ge tos, # 0
next,
END-CODE


CODE U< \ u1 u2 -- flag
\ *G An UNSIGNED version of <.
ldmfd psp ! { r0 }
cmp r0, tos
mvn .lo tos, # 0
mov .hs tos, # 0
next,
END-CODE


VXF CPU32 (68k)

CODE < \ x1 x2 -- flag
(A6)+ d7 CMP, D7 SGT,
cpu32? [if] d7 l. extb, [else] d7 w. ext, d7 l. ext, [then]

next,
END-CODE


CODE U< \ u1 u2 -- flag
(A6)+ D7 CMP, D7 SHI,
cpu32? [if] d7 l. extb, [else] d7 w. ext, d7 l. ext, [then]

next,
END-CODE


Alberto

Elizabeth D. Rather

unread,
Apr 17, 2013, 1:38:03 PM4/17/13
to
On 4/17/13 7:09 AM, rickman wrote:
> On 4/16/2013 6:46 PM, Elizabeth D. Rather wrote:
...
>> Andrew correctly notes that Novix did it in high level, but calls were
>> essentially free on that chip. It's a special case.
>
> I still think we are talking about two different issues. If the
> definition of < is different from - 0< and requires the detection of
> overflow, there is no way to implement < on a machine without overflow
> detection, at least without a lot of extra work. I believe overflow can
> be defined in terms of the sign bits, comparing them before and after
> the subtraction, so it would be quite messy indeed.
>
> How is it that < can be defined as - 0< on the Novix and it not be a
> problem? Was this not an ANS compatible Forth?
>

No, Novix was pre-ANS. They participated in the first few years (which
is why the integrated FP stack was included, for example), but most of
their work was done in the 80's, and sadly the Harris division was shut
down by the time ANS Forth was published.

rickman

unread,
Apr 17, 2013, 1:46:28 PM4/17/13
to
On 4/17/2013 6:23 AM, Anton Ertl wrote:
> rickman<gnu...@gmail.com> writes:
>> So, if I am understanding this correctly, there is no lower level Forth
>> primitives that can be used to implement an efficient< word.
>
> Yes; nor U<. Unless of course you consider the implementation I gave
> efficient enough given the frequency of occurence of< and U<.
> Looking at<http://www.complang.tuwien.ac.at/forth/peep/sorted>, 0.04%
> of the dynamically executed primitives are< and 0.01% are U<. You
> may or may not implement some of the words that are primitives in
> Gforth with< or U<, so YMMV (e.g., 0.5% of the primitives are (LOOP),
> and I am too lazy to work out if that can be implemented just with 0<).

I am thinking in somewhat different terms I guess. I looked at your
code and it is not so terribly long, actually it isn't too bad. But in
the apps that I generally do, most of the operations have to be very
efficient. Loops are controlling hardware and/or moving lots of data in
short time frames. If the word has to be defined, it won't get used
where it counts.

BTW, your code won't work on 1's complement machines I believe. Isn't
an all 1's word a negative zero? Or is that supposed to be converted to
an all 0's word if generated by an operation?


>> It
>> depends on coding this in the instruction set of the CPU used? In fact,
>> it *requires* the CPU to offer an overflow detection, right?
>
> No. You could also have direct instructions for< and U<. E.g., MIPS
> has SLT and SLTU, and Alpha has CMPLT and CMPULT.

Interesting instructions, but that isn't really the point. Or maybe it
is. I suppose these instructions map correctly to the Forth < word.

--

Rick

rickman

unread,
Apr 17, 2013, 2:21:06 PM4/17/13
to
I don't recall your CPU. Is it described on a web page? Maybe I'm just
having a senior moment...

--

Rick

Andrew Haley

unread,
Apr 17, 2013, 5:43:30 PM4/17/13
to
rickman <gnu...@gmail.com> wrote:

> I still think we are talking about two different issues. If the
> definition of < is different from - 0< and requires the detection of
> overflow, there is no way to implement < on a machine without
> overflow detection, at least without a lot of extra work. I believe
> overflow can be defined in terms of the sign bits, comparing them
> before and after the subtraction, so it would be quite messy indeed.

You don't need special overflow detection, or extra comparisons: it's
a lot easier to flip the sign bits of both operands and then do a U< .
You don't have to do anything fiddly.

Andrew.

Brad Eckert

unread,
Apr 17, 2013, 6:03:31 PM4/17/13
to
On Wednesday, April 17, 2013 11:21:06 AM UTC-7, rickman wrote:
>
> I don't recall your CPU. Is it described on a web page? Maybe I'm just
> having a senior moment...
>
I presented it at 2010 Forth day, there are slides on the SVFIG web site. I'm not sure more info will be forthcoming just yet, since it's being used in a SOC that might want to keep its privacy for a while. But the slides are in the wild.

rickman

unread,
Apr 17, 2013, 6:22:40 PM4/17/13
to
Does it have a name? How do I search for it?

--

Rick

rickman

unread,
Apr 17, 2013, 6:24:14 PM4/17/13
to
On 4/17/2013 6:03 PM, Brad Eckert wrote:
Uh, never mind. I searched on "2010 Forth day" and found it in the
table of contents.

--

Rick

rickman

unread,
Apr 17, 2013, 8:03:28 PM4/17/13
to
I understand your point but there are a couple of issues. The first is
flipping the bits is not exactly a trivial thing to do. The second is
how exactly is an unsigned compare implemented? I think this requires a
similar thing where you do a subtraction and have to check the carry
flag, no? There has to be some sort of flag because there are not
enough states in the result of a subtraction to distinguish all the
possible results.

My frame of reference is deeply embedded apps where the program is part
of the hardware and has to synchronize with it intimately. This is very
different from running a program under windows or android. So even a
few extra operations to implement a compare can make a big difference.
I think I am seeing why Chuck doesn't use compares to control loops, he
just counts down. That then maps directly to the hardware of the CPU.

Maybe I don't need that overflow flag after all.

--

Rick

rickman

unread,
Apr 18, 2013, 12:51:28 AM4/18/13
to
On 4/17/2013 6:03 PM, Brad Eckert wrote:
I found a video and downloaded it. It is twenty four minutes of you
talking about a presentation that isn't shown on the video. I went
ahead and watched it. I was able to follow ok, but of course there is a
lot of information on the screen you were pointing to that I couldn't
see. There was even one point where you are holding up a board to show
something and that is off the other side of the screen, lol.

I will say I am very jealous. I would love to be able to get paid to
work on a CPU design. On an upgrade to the FPGA design on a board I
produce we were running out of room and I came close to implementing my
CPU. But in the end the upgrade logic fit and the CPU wasn't needed.

Is there a file with the slides you are talking to in this video?

It appears that your processor is more like an MC68000 than a MISC
processor which many Forth CPUs are these days. Can you say what the
app is? It appears to be designed to support large hunks of software.

--

Rick

rickman

unread,
Apr 18, 2013, 5:19:17 PM4/18/13
to
On 4/17/2013 6:03 PM, Brad Eckert wrote:
Ok, I took the time last night to view the video in spite of the fact
that I couldn't see the material presented on the screen. I got more
out of it than I expected. From what you said, I picture an instruction
set that is not unlike that of a CISC machine, but stack oriented rather
than register. 32 bit data and address, variable length instructions,
multiple cycle... not your typical MISC CPU. More like a Forth CPU to
move into ARM territory.

Are you thinking of going into business head to head with GreenArrays?
Just kidding...

--

Rick

Andrew Haley

unread,
Apr 18, 2013, 5:53:43 PM4/18/13
to
rickman <gnu...@gmail.com> wrote:
> On 4/17/2013 5:43 PM, Andrew Haley wrote:
>> rickman<gnu...@gmail.com> wrote:
>>
>>> I still think we are talking about two different issues. If the
>>> definition of< is different from - 0< and requires the detection of
>>> overflow, there is no way to implement< on a machine without
>>> overflow detection, at least without a lot of extra work. I believe
>>> overflow can be defined in terms of the sign bits, comparing them
>>> before and after the subtraction, so it would be quite messy indeed.
>>
>> You don't need special overflow detection, or extra comparisons: it's
>> a lot easier to flip the sign bits of both operands and then do a U< .
>> You don't have to do anything fiddly.
>
> I understand your point but there are a couple of issues. The first
> is flipping the bits is not exactly a trivial thing to do.

It's a lot easier than a full adder.

> The second is how exactly is an unsigned compare implemented? I
> think this requires a similar thing where you do a subtraction and
> have to check the carry flag, no?

Yes. You really do need a carry flag in a processor, I would have
thought. But maybe you have a kind of application that doesn't need
one.

Andrew.

rickman

unread,
Apr 18, 2013, 6:02:35 PM4/18/13
to
On 4/18/2013 5:53 PM, Andrew Haley wrote:
> rickman<gnu...@gmail.com> wrote:
>> On 4/17/2013 5:43 PM, Andrew Haley wrote:
>>> rickman<gnu...@gmail.com> wrote:
>>>
>>>> I still think we are talking about two different issues. If the
>>>> definition of< is different from - 0< and requires the detection of
>>>> overflow, there is no way to implement< on a machine without
>>>> overflow detection, at least without a lot of extra work. I believe
>>>> overflow can be defined in terms of the sign bits, comparing them
>>>> before and after the subtraction, so it would be quite messy indeed.
>>>
>>> You don't need special overflow detection, or extra comparisons: it's
>>> a lot easier to flip the sign bits of both operands and then do a U< .
>>> You don't have to do anything fiddly.
>>
>> I understand your point but there are a couple of issues. The first
>> is flipping the bits is not exactly a trivial thing to do.
>
> It's a lot easier than a full adder.

What? I don't follow the significance of that statement. Flipping the
bits would be done in code. The full adder is part of the ALU hardware.
I'm lost.


>> The second is how exactly is an unsigned compare implemented? I
>> think this requires a similar thing where you do a subtraction and
>> have to check the carry flag, no?
>
> Yes. You really do need a carry flag in a processor, I would have
> thought. But maybe you have a kind of application that doesn't need
> one.

Yes, I expect so, but I was looking at this from the perspective of what
is required of the hardware in Forth. I certainly haven't done a study
of it, but Forth doesn't use flags or other indicators of the result of
math ops like processors do. A test of zero for TOS is the only test I
know of. Otherwise some of the definitions require certain features of
the processor, but aren't explicitly stated, it is left up to the
implementer to figure out I suppose.

Being an implementer, I have just assumed a processor would have certain
features. I am looking to pare down the feature set to the minimum
required. I'm not sure if I need to keep the overflow flag or not...

I did a little work on the MISC processor last night and I think a
hybrid between register and stack may be a very useful beast. Figuring
out how to translate Forth into the machine code is not so easy though.
Some have complained about my posts on this topic, so I may start a
new thread which clearly ties it to Forth.

--

Rick

Ed

unread,
Apr 19, 2013, 1:04:46 AM4/19/13
to
Bernd Paysan wrote:
> ...
> IIRC, Forth Inc. had s/u division during the early days, and was quite happy
> with that.

AFAIK all the mod operations were unsigned. No FM/MOD or
similar.



Andrew Haley

unread,
Apr 19, 2013, 4:30:46 AM4/19/13
to
rickman <gnu...@gmail.com> wrote:
> On 4/18/2013 5:53 PM, Andrew Haley wrote:
>> rickman<gnu...@gmail.com> wrote:
>>> On 4/17/2013 5:43 PM, Andrew Haley wrote:
>>>> rickman<gnu...@gmail.com> wrote:
>>>>
>>>>> I still think we are talking about two different issues. If the
>>>>> definition of< is different from - 0< and requires the detection of
>>>>> overflow, there is no way to implement< on a machine without
>>>>> overflow detection, at least without a lot of extra work. I believe
>>>>> overflow can be defined in terms of the sign bits, comparing them
>>>>> before and after the subtraction, so it would be quite messy indeed.
>>>>
>>>> You don't need special overflow detection, or extra comparisons: it's
>>>> a lot easier to flip the sign bits of both operands and then do a U< .
>>>> You don't have to do anything fiddly.
>>>
>>> I understand your point but there are a couple of issues. The first
>>> is flipping the bits is not exactly a trivial thing to do.
>>
>> It's a lot easier than a full adder.
>
> What? I don't follow the significance of that statement. Flipping the
> bits would be done in code. The full adder is part of the ALU hardware.
> I'm lost.

You're designing the hardware, right? So if you want to make flipping
bits easy you can do that.

>>> The second is how exactly is an unsigned compare implemented? I
>>> think this requires a similar thing where you do a subtraction and
>>> have to check the carry flag, no?
>>
>> Yes. You really do need a carry flag in a processor, I would have
>> thought. But maybe you have a kind of application that doesn't need
>> one.
>
> Yes, I expect so, but I was looking at this from the perspective of what
> is required of the hardware in Forth. I certainly haven't done a study
> of it, but Forth doesn't use flags or other indicators of the result of
> math ops like processors do.

Standard Forth doesn't, but Forth instruction sets do. Novix, for
example, had the +c instruction. It's a good idea.

> A test of zero for TOS is the only test I know of. Otherwise some
> of the definitions require certain features of the processor, but
> aren't explicitly stated, it is left up to the implementer to figure
> out I suppose.
>
> Being an implementer, I have just assumed a processor would have certain
> features. I am looking to pare down the feature set to the minimum
> required. I'm not sure if I need to keep the overflow flag or not...

Eh? We just established that you don't really need an overflow flag.

> I did a little work on the MISC processor last night and I think a
> hybrid between register and stack may be a very useful beast.
> Figuring out how to translate Forth into the machine code is not so
> easy though.

That depends on the design. Surely a Forth chip with stacks and
registers is the best compromise. Even pure Forth chips have often
had a few registers.

> Some have complained about my posts on this topic, so I may start a
> new thread which clearly ties it to Forth.

Ignore them. This is interesting.

Andrew.

rickman

unread,
Apr 19, 2013, 3:31:12 PM4/19/13
to
On 4/19/2013 4:30 AM, Andrew Haley wrote:
> rickman<gnu...@gmail.com> wrote:
>> On 4/18/2013 5:53 PM, Andrew Haley wrote:
>>> rickman<gnu...@gmail.com> wrote:
>>>> On 4/17/2013 5:43 PM, Andrew Haley wrote:
>>>>> rickman<gnu...@gmail.com> wrote:
>>>>>
>>>>>> I still think we are talking about two different issues. If the
>>>>>> definition of< is different from - 0< and requires the detection of
>>>>>> overflow, there is no way to implement< on a machine without
>>>>>> overflow detection, at least without a lot of extra work. I believe
>>>>>> overflow can be defined in terms of the sign bits, comparing them
>>>>>> before and after the subtraction, so it would be quite messy indeed.
>>>>>
>>>>> You don't need special overflow detection, or extra comparisons: it's
>>>>> a lot easier to flip the sign bits of both operands and then do a U< .
>>>>> You don't have to do anything fiddly.
>>>>
>>>> I understand your point but there are a couple of issues. The first
>>>> is flipping the bits is not exactly a trivial thing to do.
>>>
>>> It's a lot easier than a full adder.
>>
>> What? I don't follow the significance of that statement. Flipping the
>> bits would be done in code. The full adder is part of the ALU hardware.
>> I'm lost.
>
> You're designing the hardware, right? So if you want to make flipping
> bits easy you can do that.

Lol, I wish it were that easy. Why not just have a < instruction? You
can't just add every feature that sounds nice. The M in MISC means
"minimal" and that is the hard part, knowing what to pare away and what
to use.

What is ideal is to provide a set of features that allow Forth and
possibly other languages to be implemented easily and efficiently with
the least amount of hardware. Yes, I can flip bits, but as part of what
instruction?


>>>> The second is how exactly is an unsigned compare implemented? I
>>>> think this requires a similar thing where you do a subtraction and
>>>> have to check the carry flag, no?
>>>
>>> Yes. You really do need a carry flag in a processor, I would have
>>> thought. But maybe you have a kind of application that doesn't need
>>> one.
>>
>> Yes, I expect so, but I was looking at this from the perspective of what
>> is required of the hardware in Forth. I certainly haven't done a study
>> of it, but Forth doesn't use flags or other indicators of the result of
>> math ops like processors do.
>
> Standard Forth doesn't, but Forth instruction sets do. Novix, for
> example, had the +c instruction. It's a good idea.
>
>> A test of zero for TOS is the only test I know of. Otherwise some
>> of the definitions require certain features of the processor, but
>> aren't explicitly stated, it is left up to the implementer to figure
>> out I suppose.
>>
>> Being an implementer, I have just assumed a processor would have certain
>> features. I am looking to pare down the feature set to the minimum
>> required. I'm not sure if I need to keep the overflow flag or not...
>
> Eh? We just established that you don't really need an overflow flag.

I think that is not the point. You can implement a Touring complete
processor with just one instruction. But that is a bit too minimal. I
don't yet know if the overflow flag is or is too much baggage for the
utility it provides.


>> I did a little work on the MISC processor last night and I think a
>> hybrid between register and stack may be a very useful beast.
>> Figuring out how to translate Forth into the machine code is not so
>> easy though.
>
> That depends on the design. Surely a Forth chip with stacks and
> registers is the best compromise. Even pure Forth chips have often
> had a few registers.

That is a different thing. I'm not talking about a stack machine with a
few registers, I'm talking about a stack machine with access into the
stack as if it were registers. The code segment I was working with last
night was *much* more efficient when I got away from a straight stack
implementation in favor of a register design. Taking the step from a
register design to the hybrid approach seems like it can be even better
yet, but the comparison is not yet done. I also have been coding in
pseudo-instructions which may or may not map well to an instruction set.


>> Some have complained about my posts on this topic, so I may start a
>> new thread which clearly ties it to Forth.
>
> Ignore them. This is interesting.

I started a new thread for the new hybrid approach. I'll post more as I
work on it. The reason I post here is because, although I am very
strong in the FPGA and I dare say, even the CPU design aspects, I am not
a strong Forth coder. I have not attempted to port a full Forth to any
of my CPU designs. I think that would be a useful thing to do.

I am rather jealous of Brad Eckert. He not only has gotten to implement
his CPU design in an ASIC, I believe he is working with Forth, Inc to
port SwitfX to it! If I got that sort of support, I would do a Sally
Fields and exclaim, "You like me, you really like me!" I wonder who the
customer is that is paying for all this.

--

Rick

Brad Eckert

unread,
Apr 19, 2013, 3:37:33 PM4/19/13
to
On Thursday, April 18, 2013 2:19:17 PM UTC-7, rickman wrote:
>
> Are you thinking of going into business head to head with GreenArrays?
> Just kidding...
>
I wouldn't go into the MCU business. Designing the CPU was more of a sweat equity thing. The sea-of-computers idea sounds great, but I think they underestimated the resources it takes to commercialize it beyond niche markets.

rickman

unread,
Apr 19, 2013, 3:53:33 PM4/19/13
to
I hate to judge them although I'm sure I do often without realizing. I
think I understand the sea-of-computers concept and agree that it's time
has come. Goodness, I think they are in a 150 nm process or something
around that. Imagine what they could have at 22 nm! But you need more
than just CPUs to succeed in pretty much any embedded market. It is
mostly about I/O I think.

But as you say, there is a lot of work involved in bringing out any new
product and such a different CPU is a hard sell not to mention the work.

--

Rick

Anton Ertl

unread,
Apr 22, 2013, 5:47:14 AM4/22/13
to
rickman <gnu...@gmail.com> writes:
>BTW, your code won't work on 1's complement machines I believe. Isn't
>an all 1's word a negative zero? Or is that supposed to be converted to
>an all 0's word if generated by an operation?

It certainly is intended for 2s-complement representation, and assumes
that 0< tests the sign bit (which it does for 2s-complement
representation). I guess you can write a < for 1s-complement
arithmetic, but who cares?

[MIPS SLT SLTU etc.]
>Interesting instructions, but that isn't really the point. Or maybe it
>is. I suppose these instructions map correctly to the Forth < word.

They produce 0 or 1, so you then have to negate the result or subtract
1 from it, but apart from that SLT corresponds to < and SLTU
corresponds to U<. Likewise for Alphas CMPLT and CMPULT.

Anton Ertl

unread,
Apr 22, 2013, 5:59:03 AM4/22/13
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>rickman <gnu...@gmail.com> wrote:
>> On 4/17/2013 5:43 PM, Andrew Haley wrote:
>>> You don't need special overflow detection, or extra comparisons: it's
>>> a lot easier to flip the sign bits of both operands and then do a U< .

Which requires the same effort to implement based on 0< as <.

>Yes. You really do need a carry flag in a processor, I would have
>thought.

MIPS and Alpha show that you don't. They provide instructions for <
and U<, however; and they compute sum-with-carry as follows:

: +c ( u1 u2 -- u uc )
\ uc is the carry of u1+u2
over + dup rot u< ;

Andrew Haley

unread,
Apr 22, 2013, 8:28:41 AM4/22/13
to
Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>rickman <gnu...@gmail.com> wrote:
>>> On 4/17/2013 5:43 PM, Andrew Haley wrote:
>>>> You don't need special overflow detection, or extra comparisons: it's
>>>> a lot easier to flip the sign bits of both operands and then do a U< .
>
> Which requires the same effort to implement based on 0< as <.

I don't understand this remark. What does "based on 0< as <" mean?

>>Yes. You really do need a carry flag in a processor, I would have
>>thought.
>
> MIPS and Alpha show that you don't.

Not really. Well, alright, I suppose it's possible to get away with
having to calculate the carry yourself, in an
everything-is-possible-on-a-Turing-machine way, but it's unpleasant.

> They provide instructions for < and U<, however; and they compute
> sum-with-carry as follows:
>
> : +c ( u1 u2 -- u uc )
> \ uc is the carry of u1+u2
> over + dup rot u< ;

In which case they *do* have a carry flag, it's just that you have to
do a separate U< to get it; for what is U< but the carry out of a
subtraction? In the context of a Forth CPU, Novix's +c is a much
better idea.

Andrew.

Anton Ertl

unread,
Apr 22, 2013, 11:27:57 AM4/22/13
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>rickman <gnu...@gmail.com> wrote:
>>>> On 4/17/2013 5:43 PM, Andrew Haley wrote:
>>>>> You don't need special overflow detection, or extra comparisons: it's
>>>>> a lot easier to flip the sign bits of both operands and then do a U< .
>>
>> Which requires the same effort to implement based on 0< as <.
>
>I don't understand this remark. What does "based on 0< as <" mean?

Implementing U< based on 0< requires the same effort as implementing <
based on 0<.

Hugh Aguilar

unread,
Apr 22, 2013, 9:16:22 PM4/22/13
to
On Apr 22, 2:59 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Andrew Haley <andre...@littlepinkcloud.invalid> writes:
> >Yes.  You really do need a carry flag in a processor, I would have
> >thought.
>
> MIPS and Alpha show that you don't.  They provide instructions for <
> and U<, however; and they compute sum-with-carry as follows:
>
> : +c ( u1 u2 -- u uc )
> \ uc is the carry of u1+u2
>  over + dup rot u< ;

The MiniForth didn't have a carry flag --- it didn't have any flags at
all.
0 new messages