Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Forth and portability

29 views
Skip to first unread message

ken...@cix.compulink.co.uk

unread,
Nov 8, 2010, 5:30:26 AM11/8/10
to
While I notice that several people are developing their own Forth and
doing a lot at machine level I was wondering how much of that is
actually required.

For Java and USCD Pascal only the virtual machine or byte code
interpretator had to be coded in native code. Off course that meant USCD
P-code systems tended to be slow. Anyway as far as I can see regardless
of how a Forth is implemented there is only a minimum word set that
actually needs to be implementated in native code, the rest can be done
in Forth. That was certainly the way a lot of native code Pascal
compilers were ported. There was a minimal compiler that could be used
to compile the full thing.

Still to get to the point what is a practical minimum Forth word set
that would allow for an eventual ANSI Forth system?

Ken Young

Andrew Haley

unread,
Nov 8, 2010, 6:08:04 AM11/8/10
to
ken...@cix.compulink.co.uk wrote:
> While I notice that several people are developing their own Forth and
> doing a lot at machine level I was wondering how much of that is
> actually required.
>
> For Java and USCD Pascal only the virtual machine or byte code
> interpretator had to be coded in native code. Off course that meant
> USCD P-code systems tended to be slow.

Indeed so, and Java has moved way beyond that now.

But UCSD Pascal was not the origin of Pascal P-code: the compiler that
generated P-code was the 1973 Zurich P-compiler, part of the P-kit.
The idea was for the user either to write an interpeter or modify the
source of the P-compiler and replace its code-generaring routines.
However, according to Wirth, "the reluctance of many to proceed beyond
the interpretive scheme also gave rise to Pascal's classification as a
'slow language,' restricted to use in teaching."

> Anyway as far as I can see regardless of how a Forth is implemented
> there is only a minimum word set that actually needs to be
> implementated in native code, the rest can be done in Forth. That
> was certainly the way a lot of native code Pascal compilers were
> ported. There was a minimal compiler that could be used to compile
> the full thing.
>
> Still to get to the point what is a practical minimum Forth word set
> that would allow for an eventual ANSI Forth system?

That depends on what you mean by "practical". :-)

In most cases it's pretty obvious which words should be coded in
assembly language, and fig-FORTH mostly did the right thing. Its
choice of which words to leave as high level and which to code was
sensible, with a few exceptions: I'm not entirely convinced that

: - MINUS ( aka NEGATE) + ;

was an inspired choice. These days, unless you were doing a severe
minimalist design for aesthetic reasons you'd code a word like - .

But these days I'd question the need for an interpreted Forth anyway.

Andrew.

Chris Hinsley

unread,
Nov 8, 2010, 8:20:20 AM11/8/10
to

On the Forth I'm doing, my intention is to first get things passing the
tests, then move all non speed critical functions from assembler to
Forth. When I start getting the peephole optimizer running more of the
current native x86 code will be able to follow. Eventually a small core
that really just boots the code generator into life and provides basic
in/out OS calls will remain. The code generator itself should be in
Forth too.

Chris

Josh Grams

unread,
Nov 8, 2010, 9:03:12 AM11/8/10
to
ken...@cix.compulink.co.uk wrote:
> While I notice that several people are developing their own Forth and
> doing a lot at machine level I was wondering how much of that is
> actually required.

That depends upon your definition of "required". :)

You can do it with a very small set, but it gets slower and more
masochistic as the size decreases.

You might be interested in eForth (http://www.baymoon.com/~bimu/forth/).
It was designed for easy portability, so it starts with a minimal set
(of 35, I think?). The idea is that you'll CODE more definitions for
speed once you get it up and running. Let's see if I can easily pull
out the list...

~/src/forth/systems/eforth> grep '^CODE' eforth.x86.f
CODE EXIT ( -- ) ( R: a -- ) ( 6.1.1380 )( 0x33 ) \ ITC
CODE EXECUTE ( xt -- ) ( 6.1.1370 )( 0x1D ) \ ITC
CODE _LIT ( -- n ) ( 0x10 )
CODE _ELSE ( -- ) ( 0x13 )
CODE _IF ( f -- ) ( 0x14 )
CODE C! ( c a -- ) ( 6.1.0850 )( 0x75 )
CODE C@ ( a -- c ) ( 6.1.0870 )( 0x71 )
CODE ! ( n a -- ) ( 6.1.0010 )( 0x72 )
CODE @ ( a -- n ) ( 6.1.0650 )( 0x6D )
CODE RP@ ( -- a )
CODE RP! ( a -- )
CODE >R ( n -- ) ( R: -- n ) ( 6.1.0580 )( 0x30 )
CODE R@ ( -- n ) ( R: n -- n ) ( 6.1.2070 )( 0x32 )
CODE R> ( -- n ) ( R: n -- ) ( 6.1.2060 )( 0x31 )
CODE SP@ ( -- a )
CODE SP! ( a -- )
CODE DROP ( n -- ) ( 6.1.1260 )( 0x46 )
CODE SWAP ( n1 n2 -- n2 n1 ) ( 6.1.2260 )( 0x49 )
CODE DUP ( n -- n n ) ( 6.1.1290 )( 0x47 )
CODE OVER ( n1 n2 -- n1 n2 n1 ) ( 6.1.1990 )( 0x48 )
CODE CHAR- ( a -- a )
CODE CHAR+ ( a -- a ) ( 6.1.0897 )( 0x62 )
CODE CHARS ( n -- n ) ( 6.1.0898 )( 0x66 )
CODE CELL- ( a -- a )
CODE CELL+ ( a -- a ) ( 6.1.0880 )( 0x65 )
CODE CELLS ( n -- n ) ( 6.1.0890 )( 0x69 )
CODE 0< ( n -- f ) ( 6.1.0250 )( 0x36 )
CODE AND ( n n -- n ) ( 6.1.0720 )( 0x23 )
CODE OR ( n n -- n ) ( 6.1.1980 )( 0x24 )
CODE XOR ( n n -- n ) ( 6.1.2490 )( 0x25 )
CODE UM+ ( u u -- u cy )
CODE REDIRECT ( asciiz -- f )
CODE !IO ( u -- ) ( initialize I/O device )
CODE ?RX ( -- c -1 | 0 )
CODE TX! ( c -- )
CODE BYE ( -- ) ( 15.6.2.0830 )

That's not *quite* the set I would have chosen, but it's close. If you
have CELLS and CHARS, CELL-/CELL+ and CHAR-/CHAR+ seem unnecessary. And
I think I would have chosen TYPE over EMIT (TX!), and maybe some other
input interface than ?RX (`KEY? DUP IF KEY SWAP THEN`).

--Josh

jacko

unread,
Nov 8, 2010, 9:46:49 AM11/8/10
to
>  Still to get to the point what is a practical minimum Forth word set
> that would allow for an eventual ANSI Forth system?

It's about 32 words. Basic stack operations, IO and an ALU interface.
Do you remember the 4 block picture... input, process, output with
memory attached to process?

Simple primary operations from each set.

Rod Pemberton

unread,
Nov 10, 2010, 7:19:17 PM11/10/10
to
<ken...@cix.compulink.co.uk> wrote in message
news:XpWdneZ9_rPfTErR...@giganews.com...

>
> While I notice that several people are developing their own Forth and
> doing a lot at machine level I was wondering how much of that is
> actually required.
> [...]

> Anyway as far as I can see regardless
> of how a Forth is implemented there is only a minimum word set that
> actually needs to be implementated in native code, the rest can be done
> in Forth.
>

Forth's can and have been designed to be built from a small amount of basic
functionality. eForth was done this way intentionally. fig-Forth was done
somewhat this way too, but not by design. The word's in the minimal wordset
are low-level words, or "primitives", instead of high-level or compiled
words. They must be implemented in something other than Forth, typically
assembly, sometimes C, Perl, etc.

The issue, at least for homebrew Forth's, is how many do you need and not
have an extremely slow Forth. The smaller the "primitive" wordset is, the
more work the higher level words must do in terms of the "primitives". It
all goes to the idea of factoring - breaking words down into their smallest
part. The "primitives" are the words you're not going to factor further.
E.g., you could define XOR, OR, AND, in terms of NAND and NOT, but most
assembly languages have an instruction for XOR, OR, and AND, so you wouldn't
code XOR, OR, or AND using NAND, unless there was a reason to do so.

So, there are certain words all Forth's need, and they need to be
implemented effectively:

AND OR XOR

+ -

DUP DROP SWAP OVER ROT

@ !

R> R> R@

etc...

> Still to get to the point what is a practical minimum Forth word set
>

I'll requote myself:

"You can implement Forth with only a few words, say 9 to 16. One of the
smallest is Mikael Patel's 9 word wordset in old comp.lang.forth posts. It
not likely that it'll be quick."

"A small realistic set is around 23 for Mark Hayes' MRForth, or 36 for C.H.
Ting's eForth, or 46 for 8086 GFORTH. You can probably figure on that range
for a useable Forth. "
http://groups.google.com/group/comp.lang.forth/msg/3caa5b2e62f53d7a

This post of mine, a while ago, provides a reference point for comparison of
the size of wordsets, and also shows some of the minimal wordsets:
http://groups.google.com/group/comp.lang.forth/msg/10872cb68edcb526

> Still to get to the point what is a practical minimum Forth word set
> that would allow for an eventual ANSI Forth system?
>

ANSI? ...

I was hoping someone might post that also.

I'm not sure if anyone has publicly determined a minimal ANS Forth wordset.
It has 133 core and (supposedly) 359 total. The Forth's I've seen that use
a minimal wordset are eForth or fig-Forth or F83 based and then have a
partial ANS wordset loaded on top... I.e., probably not optimal for speed
or size and definately not commercial quality Forth, etc. It might just be
an issue of converting or updating the minimal wordset to fit ANS
definitions. The underlying architecture of the microprocessor for generic
computation really hasn't changed in 3 decades.

There have been a number of posts on this over the past few years. Google
Groups archives Usenet as well as other forums. The Advanced Search for
Google Groups can help you find them. Search for "primitives", "minimal
wordset", "eForth", "bootstrap", for group: comp.lang.forth
http://groups.google.com/advanced_search?hl=en


Rod Pemberton

Rod Pemberton

unread,
Nov 11, 2010, 3:32:17 AM11/11/10
to
"Rod Pemberton" <do_no...@notreplytome.cmm> wrote in message
news:ibfcng$utu$1...@speranza.aioe.org...

> <ken...@cix.compulink.co.uk> wrote in message
> news:XpWdneZ9_rPfTErR...@giganews.com...
> >
> > Still to get to the point what is a practical minimum Forth word set
> >
>
> I'll requote myself:
>
> "You can implement Forth with only a few words, say 9 to 16. One of the
> smallest is Mikael Patel's 9 word wordset in old comp.lang.forth posts.
It
> not likely that it'll be quick."
>
> "A small realistic set is around 23 for Mark Hayes' MRForth, or 36 for
C.H.
> Ting's eForth, or 46 for 8086 GFORTH. You can probably figure on that
range
> for a useable Forth. "
> http://groups.google.com/group/comp.lang.forth/msg/3caa5b2e62f53d7a
>
> This post of mine, a while ago, provides a reference point for comparison
of
> the size of wordsets, and also shows some of the minimal wordsets:
> http://groups.google.com/group/comp.lang.forth/msg/10872cb68edcb526
>
> > Still to get to the point what is a practical minimum Forth word set
> > that would allow for an eventual ANSI Forth system?
> >
>
> ANSI? ...
>
> I was hoping someone might post that also.
>


Sorry, I've collected more info since that post, but didn't recall these ANS
Forth's...


48 primitives in Chris Jakeman's Minimal ANS Forth (27 ANS core, 7 ANS
non-core, 14 non-ANS)
59 primitives in Wonyong Koh's hForth
120 words in Tom Almy's ForthCMP


So, maybe, Jakeman's MAF would be the place to look.

Jakeman lists 43 of the eventual 48 in his GPL'd code needed to bootstrap an
ANS Forth here:
http://groups.google.com/group/comp.lang.forth/msg/3455b440325a7852

In that post, he states:
"These 43 words are enough to build all of the ANS Core word set (133
words)."

So, 43 is the answer.


Rod Pemberton


Tarkin

unread,
Nov 11, 2010, 9:25:05 PM11/11/10
to
On Nov 11, 3:32 am, "Rod Pemberton" <do_not_h...@notreplytome.cmm>
wrote:
> "Rod Pemberton" <do_not_h...@notreplytome.cmm> wrote in message

Off-by-one error? I thought the answer was 42...

jacko

unread,
Nov 13, 2010, 7:17:03 PM11/13/10
to

Yes but as zero is not a number or by extension a valid index, then 43
is correct! :-)

0 new messages