This ref provides the basic knowledge to understand or write simple Forth
programs (integers only, no floats). The reader is assumed to know Basic,
Pascal, C or the like.
=== WHAT IS FORTH? ===
FORTH was invented by Charles Moore in 1970 as a light, portable programming
environment for the small computers of those days. Its name reflects the
restrictions of those days: Moore named the language Fourth, because its
power exceeded third generation languages, however the system it was
conceived on provided only 5 capitals for file names: FORTH. Forth provides
the development speed of Basic and the compiled program speed of C.
=== WORDS and the STACK ===
FORTH consists of prefab and user-created routines called words. Words
exchange data through the stack. Therefore all calculation is RPN (reverse
polish notation):
2 1 +
2 is put on the stack, 1 is put on the stack (which is then top of the
stack), the word + removes two integers of the stack, and returns the result
3 to the stack.
Defining a new word:
: add_one 1 + ;
After this definition writing "2 add_one" will return 3
(on the stack).
: add_one 1 + dup . ;
Add_one now writes the result to the screen, but keeps a copy of the result
(dup) on the stack!
Still with me? Here are the other, not surprising arithmic words:
+ - / *
and
2* 2/
for optimized multiplication/division by two.
Other important stack words:
dup
adds copy of top of stack to the top of the stack
drop
removes top of stack
swap
alters order of two integers on top (1 2 -> 2 1)
.
prints top of stack to screen
=== CONSTANTS, VARIABLES and ARRAYS ===
Not all data has to be on the stack. Constant values used throughout a
program can be stored in constants for easy adjustment later on. After
defining
6 CONSTANT width
writing "width" will put 6 on the stack
VARIABLE nov
"nov" will NOT return a value, but an address. It is a pointer. Instead
nov @
returns the value AT adress nov
10 nov !
writes 10 to the address nov
10 nov +!
ADDS 10 to the value at address nov
When defining an ARRAY, remember that integers take 2 bytes. So
variable pot 200 ALLOT
allocates space for 100 integers!
10 100 pot + !
writes 10 to pot(100)
=== CONDITIONALS ===
Even conditionals work RPN:
A @ 10 > IF words THEN
equals Basic's: IF A>10 THEN words ENDIF
Boolean words:
> < = <>
=== LOOPS ===
BEGIN words AGAIN
creates an infinite loop between BEGIN and AGAIN.
A @ 0 DO words i . LOOP
creates A loops between DO and LOOP, puts the index on the stack, and puts
it to the screen.
=== I / O ===
. prints top of stack to screen
." string " prints string to screen
97 emit prints character with asci code 97 (a)
key puts asci code of pressed key on stack
[..]
> Forth provides
> the development speed of Basic and the compiled program speed of C.
Reasonable Forths generate native code that is as fast as unoptimized
C (say MS-C in debug mode), meaning they're two to three times slower
than production-quality C. The optimizing process is completely
transparent to the user, e.g. the Forths still compile in zero-time
and are completely interactive.
Very high speed Forths are available (VFX from MPE, 4THCOMP from Tom
Almy, f2c from Ertl cs.) but I think their user-interface has many
undesirable aspects (extra compiler passes, completely re-arranged
code, hidden macros and inlining that stand in the way of debugging,
incomplete wordsets, only 16-bit segmented etc.) I'm sure the authors
will correct me when I'm wrong :-)
A large parth of the Forth community feels that it is enough that
Forth lets them design the speed-critical parts of their applications
with the built-in Forth assembler. Although this is a reasonable
assumption I've never seen any benchmarks quantifying this statement
(Forths written in C seem not to be able to use assembler).
Even when they didn't have reasonable code generators yet (e.g. FORTH
Inc's PolyForth scored amazingly bad in the FD March 1992 benchmark),
respected Forth vendors claimed their products possessed magical
properties (real-time performance, no-fat device-drivers etc.) that
allowed them to rewrite or finish C applications with very little
code and with many times higher performance than the code they
replaced.
[..]
> When defining an ARRAY, remember that integers take 2 bytes. So
1 CELLS
> variable pot 200 ALLOT
create pot 100 cells ALLOT
> allocates space for 100 integers!
> 10 100 pot + !
10 50 cells pot + !
> writes 10 to pot(100)
pot(50)
[..]
> ." string " prints string to screen
In a compiled definition, from the terminal use .( string )
[..]
Nice!
-marcel
Marcel Hendrix wrote:
>Reasonable Forths generate native code that is as fast as unoptimized
>C (say MS-C in debug mode), meaning they're two to three times slower
>than production-quality C. The optimizing process is completely
>transparent to the user, e.g. the Forths still compile in zero-time
>and are completely interactive.
I calculated that my Forth for the palmpilot (Quartus Forth) does
about 2 million Forth instructions in 16 million clock ticks. Since
a machine code usually takes several clock ticks to execute, I
figured C couldn't possibly be much faster, sorry!
>> When defining an ARRAY, remember that integers take 2 bytes. So
> 1 CELLS
>> variable pot 200 ALLOT
> create pot 100 cells ALLOT
>> allocates space for 100 integers!
>> 10 100 pot + !
> 10 50 cells pot + !
>> writes 10 to pot(100)
> pot(50)
I meant to write "10 100 pot + !" means "pot(50)=10", so you are right.
I guess "cells" is the same as writing "2*". Is variable not a standard
Forth word? Create does indeed do the same thing on the pilot,
thanks...
Mervyn
PS I finally found a more extensive tutorial,
http://astro.pas.rochester.edu/Forth/
in case you are looking for one.
It's not magic, just careful design and ruthless paring.
Sikorsky was once asked the secret of designing a helicopter. He said
"simplicate and add more lightness!" So it is with Forth.
Andrew.
[..]
> I calculated that my Forth for the palmpilot (Quartus Forth) does
> about 2 million Forth instructions in 16 million clock ticks. Since
> a machine code usually takes several clock ticks to execute, I
> figured C couldn't possibly be much faster, sorry!
That depends on what these "Forth instructions" were. Could you please
show us the source?
A good way to test your Forth (assuming you're interested in speed and
efficiency, many Forthers are not) is to run the "Ertl suite".
You can find it by looking for benchres, threading benchmarks etc.
starting at http://www.complang.tuwien.ac.at/anton/home.html.
> I meant to write "10 100 pot + !" means "pot(50)=10", so you are right.
> I guess "cells" is the same as writing "2*". Is variable not a standard
Sort of. For Quartus Forth "1 CELLS" is the same as "2*" but 32-bit
Forths ("2 LSHIFT") have taken over the desktop and 64-bit ("3 LSHIFT")
Forths already exist.
> Forth word? Create does indeed do the same thing on the pilot,
> thanks...
With VARIABLE you wasted 2 bytes, VARIABLE pot 198 ALLOT is the correct
phrase from the golden days :-)
The CREATE way clearly says what you mean and also works for Forths
that dynamically allocate data space, or where the dictionary is not
necessarily continuous.
-marcel
[..]
> A large part of the Forth community feels that it is enough that
> Forth lets them design the speed-critical parts of their applications
> with the built-in Forth assembler. Although this is a reasonable
> assumption I've never seen any benchmarks quantifying this statement
So why don't I do it myself? The matrix-mult benchmark is a nice candidate
because here "C" blows (mx)(i)Forth out of the water:
Benchmark machine iForth 1.10 (s) C (s)
--------- ------- --------------- -----
matrix.fs P-166 1.78 0.84
matrix.fs PII-350 0.43 0.11
The results for iForth and mxForth (shown in earlier postings) are nearly
identical.
The only word to change to CODE is innerproduct:
\ original
: innerproduct ( a[row][*] b[*][column] -- int)
0 row-size 0 do
>r over @ over @ * r> + >r
cell+ swap row-byte-size + swap
r>
loop
>r 2drop r>
;
\ new
CODE innerproduct ( a[row][*] b[*][column] -- int )
ebx ecx mov,
ebx pop, ( *B )
eax pop, ( *A )
ecx push, ecx ecx xor, ( accu )
edx push, row-size 8 / d# edx mov, ( loopcount/8 )
esi push, eax esi mov,
BEGIN,
\ 1st particle
0 [esi] eax mov,
0 [ebx] eax imul,
row-byte-size d# esi add,
CELL # ebx add,
eax ecx add,
\ copy the above particle 7 more times here.
...
edx dec, 0=,
UNTIL,
esi pop,
edx pop,
ebx pop,
ecx push,
ebx jmp,
END-CODE
And here's the result.
Benchmark machine iForth 1.10 (s) C (s)
--------- ------- --------------- -----
matrix.fs+ P-166 0.786 0.84
matrix.fs+ PII-350 0.068 0.11
Forth became slightly better than "C" on the P5 and almost twice as
fast on the PII.
One can of course write asm in C too. In MSC 4.2 that is no fun.
(have to buy a suitable assembler first: it is not included.
That money might be better spend on a copy of iForth :-).
-marcel
So there I was, speaking to some management folks at work about how
Forth could help tremendously. The company I work for is (in part) a
storage and network controller products OEM for Sun, so everyone was
already familiar with OpenBoot. It wasn't very hard for me to
convince them that using Forth, engineers could interactively debug
hardware and test systems. They nodded and agreed. Then I brought up
various language features of Forth-- simple model, extensibility, easy
to port to various platforms (important to us). Again not much
problem-- they understood and accepted.
I thought I lost them though when it was brought up that the code we
write has to conform to *external* (usually C-based) API's. But then
I pointed out some of the work others had done in creating
cross-language bindings in Forth. Most understood the technical
challenges, but thought they wouldn't be too bad.
Then came the issue of performance. Our products demand speed-- in
fact, it is speed that is our primary means of differentiating
ourselves from others in the industry. So it came as no surprise that
performance of Forth was a big issue. After pointing out that there
are modern native-code generating Forth systems available, I was
pressed for benchmarks. And here, I had to be honest. First, there
aren't many benchmarks available to compare native-code Forths and C,
but the ones I have seen usually place Forth at the same level as
unoptimized C. The numbers can be better-- especially with poor C
compilers-- but given that a C compiler has a lot more information
available to optimize code, it isn't terribly surprising.
The conversation stopped there. I had to admit that at least for our
domain, Forth wasn't the best choice. So I started to think about it
differently-- I suggested that Forth be used to debug hardware and to
write initial drivers. Then, once the design was validated, a C (or
in some cases assembler) solution could be created. They saw the
value in that, but reminded me about the short development times we
have. Then I suggested that we partition our systems so that speedy C
and assembler routines were used for the underlying driver code-- the
code that had to be fast-- and Forth be used for the user interface
components-- where speed was less important. They're mulling that
option over.
The point of Forth is indeed not necessarily in speed, but in the
environment that Forth provides. But let's not loose sight of the
fact that speed *can* matter for some people, and the need for native
code Forth systems is more than just a means to win benchmark wars
with C.
You give up too easily! I have never seen a performance requirement we
couldn't meet. Earlier in this thread there was a reference to "magical"
results, which Andrew Haley quite correctly observed were less the result of
magic than of good factoring and design, which Forth facilitates. Overall
system design, such as low-overhead multitasking and interrupt handling
(which we've always made a high priority) make a big difference.
Some of the stories the earlier poster mentioned came from me. For example:
Some years ago, American Airlines hired us to replace an unmaintainable
assembly language baggage handling system, using the same hardware and
identical user interface. Their biggest concern was performance: the assy
program failed to meet their requirement of 100 bags/min, handling only 80
max., and they were understandably concerned about a high level language's
performance. But when we installed the system and were ready for
performance tests, we were able to handle 120!
Am I claiming that "Forth is faster than assembly language"? Of course not.
What I am claiming is that what matters to the performance of an application
isn't something you can measure using small-scale code benchmarks, but
overall system design.
In the late 80's and early 90's we did a lot of industrial controls. We had
drivers for most of the standard industrial I/O (Allen Bradley, Modicon,
etc.). Most of those companies provided C and assembly language drivers.
We were consistently able to write drivers delivering 10 times the
throughput over the mfr-supplied drivers. Again, this was largely a system
design issue, having to do with how we handle interrupts, the relationship
between interrupt code and task code, speed of the multitasker, etc. These
are the things that matter!
The above examples involved old-fashioned ITC implementations. When we went
to direct-code compilation a couple of years ago, we picked up a factor of 4
in speed overall. Now we've introduced code optimizers in our products
(SwiftX released last week, SwiftForth in final beta) we've picked up
another 40%. The new SwiftForth is so fast it can keep up with 38.4 K input
displayed to the screen, which the Windows experts say can't be done (and
that's using Windows at both ends). And, no, they don't affect compile
speed or user interface issues noticably at all. Yes, the compiler works
harder, but it runs faster, too!
So, to anyone who tries sneering at Forth for being slow, just challenge
them to give you an application with a performance spec, and show them how
you can beat it. If you have trouble, we can help.
Cheers,
Elizabeth
On 1999-07-17 m...@iaehv.iae.nl(MarcelHendrix) said:
:Reasonable Forths generate native code that is as fast as
:unoptimized C (say MS-C in debug mode), meaning they're two to
:three times slower than production-quality C. The optimizing
:process is completely transparent to the user, e.g. the Forths
:still compile in zero-time and are completely interactive.
..
:Even when they didn't have reasonable code generators yet (e.g.
:FORTH Inc's PolyForth scored amazingly bad in the FD March 1992
:benchmark), respected Forth vendors claimed their products
:possessed magical properties...
(I apologise for clipping there, but I don't see magical properties as a
problem. :> )
polyForth was indirectly threaded, and so would be slow compared to a
native Forth, but then it does also offer some advantages over native
Forth too, in much the same way as a Lisp interpreter offers some
advantages over a Lisp compiler. Those advantages may have nothing to do
with speed, but it doesn't mean that they're meaningless.
It seems to me that we've all gone on to native Forth advocacy just to
fend off the C people who come up, look over our shoulders, go "Eww,
it's not compiled, it must be so slow!" and completely miss the point...
--
Communa -- you know soft spoken changes nothing
> It seems to me that we've all gone on to native Forth advocacy just to
> fend off the C people who come up, look over our shoulders, go "Eww,
> it's not compiled, it must be so slow!" and completely miss the
> point...
> Communa -- you know soft spoken changes nothing
Well said! I agree whole-heartedly.
Much of Forth's simplicity, regularity, and reflectivity are lost when
we go to native code compilers. Native code compilers should be an
optional adjunct not a replacement for threaded code interpreters IMHO.
Cheers,
Mark W. Humphries
Forth Chat Room on ICQ #37160535
Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.
[..]
> :Even when they didn't have reasonable code generators yet (e.g.
> :FORTH Inc's PolyForth scored amazingly bad in the FD March 1992
> :benchmark), respected Forth vendors claimed their products
> :possessed magical properties...
> (I apologise for clipping there, but I don't see magical properties as a
> problem. :> )
Why do you think that I think magical properties are a problem :-)
a...@cygnus.remove.co.uk (Andrew Haley) said:
> It's not magic, just careful design and ruthless paring.
> Sikorsky was once asked the secret of designing a helicopter. He said
> "simplicate and add more lightness!" So it is with Forth.
Andrew interpreted it correctly (or chose to interpret it positively),
to the benefit of all readers: we now have a nice Sikorsky quote for
after-dinner use and Mrs Rather was lured into telling us moore about
how even a so-so (in my narrow view) compiler *in competent hands*
will do wonders.
"Elizabeth D Rather" <era...@forth.com> wrote:
> Am I claiming that "Forth is faster than assembly language"? Of course not.
> What I am claiming is that what matters to the performance of an application
> isn't something you can measure using small-scale code benchmarks, but
> overall system design.
( and more )
euph...@freenet.co.uk again:
> polyForth was indirectly threaded, and so would be slow compared to a
> native Forth, but then it does also offer some advantages over native
> Forth too, in much the same way as a Lisp interpreter offers some
> advantages over a Lisp compiler. Those advantages may have nothing to do
> with speed, but it doesn't mean that they're meaningless.
Well, give us some specific examples then. For instance, what would've
happened if the PolyForthers of yore had had to use the present SwiftForth
compiler technology? Would they have failed miserably? Do you actually
know how a native code Forth compiler looks, viewed from the user side?
> It seems to me that we've all gone on to native Forth advocacy just to
> fend off the C people who come up, look over our shoulders, go "Eww,
> it's not compiled, it must be so slow!" and completely miss the point...
It is not just "to fend of the C people." If one actually uses Forth to
*do things* it makes no sense to use a compiler that generates 16-bit
segmented code, can't interface to system libraries, takes *minutes* to
read a file that can be parsed in 38 *ms* if done right, and is 100 or
1000 times (he, Jeff!) slower than schoolbook-C. These systems all equate
to the same thing: a bad, antiquated, dusty, buggy, *implementation* of
the fine language that I like so very much.
Now what will indirect-threaded code give me that could compensate
adequately for any of the negative aspects I enumerate above?
-marcel
>It is not just "to fend of the C people." If one actually uses Forth to
>*do things* it makes no sense to use a compiler that generates 16-bit
>segmented code, can't interface to system libraries, takes *minutes* to
>read a file that can be parsed in 38 *ms* if done right, and is 100 or
>1000 times (he, Jeff!) slower than schoolbook-C. These systems all equate
>to the same thing: a bad, antiquated, dusty, buggy, *implementation* of
>the fine language that I like so very much.
>
>Now what will indirect-threaded code give me that could compensate
>adequately for any of the negative aspects I enumerate above?
>
>-marcel
>
How about a simple to understand system in which to learn Forth? A
simple virtual model that a beginner can understand and learn about.
And dare I say, even use "Starting Forth" to learn?
--Fred
All but one ("slower") of the negative aspects you enumerate are orthogonal
to the issue of indirect-threaded versus native code. Even here your figures
look very suspect: Elizabeth claims a 4 x speed increase from going native
so to justify your figures you would have to demonstrate that SwiftForth
generates code that is 25 to 250 times slower than "schoolbook-C".
I think this illustrates well enough the extent to which technical
judgements on this isue can be clouded by emotional and other non-techical
influences.
Philip.
Member of FIG UK: http://www.users.zetnet.co.uk/aborigine/forth.htm
Native-code implementations are not any harder to understand than any
other implementation method. There's simply no indirection; in a
native-code system DUP compiles code that performs the function of DUP,
rather than laying down an address that, when processed by an 'inner
interpeter' at run-time, points to the code that performs the function
of DUP. I'd say the native-code method is actually easier to
understand.
Native-code implementations simply bridge any gaps between the Forth VM
and the CPU itself, making the CPU a Forth machine. In Quartus Forth
for the Palm Pilot, for example, some 30-odd CORE and CORE EXT words are
directly represented by two machine instructions or less.
--
Neal Bridges
<http://www.interlog.com/~nbridges/> Home of the Quartus Forth compiler!
>Native-code implementations are not any harder to understand than any
>other implementation method. There's simply no indirection; in a
>native-code system DUP compiles code that performs the function of DUP,
>rather than laying down an address that, when processed by an 'inner
>interpeter' at run-time, points to the code that performs the function
>of DUP. I'd say the native-code method is actually easier to
>understand.
The question of what's understandable is similar to the question
of what's readable. Who is your audience and what do they find
easy?
For people who learn Forth as a first language, or who've only
seen high level languages, it's easier to explain the Forth virtual
machine with an indirect-coded system. You can DUMP the code and
see the calls. Each one has a header that doesn't completely make
sense followed by the addresses of the routines you call that make
perfect sense. So long as they ignore primitives, and the little
pieces of headers that don't make sense, and the internals of the
interpreter and the compiler, and maybe little scraps of DOES> code,
they can do anything they want without ever thinking about how the
assembler works or how the native code works. That's simple, for
the start.
But if you're the kind of person who actually uses Forth then it
doesn't matter. You might as well argue about case-sensitivity,
it's a minor thing to understand that seldom causes problems.
When it does cause a problem you have to pay attention, but
otherwise you can mostly ignore it.
I wonder how often there are still benefits to compilers that
let you do it either way? You could take the same source
code and compile it on a native-code system, or if you need
to make it small more than fast on an indirect-coded system,
or to make it even smaller on a token-coded system, etc.
Maybe it's reached the point where those options only matter
for cross-compilers.
On 1999-07-18 m...@iaehv.iae.nl(MarcelHendrix) said:
:> :Even when they didn't have reasonable code generators yet (e.g.
:> :FORTH Inc's PolyForth scored amazingly bad in the FD March 1992
:> :benchmark), respected Forth vendors claimed their products
:> :possessed magical properties...
:> (I apologise for clipping there, but I don't see magical
:>properties as a problem. :> )
:Why do you think that I think magical properties are a problem :-)
I don't. I put in the qualifier because I removed your explanation.
However, I think it's fair to say that you favour native code Forths;
would you agree?
--
On 1999-07-17 jp...@rochester.rr.com said:
:Then came the issue of performance. Our products demand speed-- in
:fact, it is speed that is our primary means of differentiating
:ourselves from others in the industry. So it came as no surprise
:that performance of Forth was a big issue. After pointing out that
:there are modern native-code generating Forth systems available, I
:was pressed for benchmarks. And here, I had to be honest. First,
:there aren't many benchmarks available to compare native-code
:Forths and C, but the ones I have seen usually place Forth at the
:same level as unoptimized C. The numbers can be better--
:especially with poor C compilers-- but given that a C compiler has
:a lot more information available to optimize code, it isn't
:terribly surprising.
:The conversation stopped there.
What's wrong with writing it in Forth, profiling it, getting it as fast
as possible in Forth, and then going to assembler for the time-critical
bits - using, of course, the inbuilt Forth assembler to retain
interactivity? That's the standard model - like anything in Forth, it
doesn't pay to think in batch-mode terms.
This is another mixing of issues. It's too bad that there are no books
as good as "Starting Forth" and Thinking Forth" that are written with
the ANS vocabulary, but the only way that the details of the
code-generating engine can matter to a beginner - or most of the rest of
us - is in ceing able to decompile. When the source is available, that
is a small loss.
I used Z-80 PolyForth to good advantage professionally. Would I have
liked faster code? Of course! Perhaps I could have avoided writing all
those device drivers in assembler. When I write in SwiftForth, I can use
the same model. The VM hasen't changed significantly.
Jerry
--
Engineering is the art | Let's talk about what
of making what you want | you need; you may see
from things you can get. | how to do without it.
---------------------------------------------------------
SNIP
>This thread and some other stuff that has appeared here has set
me thinking
>again. I think there are some real differences in philosophy of
>approach to computing that keep coming up around Forth and don't
get
>articulated very well. Here is my try.
>
>One school of thought is that a programmer, to be effective,
really ought
>to know what the actual sequence of machine instructions is when
his
>program executes, and should shape his design in terms of
optimizing that
>sequence for the job at hand. Forth is designed to that,
because it is
>so simple you can easily understand enough of it to know exactly
what
>that sequence will be. The Forth words then just become a
shorthand to
>express the real intent. If the system won't produce the
sequence you want,
>you can go to assembler, or better, you can change the kernel
and make it
>give you what you want. This will work only if know exactly
what all
>your hardware does, and these days that means a lot of learning
that
>has to be done over again every time the hardware changes. With
hardware
>so complicated and specialized and changing so fast, this
approach
>doesn't work well for very many people any more.
>
>The other approach is to treat Forth as a virtual machine, or
black box.
>In this view the Forth programmer doesn't care what the sequence
of
>machine instructions is as long as the thing behaves correctly
and
>has a reasonable performance level. This is the approach of ANS
>Forth. The trouble is, from the viewpoint of somebody dedicated
to
>the other approach this, and native code compilers and various
>optimization strategies amounts to a breach of the faith,
because
>you now can't predict what the sequence of machine instructions
will be,
>and worse, you have to rely on another's (possibly poorly
documented
>and buggy) work to achieve what you want to achieve. For me,
the second
>approach
>is usually good enough and keeps the amount of knowledge I have
to
>manage to a much more reasonable level, but I am afraid the
>tension between these two viewpoints will always be there.
> -LenZ-
If you have any intention of porting your code to a different
hardware architecture
you should definitely take the virtual machine approach. Merced
anyone?
regards
Jerry Gitomer
They would have been delighted. What is it you (as a user) don't like about
native code compilers?
>> It seems to me that we've all gone on to native Forth advocacy just to
>> fend off the C people who come up, look over our shoulders, go "Eww,
>> it's not compiled, it must be so slow!" and completely miss the point...
>
>It is not just "to fend of the C people." If one actually uses Forth to
>*do things* it makes no sense to use a compiler that generates 16-bit
>segmented code, can't interface to system libraries, takes *minutes* to
>read a file that can be parsed in 38 *ms* if done right, and is 100 or
>1000 times (he, Jeff!) slower than schoolbook-C. These systems all equate
>to the same thing: a bad, antiquated, dusty, buggy, *implementation* of
>the fine language that I like so very much.
>
>Now what will indirect-threaded code give me that could compensate
>adequately for any of the negative aspects I enumerate above?
What negatives? What dreadful compiler are you referring to? Our native
code compilers are blazing fast, far faster than polyFORTH's compiler.
You've tried SwiftForth; was its compiler slow?
Cheers,
Elizabeth
Jonah Thomas wrote:
>
> Neal Bridges <nbri...@interlog.com> wrote:
>
> >Native-code implementations are not any harder to understand than any
> >other implementation method. There's simply no indirection; in a
> >native-code system DUP compiles code that performs the function of DUP,
> >rather than laying down an address that, when processed by an 'inner
> >interpeter' at run-time, points to the code that performs the function
> >of DUP. I'd say the native-code method is actually easier to
> >understand.
>
> The question of what's understandable is similar to the question
> of what's readable. Who is your audience and what do they find
> easy?
>
(snip)
The insights above are valid. The mindsets might be characterized as
the "speed-at-all-costs school" and the "grok school":
- I want speed at all cost
- I want to have a system whose workings I can fully grok, depend on,
and alter at will. Those sections which require speed at all cost can
be recoded with the built-in assembler as needed.
IMHO the speed-at-all-costs school is giving up a lot, and not gaining
much in return. After all if they really-really needed speed at all
cost they'd code it in assembly in the first place ;-)
Cheers,
Mark W. Humphries
Forth Chat Room on ICQ #37160535
>Native-code implementations are not any harder to understand than any
>other implementation method. There's simply no indirection; in a
>native-code system DUP compiles code that performs the function of DUP,
>rather than laying down an address that, when processed by an 'inner
>interpeter' at run-time, points to the code that performs the function
>of DUP. I'd say the native-code method is actually easier to
>understand.
I'm not sure. Say that DUP is such a word. The interpreter needs to do
different things if this word is encountered in interpret or in compile
mode. It looks to me that it brings us back to the old problems of
state-smart words, POSTPONE and to macro or not to macro, with all it's
inherent flaws. What is the code snippet POSTPONE DUP supposed to do?
How do you make a non-state-smart native compiler?
Bart.
> How do you make a non-state-smart native compiler?
State-smart words are not a problem for the compiler writer
(s/he doesn't need them), they are a problem for the user.
Example: DUP can be in hidden, maybe even sealed, vocabularies
that are searched *last*.
However, without STATE it is not obvious for the compiler writer
how to optimize CREATE DOES> constructs like CONSTANT
and VARIABLE (hmmm, why not use the trick quoted above..)
-marcel
Let's review what I'm talking about. I'm talking about engineering
team of 40+ people who have experience with C-style development
methodologies. I'm talking about an organization that has a legacy of
C-based development, and a product line that is differentiated from
our competition because of performance.
The first thing to point out is that any move towards Forth (or other
interactive languages for that matter) isn't going to happen
overnight. Any attempt to adopt Forth isn't going to happen by fiat.
What *must* happen is that Forth is gradually incorporated, both so
that people get time with the language and time with learning a
different development methodology. And certainly management isn't
going to accept such a risk on performance. If Forth will be used
anywhere, it will be first in the interactive portions of code that
run at "user speed." I see that as a very prudent move-- continue
development on the low-level stuff in C and assembler because that's
where the experience is, and use Forth for the parts that aren't
performance bound.
Once people see the value of Forth at that level, then it makes sense
to push Forth down lower. But doing that too early is too risky, both
because the engineering team is not familiar with Forth and it's
development methodology, and because unlike the blind acceptance Forth
gets for everything in this newsgroup, both management and engineering
will demand proof. And that proof will have to come from experience
in using Forth in the organization.
One of the themes I constantly see in this newsgroup is the idea that
if you sprinkle Forth on your projects, everything is solved as if by
magic. Maybe that's the case for individuals who can turn on a dime,
but it doesn't work for larger groups of people. Part of what makes
Forth valuable isn't the language itself. Forth's semantics are
simple, powerful, understandable, and extensible-- but the same can be
said about Scheme and a handful of other languages. Forth's value
comes not from the language and it's semantics, but in the way the
language helps shape designs.
So what happens when you give Forth to a 40+ member engineering team
who primarily work with C? You get C code transliterated to Forth.
That's another reason to introduce Forth gradually-- it gives people
the time to learn not just the language, but how to best use the
language.
As per page 44 of ANSI X3.215-1994, 'POSTPONE DUP' appends the
compilation semantics of DUP to the current definition. Interpretation
semantics of POSTPONE are undefined. The underlying implementation
method doesn't affect this at all.
> How do you make a non-state-smart native compiler?
Native-code is subroutine-threading with inlining. Inlining is in fact
code-copying; inlined definitions are not called, but are copied into
the target definition. This is essential for all definitions with a
code-length the same as or smaller than the size of a subroutine call,
and worthwhile for definitions with a code-length slightly larger than a
subroutine call.
To arrange this, definitions carry an 'inline' flag. Once the
interpreter has determined that it is to append the semantics of a
definition to the current definition, the decision whether to compile a
call or to inline is based on that flag.
--
Neal Bridges
<http://www.interlog.com/~nbridges/> Home of the Quartus Forth compiler!
[snip]
> What is it you (as a user) don't like about native code compilers?
[snip]
The feeling like 'a user' part?
I prefer not to 'use' the language to write a program, but to grow the
language into an application; I want to blur the distinction between
application and language. Native code compilers lead to separate-
systems thinking:
language implementor - language - language user - program - program user
The computer takes an action on every word.
It doesn't store the word away and keep it in mind for later.
- Chuck Moore, quoted in Thinking Forth
One advantage of the correspondence of source code and machine
execution is the tremendous simplification of the compiler and
interpreter.
- Leo Brodie, Thinking Forth
Why let a programming language try to outthink you?
- Leo Brodie, Thinking Forth
Cheers,
Mark W. Humphries
Forth Chat Room on ICQ #37160535
In a native-code implementation, greater efficiency is achieved by using
: and ; to define CONSTANTs and VARIABLEs.
For example, in Quartus Forth (a native-code system), CONSTANTs are not
defined using CREATE/DOES> -- they're defined via a sequence equivalent
to this:
: CONSTANT
>R \ preserve the constant value across :
: \ begin a new definition
R> \ retrieve the constant value
POSTPONE LITERAL \ compile the constant value as a literal
POSTPONE ; \ end the definition
; INLINE \ mark the new constant as 'inline'
'INLINE' here is an optimization irrelevant to the technique.
Other optimization of the new definition is automatically handled by the
optimizer built into the compiler.
VARIABLE is defined using a similar technique, equivalent to this
code-sequence:
: VARIABLE
: \ begin a new definition
\ Compile the next dataspace address as a literal:
HERE POSTPONE LITERAL
POSTPONE ; \ end the definition
1 CELLS ALLOT \ allocate one cell of dataspace for the variable
; INLINE \ mark the new variable as 'inline'
Note that this definition of VARIABLE assumes separate code and
dataspace; it would need to be modified slightly for implementations
with mixed code and dataspace.
--
Neal Bridges
<http://www.interlog.com/~nbridges/> Home of the Quartus Forth compiler!
>So why don't I do it myself? The matrix-mult benchmark is a nice candidate
>because here "C" blows (mx)(i)Forth out of the water:
>
>Benchmark machine iForth 1.10 (s) C (s)
>--------- ------- --------------- -----
>matrix.fs P-166 1.78 0.84
>matrix.fs PII-350 0.43 0.11
>
>The results for iForth and mxForth (shown in earlier postings) are nearly
>identical.
>
>The only word to change to CODE is innerproduct:
>
>\ original
>: innerproduct ( a[row][*] b[*][column] -- int)
> 0 row-size 0 do
> >r over @ over @ * r> + >r
> cell+ swap row-byte-size + swap
> r>
> loop
> >r 2drop r>
>;
Just to give a flavour of what an optimising compiler will do, here
is the code produced by the VFX compiler in the new ProForth/VFX.
There are two versions, the first as supplied by Marcel, and the
second with the return stack activity eliminated because it is
expensive on a Pentium class CPU.
200 constant row-size
400 constant row-byte-size
: innerproduct ( a[row][*] b[*][column] -- int)
0 row-size 0 do
>r over @ over @ * r> + >r
cell+ swap row-byte-size + swap
r>
loop
>r 2drop r>
;
dis innerproduct
INNERPRODUCT
( 0045EEF8 8D6DF4 ) LEA EBP, [EBP+-0C]
( 0045EEFB C74500C8000000 ) MOV DWord Ptr [EBP], 000000C8
( 0045EF02 C7450400000000 ) MOV DWord Ptr [EBP+04], 00000000
( 0045EF09 895D08 ) MOV [EBP+08], EBX
( 0045EF0C BB00000000 ) MOV EBX, 00000000
( 0045.EF11 ) (DO) 0045.EF40
( 0045EF1A 53 ) PUSH EBX
( 0045EF1B 8B5D04 ) MOV EBX, [EBP+04]
( 0045EF1E 8B1B ) MOV EBX, 0 [EBX]
( 0045EF20 8B5500 ) MOV EDX, [EBP]
( 0045EF23 8B12 ) MOV EDX, 0 [EDX]
( 0045EF25 0FAFD3 ) IMUL EDX, EBX
( 0045EF28 5B ) POP EBX
( 0045EF29 03DA ) ADD EBX, EDX
( 0045EF2B 53 ) PUSH EBX
( 0045EF2C 83450004 ) ADD [EBP], 04
( 0045EF30 81450490010000 ) ADD [EBP+04], 00000190
( 0045EF37 5B ) POP EBX
( 0045EF38 FF0424 ) INC [ESP]
( 0045EF3B 71DD ) JNO 0045EF1A
( 0045EF3D 83C40C ) ADD ESP, 0C
( 0045EF40 53 ) PUSH EBX
( 0045EF41 5B ) POP EBX
( 0045EF42 8D6D08 ) LEA EBP, [EBP+08]
( 0045.EF45 ) RET
78 bytes ok
: innerproduct2 ( a[row][*] b[*][column] -- int)
0 -rot
row-size cells bounds do
dup @ i @ * rot + swap
row-byte-size +
cell +loop
drop
;
dis innerproduct2
INNERPRODUCT2
( 0045EE88 8BD3 ) MOV EDX, EBX
( 0045EE8A 81C220030000 ) ADD EDX, 00000320
( 0045EE90 8D6DF8 ) LEA EBP, [EBP+-08]
( 0045EE93 895500 ) MOV [EBP], EDX
( 0045EE96 8B4508 ) MOV EAX, [EBP+08]
( 0045EE99 894504 ) MOV [EBP+04], EAX
( 0045EE9C C7450800000000 ) MOV DWord Ptr [EBP+08], 00000000
( 0045.EEA3 ) (DO) 0045.EECF
( 0045EEAC 8B13 ) MOV EDX, 0 [EBX]
( 0045EEAE 8B0C24 ) MOV ECX, [ESP]
( 0045EEB1 034C2404 ) ADD ECX, [ESP+04]
( 0045EEB5 8B09 ) MOV ECX, 0 [ECX]
( 0045EEB7 0FAFCA ) IMUL ECX, EDX
( 0045EEBA 034D00 ) ADD ECX, [EBP]
( 0045EEBD 81C390010000 ) ADD EBX, 00000190
( 0045EEC3 83042404 ) ADD [ESP], 04
( 0045EEC7 894D00 ) MOV [EBP], ECX
( 0045EECA 71E0 ) JNO 0045EEAC
( 0045EECC 83C40C ) ADD ESP, 0C
( 0045EECF 8B5D00 ) MOV EBX, [EBP]
( 0045EED2 8D6D04 ) LEA EBP, [EBP+04]
( 0045.EED5 ) RET
78 bytes ok
--
Stephen Pelc, s...@mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 1703 631441, fax: +44 1703 339691
web: http://www.mpeltd.demon.co.uk
Gforth is a counterexample. Basically it is possible, and you can
find CODE, END-CODE, ;CODE, and ASSEMBLER in code.fs. You can load an
assembler for you machine and assemble away. Some version in the
future will have Andrew McKewan's assembler and disassembler for the
386, Christian Pirker's assembler and disassembler for the MIPS, and
Bernd Thallner's assembler and disassembler for the Alpha. More
donations welcome.
- anton
--
M. Anton Ertl Some things have to be seen to be believed
an...@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
That's hardly surprising, given that usual Forth compilers don't
employ the techniques that differentiate optimizing from plain C
compilers (e.g., strength reduction, induction variable elimination,
constant propagation, global register allocation, instruction
scheduling).
> The numbers can be better-- especially with poor C
> compilers-- but given that a C compiler has a lot more information
> available to optimize code, it isn't terribly surprising.
So what is this "lot more information"? And where did f2c get it from
(f2c compiles Forth to C; when both its code and manually written C
code are compiled with gcc -O3, the difference is very small)?
The only difference in information I know that may play a role in
optimization is ANSI C type-based alias analysis. But this requires
more ANSI C conformance than many C programs have, and it is not very
precise; if your C compiler does alias analysis (gcc does not), it
probably uses a more precise and less standard-conformance-sensitive
data-flow analysis approach (possibly combined with optional
type-based alias analysis). And neither approach helps much in many
integer codes.
There are also reasons why Forth can be faster than C (if both used
eually good compilers):
1) Forth allows multiple return values on the stack; the typical C
idiom is having a variable whose address is passed, causing memory
references and aliasing problems; or creating a structure that has to
be disassembled again in the calling function (boxing/unboxing), which
internal to the compiler often also causes memory references and alias
problems.
2) Forth usually uses addr u strings, C uses zero-terminated strings.
Since the count is available in advance, it is much easier to exploit
parallelism on the string; moreover, it's cheaper to determine the
length of the string.
3) Run-time code generation. If there is lots of work done with a
specific set of data, it is often faster to compile code that has the
data embedded and is possibly optimized for the data, than to
reinterpret the data again and again. This is possible in Forth, but
not (easily) in C.
This list is not exhaustive.
Append the compilation semantics of DUP to the current definition.
> How do you make a non-state-smart native compiler?
There are several methods, described in
http://www.complang.tuwien.ac.at/papers/ertl&pirker96.ps.gz, Section
4.2, and http://www.complang.tuwien.ac.at/papers/ertl&pirker97.ps.gz,
Section 2.
The three correct ones are: dual wordlists, dual xts, and intelligent
COMPILE,. For optimization I favour the intelligent COMPILE,, and if we just want to deal with DUP, it's easy:
: compile, ( xt -- )
dup ['] dup = if
drop compile&optimize-dup
else
compile, \ the dumb COMPILE,
endif ;
If you want more than just optimize DUP, you'll probably want a
different control structure than a long IF-chain, but that's left as
an exercise to the reader.
> One of the themes I constantly see in this newsgroup is the
> idea that if you sprinkle Forth on your projects, everything
> is solved as if by magic. Maybe that's the case for
> individuals who can turn on a dime, but it doesn't work for
> larger groups of people. Part of what makes Forth valuable
> isn't the language itself. Forth's semantics are simple,
> powerful, understandable, and extensible-- but the same can
> be said about Scheme and a handful of other languages.
> Forth's value comes not from the language and it's semantics,
> but in the way the language helps shape designs.
> So what happens when you give Forth to a 40+ member
> engineering team who primarily work with C? You get C code
> transliterated to Forth. That's another reason to introduce
> Forth gradually-- it gives people the time to learn not just
> the language, but how to best use the language.
How about trying to get assembly language programmers to
use Forth? Assembly language transliterated to Forth might be an
improvement to both languages, particularly if it is done by
programmers who are good at commenting. Can't you sell Forth as
an interactive structured macro assembler with built in
debugging, small size, low cost and all those improved
semantics? Instead of trying to use a high level language, like
C, for low level work, start with low level Forth assembly and
show how it is easy to extend it upward.
Every few years I run into an assembly language programmer
and am amazed to find old slow 1950's ways of doing things still
in use.
--
Michael Coughlin m-cou...@ne.mediaone.net Cambridge, MA USA
Concerning CREATE...DOES>: compile the address as literal, then inline
the DOES>-routine (unless it's too large). Treating this as a
separate case in an intelligent COMPILE, should be easy.
Concerning CONSTANT: If that is implemented as a CREATE...DOES> word,
you will ever get optimum performance, because the data of
CREATE...DOES> words is not guaranteed to be immutable and you will
always have to generate the address of the data and fetch from there.
It's better to define them as INLINEd colon defs, as Neal Bridges has
shown.
Concerning VARIABLE: What's the DOES> for? Simply compile the address
as a literal.
That probably should be
...
INLINE
;
It's always the things you double-check that have the errors -- thanks.
In both cases, INLINE goes inside the semicolon.
>Part of what makes
>Forth valuable isn't the language itself. Forth's semantics are
>simple, powerful, understandable, and extensible-- but the same can be
>said about Scheme and a handful of other languages. Forth's value
>comes not from the language and it's semantics, but in the way the
>language helps shape designs.
Very good points! Forth doesn't automatically create productivity,
it *allows* productivity. It provides the opportunity to write
simpler, better-factored code -- and particularly it allows users
to fix early design mistakes. None of that happens without users
who know what they're trying for.
>So what happens when you give Forth to a 40+ member engineering team
>who primarily work with C? You get C code transliterated to Forth.
Yes!
>That's another reason to introduce Forth gradually-- it gives people
>the time to learn not just the language, but how to best use the
>language.
I think there are many more people who've tried Forth than there
are who use it today. They see the elegance, and then they don't
immediately see how to apply it for practical results. There are
methods that don't come automatically, and people give up if they
have a low frustration tolerance, or if they're satisfied with
what they're already doing, or if they just don't have the time
to learn new methods.
Having taught introductory Forth for 25 years, and in the last year having
taught it on a compile-to-code system, I must say that I've found
compile-to-code to have little effect on the learning curve, and what effect
there is is helpful. People are accustomed to compilers generating code.
They're comfortable with that idea. The Forth VM and indirect threading was
just one more architectural and conceptual hurdle to get over, and learning
is much simpler with it gone. The emphasis is, as IMO it should be, on how
to get results (rather than how the internal machinery works).
Cheers,
Elizabeth
> For example, in Quartus Forth (a native-code system), CONSTANTs are
> not defined using CREATE/DOES> -- they're defined via a sequence equivalent to
> this:
Well, you side-stepped the issue of STATE by renaming it to INLINE . That's
a political solution :-)
-marcel
Not to read too much into your joke, but in fact INLINE is not another
compilation state -- INLINE just specifies the compilation method for a
given definition. Moreover, you could remove INLINE from the equation
without detracting from the point I was making; it's not relevant except
from an optimization standpoint.
How would this code be picked up by the linker?
I think that John's incremental approach has the best chance of success.
There is a group dynamic among the staff and his credibility with
management to deal with. If he blows either, the game is over.
>
> --
> Michael Coughlin m-cou...@ne.mediaone.net Cambridge, MA USA
Nice work, John.
It seems to me that you have blurred the distinction between
the compiler and the (outer) interpreter. What Am I missing?
> Not to read too much into your joke, but in fact INLINE is not
> another compilation state -- INLINE just specifies the compilation method for
> a given definition. Moreover, you could remove INLINE from the
> equation without detracting from the point I was making; it's not relevant
> except from an optimization standpoint.
Sorry, I realized that too late. Using a "smart" COMPILE, INLINE will let
you optimize without needing STATE and IMMEDIATE .
However, I do not understand your remark on the optimization standpoint.
Without further explanation I don't immediately see why : CONSTANT CREATE ,
INLINE DOES> funny tricks ; can't do what your colon definition does?
-marcel
> Jerry
Its not the same code. A native code optimizing compiler is more akin
to a language translator than a virtual machine interpreter.
With a traditional vm each word has a consistent interpretation action,
a consistent compilation action, and a consistent execution action.
Most words consistently compile to the same fixed-sized vm "opcode"
regardless of context. As there is a higher correspondence between
source and object the result is more reflective. These attributes can
be leveraged as you code, and in your final application.
There is a tradeoff between the regularity, consistency, simplicity,
and reflectivity of a vm interpreter and the speed of inlined optimized
native code.
The attitude that this is all "under the hood" and therefore only the
concern of the language implementor and not the "user" sweeps the
tradeoff under the rug.
I'm trying (ineptly I'm sure) to provide a counterpoint and make the
tradeoffs more apparent, especialy for those new to Forth.
Now I know what you mean, but I think I disagree with your terms. For
me, it is the same code at the source level. Development is what
changes. I do disassemble code when I can, and study dumps for what they
can tell me. I feel hobbled by the non-professional version of
SwiftForth without access to the source code, especially never having
become fluent in 'x86 assembler. (I need to fix that soon!) On balance,
I prefer the faster code. I was very fond of MACH 2, which I used with
OS9 on a 68000.
Jerry
--
Engineering is the art | Let's talk about what
of making what you want | you need; you may see
from things you can get. | how to do without it.
---------------------------------------------------------
Whether you have direct threaded, indirect threaded, subroutine
threaded (native code), native code with inlining and/or other
optimizations, token threaded, bit threaded, or whatever, there
should be a very strong correspondence between source and object
code and many systems provide interactive stepped debugging with
source code display. You put down tokens that get interpreted
(again) in software or you interpret the source code and put
down instructions that use the hardware as the interpreter.
Subroutine threading (native code) is not inherently more complex
than indirect threading. The threading mechanism is not a good
measure of the complexity of a system. It is known that native
code without optimizations is slower than direct or indirect
threaded code on many architectures. There are also different
types of compiler optimzations, some that can happen
on idtc or dtc systems. Perhaps the arguments should
be more about the specific optimizations that you do or
don't like for a particular machine or model.
I think it also is interesting that Chuck Moore switched to
native code when he did cmForth for the Novix hardware. He
had no problem with the idea of native code or optimizations
at all. He did feel the optimizations in cmForth were way
too Novix specific to be portable and so he designed the
next generation of hardware for a simple native code with
inlining model to get the smallest and fastest compilers
that he had yet done.
Everything is subroutine calls except a few primitives that are
inlined in Machine Forth. It is very simple, direct, portable
etc.
In article <7n0d9u$bp8$1...@nnrp1.deja.com>,
Mark Humphries <m...@intranetsys.com> wrote:
> Its not the same code. A native code optimizing compiler is more akin
> to a language translator than a virtual machine interpreter.
I might have felt that way in 1985. I have worked with many other
Forth systems since then.
> With a traditional vm each word has a consistent interpretation
action,
> a consistent compilation action, and a consistent execution action.
> Most words consistently compile to the same fixed-sized vm "opcode"
> regardless of context. As there is a higher correspondence between
> source and object the result is more reflective. These attributes can
> be leveraged as you code, and in your final application.
I think in practice this is an issue of how complex the optimization
rules are. If the compiled code is hard to relate to the source
then you might say this. I might have been worried about that sort
of thing many years ago but I did not find it to be a problem.
> There is a tradeoff between the regularity, consistency, simplicity,
> and reflectivity of a vm interpreter and the speed of inlined
> optimized native code.
You make it seem like the two are mutually exclusive. I think you
may have very different ideas about what optimizations "should"
or should not be done than I do. We must not be talking about
the same thing. Of course the correspondence between the
inlined optimized native code and the source code is trivial
when the native code is for a Forth machine (other than Novix).
But Machine Forth makes this approach portable for other machines.
I think it is very consistent with traditional Forth theory.
> The attitude that this is all "under the hood" and therefore only the
> concern of the language implementor and not the "user" sweeps the
> tradeoff under the rug.
This sounds like me talking about ANS Forth. ;-)
If you can explain all the optimization rules in a few sentences
most of these points just don't apply. Even if there are a
hundred or more optimization rules in a particular system that
we were discussing I would prefer to hear of actual experiences
than theoretical concerns about abstractions that might or might-not
apply.
> I'm trying (ineptly I'm sure) to provide a counterpoint and make the
> tradeoffs more apparent, especialy for those new to Forth.
Working with hardware optimized for tiny five bit inlined opcodes,
and subroutine calls and returns I have a clear bias towards the
inlined native code approach. I also promote the easy to see
relationship between source and object code.
If I work with some hardware that does idtc or btc in hardware then
what do you call it? Is it native code or is it indirect threaded
code or bit threaded code? The Novix had a sort of threading
mechanism in hardware I guess to complicate the discussion
even of cmForth. The msb determined jump or call so for a range
of addresses a list of addresses was also a list of subroutine
calls to those addresses, something like that. A lot of what
would be compiled would be indistinguishable from a list of
address in indirect threaded code.
--
Jeff Fox Ultra Technology
www.UltraTechnology.com
The function would be equivalent; defining constants using : and ;
results in smaller, faster code. There's no need for the overhead
(both memory and extra processing time) of CREATE for a CONSTANT -- a
simple literal is sufficient.
--
Neal Bridges
<http://www.interlog.com/~nbridges/> Home of the Quartus Forth compiler!
My proporsal for innerproduct in high level is:
: innerproduct ( a[row][*] b[*][column] -- int )
0 swap rot mat-size bounds do
I @ swap @+ >r * + r>
row-byte-size +loop
drop ;
Not only is this shorter, it's also much faster. On a 5x86/133 (from
AMD, really a fast clocked 486, but with a fast integer multiply), this
runs in 2.1 seconds instead of 8.2 seconds (bigFORTH). BOUNDS and @+ are
bigFORTH words:
: BOUNDS ( start n -- end start ) OVER + SWAP ;
: @+ ( addr -- n addr' ) DUP @ SWAP CELL+ ;
I'm really surprised how well iForth manages to execute the code from
Anton's matrix original, or is it just the slow multiplier on the P5
that makes it look "good" (Am5x86 has a fast multiplier, as has P6)?
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
That's more an interface design issue, since it's Windows that has to do
most of the work (display fonts). When you design your terminal output
like MINOS is doing it, you can only win the price for the most useless
"WORDS" display. I do the following optimization: gather all lines - no
need to display in between the lines unless the output flow stops for
more than 50 ms. Further, don't scroll when you don't need to (and you
don't need to if there is another scroll immediately following, again
the 50 ms rule). Finally, something like WORDS just writes to the
buffer, and finally (when it's all over), it scrolls to the end and
redraws the window. You can easily keep up with megabytes text per
second.
There is nothing magic with Forth speed, it can be (theoretically) done
in any language; especially in the Windows (or X) interface case, where
the dominating time is not spent in the Forth code, anyway. It's just
that the fast development speed of Forth allows us to tune more.
IMHO it depends a lot on the people you want to teach Forth. It's just
like learning to swim: there are two ways. Either just jump into the
water and learn diving first, or "gradually", with dry swimming first,
and savety belts later. You can't apply the first method on peoples who
are too old and lost their "keep your mouth shut when you are under
water" reflex*. But people learning it the second way never learn it
properly (or at least it takes them very long, and even then they'll
fear waves etc.).
Learning new things must give you a benefit. If you can't learn all at
once (new language, new development methology, factoring, etc.), you may
miss the benefit totally. If you just transliterate C into Forth, you
get slower, less readable programs, and it takes even longer to develop
them. This can really result in a lot of people with bad Forth
experience ("tried once, nearly drowned"). So the advice is: use the
language where the benefit comes fast. That's exactly what John is
doing.
*) worse: they can't keep their nose shut.
No, a benchmark loses its value when you start to optimize it in
this way, at least for comparing languages. A faster algorithm is
irrelevant, unless it can't be coded in the other languages (e.g.
run-time code generation?).
>: innerproduct ( a[row][*] b[*][column] -- int )
> 0 swap rot mat-size bounds do
> I @ swap @+ >r * + r>
> row-byte-size +loop
> drop ;
Stephen Pelc did it this way in his 2nd attempt for VFX. Even
faster is to use a temporary variable or local instead of the
R-stack (probably not for VFX).
>Not only is this shorter, it's also much faster. On a 5x86/133 (from
>AMD, really a fast clocked 486, but with a fast integer multiply), this
>runs in 2.1 seconds instead of 8.2 seconds (bigFORTH). BOUNDS and @+ are
>bigFORTH words:
>
>: BOUNDS ( start n -- end start ) OVER + SWAP ;
>: @+ ( addr -- n addr' ) DUP @ SWAP CELL+ ;
>
>I'm really surprised how well iForth manages to execute the code from
>Anton's matrix original, or is it just the slow multiplier on the P5
>that makes it look "good" (Am5x86 has a fast multiplier, as has P6)?
The P5 has a very slow multiplier (13 clocks I think). The PII multiplies
in 2 or 3 cycles. Once that happens instruction scheduling becomes very
important and every instruction (pairing, AGI stalling) counts. I've
thought a little bit about that for iForth before I even got the PII.
What amuses me most is that a tiny bit of assembler makes the whole benchmark
more than twice as fast as the f2c version.
-marcel
I can fully grok, depend on, and alter at will my bigFORTH code
generator. It's a very simple macro inlining and pattern maching
peephole optimization. Actually, the replacement for , in a threaded
code system are three screens of code (and 9 or so screens of pattern
tables). I can easily grok the effect of three screens of code, even if
they do a lot of black magic. Really. I do not need the most simple
thing to be able to understand it. bigFORTH's compiler is a compromise
between easy to implement/easy to port and speed (at least for
register-starved CISC processors). It gives you fully predictible and
mostly decompilable code and reasonable speed. Writing in assembler is
so much harder that it pays of instantly. Especially when your code
doesn't have just a few hot spots - and even the 80/20 rule shows that
if you can speed up the 80% part by a factor of 5, you still get another
overall gain of 2 by speeding up the other 20% by that factor (because
they became dominating).
matrix-mult.fs and bubble-sort.fs are from Marty Fraeman's
translations of the Stanford integer benchmarks. sieve.fs comes from
you and probably originates in the Byte sieve; fib.fs was written by
me (but using this algorithm as benchmark has a long tradition).
You can find these benchmarks and their C versions on
http://www.complang.tuwien.ac.at/forth/bench.zip.
The fear factor is gone and they are starting to see large benefits.
The test dept doesn't need programmers since the end user after 4 to 8
hours of training designs their own scripts.
I teach none of the internals of FORTH except for the data stack. I
tell them to treat it like a HP calculator with a larger stack.
They are asking for more.
Simon
==========================================
On Mon, 19 Jul 1999 11:04:12 -0400, Michael Coughlin
<m-cou...@ne.mediaone.net> wrote:
>John Passaniti wrote:
>>
>> <euph...@freenet.co.uk> wrote in message
>> news:7mthak$2se$3...@news1.cableinet.co.uk...
>> > What's wrong with writing it in Forth, profiling it,
>> > getting it as fast as possible in Forth, and then going
>> > to assembler for the time-critical bits - using, of
>> > course, the inbuilt Forth assembler to retain
>> > interactivity?
>> > That's the standard model - like anything in Forth, it
>> > doesn't pay to think in batch-mode terms.
>
> [snip]
>
>> One of the themes I constantly see in this newsgroup is the
>> idea that if you sprinkle Forth on your projects, everything
>> is solved as if by magic. Maybe that's the case for
>> individuals who can turn on a dime, but it doesn't work for
>> larger groups of people. Part of what makes Forth valuable
>> isn't the language itself. Forth's semantics are simple,
>> powerful, understandable, and extensible-- but the same can
>> be said about Scheme and a handful of other languages.
>> Forth's value comes not from the language and it's semantics,
>> but in the way the language helps shape designs.
>
>> So what happens when you give Forth to a 40+ member
>> engineering team who primarily work with C? You get C code
>> transliterated to Forth. That's another reason to introduce
>> Forth gradually-- it gives people the time to learn not just
>> the language, but how to best use the language.
>
> How about trying to get assembly language programmers to
>use Forth? Assembly language transliterated to Forth might be an
>improvement to both languages, particularly if it is done by
>programmers who are good at commenting. Can't you sell Forth as
>an interactive structured macro assembler with built in
>debugging, small size, low cost and all those improved
>semantics? Instead of trying to use a high level language, like
>C, for low level work, start with low level Forth assembly and
>show how it is easy to extend it upward.
>
> Every few years I run into an assembly language programmer
>and am amazed to find old slow 1950's ways of doing things still
>in use.
>
>--
>Michael Coughlin m-cou...@ne.mediaone.net Cambridge, MA USA
Simon - http://www.spacetimepro.com
May I stand on the sidelines and cheer you on?
>>The first thing to point out is that any move towards Forth (or other
>interactive languages for that matter) isn't going to happen
>overnight. Any attempt to adopt Forth isn't going to happen by fiat.
>What *must* happen is that Forth is gradually incorporated, both so
>that people get time with the language and time with learning a
>different development methodology. And certainly management isn't
>going to accept such a risk on performance.
---------------- snip -------------
John,
Your view is the trend in Forth today. Dr. Ting designed his e-forth so that
the code would be more familiar to non-forthers, and that they could use the
editors and debuggers they use everyday.
When Moore devloped Forth, one aim was to unify the tools and language he
needed in programming. At the time there were a variety of minicomputers
each with their own variation of languages, editors, debuggers, etc., and
having to learn each in turn was a great time waster. By having Forth inside
and everywhere saved a great deal of time and energy. As E. Rather wrote,
even the syntax of the Forth assembler was developed as a virtual system to
which each computer system's assembler was translated.
Forth was designed to increase productivity through an integrated,
interactive system that encouraged early prototyping, and allowed rapid
iteration of the code, compile, test, debug cycle to the smallest code
module possible, the subroutine (Word). (And to do so in 64 KBytes or less.)
To consider Forth merely as a language is to miss the point.
No question the amazing power of even a $1000 computer, and wide
avaliability of documentation and code libraries do much to promote C/C++,
and in computerdom, the leader of the pack tends to be way ahead. So
introducing Forth where its light can shine brightest is a good idea, and if
it leads to greater use and appreciation of Forth's power, even better.
Walter Rottenkolber
On 1999-07-18 era...@forth.com said:
:Marcel Hendrix wrote in message <7ms7tq$qgn$1...@news.IAEhv.nl>...
:>euph...@freenet.co.uk again:
:>> polyForth was indirectly threaded, and so would be slow compared
:>>to a native Forth, but then it does also offer some advantages
:>>over native Forth too, in much the same way as a Lisp
:>>interpreter offers some advantages over a Lisp compiler. Those
:>>advantages may have nothing to do with speed, but it doesn't
:>>mean that they're meaningless.
:>Well, give us some specific examples then. For instance, what
:>would've happened if the PolyForthers of yore had had to use the
:>present SwiftForth compiler technology? Would they have failed
:>miserably?
No. So? You seem to be imagining that I have said "well, native code is
all shite". That's not what I said. I said that a threaded model has
some advantages, and I was very careful to qualify it. Simplicity of
concept is one. Speed is not.
:>Do you actually know how a native code Forth compiler
:>looks, viewed from the user side?
Why have you taken my preference personally?
:They would have been delighted. What is it you (as a user) don't
:like about native code compilers?
Elizabeth, Marcel is very pro-native. I'm the one who seems to be cast
as the villain here.
What I don't like about native code is that it's a bugger to roll your
own. Stringing addresses of code together is swift, elegant and uniform,
and you are guaranteed a certain layout in memory. It also makes
experimenting with novel control structures, or self-mutating code, a
doddle - as do Lisp functions. It's trivial to decompile and debug. If
you need to drop down to assembler, you can do so, and you have a new
primitive.
Lisp enjoys similar advantages. So, for that matter, does Smalltalk;
Squeak is very firmly bytecode-based.
I know that you have protested about the prevalence of the roll-your-own
mentality in Forth, and I don't wish to encourage it or propose it as
the only solution. Clearly it's not. However, for me it is the right
one.
What I do like about native code is the speed. That's lovely. However, I
find that my computers are fast enough to stand a little cycle loss,
whilst my mind is not fast enough to stand a lot of complexity.
:>> It seems to me that we've all gone on to native Forth advocacy
:>>just to fend off the C people who come up, look over our
:>>shoulders, go "Eww, it's not compiled, it must be so slow!" and
:completely miss the point... >
:>It is not just "to fend of the C people." If one actually uses
:>Forth to *do things* it makes no sense to use a compiler that
:>generates 16-bit segmented code, can't interface to system
:>libraries, takes *minutes* to read a file that can be parsed in 38
:>*ms* if done right, and is 100 or 1000 times (he, Jeff!) slower
:>than schoolbook-C. These systems all equate to the same thing: a
:>bad, antiquated, dusty, buggy, *implementation* of the fine
:>language that I like so very much.
None of which has a thing to do with threading, as Elizabeth's confusion
demonstrates nicely. A poor native code implementation could easily end
up with an equally disastrous set of results, and a well-done threaded
implementation would not suffer from these problems (let alone the
16-bit segmented code bug, which only applies to a single CPU
architecture which has long ceased to be current).
:>Now what will indirect-threaded code give me that could compensate
:>adequately for any of the negative aspects I enumerate above?
I don't know, and frankly doubt very much, whether it could give you
anything. All I know is what it would give me, and that's what matters
to me. Frankly, even if there were a killer point about ITC that made it
far better, on the evidence of this post my chances of getting you to
see it would be remote. There isn't; it's an issue of personal
preference, and I've outlined what informs mine above. I don't have a
problem with yours differing.
:What negatives? What dreadful compiler are you referring to? Our
:native code compilers are blazing fast, far faster than polyFORTH's
:compiler. You've tried SwiftForth; was its compiler slow?
I haven't tried SwiftForth, and I probably won't unless (or until) you
bring out a Linux version. (If I ever get a Linux installed on any of my
machines... but that's another story.)
--
Communa -- you know soft spoken changes nothing
On 1999-07-19 era...@forth.com said:
:Having taught introductory Forth for 25 years, and in the last year
:having taught it on a compile-to-code system, I must say that I've
:found compile-to-code to have little effect on the learning curve,
:and what effect there is is helpful. People are accustomed to
:compilers generating code. They're comfortable with that idea. The
:Forth VM and indirect threading was just one more architectural and
:conceptual hurdle to get over, and learning is much simpler with it
:gone. The emphasis is, as IMO it should be, on how to get results
:(rather than how the internal machinery works).
Surely for an introductory Forth course, you don't need to mention what
the code generation architecture is? At least, not until the end...
On 1999-07-20 I said:
:I'm going to reply to Marcel and Elizabeth at once, because somehow
:I missed these bits of Marcel's post last time around.
I must at this point proffer my apologies. This post was worded rather
more strongly than was appropriate, and after resolving to change it
whilst at work today, and then having to fight with my ISP to actually
get any news down, I clean forgot. Sorry, Marcel.
>In article <7mrjs5$4qn$1...@oak.prod.itd.earthlink.net>,
> "Elizabeth D Rather" <era...@forth.com> wrote:
>> The new SwiftForth is so fast it can keep up with 38.4 K input
>> displayed to the screen, which the Windows experts say can't be
>> done (and that's using Windows at both ends).
>That's more an interface design issue, since it's Windows that has to do
>most of the work (display fonts). When you design your terminal output
>like MINOS is doing it, you can only win the price for the most useless
>"WORDS" display. I do the following optimization: gather all lines - no
>need to display in between the lines unless the output flow stops for
>more than 50 ms. Further, don't scroll when you don't need to (and you
>don't need to if there is another scroll immediately following, again
>the 50 ms rule). Finally, something like WORDS just writes to the
>buffer, and finally (when it's all over), it scrolls to the end and
>redraws the window. You can easily keep up with megabytes text per
>second.
What's the prize for the most useless WORD implementation? It must be
pretty steep with so damn many entrants.
A decent word display should follow normal UI standards. Whatever the
sort order, lists should be in left aligned columns, sorted down then
across. And put a high priority on display of the most frequently
and/or recently changed information. That means that WORDS should
list down by columns, and pause at the end of a full screen of output.
How hard would it be to write that in, say, GForth?
I learnt Forth from a Forth Inc. microFORTH manual. It seemed at the time
that all those memory diagrams of boxes linked by arrows representing
address pointers helped enormously in making the whole system easy to
understand and use productively in a very short time. Do you really think,
in retrospect, that I would have done better to have ignored them and you
would have done better to have left them out?
Philip.
Member of FIG-UK http://www.users.zetnet.co.uk/aborigine/forth.htm
The key to M. Simon's success (and the growth of FORTH) is
the idea of working with engineers and telling them to "treat it
like an HP calculator with a larger stack". Simon just removed
the FUD (Fear, Uncertainty, and Doubt) from FORTH because the
user community is now dealing with a familiar concept.
If we want to FORTH to grow this is a great way to win new
users. Unfortunately it will only work in occupations where the
HP calculator is a familiar and trusted tool. Just find someone
using an HP and give them a great program (your favorite FORTH
implementation) that will let them use their PC like a bigger,
faster HP calculator.
regards
Jerry Gitomer
M.Simon wrote in message <37949857...@news.alpha.net>...
>
>I'm having a lot of luck with engineers who need to peek, poke,
and do
>simple I/O with scripts in a cold iron bring up for a large
aerospace
>company. The effort has been well recieved and they are asking
for
>more.
>
>The fear factor is gone and they are starting to see large
benefits.
>The test dept doesn't need programmers since the end user after
4 to 8
>hours of training designs their own scripts.
>
>I teach none of the internals of FORTH except for the data
stack. I
>tell them to treat it like a HP calculator with a larger stack.
>
>They are asking for more.
>
>Simon
>==========================================
Depends on the diagram! Certain usage issues (e.g., how redefinitions are
handled; concept of words calling previously defined words, etc.) are
clearer with diagrams. But about 10 years ago we quit insisting on people
knowing the actual structure of a dictionary, and learning got better
because the internal structural issues were introducing confusing and
ultimately unhelpful issues into the process.
Cheers,
Elizabeth
> On 1999-07-20 I said:
> :I'm going to reply to Marcel and Elizabeth at once, because somehow
> :I missed these bits of Marcel's post last time around.
> I must at this point proffer my apologies. This post was worded rather
> more strongly than was appropriate,
No apology is needed. I confess that my original post was deliberately
unpolished in some places (aka designed to be offensive to some).
I think you really tried to formulate an helpful response to my query,
keeping an open attitude to my ill-formulated point of view, so I'd like
to carry on the discussion a little bit further.
on Tue Jul 20 23:58:01 CEST 1999 you wrote:
> :Marcel Hendrix wrote in message <7ms7tq$qgn$1...@news.IAEhv.nl>...
> :>euph...@freenet.co.uk again:
> :>> polyForth was indirectly threaded, and so would be slow compared
> :>>to a native Forth, but then it does also offer some advantages
> :>>over native Forth too, in much the same way as a Lisp
> :>>interpreter offers some advantages over a Lisp compiler. Those
> :>>advantages may have nothing to do with speed, but it doesn't
> :>>mean that they're meaningless.
> :>Well, give us some specific examples then. For instance, what
> :>would've happened if the PolyForthers of yore had had to use the
> :>present SwiftForth compiler technology? Would they have failed
> :>miserably?
> No. So? You seem to be imagining that I have said "well, native code is
> all shite". That's not what I said. I said that a threaded model has
> some advantages, and I was very careful to qualify it. Simplicity of
> concept is one. Speed is not.
The "simplicity of concept" advantage is always brought up (I have held
this discussion regularly for the last 15 years, since I wrote my first
Forth native code compiler for the ZX80), but never ever gets explained
beyond that.
So you should've read my line as "yes, yes, I know, give me examples
please," not as, "that's not true." It is of course clear to you
that I do not believe indirect-threaded code is a simpler concept than
sub-routine threading or native code compilation. So my sentence about
PolyForth simply stated my *opinion* that had it been a native code
compiler, it would have been even better. At least one former user
seemed to agree with me.
> :>Do you actually know how a native code Forth compiler
> :>looks, viewed from the user side?
> Why have you taken my preference personally?
You can also interpret this as a straightforward question. The answer
helps me to decide if it is necessary to describe the iForth implementation
of the concept to you, so you can give examples why a different (simpler?)
concept would be better, and why.
> :They would have been delighted. What is it you (as a user) don't
> :like about native code compilers?
> Elizabeth, Marcel is very pro-native. I'm the one who seems to be cast
> as the villain here.
I certainly don't think of you as a "villain." I do not discuss with
villains (in a usenet context: people whose depths I'm unable or unwilling
to probe.)
> What I don't like about native code is that it's a bugger to roll your
> own. Stringing addresses of code together is swift, elegant and uniform,
> and you are guaranteed a certain layout in memory.
"Elegant" goes a bit too far.. Subroutine-threading does the other things
even better. And I fail to see the big advantage of the memory layout?
> It also makes
> experimenting with novel control structures, or self-mutating code, a
> doddle - as do Lisp functions.
I do all of these things in iForth. If you can give examples of what you
can do and think I can't, I'd like to compare notes.
> It's trivial to decompile and debug. If
> you need to drop down to assembler, you can do so, and you have a new
> primitive.
Decompiling is trivial, writing a decompiler is not (Bernd Paysan probably
doesn't agree :-) I admit I don't see any value in decompiling code I have
the source code of (given the compiler is correct, which is above discussion.)
Why would debugging indirect-threaded code be easier? Why can't you use
assembler in a native code compiler (didn't I give a nice enough example?)
> Lisp enjoys similar advantages. So, for that matter, does Smalltalk;
> Squeak is very firmly bytecode-based.
> I know that you have protested about the prevalence of the roll-your-own
> mentality in Forth, and I don't wish to encourage it or propose it as
> the only solution. Clearly it's not. However, for me it is the right
> one.
That is hopefully not how I put it. I rolled-my-own, my whole generation
rolled their own. So long as you keep quiet about it and don't offer it
to anybody else (unless it is very, very good, is extremely well tested
and has lots and lots of documentation), it's no problem. But at a
certain point we have to (can) tell the young-uns to simply download
"that excellent Forth" and "start from there." (if you want to flame
me on this, please make sure you don't misunderstand me).
> What I do like about native code is the speed. That's lovely. However, I
> find that my computers are fast enough to stand a little cycle loss,
> whilst my mind is not fast enough to stand a lot of complexity.
Then we differ: I still see and have to use computers and software that
are at least 10 times too slow or too small for what I want to do *now*.
And if they adapt, I have uses for factors of 100 to 1000 times faster
(maybe not larger). And after that (I'll be 65 then :-) I can use extra
factors to prevent having to think about how to code something at all.
And of course, you're too modest.
[..]
> :>It is not just "to fend of the C people." If one actually uses
> :>Forth to *do things* it makes no sense to use a compiler that
> :>generates 16-bit segmented code, can't interface to system
> :>libraries, takes *minutes* to read a file that can be parsed in 38
> :>*ms* if done right, and is 100 or 1000 times (he, Jeff!) slower
> :>than schoolbook-C. These systems all equate to the same thing: a
> :>bad, antiquated, dusty, buggy, *implementation* of the fine
> :>language that I like so very much.
This is probably badly formulated. What I describe, except for execution
speed, are not the characteristic attributes of an indirect-threaded Forth.
Gforth for Linux would be a counter example. However, none of the
listed (not exhaustively) elements can be present in a Forth that is
acceptable to me.
> None of which has a thing to do with threading, as Elizabeth's confusion
> demonstrates nicely. A poor native code implementation could easily end
> up with an equally disastrous set of results, and a well-done threaded
> implementation would not suffer from these problems (let alone the
> 16-bit segmented code bug, which only applies to a single CPU
> architecture which has long ceased to be current).
See above.
Is it possible to write a high-level indirect-threaded Forth *in itself*
(not using lots of assembly language or "C") with none of the
characteristics mentioned?
And doesn't a 68000 in some environments have these charming 64K
segments? (The second native-code Forth I wrote was for the 68000).
> :>Now what will indirect-threaded code give me that could compensate
> :>adequately for any of the negative aspects I enumerate above?
> I don't know, and frankly doubt very much, whether it could give you
> anything. All I know is what it would give me, and that's what matters
> to me. Frankly, even if there were a killer point about ITC that made it
> far better, on the evidence of this post my chances of getting you to
> see it would be remote. There isn't; it's an issue of personal
Honestly, to "see it" was my only reason for continuing the thread.
Speed is not my ultimate goal, in that case I wouldn't be using Forth
(or would be using it differently).
I've decided the "tolerable" difference between Forth an C for me to
be a factor 2, because it is easily bridged by simple techniques (as
already shown).
> preference, and I've outlined what informs mine above. I don't have a
> problem with yours differing.
I have no problem with that either. But in order to prevent this
conversation from being a fruitless exchange of useless information,
I hope you will fill in some of the details above a little bit more.
If you don't want to, that's OK too.
[..]
-marcel
Perhaps there is an advantage to understanding all of a system that is
greater than the sum of its parts, so when systems reach a size at which a
global understanding does not seem reasonably achievable it becomes less
worthwhile to spend time learning about aspects that do not appear to relate
directly to the job in hand.
> I can fully grok, depend on, and alter at will my bigFORTH code
> generator. It's a very simple macro inlining and pattern maching
> peephole optimization. Actually, the replacement for , in a threaded
> code system are three screens of code (and 9 or so screens of pattern
> tables). I can easily grok the effect of three screens of code, even
> if they do a lot of black magic. Really. I do not need the most simple
> thing to be able to understand it.
[snip]
Wow, 3 screens of black magic and 12 screens of tables as a replacement
for , in a threaded code system. I guess our views of simplicity are
worlds apart.
Cheers,
Mark W. Humphries
Forth Chat Room on ICQ #37160535
Do not spurn The Old Ways, Luke, they have served the Force well. :-)
> I have worked with every type of threading mechanism
> that I have heard of and these discussions just don't seem
> to be very applicable to the issue of threading and Forth style.
I don't think these are issues of style. They pertain to how we view a
Forth's underlying architecture. If you treat your Forth as a black box
it might as well optimize to the hilt (for speed, size, or whatever
external characteristic), you probably don't care much how it looks
under the hood anyway. If you treat your Forth as a glass box then you
might be willing to surrender some external efficiency for an engine
that has certain characteristics (cleanliness, simplicity, ease of
maintenance, ease of modificaton, robustness, whatever turns you on).
> Whether you have direct threaded, indirect threaded, subroutine
> threaded (native code), native code with inlining and/or other
> optimizations, token threaded, bit threaded, or whatever, there
> should be a very strong correspondence between source and object
> code and many systems provide interactive stepped debugging with
> source code display. You put down tokens that get interpreted
> (again) in software or you interpret the source code and put
> down instructions that use the hardware as the interpreter.
The cfas layed down in a colon definition by a threaded system have a
much more direct correspondence to source than inlined optimized code.
You're right that there should be a very strong correspondence, in a
threaded system this is a given, I doubt this is the case in most
native code Forths.
Perhaps the implementors will respond to this themselves.
There is a large semantic gap to be bridged between Forth and most cpu
instruction sets, the more they optimize, the less correspondence there
will be between their object code and there source.
> Subroutine threading (native code) is not inherently more complex
> than indirect threading.
Subroutine threading is a clean threaded architecture, it only gets
messy when you throw in inlining and optimization.
> The threading mechanism is not a good measure of the complexity of a
> system.
Its not a complete measure of complexity, but there is a positive
correlation.
> It is known that native code without optimizations is slower than
> direct or indirect threaded code on many architectures. There are
> also different types of compiler optimzations, some that can happen
> on idtc or dtc systems. Perhaps the arguments should
> be more about the specific optimizations that you do or
> don't like for a particular machine or model.
Actually I'm not going to start an argument on particular
optimizations, I'd be woefully ill-equipped to do so, and anyhow this
is not my point. I've noted that many of the clf contributors strongly
advocate optimizing native code compilation for Forth. A clf reader
could get the impression that threaded code was just a mistaken tangent
in the early days of Forth, and that the consensus today is that native
code optimizing Forth is unilaterly superior to threaded code, don't
even think about that old dusty funky anachronistic threaded stuff. I
think such an impression would be a disservice to newcomers.
> I think it also is interesting that Chuck Moore switched to
> native code when he did cmForth for the Novix hardware. He
> had no problem with the idea of native code or optimizations
> at all. He did feel the optimizations in cmForth were way
> too Novix specific to be portable and so he designed the
> next generation of hardware for a simple native code with
> inlining model to get the smallest and fastest compilers
> that he had yet done.
Chuck has the luxury of working with Forth hardware where the native
code is Forth.
> Everything is subroutine calls except a few primitives that are
> inlined in Machine Forth. It is very simple, direct, portable
> etc.
I'm sure Machine Forth is much simpler than most of the inlined
optimized native code forthers are using and advocating. Chuck has the
discipline to maintain extreme simplicity and carefuly weigh his
comprimises. Machine Forth is most likely the exception rather than the
rule.
> In article <7n0d9u$bp8$1...@nnrp1.deja.com>,
> Mark Humphries <m...@intranetsys.com> wrote:
>> Its not the same code. A native code optimizing compiler is more akin
>> to a language translator than a virtual machine interpreter.
> I might have felt that way in 1985. I have worked with many other
> Forth systems since then.
>> With a traditional vm each word has a consistent interpretation
>> action, a consistent compilation action, and a consistent execution
>> action. Most words consistently compile to the same fixed-sized vm
>> "opcode" regardless of context. As there is a higher correspondence
>> between source and object the result is more reflective. These
>> attributes can be leveraged as you code, and in your final
>> application.
> I think in practice this is an issue of how complex the optimization
> rules are. If the compiled code is hard to relate to the source
> then you might say this. I might have been worried about that sort
> of thing many years ago but I did not find it to be a problem.
I think, like Chuck, you assign a very high weighting to simplicity
when making design and architectural tradeoffs. I don't think either of
you is representative of most Forth implementors.
>> There is a tradeoff between the regularity, consistency, simplicity,
>> and reflectivity of a vm interpreter and the speed of inlined
>> optimized native code.
> You make it seem like the two are mutually exclusive.
No, but their is a positive correlation. Optimizations tend to increase
the distance between source and object code. Of course the underlying
cpu instruction set is a factor.
> I think you may have very different ideas about what optimizations
> "should" or should not be done than I do. We must not be talking
> about the same thing. Of course the correspondence between the
> inlined optimized native code and the source code is trivial
> when the native code is for a Forth machine (other than Novix).
A Forth chip defines away the problem since its instruction set has a
high correspondence to Forth source.
> But Machine Forth makes this approach portable for other machines.
> I think it is very consistent with traditional Forth theory.
See my earlier comment on Machine Forth.
>> The attitude that this is all "under the hood" and therefore only the
>> concern of the language implementor and not the "user" sweeps the
>> tradeoff under the rug.
> This sounds like me talking about ANS Forth. ;-)
Not sursprising, I think we're on a similar wavelength regarding ANS
Forth ;-)
> If you can explain all the optimization rules in a few sentences
> most of these points just don't apply. Even if there are a
> hundred or more optimization rules in a particular system that
> we were discussing I would prefer to hear of actual experiences
> than theoretical concerns about abstractions that might or might-not
> apply.
Well, how about applying a source to object correspondence litmus test.
The goal would be the most speed increase for the least source to
object distance.
>> I'm trying (ineptly I'm sure) to provide a counterpoint and make the
>> tradeoffs more apparent, especialy for those new to Forth.
> Working with hardware optimized for tiny five bit inlined opcodes,
> and subroutine calls and returns I have a clear bias towards the
> inlined native code approach. I also promote the easy to see
> relationship between source and object code.
>
> If I work with some hardware that does idtc or btc in hardware then
> what do you call it?
I call it lucky ;-)
> Is it native code or is it indirect threaded
> code or bit threaded code? The Novix had a sort of threading
> mechanism in hardware I guess to complicate the discussion
> even of cmForth. The msb determined jump or call so for a range
> of addresses a list of addresses was also a list of subroutine
> calls to those addresses, something like that. A lot of what
> would be compiled would be indistinguishable from a list of
> address in indirect threaded code.
Once again a Forth chip defines away the problem since its instruction
has a high correspondence to Forth source.
Jeff, thanks for your thought-provoking response.
Probably a few dozen lines.
If you want a pause at the end of a full screen of output, load more.fs.
However, prompting at the end of a screen of output is not everyone's
idea of a good UI. I prefer programs that don't prompt me, so I
prefer to get the whole stuff at once, and then I scroll back and
forth at will (this, of course, requires an environment that supports
scrolling back; in my case it's xterm).
In such an environment we have no page structure, so sorting down then
across becomes questionable (ls -C with more than one page of output
is quite tiring, because it requires constantly scrolling back and
forth), so IMO the "list across then down" with no columns is a good
idea (it's also more compact).
The most recently added words are listed first. Does this address
your "put a high priority on display of the most frequently and/or
recently changed information" criterion?
Generally, IMO WORDS is not very useful anyway (however it displays
its information). If I want to see if a word is there, I simply type
it's name (possibly using Gforth's name completion). The only use I
see is for checking whether the words I defined went into the right
wordlist, and for checking whether REQUIRE included a file or not.
I did the same to Byte's sieve as what I did to Marty's matmul in the
last message: I didn't improve the algorithm, but did use the fastest
control structure available. After all, they are translations, and
translations have to be done properly (well, the sieve isn't, I had
posted a factored sieve here some time ago; that's doing it properly).
The counterside of changing the benchmarks is that we would have to do
them all over again, and that's difficult at least.
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
> Perhaps there is an advantage to understanding all of a system that is
> greater than the sum of its parts, so when systems reach a size at
> which a global understanding does not seem reasonably achievable it
> becomes less worthwhile to spend time learning about aspects that do
> not appear to relate directly to the job in hand.
Good insight.
Maybe the start of a new Psychology of Bloatware? :-)
> Philip.
> Member of FIG-UK http://www.users.zetnet.co.uk/aborigine/forth.htm
Cheers,
Mark W. Humphries
Forth Chat Room on ICQ #37160535
On 1999-07-21 m...@iaehv.iae.nl(MarcelHendrix) said:
:> I must at this point proffer my apologies. This post was worded
:>rather more strongly than was appropriate,
:No apology is needed. I confess that my original post was
:deliberately unpolished in some places (aka designed to be
:offensive to some).
That you for your grace.
:I think you really tried to formulate an helpful response to my
:query, keeping an open attitude to my ill-formulated point of view,
:so I'd like to carry on the discussion a little bit further.
OK. Let's see where this leads...
:> :>Well, give us some specific examples then. For instance, what
:> :>would've happened if the PolyForthers of yore had had to use
:> :>the present SwiftForth compiler technology? Would they have
:> :>failed miserably?
:> No. So? You seem to be imagining that I have said "well, native
:> code is all shite". That's not what I said. I said that a
:> threaded model has some advantages, and I was very careful to
:> qualify it. Simplicity of concept is one. Speed is not.
:The "simplicity of concept" advantage is always brought up (I have
:held this discussion regularly for the last 15 years, since I wrote
:my first Forth native code compiler for the ZX80), but never ever
:gets explained beyond that.
The difference between now and 15 years ago is that it would appear that
now your argument has the currency of broad acceptance. This strikes me
as rather odd. I'm not questioning your own devotion to the cause of
NCC - as you say, you wrote your first for the ZX80, an architecture
that needed speed if anything ever did. (Must be more than 15 years ago
now...) What puzzles me is that now computers have speed to spare, NCC
has suddenly caught on; why is what was good enough for the Honeywell
DDP-116 now not good enough on a machine that is potentially thousands
of times faster? Especially in these days of ever increasing software
complexity. Elizabeth quoted a 4x speed increase recently; to my mind,
that isn't really good enough to justify the extra complexity of
implementation.
:So you should've read my line as "yes, yes, I know, give me examples
:please," not as, "that's not true." It is of course clear to you
:that I do not believe indirect-threaded code is a simpler concept
:than sub-routine threading or native code compilation. So my
Firstly, in terms of implementation, it is much simpler. To compile a
definition, you need only grab a word, find it, and either call , or
EXECUTE depending on what you're doing at the time and the nature of the
word itself. You can do that with subroutine threading too, but as soon
as you contemplate inlining, you have to worry about length, whether the
word can be inlined, etc. And that's even before we touch on
optimisation. As Mark Humphries points out, we're talking 20 characters
vs 3 screens.
Secondly, it's more compact. That may be less important on a machine
with several megabytes of memory, and especially on 32-bit or 64-bit
CPUs (although the IA64, with its 128-bit instructions, may somewhat
alter this), but it's still important for the little black boxes
running in embedded situations. (That's not related to simplicity.)
Thirdly, one high-level word maps to exactly one low-level word, except
in a few clear cases where a parameter is always needed (eg. jumps,
constants). And the exceptions can be disposed of, too, leaving an exact
correspondence between source and object code. I like that; I think that
is conceptual simplicity. Also, it opens the way to truly reflective
environments where source code saved in a file is no different from
object code decompiled on the fly. (Decompilation in itself does become
trivial; even disposing of jumps backwards allows this.) Not only this,
but code can alter itself with an innate understanding of what it is
doing; eliminate the one-to-one correspondence between source and object
and you lose this, having to worry about variable-length instruction
blocks and bits of code that have simply disappeared inside the
optimiser.
:sentence about PolyForth simply stated my *opinion* that had it
:been a native code compiler, it would have been even better. At
:least one former user seemed to agree with me.
And yet it was plenty good enough when it was released.
:> :>Do you actually know how a native code Forth compiler
:> :>looks, viewed from the user side?
:> Why have you taken my preference personally?
:You can also interpret this as a straightforward question. The
:answer helps me to decide if it is necessary to describe the iForth
:implementation of the concept to you, so you can give examples why
:a different (simpler?) concept would be better, and why.
OK. I've never used iForth, so go for it. (I have taken a look at
mxForth, but to be honest I didn't think the complexity was worth it for
the optimisations - and it did seem to be somewhat tied to the x86. I
will revisit it, though.)
:> :They would have been delighted. What is it you (as a user)
:>don't :like about native code compilers?
:> Elizabeth, Marcel is very pro-native. I'm the one who seems to be
:> cast as the villain here.
:I certainly don't think of you as a "villain." I do not discuss with
:villains (in a usenet context: people whose depths I'm unable or
:unwilling to probe.)
Well, then, you should have moderated your tone first time round. ;P
:> What I don't like about native code is that it's a bugger to roll
:>your own. Stringing addresses of code together is swift, elegant
:>and uniform, and you are guaranteed a certain layout in memory.
:"Elegant" goes a bit too far.. Subroutine-threading does the other
:things even better. And I fail to see the big advantage of the
:memory layout?
Elegance is in the eye of the beholder - or maybe tailor, for JVN ;> Or
perhaps I should have said "beautiful"? I find indirect threaded code
elegant. Every word looks exactly the same; it starts with the address
of a primitive routine, has an optional body consisting of some data
structure (whether that be an array of addresses, or a list, or...).
Even primitives follow this structure. This gives you object typing for
free, aside from its other benefits.
:> It also makes
:> experimenting with novel control structures, or self-mutating
:> code, a doddle - as do Lisp functions.
:I do all of these things in iForth. If you can give examples of
:what you can do and think I can't, I'd like to compare notes.
OK. Quickly, now; I'll give you the address of a word, and you have to
tell me whether or not it's a colon definition. I can do it with one
indirect comparison.
Maybe you can do it. Maybe it's not even complicated, as it won't be if
you've started all your colon words with the equivalent of a calling
convention. But I can use exactly the same method to determine the type
of any word in my system, and when I write an object system that's
exactly the kind of property I want. Can you?
How about this one:
: thing do some code @ then some more ;
Then I go on to define fifty more words using thing, before realising
that I've pulled out a single indirection rather than the double
indirection I need. (Why I haven't noticed before, and why I've keyed in
all that code at the keyboard rather than using a file, are beside the
point here.)
Bugger.
: @@ @ @ ; ' @@ ' thing 4 CELLS + ! ;
Problem solved. Your turn. (And if I can do this that easily, so can my
machine. It may be worth looking for the paper on Synthesis for why this
may be a desirable property.)
:> It's trivial to decompile and
:> debug. If you need to drop down to assembler, you can do so, and
:> you have a new primitive.
:Decompiling is trivial, writing a decompiler is not (Bernd Paysan
:probably doesn't agree :-)
How can decompilation be made trivial when some of your source words may
not even be there, and when your compiler may generate identical code
for two different basic blocks? Instead of getting what you actually
told your computer to do, you get the decompiler writer's best guess at
it. That's not quite good enough.
:I admit I don't see any value in
:decompiling code I have the source code of (given the compiler is
:correct, which is above discussion.)
No it isn't, it's directly at the heart of the discussion. It's a lot
easier to verify a compiler for ITC than for NCC.
:Why would debugging
:indirect-threaded code be easier? Why can't you use assembler in a
:native code compiler (didn't I give a nice enough example?)
Debugging ITC can be done in a completely sandboxed environment. You can
even treat the word you're debugging as an array and manually step
through it. And of course, it's a lot easier to patch it up once you've
found an error (at which point you can verify that the patch works, and
go and change the source code).
As far as I can tell, without a *lot* of extra complexity, debugging
native code can only be done by letting the native chip execute the
instructions - maybe only one at a time.
:> I know that you have protested about the prevalence of the
:> roll-your-own mentality in Forth, and I don't wish to encourage
:> it or propose it as the only solution. Clearly it's not. However,
:> for me it is the right one.
:That is hopefully not how I put it.
Given that I was replying to Elizabeth in the above paragraph, no, it's
not. :>
:(if you want to flame me on this, please make sure you
:don't misunderstand me).
That comment is more bait that the entire paragraph that preceded it...
I completely agree with you; if someone wants to learn the language, we
*need*, as a community, to be able to point at a free version and go
"Try that. You'll like it, and it's as well supported as you could wish
for." It would also be nice if it were delightfully simple, given the
convolution of F-PC. Hopefully Gforth (despite the ugly name) will be
that standard.
:> What I do like about native code is the speed. That's lovely.
:> However, I find that my computers are fast enough to stand a
:> little cycle loss, whilst my mind is not fast enough to stand a
:> lot of complexity.
:Then we differ: I still see and have to use computers and software
:that are at least 10 times too slow or too small for what I want to
:do *now*.
Do tell.
:And if they adapt, I have uses for factors of 100 to 1000
:times faster (maybe not larger). And after that (I'll be 65 then
::-) I can use extra factors to prevent having to think about how to
:code something at all.
That's just lazy. :>
:And of course, you're too modest.
Actually, I don't think I am; I'm just realistic. I don't have the
attention span for a lot of detail - in fact, I don't have much of an
attention span at all.
:This is probably badly formulated. What I describe, except for
:execution speed, are not the characteristic attributes of an
:indirect-threaded Forth. Gforth for Linux would be a counter
:example. However, none of the listed (not exhaustively) elements
:can be present in a Forth that is acceptable to me.
Nor to me.
:Is it possible to write a high-level indirect-threaded Forth *in
:itself* (not using lots of assembly language or "C") with none of
:the characteristics mentioned?
Is it possible to write a high-level native Forth in itself with similar
characteristics? How would you handle copying when you're not even sure
how addresses are represented? Besides, many ITC Forths bundle up the
kernel into about 1k of tight code; make that relocatable and copy it
into the target image, and you have it. You could even define the kernel
as a word in itself.
:And doesn't a 68000 in some environments have these charming 64K
:segments? (The second native-code Forth I wrote was for the 68000).
And is a 68000 current today? Indeed, has the 64k limitation been
current since the 68020? Admittedly, Quartus Forth is still 16-bit - but
then as Quartus is NCC, perhaps that's not something to should about. :>
(Sorry, Neal - no slight intended; I'm sure it was the right choice for
the platform.)
:> I don't know, and frankly doubt very much, whether it could give
:> you anything. All I know is what it would give me, and that's
:> what matters to me. Frankly, even if there were a killer point
:> about ITC that made it far better, on the evidence of this post
:> my chances of getting you to see it would be remote. There isn't;
:Honestly, to "see it" was my only reason for continuing the thread.
Then I don't think I can help. All I can do is tell you why I prefer it.
I don't think there is a killer advantage in the same way as speed is
the killer advantage of NCC. Perhaps this is the nature of evolution -
if so, I remain your humble servant P. Terandon.
:Speed is not my ultimate goal, in that case I wouldn't be using
:Forth (or would be using it differently).
:I've decided the "tolerable" difference between Forth an C for me to
:be a factor 2, because it is easily bridged by simple techniques (as
:already shown).
Fair enough. But for me, the difference doesn't even enter the equation,
because of the inbuilt advantages of Forth. A better comparison, for me,
would be between Forth and Visual Basic, or Forth and ELisp. C is just
beyond help (and the pail).
:I have no problem with that either. But in order to prevent this
:conversation from being a fruitless exchange of useless information,
:I hope you will fill in some of the details above a little bit more.
:If you don't want to, that's OK too.
I hope I've filled them in sufficiently for you to be able to understand
my preference a little better.
(Oh, yes, I've reverted to a former identity, and a new account. I think
this one's a stayer.)
--
the desk lisard communa time's taught the killing game herself
On 1999-07-22 m...@intranetsys.com said:
:> I can fully grok, depend on, and alter at will my bigFORTH code
:> generator. It's a very simple macro inlining and pattern maching
:> peephole optimization. Actually, the replacement for , in a
:>threaded code system are three screens of code (and 9 or so
:>screens of pattern tables). I can easily grok the effect of three
:>screens of code, even if they do a lot of black magic. Really. I
:>do not need the most simple thing to be able to understand it.
:Wow, 3 screens of black magic and 12 screens of tables as a
:replacement for , in a threaded code system. I guess our views of
:simplicity are worlds apart.
With a state machine, he could probably get away with just the 3 screens
of code. :> However, compared with : , HERE ! H CELL+ ! ; it *is* overly
complicated. RI/Forth is about the simplest Forth I've ever come across,
and a good advert for subroutine threading with inlining (the only
optimisation it does is tail-call); but its compilation engine is still
fearsomely complex. (btw, I haven't seen this in any archives, so if
anyone wants it I can mail it, and a *big* thanks to the person who
mailed it to me originally.)
About the simplest way to progress is to assume that every processor has
an accumulator and two index registers, and code little evenly-sized
chunks of code that can be combined together to produce the 24 P21
primitives (I don't like @R+ and !R+; caching R is complicated for the
x86, and they're a mix of concepts. Better to add a second address
latch, even splitting @ and ! between them). But even then there are
some unpleasant surprises awaiting the incautious implementer. That way,
though, you *might* get a small code generator - but it won't be so
simple. And as for unoptimised inlining - non, merci! It leads to
dreadful code, and dreadful bloat. (Bad Forth! Bad!)
--
xian the desk lisard time's taught the killing game herself
Vous,
Would you consider making it available generally? I immagine that one of
the usual repositories would be glad to host it.
>In article <3794f8d9...@seagoon.newcastle.edu.au>,
> ec...@cc.newcastle.edu (Bruce R. McFarling) writes:
>> A decent word display should follow normal UI standards. Whatever the
>> sort order, lists should be in left aligned columns, sorted down then
>> across. And put a high priority on display of the most frequently
>> and/or recently changed information. That means that WORDS should
>> list down by columns, and pause at the end of a full screen of output.
>>
>> How hard would it be to write that in, say, GForth?
>
>Probably a few dozen lines.
>
>If you want a pause at the end of a full screen of output, load more.fs.
>
>However, prompting at the end of a screen of output is not everyone's
>idea of a good UI. I prefer programs that don't prompt me, so I
>prefer to get the whole stuff at once, and then I scroll back and
>forth at will (this, of course, requires an environment that supports
>scrolling back; in my case it's xterm).
Fine, as long as its the top of the list, if words are listed
in reverse chronological order.
>In such an environment we have no page structure, so sorting down then
>across becomes questionable (ls -C with more than one page of output
>is quite tiring, because it requires constantly scrolling back and
>forth), so IMO the "list across then down" with no columns is a good
>idea (it's also more compact).
In that environment it is perfectly all right to simply
display a single column. ``list across then down with no columns'' is
at best the simplest design to implement if the word is just being
provided to satisfy a nominal requirement. There's no other excuse
for it.
>The most recently added words are listed first. Does this address
>your "put a high priority on display of the most frequently and/or
>recently changed information" criterion?
Only if the top of the list is what is seen when the words are
done flashing on the screen. Normally the highest priority words are
those that are most likely to be out of sight.
>Generally, IMO WORDS is not very useful anyway (however it displays
>its information). If I want to see if a word is there, I simply type
>it's name (possibly using Gforth's name completion). The only use I
>see is for checking whether the words I defined went into the right
>wordlist, and for checking whether REQUIRE included a file or not.
Most people with access to such a poorly designed tool would
conclude that the tool was not very useful in any event.
(
----------
Virtually,
Bruce McFarling, Newcastle,
ec...@cc.newcastle.edu.au
)
>About the simplest way to progress is to assume that every processor has
>an accumulator and two index registers, and code little evenly-sized
>chunks of code that can be combined together to produce the 24 P21
>primitives (I don't like @R+ and !R+; caching R is complicated for the
>x86, and they're a mix of concepts. Better to add a second address
>latch, even splitting @ and ! between them).
It struck me that if @R+ and !R+ is the best way to do that,
then the best way to do @A+ and !A+ is just @+ and !+. That is, is a
latch is use to access memory for one stack, why not for the other?
If a latch is superfluous for accessing memory for one stack, why not
for the other? The provision of a latch for each stack is at least
consistent:
A@ A!
@A @A+
!A !A+
PUSH-B PULL-B
@B+ !B+
PUSH PULL
> Most people with access to such a poorly designed tool would
>conclude that the tool was not very useful in any event.
Mind, I'm not trying to be stroppy or anything like that.
I use it regularly, but then I put all the words that are not needed
by an application into a hidden vocabulary. IMHO PUBLIC and PRIVATE
are good candidates for a new standard release. I really like
FETCH-WORDS{ word1 word2 ... wordn }
which moves the bracketed words to the current compilation wordlist.
It is handy to "post-clean" bloated wordlists. Originally I "invented"
it for my metacompiler to shuffle words between ROOT and FORTH
vocabulary before ONLY ALSO et al was defined, but it has more uses
(not for those of who thrive on chaos, of course ;-)
Andreas
Yes, that would be better.
> In that environment it is perfectly all right to simply
> display a single column. ``list across then down with no columns'' is
> at best the simplest design to implement if the word is just being
> provided to satisfy a nominal requirement.
Actually a single column would be quite a bit simpler (note that
Gforth's WORDS does word-wrap, at least if Gforth is right about the
screen width).
> There's no other excuse
> for it.
A single column would display less per screenful and thus require more
scrolling.
However, in Gforth's case, the rationale for the design is that
earlier systems did it the same way. If Gforth did it differently,
the user would have to work this out when switching between systems,
and IMO WORDS' utility is not large enough to justify this; especially
when one of the few times that WORDS is used is when the user uses the
system for the first time.
> Only if the top of the list is what is seen when the words are
> done flashing on the screen.
Just tried it. On my 168x75 character xterm the output of WORDS fits
on one screen:-) (Normally I use 80x75 characters, though).
> Most people with access to such a poorly designed tool would
> conclude that the tool was not very useful in any event.
So what do you think a well-designed WORDS would be useful for?
In our practice, the major change that's occurred in 15 years is _who_ is
doing the compiling. In early Forths, the major characteristic was that the
compiler, assembler, user/debugging support, source management,
everything -- was running on a tiny, 8- or 16-bit machine along with its
application. In that environment, there was a great need for keeping all
this as simple and clean as possible.
Today, in contrast, our practice at FORTH, Inc. is to used a Windows-based
PC as a host for developing for all targets (including, of course, itself).
This platform is awesomely powerful in comparison with the past. This
changes the tradeoffs significantly: our goal is to make a system that's
simple and easy for users, and produces target code optimized for the needs
of the application. This means the host software can be as complex as
necessary to achieve these goals. More error checking, more debugging aids
(including a source-level single-stepper, see below), and optimizing
compilers, for example.
> :> What I don't like about native code is that it's a bugger to roll
> :>your own. Stringing addresses of code together is swift, elegant
> :>and uniform, and you are guaranteed a certain layout in memory.
If you're an amateur compiler writer "rolling your own" Forth, that's
another good reason for keeping the compiler simple. It's a reasonable
tradoff to have a slower, possibly larger program in exchange for simpler
development of the environment. This logic doesn't, however, apply to those
of us making professional tools for professional users; if we're going to
ask people to spend money on our software and commit significant projects to
it, we have to do whatever's necessary at our end to give them a product
that will bring a good return on their investment in terms of programming
ease and target performance.
> :Why would debugging
> :indirect-threaded code be easier? Why can't you use assembler in a
> :native code compiler (didn't I give a nice enough example?)
>
>Debugging ITC can be done in a completely sandboxed environment. You can
>even treat the word you're debugging as an array and manually step
>through it. And of course, it's a lot easier to patch it up once you've
>found an error (at which point you can verify that the patch works, and
>go and change the source code).
>
>As far as I can tell, without a *lot* of extra complexity, debugging
>native code can only be done by letting the native chip execute the
>instructions - maybe only one at a time.
It can also be done with a source-level single-step debugger such as we
supply with SwiftForth and our SwiftX cross-compilers. The debugger does
compile things a little differently, but it's intended to let you test your
logic, and the logic doesn't change when it's optimized.
Cheers,
Elizabeth
<concerning how WORDS formats data>
I prefer to factor out paging and word-wrap from routines like WORDS
which produce data for display. I have WORDS just produce a data
stream and the system does the paging and word-wrap (paging is so
simple as turning on a counter in CR .) If a more fancy display is
needed then add a fancy filter.
With the formatting factored out WORDS and other such stay
simple not needing to contain logic to handle display modes, windows,
ANSI.SYS present or not, direct screen writes , writes to standard
output , buffered writes, printing and other such possibilities.
--
Tom Zegub tze...@dhc.net
WasteLand http://www.dhc.net/~tzegub
|_|_|_|_
| | | | http://www.dhc.net/~tzegub/fop.htm
[..]
> What puzzles me is that now computers have speed to spare, NCC
> has suddenly caught on; why is what was good enough for the Honeywell
> DDP-116 now not good enough on a machine that is potentially thousands
> of times faster? Especially in these days of ever increasing software
> complexity. Elizabeth quoted a 4x speed increase recently; to my mind,
> that isn't really good enough to justify the extra complexity of
> implementation.
Maybe it isn't so complex to implement. Maybe the (speed) advantage is
larger than a factor of 4 (but surely less than a factor of 100).
An advantage of indirect-threading that hasn't been mentioned yet
(surprisingly!) is that implementing a VM around the actual CPU and
being able to neglect OS issues brings enormous benefits in a
fragmented market. But once you don't need or want to support the
Honeywell DDP-116, IBM mainframes, a Cray, a PDP11, the Dragon, the
ORIC, the Apple II etc. anymore, and write only for the Intel x86
under WIN32, the "disadvantages" of a VM and OS independence start
showing. Especially when you want to compete with companies that
*have* taken the time and effort to write for the bare metal.
> :So you should've read my line as "yes, yes, I know, give me examples
> :please," not as, "that's not true." It is of course clear to you
> :that I do not believe indirect-threaded code is a simpler concept
> :than sub-routine threading or native code compilation. So my
> Firstly, in terms of implementation, it is much simpler. To compile a
> definition, you need only grab a word, find it, and either call , or
> EXECUTE depending on what you're doing at the time and the nature of the
> word itself.
That is not a completely fair comparison: I think it is very complex to
write an indirect-threaded implementation that can compete with a
native-code approach (or with code a C compiler or assembly language
programmer can produce). We agreed on the list of things we don't want
to see in a Forth, remember? To take the example of PolyForth again:
that was an implementation where a lot of effort (which I equate to
complexity) went into making the helicopter lift off.
> You can do that with subroutine threading too, but as soon
> as you contemplate inlining, you have to worry about length, whether the
> word can be inlined, etc. And that's even before we touch on
> optimisation. As Mark Humphries points out, we're talking 20 characters
> vs 3 screens.
But the result is definitely not the same, so this is not a valid comparison.
Do the Forths that result from these approaches do the same? IMO, to get an
equivalent result an indirect-threaded Forth must do <unnamed things>
differently. And it must be unique approaches that are not possible in a
native code Forth.
> Secondly, it's more compact. That may be less important on a machine
> with several megabytes of memory, and especially on 32-bit or 64-bit
> CPUs (although the IA64, with its 128-bit instructions, may somewhat
> alter this), but it's still important for the little black boxes
> running in embedded situations. (That's not related to simplicity.)
It is a popular misconception that a native code Forth with a desirable
minimum functionality (say: all ANS word sets :-) is larger than a
conventionally threaded one. (It is of course true for a
subroutine-threaded Forth without optimization.) Please check Anton
Ertl's f2c results for some hard facts.
> Thirdly, one high-level word maps to exactly one low-level word, except
> in a few clear cases where a parameter is always needed (eg. jumps,
> constants). And the exceptions can be disposed of, too, leaving an exact
> correspondence between source and object code. I like that; I think that
> is conceptual simplicity. Also, it opens the way to truly reflective
I am a dork, I know, but how does "conceptual simplicity" result in you
doing a better real-world job than I can? We are discussing "under-the-hood"
concepts here, not Forth on the source code level.
> environments where source code saved in a file is no different from
> object code decompiled on the fly. (Decompilation in itself does become
> trivial; even disposing of jumps backwards allows this.) Not only this,
> but code can alter itself with an innate understanding of what it is
> doing; eliminate the one-to-one correspondence between source and object
> and you lose this, having to worry about variable-length instruction
> blocks and bits of code that have simply disappeared inside the
> optimiser.
This is a matter of taste and I will not try to debate it.
My taste: I don't worry about the above. Would you resist applying
Wil Baden's pinhole optimizer to an indirect-threaded Forth?
> :sentence about PolyForth simply stated my *opinion* that had it
> :been a native code compiler, it would have been even better. At
> :least one former user seemed to agree with me.
> And yet it was plenty good enough when it was released.
To me, the user seemed to feel that he had done work the vendor
could have made unnecessary, and in retrospect hadn't benefited
much from that exercise.
[..]
> OK. I've never used iForth, so go for it. (I have taken a look at
> mxForth, but to be honest I didn't think the complexity was worth it for
> the optimisations - and it did seem to be somewhat tied to the x86. I
> will revisit it, though.)
You actually feel mxForth is more complex than, say, Gforth? I'll say
some more below.
The other two points were answered above.
[..]
> :I certainly don't think of you as a "villain." I do not discuss with
> :villains (in a usenet context: people whose depths I'm unable or
> :unwilling to probe.)
> Well, then, you should have moderated your tone first time round. ;P
My style is iterative. I often have to clarify and sometimes even apologize
in follow-up messages. I do admire authors that can write postings
completely consistent with all they've ever said before, with relentless
logic and without ever making even a spelling mistake. But I am not
afraid to make mistakes (although sometimes I don't want to admit them)
because I think it's necessary to gain new knowledge.
> :> What I don't like about native code is that it's a bugger to roll
> :>your own. Stringing addresses of code together is swift, elegant
> :>and uniform, and you are guaranteed a certain layout in memory.
> :"Elegant" goes a bit too far.. Subroutine-threading does the other
> :things even better. And I fail to see the big advantage of the
> :memory layout?
> Elegance is in the eye of the beholder - or maybe tailor, for JVN ;> Or
> perhaps I should have said "beautiful"? I find indirect threaded code
I'm not going to discuss elegance and beauty, nor tailors :-)
> elegant. Every word looks exactly the same; it starts with the address
> of a primitive routine, has an optional body consisting of some data
> structure (whether that be an array of addresses, or a list, or...).
> Even primitives follow this structure. This gives you object typing for
> free, aside from its other benefits.
Objects. Hmmm, flame-bait.
> OK. Quickly, now; I'll give you the address of a word, and you have to
> tell me whether or not it's a colon definition. I can do it with one
> indirect comparison.
If I couldn't do that, how would I code a smart COMPILE, ? It stands
to reason that a native code compiler stores and extracts a lot more
(explicit) information about the source code than any other compiler.
> Maybe you can do it. Maybe it's not even complicated, as it won't be if
> you've started all your colon words with the equivalent of a calling
> convention. But I can use exactly the same method to determine the type
Yes.
> of any word in my system, and when I write an object system that's
> exactly the kind of property I want. Can you?
Yes. But it doesn't prove anything.
A native code compiler like iForth or mxForth, or bigFORTH, doesn't take
your source code to completely re-order it in an unpredictable way,
like a C compiler might. In iForth you can say 1 ' DUP EXECUTE D.
and : test 1 2 [ ' DUP COMPILE, ] ROT . . . ; and with some
practice you will know exactly what happens and what is compiled.
> How about this one:
> : thing do some code @ then some more ;
> Then I go on to define fifty more words using thing, before realising
> that I've pulled out a single indirection rather than the double
> indirection I need. (Why I haven't noticed before, and why I've keyed in
> all that code at the keyboard rather than using a file, are beside the
> point here.)
Or why you don't simply scroll back the console or copy the keyboard
history or the screen to the clipboard or .. :-)
> Bugger.
> : @@ @ @ ; ' @@ ' thing 4 CELLS + ! ;
> Problem solved. Your turn. (And if I can do this that easily, so can my
> machine. It may be worth looking for the paper on Synthesis for why this
> may be a desirable property.)
We're getting nearer. So not only you want an indirect-threaded Forth,
you also want this fact to be exposed, so you can make use of it.
:NONAME do some code @ @ then some more ;
HERE SWAP d#) jmp, ' thing 5 MOVE
> :> It's trivial to decompile and
> :> debug. If you need to drop down to assembler, you can do so, and
> :> you have a new primitive.
> :Decompiling is trivial, writing a decompiler is not (Bernd Paysan
> :probably doesn't agree :-)
> How can decompilation be made trivial when some of your source words may
> not even be there, and when your compiler may generate identical code
> for two different basic blocks? Instead of getting what you actually
> told your computer to do, you get the decompiler writer's best guess at
> it. That's not quite good enough.
Actually, I do not want to see the Forth source code that I entered, because
I know that already. I want to see what the CPU is executing. So a smart
decompiler, although very neat and according to Bernd possible, is of
absolutely no value to me. I want a disassembler.
> :I admit I don't see any value in
> :decompiling code I have the source code of (given the compiler is
> :correct, which is above discussion.)
> No it isn't, it's directly at the heart of the discussion. It's a lot
> easier to verify a compiler for ITC than for NCC.
Is that your problem or mine? Do you think iForth (or any native code
compiler) has more bugs than you-name-it-indirect?
> :Why would debugging
> :indirect-threaded code be easier? Why can't you use assembler in a
> :native code compiler (didn't I give a nice enough example?)
> Debugging ITC can be done in a completely sandboxed environment. You can
> even treat the word you're debugging as an array and manually step
> through it. And of course, it's a lot easier to patch it up once you've
> found an error (at which point you can verify that the patch works, and
> go and change the source code).
This is how it worked 15 years ago, when loading a screen from cassette
toke one minute ore more. It may still be advantageous on small controllers
but frankly I wouldn't want to debug a saw-mill with still living human
beings in it in this way :-)
Seriously, a bug in a word can have icky side-effects (like
overwriting part of your Forth). It is much better to fix the source and
restart and reload, unless the application is not critical and doing it
the correct way is very inconvenient. I almost never patch. The bugs I see
are almost always stackproblems and logic errors, which are simply
found with FORGET <recompile> or <execute> when the problem is reasonably
factored. Because compilation is near instantaneous, I can insert .S before
and after a suspicious word anywhere in the source and re-run in the same
time as it would take to use a debugger / singlestepper interface.
> As far as I can tell, without a *lot* of extra complexity, debugging
> native code can only be done by letting the native chip execute the
> instructions - maybe only one at a time.
I'll admit that I have problems to debug recursive CALLBACKs that Windows
does to Forth words, but I can handle those using the MS-C or Linux tools
from the iForth server side *much* better than any current Forth debugger
could.
[..]
> if someone wants to learn the language, we
> *need*, as a community, to be able to point at a free version and go
> "Try that. You'll like it, and it's as well supported as you could wish
> for." It would also be nice if it were delightfully simple, given the
> convolution of F-PC. Hopefully Gforth (despite the ugly name) will be
> that standard.
I don't think Gforth is simple in the sense you demonstrate to want in this
discussion. But it certainly fits the bill for learning and using
Forth on a unix system. (On WIN32 at least 0.3.0 was not fool-proof yet.)
[..]
> :Then we differ: I still see and have to use computers and software
> :that are at least 10 times too slow or too small for what I want to
> :do *now*.
> Do tell.
Electronic circuit simulation. One circuit I'm interested in now takes
about 20 minutes to complete a run on a 350 MHz PII, using SPICE.
Another one takes half a day to simulate 20 ms real-time on a fast
HP workstation, using Saber. To be really useful for synthesis and
optimization hundreds or even thousands of these runs should be made.
Did you ever use MS Word to scroll through a 10 page document containing
8 1024x768x10^16 bitmaps? Or format a 200 page report? Use Corel Draw
to convert a 10 page Postscript file to EMF?
Did you ever try to use that 360 degree Marslander NASA picture as a
desktop background?
I have a non-linear fitting / neural-net program to aid in developing
HID lamps. It was written for two transputers and ran for three days
before I stopped it. It could not be debugged because it was too slow :-)
What about editing streaming video (and audio) in high-level Forth.
Speech-recognition? Ray-tracing? Radiosity? (Haven't yet found the
time to try any of these seriously).
[..]
> :Is it possible to write a high-level indirect-threaded Forth *in
> :itself* (not using lots of assembly language or "C") with none of
> :the characteristics mentioned?
> Is it possible to write a high-level native Forth in itself with similar
> characteristics? How would you handle copying when you're not even sure
> how addresses are represented? Besides, many ITC Forths bundle up the
A native Forth generates machine code and doesn't need to explicitly copy,
that's what it does implicitly all of the time. I'm not sure I understand
your question, but iForth's metacompiler is written in iForth. Each time
iForth gets better, the metacompiler gets smarter and generates a still
smarter new iForth, solely from source code (not by copying). The key point
is that the only tool I need is iForth, and that the code always is a
string of subroutine calls combined with the usual machine codes.
> kernel into about 1k of tight code; make that relocatable and copy it
> into the target image, and you have it. You could even define the kernel
> as a word in itself.
I'm sure you realize that 1K as such doesn't tell the whole picture. What
can that 1K do? E.g. eForth is 5K and doesn't really do anything yet.
> :And doesn't a 68000 in some environments have these charming 64K
> :segments? (The second native-code Forth I wrote was for the 68000).
> And is a 68000 current today? Indeed, has the 64k limitation been
> current since the 68020? Admittedly, Quartus Forth is still 16-bit - but
> then as Quartus is NCC, perhaps that's not something to should about. :>
> (Sorry, Neal - no slight intended; I'm sure it was the right choice for
> the platform.)
Segmented code is easy to relocate and move around, in fact that's probably
(one of the reasons) why it was invented.
[..]
> :Honestly, to "see it" was my only reason for continuing the thread.
> Then I don't think I can help. All I can do is tell you why I prefer it.
> I don't think there is a killer advantage in the same way as speed is
> the killer advantage of NCC. Perhaps this is the nature of evolution -
> if so, I remain your humble servant P. Terandon.
There is a killer advantage (see above) but it doesn't seem to be relevant
anymore.
> :Speed is not my ultimate goal, in that case I wouldn't be using
> :Forth (or would be using it differently).
> :I've decided the "tolerable" difference between Forth an C for me to
> :be a factor 2, because it is easily bridged by simple techniques (as
> :already shown).
> Fair enough. But for me, the difference doesn't even enter the equation,
> because of the inbuilt advantages of Forth. A better comparison, for me,
Preference or fact? This is the question we started with.
> because of the inbuilt advantages of Forth. A better comparison, for me,
> would be between Forth and Visual Basic, or Forth and ELisp. C is just
> beyond help (and the pail).
These languages seem complimentary, although VB is definitely Windows, so a
bit limited. I get along fine with Forth, assembler, C, FORTRAN and a tiny
bit of Tcl/Tk (which I'll gladly dump if a nicer platform-independent GUI
kit comes along).
> :I have no problem with that either. But in order to prevent this
> :conversation from being a fruitless exchange of useless information,
> :I hope you will fill in some of the details above a little bit more.
> :If you don't want to, that's OK too.
> I hope I've filled them in sufficiently for you to be able to understand
> my preference a little better.
I was not after preferences because they are above discussion. But
still I learned something, thank you.
-marcel
On 1999-07-23 era...@forth.com said:
:m...@euphrates.f9.co.uk wrote in message ...
:>... What puzzles me is
:>that now computers have speed to spare, NCC has suddenly caught
:>on; why is what was good enough for the Honeywell DDP-116 now not
:>good enough on a machine that is potentially thousands of times
:>faster?
:In our practice, the major change that's occurred in 15 years is
:_who_ is doing the compiling. In early Forths, the major
:characteristic was that the compiler, assembler, user/debugging
:support, source management, everything -- was running on a tiny, 8-
:or 16-bit machine along with its application. In that environment,
:there was a great need for keeping all this as simple and clean as
:possible.
:Today, in contrast, our practice at FORTH, Inc. is to used a
:Windows-based PC as a host for developing for all targets
:(including, of course, itself). This platform is awesomely powerful
:in comparison with the past. This changes the tradeoffs
:significantly: our goal is to make a system that's simple and easy
:for users, and produces target code optimized for the needs of the
:application. This means the host software can be as complex as
:necessary to achieve these goals. More error checking, more
:debugging aids (including a source-level single-stepper, see below),
:and optimizing compilers, for example.
Fair enough, if you're compiling to a target environment. Fair enough, I
dare say, if you're Forth, Inc. and actually have to sell development
environments to a Windows-dominated world, too. Those considerations
don't apply to me, though.
:If you're an amateur compiler writer "rolling your own" Forth,
:that's another good reason for keeping the compiler simple. It's a
:reasonable tradoff to have a slower, possibly larger program in
:exchange for simpler development of the environment. This logic
:doesn't, however, apply to those of us making professional tools
:for professional users; if we're going to ask people to spend money
:on our software and commit significant projects to it, we have to
:do whatever's necessary at our end to give them a product that will
:bring a good return on their investment in terms of programming
:ease and target performance.
Indeed. (Although I don't think there's anything about programming ease
that necessitates a native-compiled system, as Microsoft have
demonstrated with Visual Basic. They do now offer a target compiler, but
reports indicate that it's not actually very much faster because of the
reliance on the VB libs.)
:>As far as I can tell, without a *lot* of extra complexity,
:>debugging native code can only be done by letting the native chip
:>execute the instructions - maybe only one at a time.
:It can also be done with a source-level single-step debugger such
:as we supply with SwiftForth and our SwiftX cross-compilers. The
:debugger does compile things a little differently, but it's
:intended to let you test your logic, and the logic doesn't change
:when it's optimized.
Actually, the main problem cited with optimising compilers is that they
*do* change the logic of the program - especially the more complicated
forms of optimisation, like strength reduction and common subexpression
elimination. However, most native Forths seem to be more micro-
optimised, in that they don't attempt this kind of global analysis.
Maybe you get away with it. :> But you're still not letting users debug
the exact code that they will be running in a target environment.
(Although I grant that it wouldn't be impossible to do so.)
--
the desk lisard communa time's taught the killing game herself
On 1999-07-22 jya...@erols.com said:
:Would you consider making it available generally? I immagine that
:one of the usual repositories would be glad to host it.
Sure; I'll mail Skip a copy so he can upload it to taygeta. (He can
worry about the thornier aspects of copyright and things. :> )
On 1999-07-23 ec...@cc.newcastle.edu.au(BruceR.McFarling) said:
:It struck me that if @R+ and !R+ is the best way to do that,
:then the best way to do @A+ and !A+ is just @+ and !+. That is, is
:a latch is use to access memory for one stack, why not for the
:other? If a latch is superfluous for accessing memory for one stack,
:why not for the other? The provision of a latch for each stack is
:at least consistent:
:A@ A!
:@A @A+
:!A !A+
:PUSH-B PULL-B
:@B+ !B+
:PUSH PULL
I did consider that for a while, but that's probably too many operators.
It's probably sufficient to have B as write-only, given that a readable
A is necessary and sufficient if you want to test against a terminating
address. So:
S >S >D
@S @S+ !S
@D !D+ !D
: MOVE ( s' s d ) >D >S BEGIN @S+ !D+ DUP S -OR WHILE REPEAT DROP ;
: BLOCK-AND >D >S BEGIN @S+ @D AND !D+ DUP S -OR WHILE REPEAT DROP ;
You made 12 screens out of my 9 (guess). Actually, the tables are 10
screens, and one screen defines the magic constants to describe macros
(indexes for the tables).
Compared with the size of a Forth system (primitives, system-independent
parts of the compiler), this is small. I'm willing to increase the size
of my source by some percents to improve speed by a factor of four. I
wasn't until yet willing to increase the complexity by more, because it
would give me at best another twofold improvement. Since I haven't
really tried hard to get away from the macro-copying to a
register-allocating code generation, I really don't know how much it
will cost. It might be even cheaper, since it can side-step around the
assembler part.
I also want to add that Chuck Moore, the master of simplification,
himself now teaches "machine Forth", which is a sort of simplified
native code Forth.
<shameless plug>
If you want to use a 'non-professional' (read: free) version of a native
code Forth system *with* source code, instead of buying a professional
with source (expensive), you'd better look at bigFORTH (download from my
homepage). It also allows you to decompile most of the code back to
Forth. Step-trace is only available for Linux now, since I didn't get
around implementing the debug trap hooks in Windows. Actually, I don't
debug my code with step trace, so this is a really low-priority issue
for me.
</shameless plug>
A native code Forth doesn't create gaps between compiler writer, app
writer and app user. All the tricks of standard threaded code Forth are
available, including, but not limited to return stack tricks. Actually,
you can create macros in yet another way, by creating words with :,
POSTPONE and MACRO (or INLINE, there doesn't seem to be a common
syntax), and when you are lucky, something like
: postpone Literal postpone + postpone @ postpone ; macro
creates macros in the form of
mov AX,xxx(AX)
...and even this isn't necessarily the case: on a machine where addresses
are the same size as CALL instructions you can get a subroutine-threaded
Forth in the same size as an indirect-threaded one (e.g. on the ARM); the
disadvantage is only the lack of addressing range for code, but 26 bits
(again, on the ARM) is hardly a restriction.
>> trivial; even disposing of jumps backwards allows this.) Not only this,
>> but code can alter itself with an innate understanding of what it is
>> doing; eliminate the one-to-one correspondence between source and object
>> and you lose this, having to worry about variable-length instruction
>> blocks and bits of code that have simply disappeared inside the
>> optimiser.
This is really odd. No-one worries about how the code is represented in most
reflexive languages. And C programmers rarely worry about how the code is
laid out when trying to debug them (I hardly ever use a debugger to do more
than find out exactly where my program crashed; debugging the binary rather
than the source is a dangerous, if seductive, path to follow, as it results
in fixing symptoms rather than curing causes).
>I do not want to see the Forth source code that I entered, because
>I know that already.
Exactly. I never complain that gdb simply displays my C source instead of
reconstructing it from the binary. Having a decompiler is a useful check on
the correctness of the compiler, but I'd much rather see my source rather
than something which can only be an approximation, even in an indirect
threaded Forth (waaah! where's my layout gone?).
>> No it isn't, it's directly at the heart of the discussion. It's a lot
>> easier to verify a compiler for ITC than for NCC.
Depends entirely on the complexity of the compiler. If you compile to
Machine Forth, which is a form of NCC, your compiler need be no more complex
than an ITC compiler. In any case, it's still (AFAIK) a hard problem to
verify pretty much anything. And there's no point verifying a compiler for a
language that doesn't have a formal definition; the only such language I
know of is Standard ML.
>Is that your problem or mine? Do you think iForth (or any native code
>compiler) has more bugs than you-name-it-indirect?
It probably does just because of the extra complexity. For example, I always
have to keep in mind when programming in C that bugs *may* be the compiler's
fault, and in some cases they have been (GCC for ARM; I've not yet found a
bug in GCC for Intel). OTOH, I'd be very surprised if my naive
subroutine-threaded Forth compiler ever produced incorrect code (barring a
bug in the primitives). I've never caught any of the naive compilers I've
written or used doing so, and none is production-quality...
--
http://sc3d.org/rrt/ | certain, a. insufficiently analysed
> >Is that your problem or mine? Do you think iForth (or any native
> >code compiler) has more bugs than you-name-it-indirect?
> It probably does just because of the extra complexity.
What I meant is that the number of bugs found goes down exponentially with
debugging time. So my "problem" is that I can't release a native Forth as
quickly as a simpler implementation (which reasonably speaking may have less
initial bugs). We tested tForth 2 to 3 years before it was stable (2
testers, almost daily use). iForth used a lot of tForth's concepts, so it
took only two years before it became useful. BTW, surprisingly few bugs in
i/tForth had to do with code generation. If anybody is interested I can mail
the buglog file (89 K).
-marcel
If your layout follows systematic rules those rules can be incorporated into
the decompiler. Then you get not just your own code laid out to your own
standards but other people's code as well. The only decompilation problem I
have come across that cannot be solved without some additional help from the
compiler is identifying the exact original position of the word CASE in the
source code.
But what's the point when you can just display the source (unless it's not
available)? It seems much simpler to do that, which is surely the Forth way
anyway (like LOCATE).
On 1999-07-24 r...@sc3d.org said:
:....and even this isn't necessarily the case: on a machine where
:addresses are the same size as CALL instructions you can get a
:subroutine-threaded Forth in the same size as an indirect-threaded
:one (e.g. on the ARM); the disadvantage is only the lack of
:addressing range for code, but 26 bits (again, on the ARM) is
:hardly a restriction.
But then you lose on the inline expansion. I downloaded the Machine
Forth you produced for the ARM. Lacking an ARM I was unable to give it a
test, but it looked as though most of the primitives would code into two
instructions, rather than one.
:This is really odd. No-one worries about how the code is
:represented in most reflexive languages. And C programmers rarely
Well, *I* woory about it, OK? I'm giving my reasons for a preference;
even if those reasons seem odd to you, that doesn't make my preference
void.
:worry about how the code is laid out when trying to debug them (I
:hardly ever use a debugger to do more than find out exactly where
:my program crashed; debugging the binary rather than the source is
:a dangerous, if seductive, path to follow, as it results in fixing
:symptoms rather than curing causes).
Well, in ITC there's no difference, until you get down to primitive
level.
:Exactly. I never complain that gdb simply displays my C source
:instead of reconstructing it from the binary. Having a decompiler
:is a useful check on the correctness of the compiler, but I'd much
:rather see my source rather than something which can only be an
:approximation, even in an indirect threaded Forth (waaah! where's
:my layout gone?).
That can, of course, be coded in, as can comments. In any case, Lisp
will obligingly prettyprint definitions, too.
Suffice to say that I want my Forth to look exactly like a Lisp
intepreter, in terms of what I can do with it. In fact, having defended
ITC up until now, I think what would be even more fun is list threading,
which adds an extra level of indirection. EG. for the x86:
ITC - mov edi, (4*esi) LTC - mov esi, (esi+4)
inc esi mov edi, (8*esi)
jmp (edi) jmp (edi)
Now that wouldn't be at all compact, and nor would it stand a chance of
keeping up, but I really do think it would be fun... ;> And hell, if
Lisp can get away with it, I don't see why Forth can't.
You seem to be talking about introducing Forth to the people bringing
up new systems from scratch. I can easily see how introducing Forth
to *that* community wouldn't be all that hard. Those people demand
quick and scriptable means to manipulate the processor and I/O so they
can verify and validate the hardware design. I frankly don't see
introducing Forth at that level to be terribly difficult or
controversial.
The issue is what happens next.
Where I currently work (and where I have worked in the past), the test
engineers who are verifying and validating the hardware design
generally have little to do with the final application code that runs
on the system. They might have some input-- they might for example
give the people writing device drivers timing constraints when
fiddling with a peripheral, or they might suggest to the
application-level software people notes on how best to utilize
resources. But that's generally pretty much it.
Maybe a wave of Forth-enthusiasm will sweep over the company and
everyone will toss their C and ADA compilers out the window. More
likely, the software engineers who are working on the project plan to
reuse past code, and will see Forth as an impediment to code reuse.
Or, they will marginalize Forth as just a sophisticated "monitor" and
to a "real" language for the application-level code.
That's why in my attempts to introduce Forth, I have decided not to go
from the bottom-up as you have, but from the top-down. Introducing
Forth at this level has the benefit of increasing visibility of the
language, and showing it being used for more than just interactive
debugging with simple scripts.
Decompilation is a good way of exploring a system. Even where the kernel
source is available it has to be understood in the context of a metacompiler
or somesuch, whereas decompilation emphasizes the seamlessness of system and
application.
IMO threaded implementations provide some useful rungs on the ladder of
understanding Forth and those who have already climbed the ladder should not
be too quick to kick it away behind them. On the other hand I think an
obsession with execution speed is probably misplaced except, perhaps, for
niche markets. One need look no further than the success of Java to see that
speed is not the key to widespread acceptance.
[rrrrrzzzzzzzzz]
>niche markets. One need look no further than the success of Java to see that
>speed is not the key to widespread acceptance.
Odd thing about Java is that everyone's been talking about it for years, and
it's the main conventional programming language taught in my university, but
I've yet to see it in anything I use (obviously). I've only once
successfully run a non-trivial applet in Netscape (4.51), not that I try,
Netscape crawls whenever Java starts, so I've turned it off, it frequently
crashes when Java is switched on, and I've had little success playing with
the JDK. Compared with other widespread interpreted languages, it's a loser
for me (Python, Perl), though I still think the ideas are good (in
particular the way in which the designers resisted the temptation to include
any new technologies in it, but simply recombined tried-and-tested stuff).
People I know who use it a lot complain most about its speed and resource
hunger (even on large development machines). This is odd considering it was
invented for embedded devices. I don't understand why JIT has taken so long
to arrive in mass-market form (I understand it's in IE 5, and presumably
Netscape 5, but why so long?) given the effort expended on Java.
For every corporation I read of rolling out Java systems, four seem to be
playing "wait-and-see".
In short, it was really exciting when it was announced, but I'm puzzled both
by how slowly it seems to be delivering on that promise, and the fact that
it's still hot (excuse the pun).
One of my clients decided to port their application (part Forth and part
C) to Java. They spent about six months working on it before giving up.
--
Not true--I believe Win32Forth was writteen in C++ and it certainly
has a working assembler (with some minor bugs and lacunae). I just
wrote an all-assembler word for computing Bessel functions quickly.
It wasn't hard. And it was fast.
[ rest deleted ]
--
Julian V. Noble
j...@virginia.edu
"Elegance is for tailors!" -- Ludwig Boltzmann
Also a Forth that will fit in a very small cpu/ram. As I have noted
many times previously and can document, most code optimization is
a waste of time, since it is seldom used. The 90/10 rule (ok, maybe
80/20) is really true form most programs. That means if you identify
the bottlenecks and hand-code them (after the whole program is done
in hi-level) in assembler, first, it takes little extra programmer
effort; and second, the resulting program will run as fast as a
fully-optimized compiled C or FORTRAN program. Marcel's matrix
example (like the matrix example in my book!) is a case in point.
> Not true--I believe Win32Forth was writteen in C++ and it certainly has a
> working assembler (with some minor bugs and lacunae).
The important parts of Win32Forth are definitely not written in C++ but in
Forth assembler: view +
CODE + ( n1 n2 -- n3 ) \ add n1 to n2, return sum n3
pop eax
add ebx, eax
next c;
The unimportant parts of Win32Forth (the GUI) are written in C++ :-)
-marcel
PS: A few assemblers for Gforth were announced, so I was wrong anyway.