Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Review of SwiftX and VFX Forth

422 views
Skip to first unread message

Nihar

unread,
Jun 21, 2004, 11:16:05 AM6/21/04
to
Hi,

A couple of days back I had written to the Forum about switching from
LMI Forth to a newer well supported Forth for 68hc12 processor.

All the replies I got were immensely helpful. Finally it seems that
there are two companies which are in contention. SwiftX from Forth
Inc. and VFX Forth 6 from MPE. I tried searching on the net about any
comparitive reviews about these two Forth's, but I was unable to find
one.

I would be really greatful if you can point me to a comparison of the
two compilers or share you experience of using them.

Thanks in advance.
Regards
Nihar Desai

Krishna Myneni

unread,
Jun 21, 2004, 9:41:18 PM6/21/04
to

Some benchmark comparison testing is shown in my presentation,
"Introduction to Forth...", at the following link:

http://ccreweb.org/documents/programming/programming.html

You may download the presentation in one of several formats
or view the corresponding html files. Comparsion benchmarks
for several Forths are at the link

http://ccreweb.org/documents/programming/html/Forth-Presentation/text29.html

(Note: The benchmark for kForth has improved considerably, 20 to 50%,
depending on the application, for the latest release).

Krishna Myneni

rickman

unread,
Jun 22, 2004, 1:46:52 AM6/22/04
to

Your benchmark info is very useful. I might suggest that you provide
not only the charts, but also the data printed. This allows for
comparisons that are not clear on a chart with such a wide range of
data. The bars for VFX and SwiftForth are both rather short and hard to
compare.


--

Rick "rickman" Collins

rick.c...@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

Alex McDonald

unread,
Jun 22, 2004, 4:50:07 AM6/22/04
to
"rickman" <spamgo...@yahoo.com> wrote in message
news:40D7C7CC...@yahoo.com...
> Krishna Myneni wrote:
> >

===snipped

> >
> > Some benchmark comparison testing is shown in my presentation,
> > "Introduction to Forth...", at the following link:
> >
> > http://ccreweb.org/documents/programming/programming.html
> >
> > You may download the presentation in one of several formats
> > or view the corresponding html files. Comparsion benchmarks
> > for several Forths are at the link
> >
> >
http://ccreweb.org/documents/programming/html/Forth-Presentation/text29.html
> >
> > (Note: The benchmark for kForth has improved considerably, 20 to 50%,
> > depending on the application, for the latest release).
>
> Your benchmark info is very useful. I might suggest that you provide
> not only the charts, but also the data printed. This allows for
> comparisons that are not clear on a chart with such a wide range of
> data. The bars for VFX and SwiftForth are both rather short and hard to
> compare.
>

I'd be interested in the benchmark data too; it appears that Win32Forth
didn't or couldn't run the matrix-mult benchmark, and I'd like to know what
version of Win32Forth was used for the test?

--
Regards
Alex McDonald


Krishna Myneni

unread,
Jun 22, 2004, 10:42:26 PM6/22/04
to
rickman wrote:

> ...

> Your benchmark info is very useful. I might suggest that you provide
> not only the charts, but also the data printed. This allows for
> comparisons that are not clear on a chart with such a wide range of
> data. The bars for VFX and SwiftForth are both rather short and hard to
> compare.
>
>

Tests run on 7 Jan. 2004

Bubble-Sort Matrix-Mult Sieve Fib
------------------------------------------------------
gcc 0.55 0.16 0.34 1.00
VFX Forth 1.01 0.42 0.81 0.64
iForth 1.18 0.54 0.63 1.17
SwiftForth 1.40 0.60 1.08 0.60
gforth-fast 3.7 1.7 2.7 3.6
gforth 4.1 2.1 3.0 4.1
Win32Forth 4.89 --- 3.73 5.22
kforth-fast 8.09 8.57 5.68 12.46
PFE 16.8 12.4 15.2 19.6
kforth 19.54 16.54 16.57 24.98
-----------------------------------------------------

Notes:
------
1) gforth, gforth-fast, PFE, and kforth-fast were timed under Linux
on a 330 MHz Pentium II. The others were timed under Windows 98,
running on the same machine (dual-boot system). The relative speed
of iForth to VFX Forth was communicated to me by Marcel Hendrix
and I used the supplied values to estimate the iForth
execution times on my platform.

2) Versions: gcc 3.3, VFX Forth 3.60, iForth 2.?, SwiftForth 2.2.2.9,
gforth-fast 0.6.0, gforth 0.6.0, Win32Forth 6.01,
kforth-fast (development version based on 1.0.14-2), PFE 0.32.94,
kforth 1.0.14-2

3) Win32Forth ran out of application memory when attempting to run
the matrix-mult benchmark.

KM


Krishna Myneni

unread,
Jun 22, 2004, 10:44:42 PM6/22/04
to
Alex McDonald wrote:
> "rickman" <spamgo...@yahoo.com> wrote in message
> news:40D7C7CC...@yahoo.com...
>
>>Krishna Myneni wrote:
>>
>
> ===snipped
>
>
> ...
>
> http://ccreweb.org/documents/programming/html/Forth-Presentation/text29.html
>

>
> I'd be interested in the benchmark data too; it appears that Win32Forth
> didn't or couldn't run the matrix-mult benchmark, and I'd like to know what
> version of Win32Forth was used for the test?
>

See my response to "rickman". Yes Win32Forth 6.01 failed to run the
matrix-mult benchmark, apparently because of dictionary overflow.
I didn't investigate to see if Win32Forth could be reconfigured to
have a larger dictionary space.

Cheers,
Krishna

George Hubert

unread,
Jun 23, 2004, 8:31:43 AM6/23/04
to

"Krishna Myneni" <krishn...@compuserve.com> wrote in message
news:cbajcb$ih0$2...@ngspool-d02.news.aol.com...

> Alex McDonald wrote:
> > "rickman" <spamgo...@yahoo.com> wrote in message
> > news:40D7C7CC...@yahoo.com...
> >
> >>Krishna Myneni wrote:
> >>
> >
> > ===snipped
> >
>http://ccreweb.org/documents/programming/html/Forth-Presentation/text29.htm

l
> >
>
> >
> > I'd be interested in the benchmark data too; it appears that Win32Forth
> > didn't or couldn't run the matrix-mult benchmark, and I'd like to know
what
> > version of Win32Forth was used for the test?
> >
>
> See my response to "rickman". Yes Win32Forth 6.01 failed to run the
> matrix-mult benchmark, apparently because of dictionary overflow.
> I didn't investigate to see if Win32Forth could be reconfigured to
> have a larger dictionary space.
>
> Cheers,
> Krishna

The dictionary can be increased by running the meta-compiler. 6.01 is quite
an old version of W32F ( we're up to V6.08 on production versions and
V6.09.07 on beta ). The V6.09 versions feature a setup program which makes
it easy to re-meta-compile with a larger dictionary ( just follow the
instructions ) and use a fixed loadpoint which results in about 25-30 %
speed reduction ( probably a bit more with some other optimizations ).

George Hubert


Marcel Hendrix

unread,
Jun 25, 2004, 10:31:04 AM6/25/04
to
Krishna Myneni <krishn...@compuserve.com> writes Re: Review of SwiftX and VFX Forth

| Bubble-Sort Matrix-Mult Sieve Fib
| ------------------------------------------------------
| gcc 0.55 0.16 0.34 1.00
| VFX Forth 1.01 0.42 0.81 0.64
| iForth 1.18 0.54 0.63 1.17

[..]

I protest. The algorithm used in the C code for gcc is (probably)
different from the Forth one. Then the comparison between gcc
and the rest is misleading.

-marcel

Krishna Myneni

unread,
Jun 25, 2004, 9:03:45 PM6/25/04
to

The C programs compiled by gcc were apparently
generated *from* the Forth code by using a program
called forth2c. If anything, the gcc results
given above are likely worse in execution time
than if the Forth algorithms were coded from
scratch in C. As an example, below is the C version
of the sieve benchmark. Compare it to the Forth
version, and it's hard to argue that this is somehow
optimized C code.


/* Generated by forth2c, (C) Martin Maierhofer 1994 */

/* size of data space, can be changed ! */
int _C_DATA_SIZE = 8192;

#include "forth.h" /* general forth definitions */
#include "forth2c.h" /* forth2c specific definitions */


Char flags[8212];


Char eflag[4];


Cell primes(void)
{
Cell _c_result;
Cell x0;
Cell x1;
Cell x2;
Cell x3;
Cell x4;
Cell r0;
Cell r1;
Cell r2;
Cell r3;

{
x0 = (Cell) (&flags[0]);
}

{ /* Literal */
x1 = 8190;
}

{ /* Literal */
x2 = 1;
}

{ /* fill */
UCell u;
Char *addr, c;

c = (Char) x2;
u = (UCell) x1;
addr = (Char *) x0;
memset(addr, c, u);
}

{ /* Literal */
x0 = 0;
}

{ /* Literal */
x1 = 3;
}

{
x2 = (Cell) (&eflag[0]);
}

{ /* fetch */
Cell *a;

a = (Cell *) x2;
x2 = *a;

}

{
x3 = (Cell) (&flags[0]);
}

{ /* do */
Cell bound, index;

index = x3;
bound = x2;

r0 = bound;
r1 = index;

}

label0:


{ /* i */
Cell n = r1;
x2 = n;

}

{ /* cfetch */
Char *a;

a = (Char *) x2;
x2 = (Cell) *a;

}

if (!x2) goto label1;

{ /* dup */
Cell n;

n = x1;

x1 = n;
x2 = n;

}

{ /* i */
Cell n = r1;
x3 = n;

}

{ /* plus */
Cell n1, n2, n;
n1 = x3;
n2 = x2;

n = n2 + n1;
x2 = n;

}

{ /* dup */
Cell n;

n = x2;

x2 = n;
x3 = n;

}

{
x4 = (Cell) (&eflag[0]);
}

{ /* fetch */
Cell *a;

a = (Cell *) x4;
x4 = *a;

}

{ /* less-than */
Cell n1, n2, n;
n1 = x4;
n2 = x3;

n = FLAG(n2 < n1);
x3 = n;

}

if (!x3) goto label2;

{
x3 = (Cell) (&eflag[0]);
}

{ /* fetch */
Cell *a;

a = (Cell *) x3;
x3 = *a;

}

{ /* swap */
Cell n1, n2;

n1 = x3;
n2 = x2;

x2 = n1;
x3 = n2;

}

{ /* do */
Cell bound, index;

index = x3;
bound = x2;

r2 = bound;
r3 = index;

}

label3:


{ /* Literal */
x2 = 0;
}

{ /* i */
Cell n = r3;
x3 = n;

}

{ /* cstore */
Char *a;
Cell c;

a = (Char *) x3;
c = (Char) x2;

*a = c;
}

{ /* dup */
Cell n;

n = x1;

x1 = n;
x2 = n;

}

{ /* +loop */
Cell step, index, bound;

index = r3;
bound = r2;

step = x2;
index += step;
if ((step > 0 && index < bound) ||
(step <= 0 && index >= bound))
{
r2 = bound;
r3 = index;
goto label3;

}

}

{ /* unloop */

}

goto label4;

label2:


{ /* drop */

}

label4:


{ /* swap */
Cell n1, n2;

n1 = x1;
n2 = x0;

x0 = n1;
x1 = n2;

}

{ /* one_plus */
Cell n;

n = x1;
x1 = n + 1;

}

{ /* swap */
Cell n1, n2;

n1 = x1;
n2 = x0;

x0 = n1;
x1 = n2;

}

label1:


{ /* Literal */
x2 = 2;
}

{ /* plus */
Cell n1, n2, n;
n1 = x2;
n2 = x1;

n = n2 + n1;
x1 = n;

}

{ /* loop */
Cell index, bound;

index = r1;
bound = r0;

index++;
if (index != bound)
{
r0 = bound;
r1 = index;
goto label0;

}

}

{ /* unloop */

}

{ /* drop */

}

{ /* exit */
_c_result = x0;
return (_c_result);

}

}

Cell benchmark(void)
{
Cell _c_result;
Cell x0;
Cell x1;
Cell x2;
Cell r0;
Cell r1;

{ /* Literal */
x0 = 0;
}

{ /* Literal */
x1 = 1000;
}

{ /* Literal */
x2 = 0;
}

{ /* do */
Cell bound, index;

index = x2;
bound = x1;

r0 = bound;
r1 = index;

}

label0:


{
Cell _C_locret;

_C_locret = primes();
x1 = _C_locret;

}

{ /* nip */
Cell n1, n2;

n2 = x1;
n1 = x0;
x0 = n2;

}

{ /* loop */
Cell index, bound;

index = r1;
bound = r0;

index++;
if (index != bound)
{
r0 = bound;
r1 = index;
goto label0;

}

}

{ /* unloop */

}

{ /* exit */
_c_result = x0;
return (_c_result);

}

}

void main(void)
{
Cell x0;
Cell x1;

{
x0 = (Cell) (&flags[0]);
}

{ /* Literal */
x1 = 8190;
}

{ /* plus */
Cell n1, n2, n;
n1 = x1;
n2 = x0;

n = n2 + n1;
x0 = n;

}

{
x1 = (Cell) (&eflag[0]);
}

{ /* store */
Cell *a, n;

a = (Cell *) x1;
n = x0;

*a = n;
}

{
Cell _C_locret;

_C_locret = benchmark();
x0 = _C_locret;

}

{ /* dot */
printf("%d ", x0);
fflush(stdout);
}

{ /* exit */
return;

}

}


-----------------

Forth version:

\ #! /usr/stud/paysan/bin/forth

DECIMAL
\ : SECS TIME&DATE SWAP 60 * + SWAP 3600 * + NIP NIP NIP ;
CREATE FLAGS 8190 ALLOT
variable eflag
\ FLAGS 8190 + CONSTANT EFLAG

: PRIMES ( -- n ) FLAGS 8190 1 FILL 0 3 EFLAG @ FLAGS
DO I C@
IF DUP I + DUP EFLAG @ <
IF EFLAG @ SWAP
DO 0 I C! DUP +LOOP
ELSE DROP THEN SWAP 1+ SWAP
THEN 2 +
LOOP DROP ;

: BENCHMARK 0 1000 0 DO PRIMES NIP LOOP ;
\ SECS BENCHMARK . SECS SWAP - CR . .( secs)
: main
flags 8190 + eflag !
benchmark .
;

\ HPPA/720, 50 MHz: user 3.90s

Marcel Hendrix

unread,
Jun 25, 2004, 9:06:58 PM6/25/04
to
Krishna Myneni <krishn...@compuserve.com> writes Re: Review of SwiftX and VFX Forth

> Marcel Hendrix wrote:
>> Krishna Myneni <krishn...@compuserve.com> writes Re: Review of SwiftX and VFX Forth

[..]
>> I protest. The algorithm used in the C code for gcc is (probably)
>> different from the Forth one. Then the comparison between gcc
>> and the rest is misleading.

[..]

> The C programs compiled by gcc were apparently
> generated *from* the Forth code by using a program
> called forth2c. If anything, the gcc results
> given above are likely worse in execution time
> than if the Forth algorithms were coded from
> scratch in C. As an example, below is the C version
> of the sieve benchmark. Compare it to the Forth
> version, and it's hard to argue that this is somehow
> optimized C code.

Maybe Anton Ertl will care to make an authorative
comment here.

IMHO: check the dragon books. Using that enormous number
of variables and no pointers is exactly what a classical
C optimizer likes.

It's amazing that gcc can be *twice* faster than iForth
and VFX on the Sieve. I have looked very intensively into
that particular code and I couldn't improve iForth's output,
not even using in-line (or even pure) assembler. My experiments
with VC++ and the original(*) C Sieve code also didn't turn up
such large differences.

*) the algorithm used above is not the original but a variant
due to Bernd Paysan.

-marcel

Anton Ertl

unread,
Jun 26, 2004, 5:24:02 AM6/26/04
to
m...@iae.nl (Marcel Hendrix) writes:
>Krishna Myneni <krishn...@compuserve.com> writes Re: Review of SwiftX and VFX Forth
>
>> Marcel Hendrix wrote:
>>> Krishna Myneni <krishn...@compuserve.com> writes Re: Review of SwiftX and VFX Forth
>[..]
>>> I protest. The algorithm used in the C code for gcc is (probably)
>>> different from the Forth one. Then the comparison between gcc
>>> and the rest is misleading.
>[..]
>
>> The C programs compiled by gcc were apparently
>> generated *from* the Forth code by using a program
>> called forth2c. If anything, the gcc results
>> given above are likely worse in execution time
>> than if the Forth algorithms were coded from
>> scratch in C. As an example, below is the C version
>> of the sieve benchmark. Compare it to the Forth
>> version, and it's hard to argue that this is somehow
>> optimized C code.
>
>Maybe Anton Ertl will care to make an authorative
>comment here.

It is the same algorithm, because the "C" program was generated
automatically by translating Forth source.

There are also manually-coded C programs for (mostly) the same
algorithms, and they resulted (on the 486) in very similar timings to
the translated code. Read all about the translator and the timings on

http://www.complang.tuwien.ac.at/papers/ertl%26maierhofer95.ps.gz

You can get the benchmarks (Forth code, forth2c-generated C code, and
manually written C code) at

http://www.complang.tuwien.ac.at/forth/bench.zip

and do your own timings and check the generated native code.

Somewhat updated results (from 1999, still on the 486) can be found at
http://www.complang.tuwien.ac.at/forth/performance.html

>IMHO: check the dragon books. Using that enormous number
>of variables and no pointers is exactly what a classical
>C optimizer likes.

At least it does not hurt the run-time of code produced by "gcc -O",
and it made the translator simpler (the idea was to generate it mostly
automatically from the Gforth primitive descriptions). It does not
make the code faster than using fewer variables and eliminating some
of the variable-to-variable copies. Gcc without optimization and
Borland C suck on the forth2c-generated code, however.

Representing the Forth stack as local C variables (i.e., hopefully
machine registers eventually) certainly results in faster code than
representing it as an array with a stack pointer.


>It's amazing that gcc can be *twice* faster than iForth
>and VFX on the Sieve. I have looked very intensively into
>that particular code and I couldn't improve iForth's output,
>not even using in-line (or even pure) assembler. My experiments
>with VC++ and the original(*) C Sieve code also didn't turn up
>such large differences.

Krishna Myeni produced the iforth numbers for his machine by scaling
some results you produced. This scaling can introduce an error
(especially since your results did not come from a P6-based CPU),
which could be an explanation for unexpected results. So better do
your own timings.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html

Doug Hoffman

unread,
Jun 26, 2004, 11:20:49 AM6/26/04
to

> CREATE FLAGS 8190 ALLOT
> variable eflag
>

> : PRIMES ( -- n ) FLAGS 8190 1 FILL 0 3 EFLAG @ FLAGS
> DO I C@
> IF DUP I + DUP EFLAG @ <
> IF EFLAG @ SWAP
> DO 0 I C! DUP +LOOP
> ELSE DROP THEN SWAP 1+ SWAP
> THEN 2 +
> LOOP DROP ;

> flags 8190 + eflag !

This is pretty ugly code. Not readily apparent without study:

1) The 0 3 on the first line must be maintained inside both DO loops and to
the finish. The 0 is the count (# of primes found), the 3 is just re-used
again and again.

2) The count is incremented after the ELSE using "swap 1+ swap". The two
swaps are needed because the "3" is in the way. Stack juggling.

3) All prime numbers between 3 and 16,381 are found. Since all even numbers
greater than 2 are non-prime, an array that is half the size of this range
can be used as the "sieve". The indices for this array are not the prime
numbers themselves, but are indexes to the primes ( prime=(2*index)+3 ).

4) The basic flow of the sieve is to step through an array of 8190 flags,
one at a time. Only if the flag is set, then indicate that a prime has been
found by incrementing the counter and then use an inner loop to mark each
multiple of that prime as being non-prime by clearing the corresponding
flags in the array. The "0 i c!" after the second DO clears the flags.

5) The DUPs peppered about are needed to maintain the count and the "3" on
the stack. And the final DROP is needed to get rid of the "3" so the count
is left on the stack at the end.

6) Quick, what is the 2 being added to after the second THEN and why?

Actually, this code is better than that first presented by Jim Gibreath in
the Sept. 1981 Byte article. Gilbreath claimed to need help in codeing it
in Forth and he still got it wrong. He corrected it in the Jan 1983 Byte
follow-up article, but it was still very ugly. Gilbreath commented that he
"didn't have the kind of brain" that could write Forth code. I haven't the
kind of brain that likes this example. With stuff like that Byte article
it's no wonder Forth has gotten a bum rap over the years. IMHO

Personally, I would code it with local variables even if it meant a (likely
small) speed hit. There is more to programming than drag speed contests.

Regards,

-Doug

Stephen Pelc

unread,
Jun 26, 2004, 2:57:43 AM6/26/04
to comp.lang.forth
On Sat, 26 Jun 2004 01:06:58 GMT, m...@iae.nl (Marcel Hendrix) wrote:

>It's amazing that gcc can be *twice* faster than iForth
>and VFX on the Sieve. I have looked very intensively into
>that particular code and I couldn't improve iForth's output,
>not even using in-line (or even pure) assembler. My experiments
>with VC++ and the original(*) C Sieve code also didn't turn up
>such large differences.

If the tests are being run on a P3 or P4 try enabling the
P4 optimisations with the +IDATA switch in VFX Forth. This
keeps data well away from code. Without this switch I have
seen a 7:1 variation in a benchmark that uses a single
variable. Most C compilers keep data well away from code.

Stephen
--
Stephen Pelc, steph...@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads

Doug Hoffman

unread,
Jun 26, 2004, 7:50:58 PM6/26/04
to
Earlier I wrote:

> Personally, I would code it with local variables even if it meant a (likely
> small) speed hit. There is more to programming than drag speed contests.

Actually, it is faster with local variables, using Mops on a G3 iMac 500MHz
(YMMV):

8190 constant size
variable flags size allot

: sieve { \ prime k index cnt -- cnt }
\ prime k index cnt all initialized to 0
flags size 1 fill
BEGIN
size index >
WHILE
index flags + c@
IF
index dup + 3 + -> prime
1 ++> cnt
index prime + -> k
BEGIN
size k >
WHILE
0 k flags + c!
prime ++> k
REPEAT
THEN
1 ++> index
REPEAT cnt ;

: BENCHMARK 1000 0 DO sieve drop LOOP ;


0.317 seconds with locals
0.350 seconds with original code


Leo Wong

unread,
Jun 26, 2004, 10:41:09 PM6/26/04
to
Doug Hoffman wrote:
> Earlier I wrote:
>
>
>>Personally, I would code it with local variables even if it meant a (likely
>>small) speed hit. There is more to programming than drag speed contests.
>
>
> Actually, it is faster with local variables, using Mops on a G3 iMac 500MHz
> (YMMV):
>

A possibly rewarding exercise is to start with the clearest code you can
write, regardless of speed, and optimize by stages.

Leo Wong
--
http://www.albany.net/~hello/
"Truth will sooner come out from error than from confusion."
-- Francis Bacon, quoted in Barzun and Graff, The Modern Researcher,
6th edition, p. 308.

Steve Fisher

unread,
Jun 26, 2004, 10:41:13 PM6/26/04
to
> Personally, I would code it with local variables even if it meant a (likely
> small) speed hit. There is more to programming than drag speed contests.

Using Gforth-fast, this version takes about 14% longer:

variable pcount
: primes ( -- n )
3 { step }
flags 8190 1 fill
0 pcount !
eflag @ flags
do
i c@
if step i + dup eflag @ < ( step+i bool)
if
eflag @ swap
do
0 i c!
step
+loop
else drop
endif
1 pcount +!
endif
step 2 + to step
loop
pcount @ ;

While we're speaking of ugly code, the Rot13 site
http://www.miranda.org/~jkominek/rot13/ has these C programs:

int main ()
{
register char byte, cap;
for(;read (0, &byte, 1);)
{
cap = byte & 32;
byte &= ~cap;
byte = ((byte >= 'A') && (byte <= 'Z') ? ((byte - 'A' + 13) % 26
+ 'A') : byte) | cap;
write (1, &byte, 1);
}
}

main(a){while(a=~getchar())putchar(~a-1/(~(a|32)/13*2-11)*13);}


I think this Forth version is easier to understand:

13 constant eol
: rot13
begin key dup dup toupper dup ( c c upcased upcased)
'N 'Z 1+ within swap
'A 'N within ( c c 0|-1 0|-1 )
- 13 * ( c c offset) + emit ( c)
eol =
until cr ;

George Morrison

unread,
Jun 27, 2004, 5:41:45 AM6/27/04
to
Steve Fisher wrote:
> While we're speaking of ugly code, the Rot13 site
> http://www.miranda.org/~jkominek/rot13/ has these C programs:
>
> int main ()
> {
> register char byte, cap;
> for(;read (0, &byte, 1);)
> {
> cap = byte & 32;
> byte &= ~cap;
> byte = ((byte >= 'A') && (byte <= 'Z') ? ((byte - 'A' + 13) % 26
> + 'A') : byte) | cap;
> write (1, &byte, 1);
> }
> }
>
> main(a){while(a=~getchar())putchar(~a-1/(~(a|32)/13*2-11)*13);}
>
>
> I think this Forth version is easier to understand:
>
> 13 constant eol
> : rot13
> begin key dup dup toupper dup ( c c upcased upcased)
> 'N 'Z 1+ within swap
> 'A 'N within ( c c 0|-1 0|-1 )
> - 13 * ( c c offset) + emit ( c)
> eol =
> until cr ;

A table lookup version.

13 constant eol
create table 256 chars allot
: populate ( addition start end -- )
1+ swap do i over + i chars table + c! loop drop ;
0 0 255 populate
13 char A char M populate
-13 char N char Z populate
13 char a char m populate
-13 char n char z populate

: rotchar ( ch -- ch' ) chars table + c@ ;
: rot13 ( -- ) begin key dup rotchar emit eol = until cr ;

--
George Morrison | george morrison gedamo demon co uk
Aberdeen, Scotland | . @ . . .

Marcel Hendrix

unread,
Jun 27, 2004, 5:58:44 AM6/27/04
to
Because of newsserver problems I cannot comment on Anton's
recommendation to recheck the Forth2C code.

I copied the Forth benchmark code and 4 Forth compilers
to three of my machines and ran all combinations.

I also compiled SIEV.C from the Forth2C distributions with MS VC6,
without definining any SYMBOLS, but setting VC's switches for maximum
speed (i.e. framepointer removal etc.)

I checked the ASM file for SIEVE.exe and it looked correct to me
(that is, every variable was kept in registers, code looked extremely
efficient, no trace of the 'virtual machine' left).

This is the result for the Gforth Sieve (with eflag):

GForth Sieve | P54C 166 MHz Athlon 900 MHz P4 3 GHz HT
----------------+-------------------------------------------------
mxForth 3.0 | 0.767 sec 0.156 sec 0.069 sec
MS VC6 max. opt | 0.880 sec 0.160 sec 0.060 sec
iForth 2.0 | 1.792 sec 0.328 sec 0.122 sec
VFX 3.40.0851 | 2.093 sec 0.411 sec 0.172 sec
GForth 0.6.1 | 6.101 sec 0.990 sec 0.411 sec

So indeed, C is two times faster than iForth and VFX on all my hardware.
However, mxForth is equally fast as C. (mxForth's model allows it to use
one more register than iForth).

I had to leave an unexpectedly large gap between the flags array and
PRIMES on the P4.

Also I found that the original Gilbreath Sieve is quite a lot faster
than the GForth variant on most other Forth/hardware combinations.

-marcel

-- ---------------------------------------------------------------------
[DEFINED] TIMER-RESET 0= [IF] INCLUDE compat.fth [THEN]

( selection -- )

ANEW -sieve

BASE @ DECIMAL

8190 CONSTANT size
CREATE flags size ALLOT
flags size + CONSTANT eflag
1024 ALLOT ALIGN

( sel base -- ) SWAP VALUE sel

sel 0= [IF] CR .( GForth Sieve)


: PRIMES ( -- n )

flags size 1 FILL
0 3
eflag flags DO I C@ IF DUP I + DUP
eflag < IF eflag SWAP DO 0 I C! DUP +LOOP


ELSE DROP
THEN SWAP 1+ SWAP
THEN 2 +
LOOP DROP ;

[THEN]

sel 1 = [IF] CR .( Gilbreath Sieve)
: PRIMES ( -- )
flags size 1 FILL
0 size 0 DO I flags +
C@ IF I 2* 3 +
DUP I +
BEGIN DUP size
U< WHILE 0 OVER flags + C!
OVER +
REPEAT
2DROP 1+
ENDIF
LOOP ;
[THEN]

sel 2 = [IF] CR .( GForth + local Sieve)


: PRIMES ( -- n )

3 LOCALS| step |
0 ( pcount)
flags size 1 FILL
eflag flags
DO I C@ IF step I + DUP eflag < ( step+i bool)
IF eflag SWAP DO 0 I C! step +LOOP
ELSE DROP
ENDIF
1+ ( pcount)
ENDIF
step 2 + TO step
LOOP ;
[THEN]

sel 3 = [IF] CR .( Mops Sieve)
: PRIMES ( -- u )
0 0 0 LOCALS| prime k index |
flags size 1 FILL
0 ( cnt)
BEGIN
size index >
WHILE
index flags + C@
IF index 2* 3 + TO prime
1+ ( cnt)
index prime + TO k
BEGIN size k >
WHILE 0 k flags + C!
k prime + TO k
REPEAT
THEN
index 1+ TO index
REPEAT ;

\ G3 iMac 500 MHz with MOPS
\ 0.317 seconds with locals


\ 0.350 seconds with original code

[THEN]

: BENCHMARK ( -- )
BASE @ >R DECIMAL
24 HTAB TIMER-RESET
0 1000 0 DO PRIMES NIP LOOP DROP
.ELAPSED
R> BASE ! ;

benchmark

BASE !

Anton Ertl

unread,
Jun 27, 2004, 6:07:06 AM6/27/04
to
m...@iae.nl (Marcel Hendrix) writes:
>This is the result for the Gforth Sieve (with eflag):
>
>GForth Sieve | P54C 166 MHz Athlon 900 MHz P4 3 GHz HT
...

>GForth 0.6.1 | 6.101 sec 0.990 sec 0.411 sec

That's slow for Gforth-0.6 (bad compiler/flags?). With an optimally
compiled version (gcc-2.95.*, configure --enable-force-reg) and
gforth-fast you should get something like:

Gforth 0.6 | 0.49s 0.20s

>Also I found that the original Gilbreath Sieve is quite a lot faster
>than the GForth variant on most other Forth/hardware combinations.

Looking at the code, it appears that the "Gforth sieve" uses +LOOP to
reduce the number of words in the inner loop, which helps
threaded-code systems and probably also simple native-code compilers,
but not sophisticated native-code compilers like iforth/mxforth/VFX
(i.e.most other:-); the slowdown may come from the fact that the +LOOP
test is more complex than U<, or from other differences.

I am not sure where the "Gforth sieve" comes from. Probable sources
are Bernd Paysan, or Marty Fraemans Forth versions of the Stanford
benchmarks (bubble and matrix come from there).

Doug Hoffman

unread,
Jun 27, 2004, 6:28:45 AM6/27/04
to
Marcel Hendrix wrote:

> sel 3 = [IF] CR .( Mops Sieve)
> : PRIMES ( -- u )
> 0 0 0 LOCALS| prime k index |
> flags size 1 FILL
> 0 ( cnt)
> BEGIN
> size index >
> WHILE
> index flags + C@
> IF index 2* 3 + TO prime
> 1+ ( cnt)
> index prime + TO k
> BEGIN size k >
> WHILE 0 k flags + C!
> k prime + TO k
> REPEAT
> THEN
> index 1+ TO index
> REPEAT ;
>
> \ G3 iMac 500 MHz with MOPS
> \ 0.317 seconds with locals
> \ 0.350 seconds with original code

Thanks for that, Marcel.

The "2*" you used instead of the "dup +" I used sped it up a bit more to
0.300 seconds. More readable too. Keeping the count on the stack had no
measurable effect on the time, on my setup.

Regards,

-Doug

Marcel Hendrix

unread,
Jun 28, 2004, 1:58:45 AM6/28/04
to
an...@mips.complang.tuwien.ac.at (Anton Ertl) writes Re: Review of SwiftX and VFX Forth

> m...@iae.nl (Marcel Hendrix) writes:
>> This is the result for the Gforth Sieve (with eflag):

>> GForth Sieve | P54C 166 MHz Athlon 900 MHz P4 3 GHz HT
> ...
>> GForth 0.6.1 | 6.101 sec 0.990 sec 0.411 sec

> That's slow for Gforth-0.6 (bad compiler/flags?). With an optimally
> compiled version (gcc-2.95.*, configure --enable-force-reg) and
> gforth-fast you should get something like:

> Gforth 0.6 | 0.49s 0.20s
[..]

I did all tests under Windows NT/2000/XP, so I used the windows
binary (see thread in gforth mailinglist when Gforth 0.6 came out).
AFAIK this one was made by Bernd Paysan. I don't have the tools
to compile Gforth for windows myself.

The speeds you show indicate that Gforth and VFX produce code that is
equally fast...

-marcel

LeoWong

unread,
Jun 28, 2004, 11:08:59 AM6/28/04
to
\ primes.f Leo Wong, June 28, 2004 +
\ Slow and less slow versions of the Sieve of Eratosthenes
\ Assumes 1 CHAR = 1 address unit

\ Slow version
16384 CONSTANT #nums \ includes 1900 prime numbers
CREATE nums #nums ALLOT

\ Address of n
: num ( n -- a ) nums + ;

\ Is n a prime number?
: prime? ( n -- flag ) num C@ ;

\ Multiples of n are not prime
: -multiples ( n -- )
DUP BEGIN OVER + DUP #nums U<
WHILE FALSE OVER num C! REPEAT 2DROP ;

: primes-a ( -- u )
nums #nums TRUE FILL
0 #nums 2 DO I prime? IF I -multiples 1+ THEN LOOP ;

\ Somewhat faster
8190 CONSTANT size \ consider this many numbers only


CREATE flags size ALLOT
flags size + CONSTANT eflag

: squared ( n -- n*n ) DUP * ;
\ square root after Wil Baden
: sqrt ( n1 -- n2 )
0 TUCK DO 1+ DUP 2* 1+ +LOOP ;

size 2* sqrt 2/ CONSTANT maxfactor

\ Increment count of primes
: +primes ( n1 a -- n1+1 a ) >R 1+ R> ;

\ Return number represented by ith flag
: >n ( i -- n ) 2* 3 + ;

\ Reset flags of odd multiples of n
: -flags ( n -- )
>n eflag OVER squared 3 - 2/ flags +
DO FALSE I C! DUP +LOOP DROP ;

: primes-b ( -- u )
flags size TRUE FILL
1 flags maxfactor 0 DO COUNT IF +primes I -flags THEN LOOP
size maxfactor DO COUNT IF +primes THEN LOOP DROP ;

Leo Wong
http://www.albany.net/~hello/
"There is more to life than increasing its speed" -- Gandhi

LeoWong

unread,
Jun 28, 2004, 1:59:28 PM6/28/04
to
Looking at the Forth versions there, it may be a comedy site.

How much nicer is 'c than [CHAR] c

\ rot13.f Leo Wong, June 28, 2004 +

: between ( n1 n2 n3 -- flag ) 1+ WITHIN ;

: rot13 ( c1 -- c2 )
DUP [CHAR] a [CHAR] m between IF 13 + EXIT THEN
DUP [CHAR] n [CHAR] z between IF 13 - EXIT THEN
DUP [CHAR] A [CHAR] M between IF 13 + EXIT THEN
DUP [CHAR] N [CHAR] Z between IF 13 - EXIT THEN ;

: test ( ca u )
CR 0 ?DO COUNT rot13 EMIT LOOP DROP CR ;

s" Ybeq Wrfhf Puevfg, unir zrepl ba zr, n fvaare." test

Leo Wong
http://www.albany.net/~hello/
He lived. He loved. He prayed.

Ian Osgood

unread,
Jun 28, 2004, 2:01:54 PM6/28/04
to
Leo Wong <he...@albany.net> wrote in message news:<s4-dnb2YLs5...@thebiz.net>...

> Doug Hoffman wrote:
> > Earlier I wrote:
> >
> >
> >>Personally, I would code it with local variables even if it meant a (likely
> >>small) speed hit. There is more to programming than drag speed contests.
> >
> >
> > Actually, it is faster with local variables, using Mops on a G3 iMac 500MHz
> > (YMMV):
> >
>
> A possibly rewarding exercise is to start with the clearest code you can
> write, regardless of speed, and optimize by stages.
>
> Leo Wong

There is a more efficient, better factored, and better commented
version on the Forth Freak Wiki:

http://www.forthfreak.net/wiki/index.cgi?SieveOfEratosthenes

You do extra work by counting and setting flags in the same loop. You
only need sqrt(n)/2 iterations to set flags for prime numbers up to n.
It is better to have seperate words for setting the flags, counting
the primes, and listing the primes.

Ian

LeoWong

unread,
Jun 28, 2004, 2:56:44 PM6/28/04
to
Good advice, thanks.

LeoWong

unread,
Jun 28, 2004, 2:58:20 PM6/28/04
to
Wil Baden would certainly use a table lookup in Forth. Maybe he still does
in Java.

Thanks,

Marcel Hendrix

unread,
Jun 28, 2004, 3:39:48 PM6/28/04
to
ia...@quirkster.com (Ian Osgood) writes Re: Sieve (was Re: Review of SwiftX and VFX Forth)

> There is a more efficient, better factored, and better commented
> version on the Forth Freak Wiki:

> http://www.forthfreak.net/wiki/index.cgi?SieveOfEratosthenes

The only thing it lacks is a final test before production :-)

\ : clear-multiples ( prime start -- prime )
\ maxp bounds do 0 I c! dup +loop ;

This writes all over the end of the flags array.
As a quick hack, it can be fixed with:

: clear-multiples ( prime start -- prime )
flags maxp + swap
do 0 I c! dup +loop ;

I guess the Wiki stuff was tried on a system where DATA and CODE are in
very widely separated areas or even in different 'segments.' On iForth
it overwrote its own code.

> You do extra work by counting and setting flags in the same loop. You
> only need sqrt(n)/2 iterations to set flags for prime numbers up to n.
> It is better to have seperate words for setting the flags, counting
> the primes, and listing the primes.

The overwriting implies clear-multiples does too much work. On my system
the new Sieve is slightly slower than Gilbreath's original.

-marcel


LeoWong

unread,
Jun 28, 2004, 4:02:11 PM6/28/04
to
This is just another way of building a table.

: +fill ( startvalue startaddress n -- )
0 DO OVER I + OVER C! CHAR+ LOOP 2DROP ;
: enumerated
CREATE ( startvalue n -- ) HERE SWAP DUP CHARS ALLOT +fill
DOES> ( n -- a ) SWAP CHARS + ;

0 256 enumerated table

CHAR A 13 + CHAR A table 13 +fill
CHAR N 13 - CHAR N table 13 +fill
CHAR a 13 + CHAR a table 13 +fill
CHAR n 13 - CHAR n table 13 +fill

Leo Wong

unread,
Jun 28, 2004, 8:09:49 PM6/28/04
to
Marcel Hendrix wrote:
>
> The overwriting implies clear-multiples does too much work. On my system
> the new Sieve is slightly slower than Gilbreath's original.
>

With Marcel's correction it is quite speedily on my machine.

"All artists argue about their ways of performing. This is taken for
granted." -- Jacques Barzun, Simple & Direct

Bernd Paysan

unread,
Jun 29, 2004, 6:48:54 AM6/29/04
to
Doug Hoffman wrote:
> 1) The 0 3 on the first line must be maintained inside both DO loops and
> to
> the finish. The 0 is the count (# of primes found), the 3 is just re-used
> again and again.

Nope, the 3 is incremented by 2 each time. It's 2i+3.

The code wasn't written by me, but came with volksForth. I didn't change it
to allow comparison (if any, there were only modest changes). There are
several important mistakes. First of all, EFLAGS should not be a variable,
but a constant (it points to the end of the array). Then, the loop should
go to the square root of the last number (16k, so 128, which means "first
64 entries"). Counting the number of primes found should be separated. The
inner EFLAGS comparison is completely superfluous (when getting the maximum
index right). There's no factoring, but this function is screaming for
factoring:

Create flags 8190 allot
here Constant eflags

: init-flags ( -- ) flags 8190 1 fill ;
: clear-multiples ( n addr -- n )
eflags swap DO 0 I c! dup +LOOP ;
\ or: BEGIN 0 over c! over + dup eflags u>= UNTIL drop ;
: count-primes ( -- n )
0 eflags flags DO I c@ + LOOP ;
: list-primes ( -- ) 2 .
3 eflags flags DO I c@ IF dup . THEN 2 + LOOP drop ;
: find-primes ( -- ) 3 flags 64 bounds DO
I c@ IF dup I + clear-multiples THEN 2 +
LOOP drop ;
: primes ( -- 1899 ) init-flags find-primes count-primes ;
: benchmark 1000 0 DO primes drop LOOP ;

Note that this is not that much of an improvement, since the clear-multiples
still takes most of the time for the smaller numbers. Since many larger
numbers have several small factors, this is not a really efficient way to
do sieve.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

Ian Osgood

unread,
Jun 29, 2004, 12:44:58 PM6/29/04
to
m...@iae.nl (Marcel Hendrix) wrote in message news:<8386342...@frunobulax.edu>...

> ia...@quirkster.com (Ian Osgood) writes Re: Sieve (was Re: Review of SwiftX and VFX Forth)
>
> > There is a more efficient, better factored, and better commented
> > version on the Forth Freak Wiki:
>
> > http://www.forthfreak.net/wiki/index.cgi?SieveOfEratosthenes
>
> The only thing it lacks is a final test before production :-)
>
> \ : clear-multiples ( prime start -- prime )
> \ maxp bounds do 0 I c! dup +loop ;
>
> This writes all over the end of the flags array.
> As a quick hack, it can be fixed with:
>
> : clear-multiples ( prime start -- prime )
> flags maxp + swap
> do 0 I c! dup +loop ;
>
> I guess the Wiki stuff was tried on a system where DATA and CODE are in
> very widely separated areas or even in different 'segments.' On iForth
> it overwrote its own code.

Thanks for the fix! This was a leftover from replacing the pointer
eflags with the limit maxp. The broken code worked on GForth
(amazingly). I have updated the wiki page.

> > You do extra work by counting and setting flags in the same loop. You
> > only need sqrt(n)/2 iterations to set flags for prime numbers up to n.
> > It is better to have seperate words for setting the flags, counting
> > the primes, and listing the primes.
>
> The overwriting implies clear-multiples does too much work. On my system
> the new Sieve is slightly slower than Gilbreath's original.
>
> -marcel

Interesting! Does that imply that the overhead of a second loop to
count the flags is higher than the extra n-sqrt(n) calls to
clear-multiples? Just goes to show that profiling trumps theory.

Ian

Marcel Hendrix

unread,
Jun 29, 2004, 5:55:56 PM6/29/04
to
ia...@quirkster.com (Ian Osgood) writes Re: Sieve (was Re: Review of SwiftX and VFX Forth)

> m...@iae.nl (Marcel Hendrix) wrote in message news:<8386342...@frunobulax.edu>...
>> ia...@quirkster.com (Ian Osgood) writes Re: Sieve (was Re: Review of SwiftX and VFX Forth)

>> > There is a more efficient, better factored, and better commented
>> > version on the Forth Freak Wiki:

>> > http://www.forthfreak.net/wiki/index.cgi?SieveOfEratosthenes

[..]


>> The overwriting implies clear-multiples does too much work. On my system
>> the new Sieve is slightly slower than Gilbreath's original.

> Interesting! Does that imply that the overhead of a second loop to


> count the flags is higher than the extra n-sqrt(n) calls to
> clear-multiples? Just goes to show that profiling trumps theory.

Apparently, yes.

This, however, is twice faster ( 607 ms instead of 1349 ms for 1000 times
on a P54c 166 MHz under Windows NT).

( ./examples/benchmar/sieveros.frt; Douglas Ross, FD Vol XII, No 2, page 6 )
#1000 constant #times
#90 constant root
#8190 constant size

CREATE flags size ALLOT

: DO-PRIME
flags size 1 FILL
root 1 DO flags I +
C@ IF I 2* 1+
DUP 2* SWAP DUP *
BEGIN DUP size 2*
U< WHILE 0 OVER 2/ flags + C!
OVER +
REPEAT
2DROP
ENDIF
LOOP ;

: PRIMES CR #times DEC. ." iterations." TIMER-RESET
#times 0 DO DO-PRIME LOOP
CR .ELAPSED ;

: #PRIMES DO-PRIME
1 size 1 DO flags I + C@ + LOOP . ." primes." ;

: .PRIMES 1 LOCAL cnt \ <size> --- <>
DEPTH 0= ABORT" No limit"
DUP size U> ABORT" above SIZE "
DO-PRIME
CR 2 8 U.R
1 DO
flags I + C@ IF I 2* 1+ 8 U.R
cnt 8 = IF CR CLEAR cnt
ELSE 1 +TO cnt
ENDIF
ENDIF
LOOP ;

-marcel

Albert van der Horst

unread,
Jun 29, 2004, 3:30:58 AM6/29/04
to
In article <s4-dnb2YLs5...@thebiz.net>,

Leo Wong <he...@albany.net> wrote:
>Doug Hoffman wrote:
>> Earlier I wrote:
>>
>>
>>>Personally, I would code it with local variables even if it meant a (likely
>>>small) speed hit. There is more to programming than drag speed contests.
>>
>>
>> Actually, it is faster with local variables, using Mops on a G3 iMac 500MHz
>> (YMMV):
>>
>
>A possibly rewarding exercise is to start with the clearest code you can
>write, regardless of speed, and optimize by stages.

I don't like all those people tweaking the original benchmark to fit
their compiler or their coding habits. It also doesn't matter a bit
whether it is ugly code.

I would like to have a version that could be a model of good factoring,
and then see how it fares one different optimising compilers.
First decide on it. Then run it. Never allow to run changed versions.
Or at least hit everybody on the head when they are presenting benchmark
results of changed versions.

(I tried something like that with a recursion benchmark, where the
copyright tries to forbid distributing without notice of changed
versions.)

>Leo Wong

Groetjes Albert
--
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
One man-hour to invent,
One man-week to implement,
One lawyer-year to patent.

Elizabeth D Rather

unread,
Jun 29, 2004, 8:51:41 PM6/29/04
to
"Nihar" <nihar...@gmail.com> wrote in message
news:f7d92a3b.04062...@posting.google.com...
> Hi,
>
> A couple of days back I had written to the Forum about switching from
> LMI Forth to a newer well supported Forth for 68hc12 processor.
>
> All the replies I got were immensely helpful. Finally it seems that
> there are two companies which are in contention. SwiftX from Forth
> Inc. and VFX Forth 6 from MPE. I tried searching on the net about any
> comparitive reviews about these two Forth's, but I was unable to find
> one.
>
> I would be really greatful if you can point me to a comparison of the
> two compilers or share you experience of using them.

I'm sorry to see that no one has responded to your real question.
Benchmarks involving running the Sieve on SwiftForth and other Windows
Forths have no bearing at all on your 68HC12 question! Not to mention that
performance benchmarks are only one measure of the usefulness of a Forth
product: ease of use, compactness of code, and library support are surely
also important issues.

I invite you to look at our web site www.forth.com for more information of
SwiftX, if you haven't already... you'll find some more information there,
and you can also order a $15 SwiftX evaluation CD that contains a lot of
detailed information, including a complete documentation set, and demo
versions of SwiftX for the 68HC11 and other targets.

Cheers,
Elizabeth
--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310-491-3356
5155 W. Rosecrans Ave. #1018 Fax: +1 310-978-9454
Hawthorne, CA 90250
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================


Steve Fisher

unread,
Jul 3, 2004, 4:31:07 AM7/3/04
to
"LeoWong" <he...@albany.net> wrote in message

> How much nicer is 'c than [CHAR] c

If one were using F83 by LaXen & Perry, this would work:

=========================================================

1 width !
\ Since width is 1, only the first letter of the name and its
\ length will be stored.
: 'a ( -- ascii)
>in @ 1-
blk @ ?dup ( Are we loading a file?)
if block
else
tib ( No, input is from keyboard.)
then +
begin
dup c@ BL =
while
1-
repeat
c@
state @ if [compile] literal then
; immediate
31 width !
\ Now when Forth parses 'B or 'x or '= or any two-letter name
\ beginning with ' , the word we just defined will be executed.

===============================================================

Of course, this won't work when using an ANS Forth.

George Morrison

unread,
Jul 3, 2004, 1:56:51 PM7/3/04
to

This creates a set of constants, 'A 'B etc.

: $'char s" char ! constant '!" ;
: ('char) $'char drop 2dup 5 chars + c! 17 chars + c! ;
: 'char 127 33 do i ('char) $'char evaluate loop ;

Of course, this won't work if your Forth is case insensitive (no
difference between 'A and 'a).

Albert van der Horst

unread,
Jul 3, 2004, 11:16:50 AM7/3/04
to
In article <240eba29895c250b...@localhost.talkaboutprogramming.com>,

LeoWong <he...@albany.net> wrote:
>Looking at the Forth versions there, it may be a comedy site.
>
>How much nicer is 'c than [CHAR] c

This c-ism is ill-advised in Forth, where ' is associated with
execution tokens, and 'something is conventionally either an execution
token, or an address that contains one.
I dislike
db char x char a char c 4
in a disassembler, (where it would run for pages on end),
but replace it with
db &x &a &c 4
which is quite common too.

Steve Fisher

unread,
Jul 3, 2004, 10:08:15 PM7/3/04
to
Very clever, but naughty! Under the definition of S" the ANS Forth
Standard says "A program shall not alter the returned string."
I am tempted to sic the ANS police on you. :)

George Morrison

unread,
Jul 4, 2004, 4:40:05 AM7/4/04
to

"It's a fair cop guv, but society is to blame."
"Agreed. We will be charging them too."
-- Monty Python

Steve Fisher

unread,
Jul 4, 2004, 11:01:54 AM7/4/04
to
All Forth programmers should be arrested, if this definition is correct:

arrest, v.t. Formally to detain one accused of unusualness.
-- Ambrose Bierce

Anton Ertl

unread,
Jul 25, 2004, 7:09:45 AM7/25/04
to
ia...@quirkster.com (Ian Osgood) writes:
>There is a more efficient, better factored, and better commented
>version on the Forth Freak Wiki:
>
>http://www.forthfreak.net/wiki/index.cgi?SieveOfEratosthenes

For 1000 times counting the primes up to 16384 (or so, defining maxp
as 8191), it takes 0.15s instead of 0.25s with gforth-fast on a
2.26GHz Pentium 4. Very nice. I just wonder why it counts 1900
primes while the other version counts 1899.

In any case, I have put up the program with SQRT, changed maxp, and a
BENCHMARK word at

http://www.complang.tuwien.ac.at/forth/sieve-factored.fs

EuroForth 2004: http://www.complang.tuwien.ac.at/anton/euroforth2004/
Deadline (refereed): August 28, 2004; Conference: November 19-21, 2004

Anton Ertl

unread,
Jul 26, 2004, 2:26:00 AM7/26/04
to
m...@iae.nl (Marcel Hendrix) writes:
>an...@mips.complang.tuwien.ac.at (Anton Ertl) writes Re: Review of SwiftX and VFX Forth
>
>> m...@iae.nl (Marcel Hendrix) writes:
>>> This is the result for the Gforth Sieve (with eflag):
>
>>> GForth Sieve | P54C 166 MHz Athlon 900 MHz P4 3 GHz HT
>> ...
>>> GForth 0.6.1 | 6.101 sec 0.990 sec 0.411 sec
>
>> That's slow for Gforth-0.6 (bad compiler/flags?). With an optimally
>> compiled version (gcc-2.95.*, configure --enable-force-reg) and
>> gforth-fast you should get something like:
>
>> Gforth 0.6 | 0.49s 0.20s
>[..]
>
>I did all tests under Windows NT/2000/XP, so I used the windows
>binary (see thread in gforth mailinglist when Gforth 0.6 came out).

Ok, I have now tested under Windows ME, where the gforth-fast.exe from
our 0.6.2 package is really slow (0.5s-0.8s real time for siev.fs on a
2GHz Athlon 64), no CPU times available), and then taken the same
binary to a W2K machine, where I get 0.4s user time on a 2GHz Athlon
MP. More precisely, for the small benchmarks I get (user time):

siev bubble matrix fib
0.405 0.452 0.249 0.499 2GHz Athlon MP, gcc-3.2, W2K
0.29 0.34 0.14 0.37 2GHz Athlon MP, gcc-3.2, Linux
0.22 0.33 0.14 0.35 2GHz Athlon MP, gcc-2.95.1, Linux

So, for a good part Windows (or Cygwin) seems to be responsible for
the slowness. I wonder what's going on there; differences in system
time would not be a big surprise, but I have no explanation for
differences in user time.

Anyway, if I scale the 0.405s to a 900MHz machine, you should still be
seeing a slightly faster result with the more recent

http://www.complang.tuwien.ac.at/forth/gforth/gforth-0.6.2.exe

package.

Thomas Pornin

unread,
Jul 26, 2004, 3:53:02 AM7/26/04
to
According to Anton Ertl <an...@mips.complang.tuwien.ac.at>:

> For 1000 times counting the primes up to 16384 (or so, defining maxp
> as 8191), it takes 0.15s instead of 0.25s with gforth-fast on a
> 2.26GHz Pentium 4. Very nice. I just wonder why it counts 1900
> primes while the other version counts 1899.

If that is any help: a small sloppy C program of mine, not using a sieve
but a simple loop and primality test (by trial division), finds exactly
1900 primes lower than 16384. My guess is that the programs which find
only 1899 primes forget the number "2".


--Thomas Pornin

Doug Hoffman

unread,
Jul 26, 2004, 5:16:15 AM7/26/04
to
Thomas Pornin wrote:

It depends on the stated conditions. The Sept. 1981 Byte Sieve article by
Jim Gilbreath, which may have popularized the Sieve as a benchmark, stated
the conditions as: "... which finds all of the prime numbers between 3 and
16,381."

Regards,

-Doug

Julian V. Noble

unread,
Jul 26, 2004, 6:43:33 PM7/26/04
to

Maybe a difference in the C compilers (or their associated libraries)
used to create GForth?

--
Julian V. Noble
Professor Emeritus of Physics
j...@lessspamformother.virginia.edu
^^^^^^^^^^^^^^^^^^
http://galileo.phys.virginia.edu/~jvn/

"For there was never yet philosopher that could endure the
toothache patiently."

-- Wm. Shakespeare, Much Ado about Nothing. Act v. Sc. 1.

Anton Ertl

unread,
Jul 27, 2004, 1:57:20 AM7/27/04
to
"Julian V. Noble" <j...@virginia.edu> writes:

>Anton Ertl wrote:
>> siev bubble matrix fib
>> 0.405 0.452 0.249 0.499 2GHz Athlon MP, gcc-3.2, W2K
>> 0.29 0.34 0.14 0.37 2GHz Athlon MP, gcc-3.2, Linux
>> 0.22 0.33 0.14 0.35 2GHz Athlon MP, gcc-2.95.1, Linux
>>
>> So, for a good part Windows (or Cygwin) seems to be responsible for
>> the slowness. I wonder what's going on there; differences in system
>> time would not be a big surprise, but I have no explanation for
>> differences in user time.
...

>Maybe a difference in the C compilers (or their associated libraries)
>used to create GForth?

As indicated above, for the first two lines the C compilers are the
same; I have also checked the code of some primitives, and it is the
same.

Since hardly any time is spent in libraries in these benchmarks, they
should not matter; and anyway, I am pretty sure that the Cygwin
library is derived from the glibc that's also used on Linux.

Doug Hoffman

unread,
Jul 27, 2004, 8:15:13 PM7/27/04
to
alb...@spenarnc.xs4all.nl (Albert van der Horst) wrote in message news:<I027JM.6CJ...@spenarnc.xs4all.nl>...

> In article <s4-dnb2YLs5...@thebiz.net>,
> Leo Wong <he...@albany.net> wrote:
> >Doug Hoffman wrote:
> >> Earlier I wrote:
> >>
> >>
> >>>Personally, I would code it with local variables even if it meant a (likely
> >>>small) speed hit. There is more to programming than drag speed contests.
> >>
> >>
> >> Actually, it is faster with local variables, using Mops on a G3 iMac 500MHz
> >> (YMMV):
> >>
> >
> >A possibly rewarding exercise is to start with the clearest code you can
> >write, regardless of speed, and optimize by stages.
>
> I don't like all those people tweaking the original benchmark to fit
> their compiler or their coding habits.

I agree that the generic (non language specific) algorithm should not
be changed for benchmarks. But I see nothing wrong with recoding that
algorithm. The same algorithm can be coded several different ways, at
least with Forth. For example: using locals vs stack gymnastics.


>It also doesn't matter a bit whether it is ugly code.

When the code gets published in a widely read magazine (as was Byte at
one time) it can leave a lasting positive or negative impression with
those not familiar with Forth.

Sorry for the late response but I only just now noticed this.

Regards,

-Doug

0 new messages