Incremental Averaging

Arnold Doray

unread,

May 9, 2012, 12:45:55 AM5/9/12

to

Dear Forthers,

I need to do incremental averaging. I came up with this:

: (average) ( n An Xn+1 -- n+1 An+1 )
swap dup >R
- >R
1 + dup
R> swap /
R> + ;

: average ( seq -- d )
0 0 ['] (average) 2 reduce swap drop ;

REDUCE is a higher order function that takes a stream/"seq", a set of
accumulators, an XT that updates the accumulators and the arity of the
XT. It reads the stream to its end.

Note that in this Forth / also handles floating point division.

My problem is that (AVERAGE) looks really ugly. Any suggestions for
improvement?

Thanks,
Arnold

Elizabeth D. Rather

unread,

May 9, 2012, 3:42:44 AM5/9/12

to

This is the kind of problem M*/ was made for. AVERAGE should accept the
address of an array and its length. Add everything into a double-length
accumulator, and then do an M*/ with a ratio that includes a scale
factor (you know how many actual digits of accuracy you have) and 'n' as
the divisor.

Something like this: (don't have access to a program to test, sorry)

: AVG ( addr n -- n-avg) DUP >R \ Save copy of n
0. 2SWAP 0 DO \ 0. is initial accumulator
>R R@ I CELLS + @ M+ R> LOOP \ Add Ith value to accumulator
10 R> M*/ ; \ Scale average in tenths.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Arnold Doray

unread,

May 9, 2012, 4:29:32 AM5/9/12

to

On Tue, 08 May 2012 21:42:44 -1000, Elizabeth D. Rather wrote:

>
> This is the kind of problem M*/ was made for. AVERAGE should accept the
> address of an array and its length. Add everything into a double-length
> accumulator, and then do an M*/ with a ratio that includes a scale
> factor (you know how many actual digits of accuracy you have) and 'n' as
> the divisor.
>
> Something like this: (don't have access to a program to test, sorry)
>
> : AVG ( addr n -- n-avg) DUP >R \ Save copy of n
> 0. 2SWAP 0 DO \ 0. is initial accumulator
> >R R@ I CELLS + @ M+ R> LOOP \ Add Ith value to accumulator
> 10 R> M*/ ; \ Scale average in tenths.
>
> Cheers,
> Elizabeth

Thank you Elizabeth, but your solution is the plain-vanilla averging, not
incremental averaging. Incremental averaging is less prone to overflow
problems (when you need to read many numbers from a large file or have to
sum squares , eg for RMS or variance calculations).

But thank you for pointing out the use of M*/ . Very useful.

Cheers,
Arnold

Paul Rubin

unread,

May 9, 2012, 4:54:12 AM5/9/12

to

Arnold Doray <inv...@invalid.com> writes:
> REDUCE is a higher order function that takes a stream/"seq", a set of
> accumulators, an XT that updates the accumulators and the arity of the
> XT. It reads the stream to its end.

1. That style doesn't seem very Forthish, but whatever.

2. If you just want the final average, it's probably better to add up
everything to a double-width value, and divide at the end. The method
you posted seems to lose precision at every step.

Hugh Aguilar

unread,

May 9, 2012, 6:46:50 AM5/9/12

to

On May 9, 2:54 am, Paul Rubin <no.em...@nospam.invalid> wrote:

> Arnold Doray <inva...@invalid.com> writes:
> > REDUCE is a higher order function that takes a stream/"seq", a set of
> > accumulators, an XT that updates the accumulators and the arity of the
> > XT. It reads the stream to its end.
>
> 1. That style doesn't seem very Forthish, but whatever.

I do stuff like that all of the time in the novice package (EACH
etc.).

What Forth is this?

Maybe I should provide something like your REDUCE in the novice
package. Right now though, I'm confused about what this code does. Is
a "seq" a sequential ascii file with one number on each line? These
are all single-precision integers, right?

> 2. If you just want the final average, it's probably better to add up
> everything to a double-width value, and divide at the end. The method
> you posted seems to lose precision at every step.

That is what I would do --- simple and straightforward.

Anton Ertl

unread,

May 9, 2012, 7:22:02 AM5/9/12

to

Arnold Doray <inv...@invalid.com> writes:
>Dear Forthers,
>
>I need to do incremental averaging. I came up with this:
>
>: (average) ( n An Xn+1 -- n+1 An+1 )
> swap dup >R
> - >R
> 1 + dup
> R> swap /
> R> + ;

[...]

>My problem is that (AVERAGE) looks really ugly. Any suggestions for
>improvement?

With locals:

: (average) { n1 An Xn+1 -- n+1 An+1 }
Xn+1 An - n 1+ tuck / An + ;

Without:

: (average) ( n An Xn+1 -- n+1 An+1 )

over - rot 1+ dup >r / + r> swap ;

Both untested.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2011: http://www.euroforth.org/ef11/

Arnold Doray

unread,

May 9, 2012, 7:53:26 AM5/9/12

to

On Wed, 09 May 2012 01:54:12 -0700, Paul Rubin wrote:

> Arnold Doray <inv...@invalid.com> writes:
>> REDUCE is a higher order function that takes a stream/"seq", a set of
>> accumulators, an XT that updates the accumulators and the arity of the
>> XT. It reads the stream to its end.
>
> 1. That style doesn't seem very Forthish, but whatever.

Yes, not mainstream forthish but Reduce, Map, Zip/ZipWith, etc work much
better in Forth than in languages that do support them like Haskell or
Lisp. They are useful because they obviate the need to have explicit
loops, which means they can handle infinite streams.

For example, to sum the first 10,000 squares divsible by 41 :

: ... ( n1 n2 -- seq ) .... ;
: square dup * ;
: pass? 41 mod 0= ;
: sum 0 ['] + 1 REDUCE ;

1 2 ... ' square MAP ' pass? FILTER 10000 TAKE sum

560341738361350000 ok

How to do this cleanly in mainstream Forth? While you can do this in
Haskell, I feel the Forth version is better. YMMV of course.

>
> 2. If you just want the final average, it's probably better to add up
> everything to a double-width value, and divide at the end. The method
> you posted seems to lose precision at every step.

I am dealing with large geophysical datasets, with a few hundred million
entries. I wanted to see the current average as the calculation is being
performed. The incremental version allows this with just a little
tweaking.

Cheers,
Arnold

Andrew Haley

unread,

May 9, 2012, 7:57:47 AM5/9/12

to

It needs a bit of factoring, and I think you're doing too much on the
stack. It's easy if you make the accumulator a structure in memory
with count and average, like so:

0
field: .count
field: .ave
constant /accum

and then:

\ Increment a counter, return its value
: +counter ( a - n) dup @ 1+ tuck ! ;

\ Incremental average
: (average) ( x accum)
dup >r .ave @ -
r@ .count +counter /
r> .ave +! ;

This will also mean less stack thrashing in the word that calls
(AVERAGE) .

Andrew.

Doug Hoffman

unread,

May 9, 2012, 8:09:20 AM5/9/12

to

Interestingly, I came up with locals solution that almost identical to
Anton's, and a non-locals solution that was identical.

In either case, it might help to provide a readable comment such as:

\ An+1 = An + (Xn+1 - An)/(n+1)

Both the locals solution and the comment should help to make this a very
easy-to-maintain program.

-Doug

Arnold Doray

unread,

May 9, 2012, 8:18:01 AM5/9/12

to

On Wed, 09 May 2012 03:46:50 -0700, Hugh Aguilar wrote:
>>
>> 1. That style doesn't seem very Forthish, but whatever.
>
> I do stuff like that all of the time in the novice package (EACH etc.).
>
> What Forth is this?
>

It's a Forth I wrote over Java. I am using it now for real apps at work.
It certainly takes the pain out of coding in Java. :) At least for the
kind of work I do (dealing with geophysical data), this approach looks
promising.

If you are implementing this in "Straight Forth", note that for the
higher order functions to be useful, they have to be lazy, since they act
on infinite streams, a notion I learnt from Clojure.

Some higher order functions I have now are :

REDUCE ( seq <acc> xt <num-cc> -- v )
MAP ( seq xt -- seq' )
ZIP ( seq<a> seq<b> -- seq3<a,b> )
ZIPWITH ( seq<a> seq<b> XT<op> -- seq<op(a,b)> )
TAKE ( seq n -- seq' )
etc.

I don't have folds yet, because I suspect they will be just too slow over
Java.

> Maybe I should provide something like your REDUCE in the novice package.
> Right now though, I'm confused about what this code does. Is a "seq" a
> sequential ascii file with one number on each line? These are all
> single-precision integers, right?

A "seq" is an abstract datastream which has operators "head" and "tail".
It is like a Lisp list, but potentially infinitely long. An idea I stole
from Clojure.

Cheers,
Arnold

Paul Rubin

unread,

May 9, 2012, 2:02:40 PM5/9/12

to

Arnold Doray <inv...@invalid.com> writes:
> Yes, not mainstream forthish but Reduce, Map, Zip/ZipWith, etc work much
> better in Forth than in languages that do support them like Haskell or
> Lisp. They are useful because they obviate the need to have explicit
> loops, which means they can handle infinite streams.

Does your Forth have garbage collection? It's been done, but of course
it's not traditional. Without GC it's difficult to use HOF's with much
flexibility. How do you operate on sequences of heap-allocated
structures?

> For example, to sum the first 10,000 squares divsible by 41 : ...

> 560341738361350000 ok

But that answer is wrong, it's actually the sum of the first 100000 of
those squares. Some cut-and-paste error, or a program bug?

> How to do this cleanly in mainstream Forth? While you can do this in
> Haskell, I feel the Forth version is better. YMMV of course.

s = sum $ take 10000 $ [a2 | a <- [1..], let a2 = a*a, a2`mod`41 == 0]

looks natural to me. But say instead of the first 100000 you just want
the ones up to 100000:

t = takeWhile (<= 100000) [a2 | a <- [1..], let a2 = a*a, a2`mod`41 == 0]

and now in addition to the sum, you also want the count:

print (sum t, length t)

How would you do that in your setup? I guess that asks if the right
thing happens if you DUP a sequence.

> I am dealing with large geophysical datasets, with a few hundred million
> entries. I wanted to see the current average as the calculation is being
> performed. The incremental version allows this with just a little
> tweaking.

Well you should probably use floating point, but unless you're worried
about int overflow, keeping a running count and sum seems easier than
that incremental method.

> A "seq" is an abstract datastream which has operators "head" and "tail".
> It is like a Lisp list, but potentially infinitely long. An idea I stole
> from Clojure.

If you want Clojure why don't you use it? Anyway Clojure probably
got it from Scheme, which got it from even older implementations.

Marcel Hendrix

unread,

May 9, 2012, 3:43:05 PM5/9/12

to

anew -average

DOC
(*

> I need to do incremental averaging. I came up with this:
>
> : (average) ( n An Xn+1 -- n+1 An+1 )
> swap dup >R
> - >R
> 1 + dup
> R> swap /
> R> + ;
>
> : average ( seq -- d )
> 0 0 ['] (average) 2 reduce swap drop ;

*)
ENDDOC

: (av) ( An 'Xn+1 n -- An+{Xn+1-An}/n+1 'Xn+2 )
1+ local n+1 @+ local Xn+1 local addr
Xn+1 over - n+1 / + addr ;

CREATE seq #24 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ,
2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 ,
3 , 3 , 3 , 3 , 3 , 3 , 3 , 3 ,

: average ( addr -- ave ) 0 swap @+ 0 ?DO I (av) LOOP DROP ;

cr .( seq average ) seq average .

: .count ; : .ave cell+ ;
: +counter ( a - n) dup @ 1+ tuck swap ! ;

\ Incremental average
: (average) ( x accum)
dup >r .ave @ -
r@ .count +counter /
r> .ave +! ;

create accum 0 , 0 ,
: 2average ( addr -- ave )
0. accum 2!
@+ 0 ?do @+ accum (average) loop drop
accum .ave @ ;

cr .( seq 2average ) seq 2average .

: (fav) ( F: An -- An+{Xn+1-An}/n+1 ) ( 'Xn+1 n -- 'Xn+2 )
1+ local n+1 @+ local Xn+1 local addr
Xn+1 S>F FOVER F- n+1 S>F F/ F+ addr ;

: faverage ( addr -- ) ( F: -- fave ) 0e @+ 0 ?DO I (fav) LOOP DROP ;

cr .( seq faverage ) seq faverage f.

\ seq average 1
\ seq 2average 1
\ seq faverage 2.000000 ok

-marcel

Hugh Aguilar

unread,

May 9, 2012, 7:49:35 PM5/9/12

to

On May 9, 6:18 am, Arnold Doray <inva...@invalid.com> wrote:
> If you are implementing this in "Straight Forth", note that for the
> higher order functions to be useful, they have to be lazy, since they act
> on infinite streams, a notion I learnt from Clojure.

Lazy evaluation is way beyond anything that Straight Forth is going to
support. :-) I'm mostly interested in micro-controllers, you know ---
I'm not trying to compete against Lisp --- in fact, I intend to make
Racket my "sister language" that the Straight Forth users will be
encouraged to use for this kind of program.

If you want to run on the JVM, then you should just go with Closure.
Was there something that you didn't like about Closure?

You might be interested in Factor. It has "sequences" with REDUCE and
MAP and so forth. They aren't infinite in size though. Factor does
have GC so they can change size dynamically. My novice package has
lists, which were inspired by Factor's sequences. They aren't infinite
in size either. I don't have GC; I just require the user to explicitly
call ALLOC and DEALLOC as necessary.

I don't really understand why you want lazy evaluation. Isn't that for
situations in which the data is calculated (usually some mathematical
series, such as the Fibonacci or some such), and the lazy evaluation
allows the calculation to be postponed until the data is needed? This
isn't your situation at all! You just have a lot of data. It is not
calculated though, but it came from measurements taken by some guy
somewhere. I think that what you really need is virtual memory ---
this way your data doesn't have to all be in memory at one time (as it
does in Factor's sequences, or in my lists in my novice package). For
this, I recommend that you use ye olde Forth 1K blocks --- that is
what they were invented for!

Hugh Aguilar

unread,

May 9, 2012, 7:50:47 PM5/9/12

to

On May 9, 5:49 pm, Hugh Aguilar <hughaguila...@yahoo.com> wrote:
> If you want to run on the JVM, then you should just go with Closure.
> Was there something that you didn't like about Closure?

I meant Clojure --- it has a 'j' in the name.

Arnold Doray

unread,

May 10, 2012, 12:56:54 AM5/10/12

to

On Wed, 09 May 2012 11:02:40 -0700, Paul Rubin wrote:
>
> Does your Forth have garbage collection? It's been done, but of course
> it's not traditional. Without GC it's difficult to use HOF's with much
> flexibility. How do you operate on sequences of heap-allocated
> structures?
>

Yes, I built it on top of Java, so it implicitly uses Java's GC. I'm not
sure what you mean by "operate on sequences of heap-allocated
structures".

>> For example, to sum the first 10,000 squares divsible by 41 : ...
>> 560341738361350000 ok
>
> But that answer is wrong, it's actually the sum of the first 100000 of
> those squares. Some cut-and-paste error, or a program bug?
>

My bad. I ran a version on my PC with the first 100,000 squares. Didn't
feel it was fair for most forths, so I took it down a notch but forgot to
update the answer.

>> How to do this cleanly in mainstream Forth? While you can do this in
>> Haskell, I feel the Forth version is better. YMMV of course.
>
> s = sum $ take 10000 $ [a2 | a <- [1..], let a2 = a*a, a2`mod`41 == 0]
>
> looks natural to me. But say instead of the first 100000 you just want
> the ones up to 100000:
>

Haskell is built for this sort of thing. How would you do it in
"traditional" Forth, which was my question.

> t = takeWhile (<= 100000) [a2 | a <- [1..], let a2 = a*a, a2`mod`41 ==
> 0]
>
> and now in addition to the sum, you also want the count:
>
> print (sum t, length t)
>
> How would you do that in your setup? I guess that asks if the right
> thing happens if you DUP a sequence.
>

Yes, the "right thing" certainly happens when you DUP a sequence. That's
the whole point of using seqs. They are not like Java's iterators.

Using :> as my preferred shorthand for :NONAME here's how you would do it
in Forth:

: squares ( seq -- seq' )
['] square map ;

: sum ( seq -- n )
0 ['] + 1 reduce ;

: length ( seq -- n )
0 ['] 1+ 1 reduce ;

1 2 ... squares :> 1000000 <= ; take-while dup sum . length .

333833500
1000001 ok

How would you do this in "mainstream" forth?

>> A "seq" is an abstract datastream which has operators "head" and
>> "tail".
>> It is like a Lisp list, but potentially infinitely long. An idea I
>> stole from Clojure.
>
> If you want Clojure why don't you use it? Anyway Clojure probably got
> it from Scheme, which got it from even older implementations.

Because I need many other things that Clojure doesn't provide.

I needed a number tower (this Forth handles big integers, reals,
rationals & complex numbers), multitasking, REPL, executing on multiple
remote machines, etc. All of which this Forth supports.

The two big takeaways I got from Clojure was the idea of Seqs and that
lazy HOFs with seqs are a good idea, after listening to a talk by its
designer Rich Hickey (It's somewhere on U-Tube, but I can't find a
link).

In any case, I couldn't bring myself to use Clojure on a daily basis.
It's just too ugly. IMHO. But it does have some nice ideas under the
hood.

Forth is capable of presenting the same ideas much more clearly, I feel.
I think the example above proves it to some extent.

Cheers
Arnold

Paul Rubin

unread,

May 10, 2012, 3:03:01 AM5/10/12

to

Arnold Doray <inv...@invalid.com> writes:
> Yes, I built it on top of Java, so it implicitly uses Java's GC. I'm not
> sure what you mean by "operate on sequences of heap-allocated
> structures".

Forth stacks traditionally have unboxed integers that are also used as
pointers, characters, etc. If you want non-statically-allocated
multi-word structures they have to be managed somehow, either manually
or with GC. Manual allocation doesn't mix very well with HOF-oriented
programming style, so you want GC and ok, it sounds like you have it.

> Haskell is built for this sort of thing. How would you do it in
> "traditional" Forth, which was my question.

Old fashioned imperative loops and counters.

> Yes, the "right thing" certainly happens when you DUP a sequence. That's
> the whole point of using seqs. They are not like Java's iterators.

Interesting. You probably know about Factor (factorcode.org). I don't
know if it supports that though. I'd say what you're doing is way
outside of normal Forth practice.

> 333833500
> 1000001 ok
> How would you do this in "mainstream" forth?

Loops and stack juggling or accumulation variables.

> I needed a number tower (this Forth handles big integers, reals,
> rationals & complex numbers), multitasking, REPL, executing on multiple
> remote machines, etc. All of which this Forth supports.

I thought Scheme and Common Lisp have that numeric tower and am
surprised if Clojure doesn't (it has a CL compatibility layer, I
thought). I'm told Clojure has good multitasking. I don't know about
the other stuff but I'd expect you can get to the usual Java stuff,
since that's the whole point of running under the JVM.

> Forth is capable of presenting the same ideas much more clearly, I feel.
> I think the example above proves it to some extent.

I think the example is quite far from the traditional concepts of Forth.
That's neither a good nor a bad thing, of course. No language is the
be-all or end-all. Forth's strengh is extreme simplicity of
implementation and predictable execution semantics including timing. I
don't think it's realistic to do anything high speed and hard-real-time
in any JVM environment, but Forth is used for that all the time.

Arnold Doray

unread,

May 10, 2012, 4:17:30 AM5/10/12

to

On Thu, 10 May 2012 00:03:01 -0700, Paul Rubin wrote:

>
>> Yes, the "right thing" certainly happens when you DUP a sequence.
>> That's the whole point of using seqs. They are not like Java's
>> iterators.
>
> Interesting. You probably know about Factor (factorcode.org). I don't
> know if it supports that though. I'd say what you're doing is way
> outside of normal Forth practice.
>

Yes, you're right, because "normal" practice is focussed on a much lower
level of absraction - embedded programming, which is Forth's normal
mainstay.

What I saw immediately is that Forth naturally supports much higher
levels , just like Lisp, but far more effectively.

>> 333833500 1000001 ok How would you do this in "mainstream" forth?
>
> Loops and stack juggling or accumulation variables.
>

Of course, Forth is "Turing Complete". However I'd bet the result would
be much harder to debug and understand.

>> I needed a number tower (this Forth handles big integers, reals,
>> rationals & complex numbers), multitasking, REPL, executing on multiple
>> remote machines, etc. All of which this Forth supports.
>
> I thought Scheme and Common Lisp have that numeric tower and am
> surprised if Clojure doesn't (it has a CL compatibility layer, I
> thought). I'm told Clojure has good multitasking. I don't know about
> the other stuff but I'd expect you can get to the usual Java stuff,
> since that's the whole point of running under the JVM.
>

Part of the reason for doing this is that we have a huge load of legacy
code (in Java), which needs to be accessed by non-programmers. I really
don't think they could use Clojure effectively in their work. I can see
them using Forth. It's very simple, and the core language can be easily
extended.

>> Forth is capable of presenting the same ideas much more clearly, I
>> feel.
>> I think the example above proves it to some extent.
>
> I think the example is quite far from the traditional concepts of Forth.
> That's neither a good nor a bad thing, of course. No language is the
> be-all or end-all. Forth's strengh is extreme simplicity of
> implementation and predictable execution semantics including timing. I
> don't think it's realistic to do anything high speed and hard-real-time
> in any JVM environment, but Forth is used for that all the time.

I'm simply saying -- and I hope, proving with real code -- that Forth is
capable of flourishing outside its original design context, and can be
very relevant to solving modern programming problems.

Here is a practical example of reading and parsing a CSV file containing
energy estimates and getting the RMS errors:

: => ( val [name] -- )
create , ;

"scs.csv" open-text => input

\ NB: :> is :NONAME

:> input @ readline ; seq => data

: column ( <string> n -- <string> )
>R " " split to-tuple R> @@ ;

data :> 2 column ; map => E
data :> 3 column ; map => Ei
data :> 4 column ; map => Ew-deficit

: -- ( seq1 seq2 -- seq3 )
' - zipwith ;

Ei Ew-deficit -- => Ew
E Ei -- => Ei-error
E Ew -- => Ew-error

: square ( x -- x*x )
dup * ;

: ^2 ( seq -- seq' )

['] square map ;

: sum ( seq -- n )
0 ['] + 1 reduce ;

: length ( seq -- n )
0 ['] 1+ 1 reduce ;

: average ( seq -- n )
dup sum swap length / ;

Ei-error ^2 average sqrt => RMS-Ei
Ew-error ^2 average sqrt => RMS-Ew

RMS-Ei ? RMS-Ew ?

Cheers,
Arnold

Hugh Aguilar

unread,

May 10, 2012, 4:40:21 AM5/10/12

to

On May 10, 2:17 am, Arnold Doray <inva...@invalid.com> wrote:
> I'm simply saying -- and I hope, proving with real code -- that Forth is
> capable of flourishing outside its original design context, and can be
> very relevant to solving modern programming problems.

Is this Forth of yours proprietary? Or can we play with it? I'm
definitely interested in looking at it.

Hugh Aguilar

unread,

May 10, 2012, 5:07:59 AM5/10/12

to

On May 9, 10:56 pm, Arnold Doray <inva...@invalid.com> wrote:
> I needed a number tower (this Forth handles big integers, reals,
> rationals & complex numbers),

If you want all of these data types, Forth would seem to be an
unlikely choice, what with Forth being an untyped language. :-)

Have you considered making your system a super-set of PostScript? It
has limited dynamic-OOP, which could be extended to include all of
your "number tower" types. Doing PostScript in Java would certainly
put a smile on Don Lancaster's face! PostScript is written in C. I
don't know much about the implementation, but I think that it is
fairly simple --- a lot simpler than Factor, anyway --- you might be
able to port this into Java and also upgrade it to do that other stuff
that you want. Apparently you have already done a huge amount of work
in implementing Forth though, so maybe you are pot committed now.

BTW, I do have rationals in my novice package.

Jython has all of those data types --- have you thought about it? ---
or do you consider it to be ugly like Clojure?

Hans Bezemer

unread,

May 11, 2012, 4:23:07 AM5/11/12

to

Arnold Doray wrote:
> Thank you Elizabeth, but your solution is the plain-vanilla averging, not
> incremental averaging. Incremental averaging is less prone to overflow
> problems (when you need to read many numbers from a large file or have to
> sum squares , eg for RMS or variance calculations).

4tH has got an FP statistical package, which can be obtained here:
http://4th.googlecode.com/svn/trunk/4th.src/lib/statist.4th

It shouldn't be too hard to convert it to vanilla Forth. It features several
averages, standard deviation, variance, etc. And yes, it's incremental.

Hans Bezemer

Albert van der Horst

unread,

May 11, 2012, 5:29:06 AM5/11/12

to

In article <jofhql$2v6$1...@dont-email.me>,

Interesting!
In my Forth I can do
' ID. <wid> FOR-WORDS
and all words from wordlist wid are printed.
That is probably a similar idea.

I think the crux lies in generators like ... (or is that even a good
name).
Can you elaborate a bit on that?

>
>Cheers
>Arnold

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Paul Rubin

unread,

May 11, 2012, 4:49:57 AM5/11/12

to

Albert van der Horst <alb...@spenarnc.xs4all.nl> writes:
> Interesting!
> In my Forth I can do
> ' ID. <wid> FOR-WORDS
> and all words from wordlist wid are printed.
> That is probably a similar idea.
>
> I think the crux lies in generators like ... (or is that even a good
> name). Can you elaborate a bit on that?

It's more like delayed-evaluation streams in Scheme, I think:

http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-24.html#%_sec_3.5

This is also the basic data structure of Haskell, more or less.

Arnold Doray

unread,

May 11, 2012, 6:58:21 AM5/11/12

to

On Fri, 11 May 2012 09:29:06 +0000, Albert van der Horst wrote:
>
> Interesting!
> In my Forth I can do
> ' ID. <wid> FOR-WORDS
> and all words from wordlist wid are printed.
> That is probably a similar idea.
>
> I think the crux lies in generators like ... (or is that even a good
> name).
> Can you elaborate a bit on that?
>
>

Yes, it's a similar idea. But it only scratches the surface. When you
break it down, there are really 4 separate ideas at play.

The first is Higher Order Functions (HOF), which have been the staple of
Lisp/Scheme. HOFs essentially are way to abstract away looping. Instead
of looping constructs like DO...LOOP, the HOFs put the looping into a
function like MAP, REDUCE, FILTER, FOLDL, FOLDR, etc. One theorerical
motivation for this approach is functional programming, which makes
mutable state very hard. Looping constructs need state, so they are
usually verboten in functional languages. But despite the abtuse reason,
it turns out that HOFs are very useful in practical programming.

You can actually have HOFs to a limited extent in "traditional" Forth. Eg:

: terminated? ( x -- f ) .... ;

: generator ( -- x ) ... ;

: transformer ( x -- y ) ... ;

: map { gen trans }
begin
gen execute
dup terminated? invert
while
trans execute
repeat ;

\ Usage:
' generator ' transformer map

This example is not terribly convincing. (Yours is a better example, but
still lacks the other 3 ingredients.)

For one, MAP executes *immediately*, when it is called. It is not "lazy",
ie, evaluated only it is needed. The other problem is that you can't
"save" MAP's result to a variable where it could be reused and composed
with other HOFs. These two aspects are intimately linked. This is the
second ingredients.

The third ingredient is the ability to create/represent lists, which may
be potentially infinitely long. In honor of Clojure, I call them "seq"s.
These are NOT simply functions with state. For example:

variable count
0 count !

: counter ( -- count )
count @
1 count +! ;

While COUNTER is stateful, is NOT a seq, because it cannot be DUPed.

With these 3 ingredients, you can find the sum and number of squares
divisible by 41 and less than 100,000.

\ creates an arithmetic series
: ... ( a b -- seq )

: square dup * ;
: squares ['] square map ;
: sum 0 ['] + 1 reduce ;
: length 0 ['] 1+ 1 reduce ;
: pass? 41 mod 0= ;
: divisible-by-41 ['] pass? filter ;

1 2 ... squares divisible-by-41 :noname 100000 < ; take-while dup sum .
length .

You need all 4 ingredients above to do something like this.

But this is still not compelling because it involves just integers.

So, the last ingredient (I feel) is "boxed" quantities. ie, you can put
more on the stack than just integers. This allows a richer set of things
you can do with HOFs. For example, the Seqs could give you a sequence of
images, lines in a file, etc. Managing boxed quantities effectively means
you need automatic garbage collection.

Cheers,
Arnold

Marcel Hendrix

unread,

May 11, 2012, 4:53:47 PM5/11/12

to

Albert van der Horst <alb...@spenarnc.xs4all.nl> writes Re: Incremental Averaging

> In article <jofhql$2v6$1...@dont-email.me>,
> Arnold Doray <inv...@invalid.com> wrote:

[..]

>>>> For example, to sum the first 10,000 squares divsible by 41 : ...
>>>> 560341738361350000 ok

[..]

: SQR_41MOD? ( ix -- dn TRUE=41_divisible ) DUP UM* 2DUP #41 UM/MOD DROP 0= ;

: SumSQ ( u -- d )
>R 0. 0 BEGIN R@
WHILE DUP >R SQR_41MOD? IF D+ R> R> 1- >R
ELSE 2DROP R>
ENDIF 1+
REPEAT R> 2DROP ;

CR .( Try: #100000 SumSQ D. \ 560249286135000 ok)
CR .( #100000 SumSQ D. \ 560324928361350000 ok)
CR .( #100000000 SumSQ D. \ 560333324928333361350000000 ok)

-marcel

Paul Rubin

unread,

May 11, 2012, 8:01:02 PM5/11/12

to

Arnold Doray <inv...@invalid.com> writes:
>>> 333833500 1000001 ok How would you do this in "mainstream" forth?
>> Loops and stack juggling or accumulation variables.
> Of course, Forth is "Turing Complete". However I'd bet the result would
> be much harder to debug and understand.

Not when you count implementing all the infrastructure (such as the JVM)
needed to get the "abstract" version to do anything. At least one
aspect of Forth culture seems to include blurring/erasing the boundary
between the user program and the system environment, as part of getting
rid of abstraction. That makes certain sense in an embedded product: if
the software does the wrong thing, it must be fixed, no matter what part
of the software is at fault. Taking that approach means "debugging" has
to include fixing any JVM or Java compiler bugs that interfere with your
program's operation (or similarly fixing any bugs in a traditional Forth
that you might use instead). From that perspective, the traditional
Forth is probably easier to debug and understand, just because there's
orders of magnitude less code.

> Part of the reason for doing this is that we have a huge load of legacy
> code (in Java), which needs to be accessed by non-programmers. I really
> don't think they could use Clojure effectively in their work. I can see
> them using Forth.

I'm skeptical but it will be interesting to find out. I think Lisp is
far easier to use because functions have named parameters that you can
refer to without having to juggle the stack (locals in Forth aren't
traditional). I personally find Forth rather difficult (though
enjoyable) even though I'm a programmer relatively comfortable with
unusual languages. Lazy sequences are fundamental to Haskell and
reasoning about their behavior is a notorious obstacle to becoming
productive in Haskell. I have trouble with it myself, which is why I
consider myself not really that far out of the beginner zone in Haskell.

So I think expecting non-programmers to deal with stack manipulation,
lazy evaluation, and (this didn't come up so I'm presuming) the absence
of a type system to explain their errors, sounds wildly optimistic.

> I'm simply saying -- and I hope, proving with real code -- that Forth is
> capable of flourishing outside its original design context, and can be
> very relevant to solving modern programming problems.

I'd say you've got something interesting that isn't really Forth in
spirit, but that's just me and I'm surely no authority on such things.

Besides Factor you might like Joy:

http://en.wikipedia.org/wiki/Joy_%28programming_language%29

> Here is a practical example of reading and parsing a CSV file containing
> energy estimates and getting the RMS errors:

I found that example 1) hard to read, and 2) reminds me of APL. If you
want APL, why not use it?

Hugh Aguilar

unread,

May 12, 2012, 1:27:39 AM5/12/12

to

On May 11, 3:58 am, Arnold Doray <inva...@invalid.com> wrote:
> The other problem is that you can't
> "save" MAP's result to a variable where it could be reused and composed
> with other HOFs.

You really should take a look at my novice package. My EACH and other
similar HOFs (not a term I knew about, but whatever) allow the
"toucher" (my term for the function that gets called for each node) to
access parameters on the stack. This is EACH:

: each ( i*x head 'toucher -- j*x ) \ toucher: i*x node -- j*x

Notice how the toucher has access to the i*x stuff, which it can
change into j*x stuff. EACH hides its own working data on the return
stack during the execution of the toucher so that that data doesn't
get in the way of the toucher accessing the i*x stuff underneath.
Also, the toucher can REMOVE the node from the list, as EACH has
already acquired the next node and stashed its address on the return
stack out of the way.

These functions demonstrate how all of this works:
count-C
purge-C
collect-C
first-C

Arnold Doray

unread,

May 12, 2012, 2:10:09 AM5/12/12

to

On Fri, 11 May 2012 17:01:02 -0700, Paul Rubin wrote:

> Arnold Doray <inv...@invalid.com> writes:
>>>> 333833500 1000001 ok How would you do this in "mainstream" forth?
>>> Loops and stack juggling or accumulation variables.
>> Of course, Forth is "Turing Complete". However I'd bet the result would
>> be much harder to debug and understand.
>
> Not when you count implementing all the infrastructure (such as the JVM)
> needed to get the "abstract" version to do anything. At least one
> aspect of Forth culture seems to include blurring/erasing the boundary
> between the user program and the system environment, as part of getting
> rid of abstraction. That makes certain sense in an embedded product: if
> the software does the wrong thing, it must be fixed, no matter what part
> of the software is at fault. Taking that approach means "debugging" has
> to include fixing any JVM or Java compiler bugs that interfere with your
> program's operation (or similarly fixing any bugs in a traditional Forth
> that you might use instead). From that perspective, the traditional
> Forth is probably easier to debug and understand, just because there's
> orders of magnitude less code.
>

What I'm advocating here is that Forth is capable of much more than just
being relevant to embedded programming. So the question is "what is
Forth?". I don't know. But what I am coming to appreciate is CM's spirit
of innovation - he's always advocated finding your own Forth. I used to
think that this "zeitgeist" was counterproductive for Forth as a
language, for obvious reasons. But perhaps there is a middle ground
between standardization and innovation?

I know that Java at least has taken this tack, with separate editions for
different environements.

Coming back to your objection above, I cannot agree. I wrote this Forth
in Java in a few days. I am in no way an expert in these matters, but it
just worked. A lot of it was just bootstrapped using Forth.

From this experience, I believe once you get the VM right (in this case I
used the JVM), the rest is dead easy.

I must empahasize that I have not changed Forth -- I can't agree with
approaches like Reva that change the language substantially.

In the Forth I wrote, all the constructs you are used to in "traditional"
Forth will work. Well, at least the ones I've bothered to implement :). I
have only added other things like boxed quantities, seqs, etc. But the
thinking about these things is trivially the same.

There are some things that are a little different, one of which is
operator overloading (not in the C++ sense). For example, the / operator
will work on ints, floats, rationals and complex numbers. Eg, to divide
45 + 3i and 2/3 + 3i, you'd do just:

45::3 2/3::3 / .

{351/85 , -1197/85i} ok

But this change is not too hard to grasp surely. It simplifies the
language because there is no need for the additional words for floats,
etc.

One thing I have stayed away from is some notion of type. There are
*huge* benefits with types - eg, the ability to define sets of operations
on a type. For example, you might want to implement quarternions (or
complex numbers), instead of having them be natively defined. But I can't
think of a way to do this that would still feel like "Forth". Any
pointers?

>> Part of the reason for doing this is that we have a huge load of legacy
>> code (in Java), which needs to be accessed by non-programmers. I really
>> don't think they could use Clojure effectively in their work. I can see
>> them using Forth.
>
> I'm skeptical but it will be interesting to find out. I think Lisp is
> far easier to use because functions have named parameters that you can
> refer to without having to juggle the stack (locals in Forth aren't
> traditional). I personally find Forth rather difficult (though
> enjoyable) even though I'm a programmer relatively comfortable with
> unusual languages. Lazy sequences are fundamental to Haskell and
> reasoning about their behavior is a notorious obstacle to becoming
> productive in Haskell. I have trouble with it myself, which is why I
> consider myself not really that far out of the beginner zone in Haskell.
>

All you need are locals really, and most stack juggling disappears.This
makes Forth accessible to even beginners. The lack of syntax makes it
easy to pick up.

I feel "traditional" Forth is only "difficult" when it is pressed to do
work outside its narrow embedded domain. But this could be said of any
language.

I don't think lazy sequences are the problem with Haskell. The problem
with Haskell is that it is lazy everywhere. That's why it's easy to get
lost.

> So I think expecting non-programmers to deal with stack manipulation,
> lazy evaluation, and (this didn't come up so I'm presuming) the absence
> of a type system to explain their errors, sounds wildly optimistic.
>

No, I don't think it is hard to understand. I would go so far to say that
it is actually very intuitive for beginners.

What most people find hard is recursive thinking -- upon which much of
the elegance of Lisp/Scheme hinges on. I know because I have had to
teach domain experts. Not something I want to repeat.

>
>> Here is a practical example of reading and parsing a CSV file
>> containing energy estimates and getting the RMS errors:
>
> I found that example 1) hard to read, and 2) reminds me of APL. If you
> want APL, why not use it?

LOL. I'll probably get the same response from comp.lang.apl -- "What
you're doing sounds like Forth. Why not use Forth?" :)

Cheers,
Arnold

Paul Rubin

unread,

May 12, 2012, 2:39:37 AM5/12/12

to

Arnold Doray <inv...@invalid.com> writes:
> What I'm advocating here is that Forth is capable of much more than just
> being relevant to embedded programming. So the question is "what is
> Forth?". I don't know. But what I am coming to appreciate is CM's spirit
> of innovation - he's always advocated finding your own Forth.

I dunno. Maybe you're right. Jeff Fox was the big Forth philosopher
around here til we lost him earlier this year, so unfortunately he can't
weigh in. Maybe Andrew has some thoughts. But, these days Chuck seems
to go beyond wanting Forth to be close to the hardware--he wants it to
BE the hardware (e.g. Green Array chips).

> In the Forth I wrote, all the constructs you are used to in "traditional"
> Forth will work. Well, at least the ones I've bothered to implement :).

Nice, I guess. Forth has subtleties that I've personally never really
understood, like the arguments in another thread about POSTPONE. Do
you handle all that? Can your program pass ANS test suites?

> One thing I have stayed away from is some notion of type. ... But I can't

> think of a way to do this that would still feel like "Forth". Any
> pointers?

1. StrongForth http://home.vrweb.de/stephan.becher/forth/
2. Peter Knaggs' thesis http://www.rigwit.co.uk/thesis/index.html
3. Various Forth OO libraries
4. Factor (runtime types) - link given earlier
5. Joy(?) - link given earlier

> What most people find hard is recursive thinking -- upon which much of
> the elegance of Lisp/Scheme hinges on. I know because I have had to
> teach domain experts. Not something I want to repeat.

You're planning to explain how to do everything with HOF's instead.
That doesn't sound easy.

Elizabeth D. Rather

unread,

May 12, 2012, 3:13:01 AM5/12/12

to

On 5/11/12 8:10 PM, Arnold Doray wrote:
> One thing I have stayed away from is some notion of type. There are
> *huge* benefits with types - eg, the ability to define sets of operations
> on a type. For example, you might want to implement quarternions (or
> complex numbers), instead of having them be natively defined. But I can't
> think of a way to do this that would still feel like "Forth". Any
> pointers?

Don't know about quarternions, but I've worked a lot with complex
numbers, 2-vectors, 3-vectors, and other specialized data types. The
Forth approach is simply to add operators with appropriate prefixes to
do operations on them. For example, V+ (add 2 vectors), V*/ (multiply a
vector by a ratio of scalars), etc., to suit the application at hand.

Type very definitely has a place in Forth. The important thing is,
though, that it's managed by the programmer as appropriate for the
situation at hand, as opposed to being a concept implemented and policed
by an all-powerful (which cannot always be all-knowing) compiler.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Arnold Doray

unread,

May 12, 2012, 3:21:21 AM5/12/12

to

Neat, but that was an easy problem. :)

How about this: Find all the pairwise products of squares divisible by 41
and cubes divisible by 63. Sum all such products which are less than
100,000,000,000.

Not a fair challenge at all since this is where HOFs+Seqs shine. But here
is the solution anyway:

: square dup * ;
: cube dup square * ;
: squares ['] square map ;
: cubes ['] cube map ;
: /41? mod 41 0= ;
: /63? mod 63 0= ;
: ** ['] * zipwith ;
: sum 0 ['] + 1 reduce ;

1 2 ...
dup squares ' /41? filter
swap cubes ' /63? filter
**
:noname 100000000000 < ; take-while
sum
.

68887253925 ok

The Forth HOF/seq solution is simple because it is constructive; You only
need to build the parts and put them together like "Lego blocks". There
is no need to explicitly manage state.

Cheers,
Arnold

A. K.

unread,

May 12, 2012, 3:30:10 AM5/12/12

to

I know it's fun, but it's not Forth either. ( prove the contrary! ;-) )

I could have done it easier with CLP, but that's not Forth either. :-)

Arnold Doray

unread,

May 12, 2012, 3:51:34 AM5/12/12

to

On Fri, 11 May 2012 21:13:01 -1000, Elizabeth D. Rather wrote:

> On 5/11/12 8:10 PM, Arnold Doray wrote:
>> One thing I have stayed away from is some notion of type. There are
>> *huge* benefits with types - eg, the ability to define sets of
>> operations on a type. For example, you might want to implement
>> quarternions (or complex numbers), instead of having them be natively
>> defined. But I can't think of a way to do this that would still feel
>> like "Forth". Any pointers?
>
> Don't know about quarternions, but I've worked a lot with complex
> numbers, 2-vectors, 3-vectors, and other specialized data types. The
> Forth approach is simply to add operators with appropriate prefixes to
> do operations on them. For example, V+ (add 2 vectors), V*/ (multiply a
> vector by a ratio of scalars), etc., to suit the application at hand.
>
> Type very definitely has a place in Forth. The important thing is,
> though, that it's managed by the programmer as appropriate for the
> situation at hand, as opposed to being a concept implemented and policed
> by an all-powerful (which cannot always be all-knowing) compiler.
>

"Type" probably isn't the only concept operative here. OCaml has a types,
but you "still" have to explicitly specify / vs /. + vs +. etc.

What I have in mind is an explicit way for conversions to take place --
eg, the *programmer* specifies what happens when a + takes place between
an Integer and a Float. Is an exception thrown? Is the Int cast to a
Float? etc. These decisions are specified by the *programmer* when she
defines "+".

These decisions are NOT made automatically by the compiler.

I have a very limited form of this happening with the Forth I use -- it
has a built-in, self consistent number system for the basic arithmetic
operations ( + , / , * , etc) on ints, floats, rationals and complex
numbers.

But it is hardcoded into the compiler.

What I want to be able to do is to take these specifications out of the
compiler and into the programmer's hands. That would make the compiler
much simpler, and give the programmer the chance to define custom
algerabic systems (matrix multiplications, tensors, etc) with fancy
operations beyond the usual arithmetic ones.

But I don't think this is possible without some type of explicit type
system, which would change Forth too much, IMHO.

Cheers,
Arnold

Arnold Doray

unread,

May 12, 2012, 4:56:13 AM5/12/12

to

On Sat, 12 May 2012 09:30:10 +0200, A. K. wrote:

> I know it's fun, but it's not Forth either. ( prove the contrary! ;-) )

It is an invitation to expand our ideas of Forth.

Really, what is Forth?

Is Forth merely decided by a committee of a select few? Is that "Forth"?
So, if we have a future Forth does that mean that our earlier Forths are
suddenly "not Forths"?

How about Fig-Forth? It's not ANS-compliant. Would you refuse to call it
Forth?

Swift and VFX both have non-ANS module systems. Are they are not Forths?

Or what about Forth with OO? Is it Forth?

You can build HOFs, seqs lazy evaluation etc. (with a lot of hard work)
*entirely* in ANS Forth. From that perspective, it is Forth. I simply
sidestepped the hard work by using Java.

For me, Forth is about simplicity.
Stripping things to their bare essentials.
That is my first guide.

The other is that it has to "feel" like Forth, when you're programming in
it, which is a very subjective thing. Or perhaps, more precisely, it has
to feel "natural".

HOFs and Seqs greatly simplify programming on datastreams. The examples
prove it.

They in no way feel contrived when used with ordinary Forth syntax. On
the contrary, it is the "non-HOF/Seq" method that feels strained, for
this class of problems.

So, I would conclude that Yes, this is Forth.

QED :)

Cheers,
Arnold

Hugh Aguilar

unread,

May 12, 2012, 5:21:47 AM5/12/12

to

On May 11, 11:10 pm, Arnold Doray <inva...@invalid.com> wrote:
> There are some things that are a little different, one of which is
> operator overloading (not in the C++ sense). For example, the / operator
> will work on ints, floats, rationals and complex numbers. Eg, to divide
> 45 + 3i and 2/3 + 3i, you'd do just:
>
> 45::3 2/3::3 / .
>
> {351/85 , -1197/85i} ok
>
> But this change is not too hard to grasp surely. It simplifies the
> language because there is no need for the additional words for floats,
> etc.

I find it hard to grasp. How does / know what data types it is working
with? Is this dynamic-OOP in the sense that the datums on the stack
are tagged with identifiers indicating what type they are? That is
what PostScript does, which is why I suggested that you go with
PostScript. Factor does this also. There are many languages that do
this --- but Forth isn't one of them.

A. K.

unread,

May 12, 2012, 6:03:33 AM5/12/12

to

With all respect, I'd call your Forth extension a domain specific language.

Because using your argumentation, one could as well easily conclude that
everything is assembler.

Howdy
Andreas

Coos Haak

unread,

May 12, 2012, 6:20:42 AM5/12/12

to

Op Sat, 12 May 2012 07:21:21 +0000 (UTC) schreef Arnold Doray:

<snip>

>: square dup * ;
>: cube dup square * ;
>: squares ['] square map ;
>: cubes ['] cube map ;
>: /41? mod 41 0= ;

Prefix Forth?
: /41? 41 mod 0= ;

>: /63? mod 63 0= ;
: /63? 63 mod 0= ;'

>: ** ['] * zipwith ;
>: sum 0 ['] + 1 reduce ;

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html

Arnold Doray

unread,

May 12, 2012, 6:24:30 AM5/12/12

to

On Sat, 12 May 2012 12:20:42 +0200, Coos Haak wrote:

> Op Sat, 12 May 2012 07:21:21 +0000 (UTC) schreef Arnold Doray:
>
> <snip>
>>: square dup * ;
>>: cube dup square * ;
>>: squares ['] square map ; : cubes ['] cube map ;
>>: /41? mod 41 0= ;
>
> Prefix Forth?
> : /41? 41 mod 0= ;
>
>>: /63? mod 63 0= ;
> : /63? 63 mod 0= ;'
>
>>: ** ['] * zipwith ;
>>: sum 0 ['] + 1 reduce ;

Typo. It is postfix. -AD

Arnold Doray

unread,

May 12, 2012, 6:42:01 AM5/12/12

to

On Sat, 12 May 2012 12:03:33 +0200, A. K. wrote:

> With all respect, I'd call your Forth extension a domain specific
> language.
>
> Because using your argumentation, one could as well easily conclude that
> everything is assembler.
>
> Howdy Andreas

Perhaps you are right Andreas, my experience is limited, and I only use
HOFs to process data streams. That's a pretty broad usage, but you could
call it domain specific.

But also bear in mind that these techniques are used very often in
functional languages, like Haskell. It is likely impossible to program in
Haskell without using HOFs. Which I feel is too extreme.

They have their roots in old languages like Lisp (which is not really a
functional language), and have widespread usage there.

So it cannot be domain specific -- they really apply to many domains.

All I am saying is that these ideas are also very "natural" when
expressed in Forth. It surprises me that they are even more
"natural" (IMHO) than languages that traditionally make heavy use of
them.

Cheers,
Arnold

Marcel Hendrix

unread,

May 12, 2012, 7:41:09 AM5/12/12

to

Arnold Doray <inv...@invalid.com> writes Re: Incremental Averaging

> On Fri, 11 May 2012 22:53:47 +0200, Marcel Hendrix wrote:

[..]

> Neat, but that was an easy problem. :)

> How about this: Find all the pairwise products of squares divisible by 41
> and cubes divisible by 63. Sum all such products which are less than
> 100,000,000,000.

This description is not clear enough.

I assume all squares MOD 41 are needed (a), and all cubes MOD 63 (b).
Then look at alle possible cross multiplications of elements from and b.
When these have a result less than 100,000,000,000, add them to a running sum.

> Not a fair challenge at all since this is where HOFs+Seqs shine. But here
> is the solution anyway:

[..]

> 68887253925 ok

> The Forth HOF/seq solution is simple because it is constructive; You only
> need to build the parts and put them together like "Lego blocks". There
> is no need to explicitly manage state.

I get a different result. But you can reproduce my code with HOFs+Seqs
to get a middle ground.

Apparently your compiler decided to not evaluate a and b upto infinity,
based on the "inconvenient" condition that a[i]*b[j] < 100,000,000,000.
It is not obvious to me how many elements of a and b HAVE been calculated
for your result, but clearly less than I do below. (I assumed that list
a's lowest element is 41 and list b's lowest element is 63, which is very
pessimistic).

If you calculated the results in less time than the below Forth code, the
mentioned constructs are clearly worth considering.

-marcel

-- ---------------
#100000000000 =: n
n #63 / SQRT 1+ =: sqrtn
n #41 / S>F 3e 1/F F** F>S 1+ =: cubrn

CREATE _squares HERE sqrtn 2* CELLS DUP ALLOT ERASE
CREATE _cubes HERE cubrn 2* CELLS DUP ALLOT ERASE

:NONAME sqrtn 0 ?DO I I * DUP #41 MOD 0= _squares I DOUBLE[] 2! LOOP ; EXECUTE
:NONAME cubrn 0 ?DO I I * I * DUP #63 MOD 0= _cubes I DOUBLE[] 2! LOOP ; EXECUTE

: square_41mod? ( ix -- n TRUE=41_divisible ) _squares []DOUBLE 2@ ;
: cube_63mod? ( ix -- n TRUE=63_divisible ) _cubes []DOUBLE 2@ ;

: ?use ( dsum1 n1 t1 n2 t2 -- dsum2 )
ROT AND IF UM* 2DUP n S>D D< IF D+ ELSE 2DROP ENDIF
ELSE 2DROP
ENDIF ;

: SumSQ+CUBE ( -- d ) 0.
sqrtn 0 DO
cubrn 0 DO I cube_63mod? J square_41mod? ?use
LOOP
LOOP ;

\ FORTH> timer-reset SumSQ+CUBE (n,3) cr .elapsed 5,927,183,869,635
\ 0.278 seconds elapsed. ok

Paul Rubin

unread,

May 12, 2012, 9:18:58 AM5/12/12

to

Arnold Doray <inv...@invalid.com> writes:
> The Forth HOF/seq solution is simple because it is constructive; You
> only need to build the parts and put them together like "Lego
> blocks". There is no need to explicitly manage state.

That would perhaps most easily be done in normal Forth with an OO
library. Maybe it would even be worth it to create a general stream /
iterator OO class. Instnaces could be plugged together similar to those
sequences.

Paul Rubin

unread,

May 12, 2012, 11:11:31 AM5/12/12

to

m...@iae.nl (Marcel Hendrix) writes:
> I assume all squares MOD 41 are needed (a), and all cubes MOD 63 (b).
> Then look at alle possible cross multiplications of elements from and b.

No it's enumerate the squares ==0 mod 41 in order [a1, a2, ...],
and similarly with the cubes ==0 mod 63 [b1, b2..] and look at
the sequence of pairs [(a1,b1), (a2, b2), ...]
and multiply the terms of the pairs [(a1*b1), (a2*b2), ...]
There are no cross terms like a1*b2 or b2*a1.

> Apparently your compiler decided to not evaluate a and b upto infinity,
> based on the "inconvenient" condition that a[i]*b[j] < 100,000,000,000.

The product sequence is increasing order so he just stops once a product
larger than 10^11 is seen.

BruceMcF

unread,

May 12, 2012, 11:11:39 AM5/12/12

to

If you have a workspace system, where the type information is attached
to the workspace data, then you would have workspace versions of + / *
etc. that take addresses into the workspace and return results as
addresses into the workspace. You'd need a garbage collection system,
but only for items imported into and generated within the workspace.

Elizabeth D. Rather

unread,

May 12, 2012, 1:37:07 PM5/12/12

to

On 5/11/12 9:51 PM, Arnold Doray wrote:
> On Fri, 11 May 2012 21:13:01 -1000, Elizabeth D. Rather wrote:
>

...

>> Type very definitely has a place in Forth. The important thing is,
>> though, that it's managed by the programmer as appropriate for the
>> situation at hand, as opposed to being a concept implemented and policed
>> by an all-powerful (which cannot always be all-knowing) compiler.
>>
>
> "Type" probably isn't the only concept operative here. OCaml has a types,
> but you "still" have to explicitly specify / vs /. + vs +. etc.
>
> What I have in mind is an explicit way for conversions to take place --
> eg, the *programmer* specifies what happens when a + takes place between
> an Integer and a Float. Is an exception thrown? Is the Int cast to a
> Float? etc. These decisions are specified by the *programmer* when she
> defines "+".
>
> These decisions are NOT made automatically by the compiler.
>
> I have a very limited form of this happening with the Forth I use -- it
> has a built-in, self consistent number system for the basic arithmetic
> operations ( + , / , * , etc) on ints, floats, rationals and complex
> numbers.
>
> But it is hardcoded into the compiler.
>
> What I want to be able to do is to take these specifications out of the
> compiler and into the programmer's hands. That would make the compiler
> much simpler, and give the programmer the chance to define custom
> algerabic systems (matrix multiplications, tensors, etc) with fancy
> operations beyond the usual arithmetic ones.
>
> But I don't think this is possible without some type of explicit type
> system, which would change Forth too much, IMHO.

If you want the programmer to select the appropriate form of +, what is
your objection to simply providing the forms relevant in the application
in question, with different names? Dead simple to implement, and it has
the charm of being immediately obvious from looking at the source which
form of + is being used in each instance.

Elizabeth D. Rather

unread,

May 12, 2012, 1:44:21 PM5/12/12

to

On 5/12/12 12:03 AM, A. K. wrote:
...

> With all respect, I'd call your Forth extension a domain specific language.

Which, after all, is what Forth is intended for: a *starting point* from
which you build the tools and the commands to work in the application
domain you're working in, whether it's instrument control, shipping
packages, analyzing municipal bonds, or mathematical analysis.

The point of standards is not to limit you in any way, but to provide a
known baseline for those who wish to port their "domain-specific
language" from one Forth to another.

Marcel Hendrix

unread,

May 12, 2012, 3:20:37 PM5/12/12

to

Paul Rubin <no.e...@nospam.invalid> writes Re: Incremental Averaging

That is barely worth thinking about.

-- -------
#100.000.000.000 ==: n
0 VALUE ix41
0 VALUE ix63

: startup ( -- ) CLEAR ix41 CLEAR ix63 TIMER-RESET ;
: finish ( -- ) (n,3) CR .ELAPSED ;
: square_41mod? ( -- next_n ) BEGIN 1 +TO ix41 ix41 DUP * DUP #41 MOD WHILE DROP REPEAT ;
: cube_63mod? ( -- next_n ) BEGIN 1 +TO ix63 ix63 DUP DUP * * DUP #63 MOD WHILE DROP REPEAT ;
: ?use ( dsum1 n1 n2 -- dsum2 bool ) UM* 2DUP n D< IF D+ FALSE ELSE 2DROP TRUE ENDIF ;
: SumSQ+CUBE ( -- d ) startup 0. BEGIN cube_63mod? square_41mod? ?use UNTIL finish ;

\ FORTH> SumSQ+CUBE 68,887,253,925
\ 0.001 seconds elapsed. ok
-- -------

-marcel

Arnold Doray

unread,

May 13, 2012, 12:06:05 PM5/13/12

to

That's another neat solution.

You made a good observation in your earlier post -- that for an
exhaustive search over all pairs, HOFs are not the way to go. But that
does not invalidate their utility. HOFs are just one tool to use in the
appropriate circumstance.

I feel your solutions are far more "opaque" compared to the HOF ones I
presented, which read like a recipe. YMMV of course. But I think the
widespread usage of HOFs in other languages validates their general
utility.

The HOF solution is comparable in speed to your imperative one above. I
used a version of your solution on my slow Forth (it runs on Java, uses
boxed integers, is not compiled to bytecode and has no optimization
whatsoever) at 3ms - 8ms (once JIT kicks in) for both solutions.

Cheers,
Arnold

Arnold Doray

unread,

May 13, 2012, 12:39:07 PM5/13/12

to

Firstly, in many instances, you can't tell ahead of time what to expect.
For example, suppose you have an intelligent SQRT that gives i for -1
SQRT. Anything downstream from that needs to act appropriately to handle
complex numbers. Or more mundanely, you might want to add a float and an
int. The method I advocate obviates the need to sprinkle your code with
explicit type casts.

Second, it can avoid creating separate functions for each type. So, this
has the potential of cutting down a *lot* of repetition in your code. For
example, suppose you write code to sort a set of integers. If your
comparison operator only works for integers, then you have to repeat the
same code to sort floats.

The third advantage is that you can incrementally create more complex
closed systems. For example, floats are closed with respect to * and +
(ie, float float * = float , float float + = float ). If you had a system
like I describe, the programmer could incrementally add new definitions
for * and + to expand them to include complex numbers. Or matrices, etc.

Many languages (including Java) partially "solve" item (1) by getting the
compiler to automatically perform type casting. This is a hack, IMHO.
Languages like Haskell go to the other extreme by requiring types to be
specified for every function, which can be tedious. The situation is
compounded in Forth, since Forth definitions are short and numerous.
From the little I've seen of it, StrongForth goes this route too.

Cheers,
Arnold

A. K.

unread,

May 13, 2012, 1:31:20 PM5/13/12

to

On 13.05.2012 18:39, Arnold Doray wrote:

> The third advantage is that you can incrementally create more complex
> closed systems. For example, floats are closed with respect to * and +
> (ie, float float * = float , float float + = float ). If you had a system
> like I describe, the programmer could incrementally add new definitions
> for * and + to expand them to include complex numbers. Or matrices, etc.
>
> Many languages (including Java) partially "solve" item (1) by getting the
> compiler to automatically perform type casting. This is a hack, IMHO.
> Languages like Haskell go to the other extreme by requiring types to be
> specified for every function, which can be tedious. The situation is
> compounded in Forth, since Forth definitions are short and numerous.
> From the little I've seen of it, StrongForth goes this route too.
>
> Cheers,
> Arnold

In the end with operator overloading one has to make a choice between
early binding and late binding. Late binding makes for pretty code but
has a runtime penalty. With early binding you have to give the compiler
some type information. And overloading has a tendency to exponential
growing complexity, the more types the operators have to digest.

Eg. say one wants to use + for string concatenation and also for scalar
numeric vector addition. One has to tell the compiler that character
vectors are strings and no numeric vectors...

Elizabeth D. Rather

unread,

May 13, 2012, 2:29:51 PM5/13/12

to

I obviously misunderstand you. On the one hand, you wrote:

On 5/13/12 6:39 AM, Arnold Doray wrote:
> On Sat, 12 May 2012 07:37:07 -1000, Elizabeth D. Rather wrote:
>
>> On 5/11/12 9:51 PM, Arnold Doray wrote:

...

>>> What I have in mind is an explicit way for conversions to take place --
>>> eg, the *programmer* specifies what happens when a + takes place
>>> between an Integer and a Float. Is an exception thrown? Is the Int cast
>>> to a Float? etc. These decisions are specified by the *programmer* when
>>> she defines "+".
>>>
>>> These decisions are NOT made automatically by the compiler.

...

>>>
>>> What I want to be able to do is to take these specifications out of the
>>> compiler and into the programmer's hands. That would make the compiler
>>> much simpler, and give the programmer the chance to define custom
>>> algerabic systems (matrix multiplications, tensors, etc) with fancy
>>> operations beyond the usual arithmetic ones.

...and I said:

>> If you want the programmer to select the appropriate form of +, what is
>> your objection to simply providing the forms relevant in the application
>> in question, with different names? Dead simple to implement, and it has
>> the charm of being immediately obvious from looking at the source which
>> form of + is being used in each instance.

Now you seem to want some kind of automatic type detection:

> Firstly, in many instances, you can't tell ahead of time what to expect.
> For example, suppose you have an intelligent SQRT that gives i for -1
> SQRT. Anything downstream from that needs to act appropriately to handle
> complex numbers. Or more mundanely, you might want to add a float and an
> int. The method I advocate obviates the need to sprinkle your code with
> explicit type casts.
>
> Second, it can avoid creating separate functions for each type. So, this
> has the potential of cutting down a *lot* of repetition in your code. For
> example, suppose you write code to sort a set of integers. If your
> comparison operator only works for integers, then you have to repeat the
> same code to sort floats.
>
> The third advantage is that you can incrementally create more complex
> closed systems. For example, floats are closed with respect to * and +
> (ie, float float * = float , float float + = float ). If you had a system
> like I describe, the programmer could incrementally add new definitions
> for * and + to expand them to include complex numbers. Or matrices, etc.
>
> Many languages (including Java) partially "solve" item (1) by getting the
> compiler to automatically perform type casting. This is a hack, IMHO.
> Languages like Haskell go to the other extreme by requiring types to be
> specified for every function, which can be tedious. The situation is
> compounded in Forth, since Forth definitions are short and numerous.
> From the little I've seen of it, StrongForth goes this route too.

I'm sorry, but in most cases you need different code to operate on
different kinds of data. Are you advocating a + operator that would
examine the data (where? the data stack? float stack? how many cells?
how does it know?) and somehow pick a code path that will do the right
thing? How is this either "programmer choice" or "simple"?

Marcel Hendrix

unread,

May 13, 2012, 5:32:35 PM5/13/12

to

Arnold Doray <inv...@invalid.com> writes Re: Incremental Averaging

> On Sat, 12 May 2012 21:20:37 +0200, Marcel Hendrix wrote:

[..]

> That's another neat solution.

> The HOF solution is comparable in speed to your imperative one above. I
> used a version of your solution on my slow Forth (it runs on Java, uses
> boxed integers, is not compiled to bytecode and has no optimization
> whatsoever) at 3ms - 8ms (once JIT kicks in) for both solutions.

The code doesn't run long enough to time it properly. With a special
ns timer utility:

FORTH> SumSQ*CUBE 24,616,520,368,496,750,925
248 microseconds elapsed. ok

This is about 32 times faster than your timing, the speed ratio of a
jetplane to a scooter :-)

-marcel

Arnold Doray

unread,

May 15, 2012, 3:43:02 AM5/15/12

to

On Thu, 10 May 2012 02:07:59 -0700, Hugh Aguilar wrote:

> On May 9, 10:56 pm, Arnold Doray <inva...@invalid.com> wrote:
>> I needed a number tower (this Forth handles big integers, reals,
>> rationals & complex numbers),
>
> If you want all of these data types, Forth would seem to be an unlikely
> choice, what with Forth being an untyped language. :-)
>
> Have you considered making your system a super-set of PostScript? It has
> limited dynamic-OOP, which could be extended to include all of your
> "number tower" types. Doing PostScript in Java would certainly put a
> smile on Don Lancaster's face! PostScript is written in C. I don't know
> much about the implementation, but I think that it is fairly simple ---
> a lot simpler than Factor, anyway --- you might be able to port this
> into Java and also upgrade it to do that other stuff that you want.
> Apparently you have already done a huge amount of work in implementing
> Forth though, so maybe you are pot committed now.
>
> BTW, I do have rationals in my novice package.
>
> Jython has all of those data types --- have you thought about it? --- or
> do you consider it to be ugly like Clojure?

I have been on a long trek down a long road trying to find the simplest
language for impatient domain experts. I have inflicted on them Lisp,
OCaml and Ruby. These are all much too hard, since all they want is to
get on with their work, not program! Most know Fortran, but the nature of
the work is such that it isn't suited to Fortran. I can't see them using
Clojure -- too much hinges on recursion. I can't see them using P/Jyton
-- it's harder than Ruby.

I needed a language that I could use conveniently, change conveniently,
leverage on legacy code and which they could use too. Having found Forth,
I can safely say that it meets all these needs.

I needed to soften the edges a little, by using HOFs, etc, I also needed
multitasking, RPCs and Java connectivity. So I couldn't use ANS Forth.

The big revelation for me is how easy it is to write this kind of Forth
in Java. It gives you the facilities out of the box (like the GC).

Cheers,
Arnold

Arnold Doray

unread,

May 15, 2012, 3:52:30 AM5/15/12

to

On Fri, 11 May 2012 22:27:39 -0700, Hugh Aguilar wrote:

> On May 11, 3:58 am, Arnold Doray <inva...@invalid.com> wrote:
>> The other problem is that you can't "save" MAP's result to a variable
>> where it could be reused and composed with other HOFs.
>
> You really should take a look at my novice package. My EACH and other
> similar HOFs (not a term I knew about, but whatever) allow the "toucher"
> (my term for the function that gets called for each node) to access
> parameters on the stack. This is EACH:
>
> : each ( i*x head 'toucher -- j*x ) \ toucher: i*x node -- j*x
>
> Notice how the toucher has access to the i*x stuff, which it can change
> into j*x stuff. EACH hides its own working data on the return stack
> during the execution of the toucher so that that data doesn't get in the
> way of the toucher accessing the i*x stuff underneath. Also, the toucher
> can REMOVE the node from the list, as EACH has already acquired the next
> node and stashed its address on the return stack out of the way.
>
> These functions demonstrate how all of this works:
> count-C purge-C collect-C first-C

Hugh, I understand what you are doing, but this is only going it half
way.

For HOFs to be really effective, your system needs 2 more ingredients --
lazy evaluation (or "delayed" evaluation as it also called), and for
*that* to really be practical, you need boxed quantities (ie, you need to
keep the sequence in memory somehow, not directly on the stack). The
stack is used to keep track of these boxed entities. Of course, having
such boxed entities presumes a GC.

I wrote in another post on this in much more detail (also in this
thread).

Cheers,
Arnold

Arnold Doray

unread,

May 15, 2012, 4:59:58 AM5/15/12

to

Aren't you are comparing apples to oranges?

You have a very fast Forth. I am using a forth cobbled together in a few
days. A direct comparison is silly and it proves nothing of consequence.

The correct comparison is to compare timings of your imperative algorithm
against the HOF method *both* on MY forth (since your forth does not have
HOFs).

I have done exactly this, and on that basis, your imperative algorithm
and HOF algoritm performed similarly.

I re-ran it for larger N = 10^30:

6073475610503260942236987344898741

The HOF is faster (4400ms) vs the imperative solution (5500ms). I made
sure to run both 10 times to get the JIT to kick in. I think this is
because of my unoptimized DO...LOOP which is in Forth.

The listings are attached below. Remember, this is my Forth where * is
overloaded.

Cheers,
Arnold

\ ------------ IMPERATIVE SOLUTION

: => create , ;

: ^ 1 swap 0 do over * loop swap drop ;
10 30 ^ => n
0 => ix41
0 => ix63

: clear ( xt -- )
0 swap ! ;

: startup ( -- )
ix41 CLEAR ix63 CLEAR TIMER ;

: finish ( n -- )
ELAPSED . . ;

: square_41mod? ( -- next_n )

BEGIN ix41 @ 1 + DUP ix41 !
DUP * DUP 41 MOD WHILE DROP REPEAT ;

: cube_63mod? ( -- next_n )

BEGIN ix63 @ 1 + DUP ix63 !
DUP DUP * * DUP 63 MOD WHILE DROP REPEAT ;

: ?use ( dsum1 n1 n2 -- dsum2 bool )

* 2DUP n @ < IF + FALSE ELSE 2DROP TRUE THEN ;

: SumSQ+CUBE ( -- d )

startup 0 BEGIN cube_63mod? square_41mod? ?use UNTIL finish ;

\ --- HOF Solution

: square dup * ;
: cube dup square * ;
: squares ['] square map ;
: cubes ['] cube map ;

: /41? 41 mod 0= ;

: /63? 63 mod 0= ;
: ** ['] * zipwith ;
: sum 0 ['] + 1 reduce ;

: N< n @ < ;

: SumSQ+CUBE-HOF
timer

1 2 ...
dup squares ['] /41? filter
swap cubes ['] /63? filter
**

['] N< take-while
sum
elapsed . . ;

Arnold Doray

unread,

May 15, 2012, 5:08:56 AM5/15/12

to

On Thu, 10 May 2012 01:40:21 -0700, Hugh Aguilar wrote:

> On May 10, 2:17 am, Arnold Doray <inva...@invalid.com> wrote:
>> I'm simply saying -- and I hope, proving with real code -- that Forth
>> is capable of flourishing outside its original design context, and can
>> be very relevant to solving modern programming problems.
>
> Is this Forth of yours proprietary? Or can we play with it? I'm
> definitely interested in looking at it.

I think it will be ok to release it under the GPL. Give me a week to get
this sorted. If anyone else is interested, mail me at arnold at terra
hyphen weather dot com.

Arnold Doray

unread,

May 15, 2012, 5:18:35 AM5/15/12

to

On Sun, 13 May 2012 19:31:20 +0200, A. K. wrote:
>
> In the end with operator overloading one has to make a choice between
> early binding and late binding. Late binding makes for pretty code but
> has a runtime penalty. With early binding you have to give the compiler
> some type information. And overloading has a tendency to exponential
> growing complexity, the more types the operators have to digest.
>
> Eg. say one wants to use + for string concatenation and also for scalar
> numeric vector addition. One has to tell the compiler that character
> vectors are strings and no numeric vectors...

Very well put.

I use late binding internally.

But with Forth, you can have your cake and eat it.

If speed is your concern, then go "traditional" and define separate
operators for each "type".

If flexibility is your concern, then go with late binding-type operator
overloading.

Either way, that same Forth would allow you to do what you want. It need
not be an either/or decision.

I don't like early binding for obvious reasons. I've heard it said that
static type checking saves you trouble, but in truth, I feel (having used
"duck" typed languages like Lisp or Ruby and strongly typed ones like
Java and Haskell) they are more bother than a help.

The best reason for Types, (IMHO) is the ability to create new types
complete with their operators. I don't think early binding is needed for
this.

Cheers,
Arnold

Arnold Doray

unread,

May 15, 2012, 5:38:13 AM5/15/12

to

On Wed, 09 May 2012 16:49:35 -0700, Hugh Aguilar wrote:

>
> I don't really understand why you want lazy evaluation. Isn't that for
> situations in which the data is calculated (usually some mathematical
> series, such as the Fibonacci or some such), and the lazy evaluation
> allows the calculation to be postponed until the data is needed? This
> isn't your situation at all! You just have a lot of data. It is not
> calculated though, but it came from measurements taken by some guy
> somewhere. I think that what you really need is virtual memory --- this
> way your data doesn't have to all be in memory at one time (as it does
> in Factor's sequences, or in my lists in my novice package). For this, I
> recommend that you use ye olde Forth 1K blocks --- that is what they
> were invented for!

Do read that other post to Albert where I explain this at some length.

In summary, no, you *don't* want lazy evaluation in general.

But you DO want lazy evaluation specifically when dealing with infinite
sequences. In other words, you want your HOFs to be lazy. There are two
reasons: It allows you to duplicate & pass the transformed sequence, and
that allows you to compose things to make complicated sequences.

The example below should answer your question. The data is from a CSV
file on disk.

Cheers,
Arnold

: => create , ;

"scs.csv" open-text => input

\ NB: :> is :NONAME

:> input @ readline ; seq => data

\ Extracts a given column of data from a row of CSV text.
: column ( <string> n -- <string> )
>R "," split R> @@ ;

data :> 2 column ; map => E
data :> 3 column ; map => Ei
data :> 4 column ; map => Ew-deficit

\ Pairwise subtraction of two sequences
: -- ( seq1 seq2 -- seq3 )
' - zipwith ;

Ei Ew-deficit -- => Ew
E Ei -- => Ei-error
E Ew -- => Ew-error

: square ( x -- x*x )
dup * ;

: ^2 ( seq -- seq' )
['] square map ;

: sum ( seq -- n )
0 ['] + 1 reduce ;

: length ( seq -- n )
0 ['] 1+ 1 reduce ;

: average ( seq -- n )
dup sum swap length / ;

Ei-error ^2 average sqrt => RMS-Ei
Ew-error ^2 average sqrt => RMS-Ew

\ Print the two Root Mean Squares.
RMS-Ei ? RMS-Ew ?

Arnold Doray

unread,

May 15, 2012, 6:29:58 AM5/15/12

to

>
> I'm sorry, but in most cases you need different code to operate on
> different kinds of data.

Yes, but.... !

IF you had a way to specify type, you could imbue either your operators
with intelligence to know what to do at runtime (late-binding) or at
compile time by your compiler (early-binding, which I am not advocating).

Let's say you have +. for addition of floats and + for addition of
integers.

In our *hypothetical* Forth supporting types, to get + to do double duty
(to act on floats and ints), you might do this:

\ Definition for Floats
: + (( d d )) +. ;

\ Definition for Float + Int
: + (( d i )) real +. ;

\ Definition for Int + Anything else
: + (( i _ )) swap + ;

And presto, + would work for both integers and floats. The compiler takes
the special (( ... )) stack comments and converts them to late binding
decisions. Because I am using late binding, I am not interested in the
output type.

This is similar to what StrongForth does but (I am not sure of this),
StrongForth uses early binding. This is obviously faster, but requires
"discipline" on behalf of the programmer, since every word needs to be
typed -- both input and output. Which IMHO is rather tedious.

Let's say we now want to extend + to complex numbers.

\ Tells the compiler about a complex number. The type is called CPX.
type cpx

\ makes a boxed complex number
: complex ( n1 n2 -- n1+in2 )
... / Code to box the complex number somehow
cpx set-type ; / Set the type to CPX.

\ This word deconsructs the complex number
: ~complex ( c -- re img ) .... ;

\ Extend + to CPX domain
: + (( cpx cpx ))
~complex >R >R
~complex
swap R> +
swap R> +
complex ;

: + (( cpx _ ))
swap ~complex >R + R> complex ;

And now your + would work on complex numbers , floats, integers, etc.
If this isn't "simple" for professional programmers, I don't know what
is.

The drawback is that late binding is slow, and so is boxing/unboxing. A
GC is likely needed. But if speed is a serious concern, then nothing
stops you from "going traditional" and using separate + operators for
each type.

Cheers,
Arnold

humptydumpty

unread,

May 15, 2012, 1:59:24 PM5/15/12

to

On Sat, 12 May 2012 07:21:21 +0000, Arnold Doray wrote:

> How about this: Find all the pairwise products of squares divisible by
> 41 and cubes divisible by 63. Sum all such products which are less than
> 100,000,000,000.
>

> Not a fair challenge at all since this is where HOFs+Seqs shine. But
> here is the solution anyway:
>

> : square dup * ;
> : cube dup square * ;
> : squares ['] square map ;
> : cubes ['] cube map ;

> : /41? mod 41 0= ;
> : /63? mod 63 0= ;

> : ** ['] * zipwith ;
> : sum 0 ['] + 1 reduce ;
>

> 1 2 ...
> dup squares ' /41? filter
> swap cubes ' /63? filter
> **

> :noname 100000000000 < ; take-while
> sum
> .
>
> 68887253925 ok

>
> The Forth HOF/seq solution is simple because it is constructive; You
> only need to build the parts and put them together like "Lego blocks".
> There is no need to explicitly manage state.
>

> Cheers,
> Arnold

Hi!

Just for maintaining diversity of solutions:
8<---
\ Generator&filters solution; `|' - pipe prefix means filter-word

\ --- Helper Words
\ Borowed from M.L.Gasanenko backtracking papers:
: enter >r ;
: FORWARD postpone r@ postpone enter ; immediate
: BACKWARD postpone rdrop postpone exit ; immediate
\ ---
: :=0e 0e0 f! ;
: f+! dup f@ f+ f! ;
: fdiv? f/ fdup floor f= ;
: sqr fdup f* ;
: cub fdup sqr f* ;
: f2dup fover fover ;
: f2drop fdrop fdrop ;
\ --- Consumable Variables ---
: consumable -1 , here 0e0 f, constant ;
: consumed ( aconsumable -- aflag ) [ -1 cells ] literal + ;
: consumed? ( aconsumable -- flag.consumed ) consumed @ ;
: store ( aconsumable -- ; F: value -- )
dup consumed? IF dup f! consumed off ELSE fdrop drop THEN ;
: consume ( aconsumable -- ; F: -- value )
dup consumed? IF drop 0e0 ELSE dup f@ consumed on THEN ;

\ Problem specific words \
41e0 fconstant M 63e0 fconstant N
1e11 fconstant MAXPRODUCT
\ ======================= \
fvariable sump sump :=0e
consumable Sqrs consumable Cubs

: cub/N ( -- ; F: CB -- )
cub fdup N fdiv? IF Cubs store ELSE fdrop THEN ;

: sqr/M ( -- fincrSQ ; F: SQ -- )
sqr fdup M fdiv? IF Sqrs store ELSE fdrop THEN ;

: |updflags FORWARD nip nip Sqrs consumed? swap Cubs consumed? swap
BACKWARD ;

: |** ( forwd: F: -- product ) ( backw: F: -- )
Sqrs consumed? Cubs consumed? OR 0=
IF Sqrs consume Cubs consume f* FORWARD THEN
BACKWARD ;

: |product<MAX ( forwd: -1 -- -1; F: x -- x ) ( backw: -1 -- 0; F: x
-- )
fdup MAXPRODUCT f< IF FORWARD ELSE fdrop 0= THEN BACKWARD ;

: roots ( -- ; control flows like water in a plant - from roots to stem
and backwards )
1e0 1e0 -1 -1 -1 ( fincrSQ fincrCB fgen ; F: SQcnt CBcnt )
BEGIN dup
WHILE FORWARD
rot dup IF fswap 1e0 f+ fswap THEN
rot dup IF 1e0 f+ THEN
rot
REPEAT f2drop 2drop drop BACKWARD ;

: grows sump :=0e
roots |updflags fdup cub/N fover sqr/M |** |product<MAX sump f+! ;

: 1000harvest
utime 1000 0 do grows loop utime
2swap d- 1000 um/mod nip . ." miliseconds elapsed for 1000 grows." cr
sump f@ f>d d. cr ;
--->8

Test for spforth and gforth:
bo@ou:~/Software/src/forth$ spf4 adclf.f 1000harvest bye ; gforth adclf.fs
-e "1000harvest bye"
60 miliseconds elapsed for 1000 grows.
68887253925
252 miliseconds elapsed for 1000 grows.
68887253925

Have a nice day,
humptydumpty

WJ

unread,

Mar 4, 2013, 3:39:54 AM3/4/13

to

Paul Rubin wrote:

> Arnold Doray <inv...@invalid.com> writes:
> > Yes, not mainstream forthish but Reduce, Map, Zip/ZipWith, etc work much
> > better in Forth than in languages that do support them like Haskell or
> > Lisp. They are useful because they obviate the need to have explicit
> > loops, which means they can handle infinite streams.
>
> Does your Forth have garbage collection? It's been done, but of course
> it's not traditional. Without GC it's difficult to use HOF's with much
> flexibility. How do you operate on sequences of heap-allocated
> structures?
>
> > For example, to sum the first 10,000 squares divsible by 41 : ...
> > 560341738361350000 ok
>
> But that answer is wrong, it's actually the sum of the first 100000 of
> those squares. Some cut-and-paste error, or a program bug?
>
> > How to do this cleanly in mainstream Forth? While you can do this in
> > Haskell, I feel the Forth version is better. YMMV of course.
>
> s = sum $ take 10000 $ [a2 | a <- [1..], let a2 = a*a, a2`mod`41 == 0]

Clojure:

(defn sq [n] (* n n))
(defn pass? [n] (zero? (rem n 41)))

(reduce + (take 10000 (filter pass? (map sq (iterate inc 1)))))

>
> looks natural to me. But say instead of the first 100000 you just want
> the ones up to 100000:
>
> t = takeWhile (<= 100000) [a2 | a <- [1..], let a2 = a*a, a2`mod`41 == 0]
>
> and now in addition to the sum, you also want the count:
>
> print (sum t, length t)
>
> How would you do that in your setup? I guess that asks if the right
> thing happens if you DUP a sequence.
>
> > I am dealing with large geophysical datasets, with a few hundred million
> > entries. I wanted to see the current average as the calculation is being
> > performed. The incremental version allows this with just a little
> > tweaking.
>
> Well you should probably use floating point, but unless you're worried
> about int overflow, keeping a running count and sum seems easier than
> that incremental method.
>
> > A "seq" is an abstract datastream which has operators "head" and "tail".
> > It is like a Lisp list, but potentially infinitely long. An idea I stole
> > from Clojure.
>
> If you want Clojure why don't you use it? Anyway Clojure probably
> got it from Scheme, which got it from even older implementations.

WJ

unread,

Dec 25, 2014, 2:54:27 AM12/25/14

to

Factor:

USING: locals math.functions ;

1 0 0 ! i sum count
[ dup 10000 < ]
[| i sum count |
i sq :> squared
i 1 +
squared 41 divisor?
[ sum squared + count 1 + ]
[ sum count ]
if
]
while drop nip .

560417386135000