GA144 polyForth

91 views
Skip to first unread message

Howerd

unread,
Sep 28, 2012, 3:41:27 PM9/28/12
to Color...@googlegroups.com
Hi All,

I have just downloaded the latest arrayForth and polyForth systems for the GreenArrays GA144 EV001 evaluation board, dusted down the eval board and installed it...

Back in ~1978 I accidentally came across microForth for the COSMAC computer, with a 2 MHz CDP1802 and 12K of RAM, and typed  1 1 + .  for the first time.
Since then I have used various flavours of Forth, and I use this simple test to confirm that I can interact with the computer.

So nothing new here, I still get the answer 2, and Greg's saneForth Terminal looks very DOS-like (except that it actually runs under Win7 64 bit).

But this is actually something very different :
1. There is no assembler because this is running in a few of the F18 cores in one of the GA144 chips which have a Forth instruction set.
2. There is no cross compiler because the GA144 polyForth compiles itself on the chip - the PC is only a terminal.
3. There is no "inner interpreter" AKA "address interpreter" because the F18 cores are programmed to be the polyForth virtual machine. OK, you can argue semantics here...
4. Contradicting point 1, there is a sort of assembler, in that you can define extensions to the virtual machine, and also access any of the other 100+ F18's via Ganglia and Snorkels ( whatever they are - more docs please GA guys :-)

I also ran a speed test :
: asd  1000 for 1000 for 0 drop next next ;
takes about 3 seconds.
IIRC a 16 MHx Novix takes less than 1 second for this, and most 8 bit processors are some tens of seconds.

Speed wise, the combination of GA144, SPI EEPROM and SRAM, running polyForth looks plenty fast enough for the sort of embedded apps I usually work with.
Power wise, it looks good too.
Cost wise, well maybe I can haggle with GA...
Peripherals - there are plenty of fast counters, F18's in adundance that can be programmed to do simple serial or even 10M Ethernet, or you can use the built in SERDES.

All in all it could compete with an MSP430, 8051 or a PIC except that only with the GA144 do you get so many fast cores with fast I/O to play with.

The GA144 with polyForth seems to be to good not to use...

Just sharing my excitement - well done to all the GreenArray folks!

Best regards,
Howerd

NickM

unread,
Aug 1, 2013, 2:49:49 PM8/1/13
to color...@googlegroups.com
On 28 September 2012 21:41, Howerd <how...@yahoo.co.uk> wrote:
Hi All,

I have just downloaded the latest arrayForth and polyForth systems for the GreenArrays GA144 EV001 evaluation board, dusted down the eval board and installed it... (snip)
   I also ran a speed test :
: asd  1000 for 1000 for 0 drop next next ;
takes about 3 seconds.
IIRC a 16 MHx Novix takes less than 1 second for this, and most 8 bit processors are some tens of seconds.

(snip)


All in all it could compete with an MSP430, 8051 or a PIC except that only with the GA144 do you get so many fast cores with fast I/O to play with.

The GA144 with polyForth seems to be to good not to use...

Just sharing my excitement - well done to all the GreenArray folks!

Best regards,
Howerd

Nick Maroudas here:  Early reviewers often wondered what one could do with all those cores.  I am intrigued by the possibility that, given a board with hundreds of fast cores (plus a few inbuilt DACs) one could build a much more realistic synths than any that are currently on the market.  Because one could model two aspects of musical instruments: 
1. distributed systems (vibrating in 2D or 3D space) with cores as "lumped constants"
2. and/or parallel processing (of harmonic partials) with each voice having its own set of cores to handle its own set of partials.
Commercial synths sound terrible to my ears, and their sample rates are too slow to match the realism of modern recording technology (192-384kHz).

Re your speed test, I make that 3 seconds per million do loops.  Which is the same order as your 1 sec on a 16MHz Novix.  Always glad to hear of the efficient Novix, though  mine was only 4MHz:).  But is that the best Chuck's latest cpu the GA144 can do?  For comparison, I ran a 1000*1000 do 0 drop loop, on my desktop with MPE's vfxforth running on a 3GHz Pentium, and it took only a couple of millisec. MPE say that their vfxforth compiles to native code. Would the GA144 be faster if it were compiled from native ColorForth?

Caritas,
Nick 

--
You received this message because you are subscribed to the Google Groups "ColorForth" group.
To view this discussion on the web visit https://groups.google.com/d/msg/Color-Forth/-/jAIlClF8fWwJ.
To post to this group, send email to Color...@googlegroups.com.
To unsubscribe from this group, send email to Color-Forth...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/Color-Forth?hl=en.



--
Dr NG Maroudas, 47 Hachoresh, Kyriat Tivon, Lower Galilee, Israel 36051 
Tel  Home +972 48 337 315     Cellular +972 547 602 687

*****

 Every action of our lives touches on some chord that will vibrate in eternity.
Sean O'Casey


Dennis Ruffer

unread,
Aug 1, 2013, 4:06:11 PM8/1/13
to Color...@googlegroups.com

Nick,

 

Glad to see some interest and I encourage you to get yourself a chip and see what you can do with it.

 

It is not intended to be a speed demon, in it's current form and cerainly isn't expected to beat a desktop computer, unless you consider the number of watts each instruction takes.  I'm not good at calculating all the numbers or even remembering them for that matter, but you can certainly read about all of that on their web site.  I just know that it takes working with the chip before you can appreciate its capabilities.

 

I do know that I can make a faster loop with unext, but putting something onto the stack from within that loop gets to be interesting.

 

I also know that the "native" language of ColorForth is actually Intel x86 machine code, which has nothing to do with the machineForth that the GA144 runs natively.  What Howerd showed was pretty close to "native", but there are all sorts of optimizations that could be applied, to the point that the loop just disappears.  I'm sure that's not your point. ;)

 

I have done some timing with earlier generations of this chip and they tend to be rather complex to do properly.  The hardest part is to eliminate the observer from the results.  However, once again, the GA web site has a lot of information regarding the measurements they have obtained, which are consistent with my observations.

 

So, try it!  You just might like it. ;)

 

DaR

To unsubscribe from this group and stop receiving emails from it, send an email to Color-Forth...@googlegroups.com.


To post to this group, send email to Color...@googlegroups.com.

NickM

unread,
Aug 2, 2013, 7:45:03 AM8/2/13
to Color...@googlegroups.com
NickM here:  Pardon my confusion, but the numbers from Chuck's colorforth website are:
"
GreenArrays' Evaluation Board. It has 2 GA144 multi-computer chips, each with 144 f18a computers. Total of 288 computers [each] running at 650 Mips or 194 Gips [total ips for 288 cores]."

I am puzzled because Chuck's original Novix chip was extremely efficient (1 or 2 instructions per cycle) and Howerd's test results confirm its efficiency:  1 loop per 16 clock cycles.  Howerd's Novix board running at MHz delivers loops at a rate near MHz.  So a G144 board, with cores rated at 650 Mips, should be delivering at least 40M loops/sec, whereas it appears to be working only about the same rate as a 16 Mips Novix.  That is why I asked whether the G144 board might possibly run faster if programmed in "native" ColorForth. 

As regards native code, again pardon my confusion, but I read on an early CF blog that, CF was the machine language for Chuck's latest chip; so I assumed that CF was only launched on Intel to get PC programmers familiar with CF.  There are only a dozen or so hooks between CF and a PC:  the  most elementary Intel instructions plus a few BIOS calls.  I assumed these hooks would be removed and CF would run  directly on the new chip. Is this not so?

Having programmed music happily in "native" CF on a PC for a few years, I am looking forward to transferring my skill to the real thing on the real G144 board.  My questions are not meant to criticize, and I have no objection to shelving my hard-won CF skills and taking up PolyForth, but I am puzzled by these speed tests.

Caritas,
Nick
    
On 1 August 2013 23:06, Dennis Ruffer <daru...@gmail.com> wrote:

Nick,

 

It is not intended to be a speed demon, in it's current form and cerainly isn't expected to beat a desktop computer, unless you consider the number of watts each instruction takes.  I'm not good at calculating all the numbers or even remembering them for that matter, but you can certainly read about all of that on their web site. 

(snip)

I do know that I can make a faster loop with unext, but putting something onto the stack from within that loop gets to be interesting.

 

I also know that the "native" language of ColorForth is actually Intel x86 machine code, which has nothing to do with the machineForth that the GA144 runs natively.

(snip)

 

 

So, try it!  You just might like it. ;)

 

DaR

 

From: Color...@googlegroups.com [mailto:Color...@googlegroups.com] On Behalf Of NickM
Sent: Thursday, August 01, 2013 11:50 AM
To: color...@googlegroups.com
Subject: Re: [Color-Forth:434] GA144 polyForth

On 28 September 2012 21:41, Howerd <how...@yahoo.co.uk> wrote:

Hi All,

I have just downloaded the latest arrayForth and polyForth systems for the GreenArrays GA144 EV001 evaluation board, dusted down the eval board and installed it... (snip)

   I also ran a speed test :

: asd  1000 for 1000 for 0 drop next next ;
takes about 3 seconds.
IIRC a 16 MHx Novix takes less than 1 second for this, and most 8 bit processors are some tens of seconds.

(snip)

All in all it could compete with an MSP430, 8051 or a PIC except that only with the GA144 do you get so many fast cores with fast I/O to play with.

The GA144 with polyForth seems to be to good not to use...

Just sharing my excitement - well done to all the GreenArray folks!

Best regards,
Howerd


Nick Maroudas here:  Early reviewers often wondered what one could do with all those cores.  I am intrigued by the possibility that, given a board with hundreds of fast cores (plus a few inbuilt DACs) one could build a much more realistic synth than any that are currently on the market.  Because one could model two aspects of musical instruments: 

1. distributed systems (vibrating in 2D or 3D space) with cores as "lumped constants"
2. and/or parallel processing (of harmonic partials) with each voice having its own set of cores to handle its own set of partials.
Commercial synths sound terrible to my ears, and their sample rates are too slow to match the realism of modern recording technology (192-384kHz).

Re your speed test, I make that 3 seconds per million do loops.  Which is the same order as your 1 sec on a 16MHz Novix.  Always glad to hear of the efficient Novix, though  mine was only 4MHz:).  But is that the best Chuck's latest cpu the GA144 can do?  For comparison, I ran a 1000*1000 do 0 drop loop, on my desktop with MPE's vfxforth running on a 3GHz Pentium, and it took only a couple of millisec. MPE say that their vfxforth compiles to native code. Would the GA144 be faster if it were compiled from native ColorForth?


Caritas,
Nick 


Dennis Ruffer

unread,
Aug 2, 2013, 10:23:00 AM8/2/13
to Color...@googlegroups.com

Ah, I see your confusion.  650 Mips does not translate to faster loops/sec on one "thread" of execution, but multiply the speed times 288 "threads" of execution.  It's a parallel processor and a single core is roughly equivalent to a Novix, but each chip holds the equivalent of 144 Novix chips.  I'm not sure I can explain the speed increase properly, but it's not 1-to-1.  You have to be able to expand the problem up (or down depending on your perspective) before you can take advantage of the parallel concepts.  Simplistically, you just multiply times 144, but reality turns out to be much more complex.

 

Some of the ColorForth concepts (like for…next) made it into MachineForth, but initially, we did VentureForth on SwiftForth, gforth and MPE Forth, so CF is not required to do MF.  In the end, you only need to know the 32 MF instruction set.  See: http://www.greenarraychips.com/home/documents/greg/DB001-110412-F18A.pdf

--

NickM

unread,
Aug 2, 2013, 3:26:32 PM8/2/13
to Color...@googlegroups.com
NickM here:  Thanks for the explanation, Dennis, but I still don't get it.  In your previous posting you mentioned that one could get a speed increase by using unext. Now Chuck rates unext as follows, for a single node with a single pin driving an oscillator: 

http://www.greenarraychips.com/home/documents/pub/AP002-OSC.html

"At roughly 2.5 nS per unext iteration, the period would be roughly 12,207 loop iterations per cycle [of ~30 us for a 32kHz osc]"


I make this ~ 1/4 ms for 1 million unext loops; 12000 times less than the 3 sec delivered by Howerd's 1 million for...next loops .  So I ask again, where does that huge discrepancy in speed arise? Is it in the PolyForth virtual machine or in the SaneForth kernel?  Would it also arise if the G144 chip were compiled by Intel ColorForth with an interpreter written in native CF?  (Which, I believe, was the case a couple of years ago, with CF &  Okad becoming ArrayForth).

Caritas,

Nick


On 2 August 2013 17:23, Dennis Ruffer <daru...@gmail.com> wrote:

Ah, I see your confusion.  650 Mips does not translate to faster loops/sec on one "thread" of execution,


(snip)

 

Some of the ColorForth concepts (like for…next) made it into MachineForth, but initially, we did VentureForth on SwiftForth, gforth and MPE Forth, so CF is not required to do MF.  In the end, you only need to know the 32 MF instruction set. 

 (snip)

DaR

 


Dennis Ruffer

unread,
Aug 2, 2013, 5:22:03 PM8/2/13
to Color...@googlegroups.com

Nick,

 

I don't see where you are seeing any reference to PolyForth or SaneForth in that article.  The code is obviously ColorForth, based on its color syntax and compiles directly to MachineForth.

 

As far as any discrepency between whoever the author of that article was and Howerd's measurements, I have no opinion, but recommend that you do your own measurements, citing my common belief: http://www.spec.org/osg/news/articles/news9412/lies.html

 

DaR

 

From: Color...@googlegroups.com [mailto:Color...@googlegroups.com] On Behalf Of NickM


Sent: Friday, August 02, 2013 12:27 PM
To: Color...@googlegroups.com

--

Howerd

unread,
Aug 2, 2013, 5:29:28 PM8/2/13
to Color...@googlegroups.com
Hi Nick,

BTW thnks for the email - I only check this group occasionally :-)

The loop speed that I mentioned is for a polyForth running on five F18 cores, using off chip RAM - it is much slower than a loop on 1 core, which could run 1M loops in ~1/4 ms.
So the : asd ... : code is high level polyForth code, nothing to do with F18 instructions directly.

> Is it in the PolyForth virtual machine
Yes :-)


> Would it also arise if the G144 chip were compiled by Intel ColorForth with an interpreter written in native CF?
I think you are making this too complicated ;-)

Using ArrayForth ( or anything else ) to create F18 instructions would create code that would run at the F18 speed ( ~600 MIPS).
The point in my original post was that even running a 5-core polyForth gives speeds comparable to an 8051 or MSP430, at similar power levels.
You can always drop down to the F18 assembler code, for speed, too.

BTW I am very interested in music applications in CF and/or GA144 :-)

Best regards,
Howerd

John Drake

unread,
Aug 2, 2013, 11:16:00 PM8/2/13
to Color...@googlegroups.com
Hi guys.  I assumed that was the case.  Thank you for clearing that up.  If Jeff were still with us
I would expect an anecdote from the 4os days.  (I miss Jeff).  I remember years ago Jeff stating
that when implementing eForth on the F21 there were a couple of instructions that were difficult
to implement.  (Pick, roll, etc).  I wonder how close a Forth on the F18 could be to ANS Forth
and still be fast?


NickM

unread,
Aug 4, 2013, 3:18:42 PM8/4/13
to Color...@googlegroups.com
Dear Howerd,

Many thanks for giving a plain answer to a simple question.  I thought this was the case, but
Dennis Ruffis answered different, or in a way that I could not follow. Having cleared that up, with such a high core speed plus its being a massively parallel processor, the GA144 board looks like a diamond on a Woolworth tray (or a lonely little petunia in an onion patch).  It is also reassuring that high level Forth is not obligatory:  CF is still there for them as learnt to love its sweet simplicity.

Not surprised to hear you are interested in music for GA or CF.  Your Handel on CF4DOS was a treat.  I took it out of DOS and incorporated it into native CF05, where it made a useful  signature tune; Handel would ring out that a floppy was booting properly even when the screen was blank.     


If you email me a postal address I can snailmail a CD that explains what I am doing.  The CD also boots a program in MPE Forth running on Puppy Linux.  I know you are familiar with both because I read your nail-biting account of  bomb disposal work with Michael Pelc of MPE. 
My email: ngmaroudas at gmail dot com

Briefly, muggins had a notion 15 years ago that he would need MHz sample rates to synthesize accurate harmony (Nyquist? just ignore him!).  CF on a 3.2 GHz P4 proved fast enough to sound   four-part harmony at bandwidth 16Hz - 32kHz and sampling rate 300 kHz.  It was enough to satisfy my own ears; and to demonstrate that even a 2-core cpu at 3 MHz was too slow; further progress lay in the direction that Chuck was leading us:  the massively parallel processing cpu that GA now offer on a plate. Ray StMarie and I discussed this a few years ago; unfortunately, illness, relocating,  ageing, etc (not to mention Chuck's own trouble with SeaForth) brought a halt to my work with CF.  On the positive side, progress in recording technology has gone way above 300 kHz and sounds great (just listen to free DXD samples from 2L of Norway, on a 200 euro DAC from M2Tech of Italy).  It now looks like Nyquist needs to be supplemented by some new criterion, possibly in the time domain rather than pure frequency.  So it seems to me time to transfer my old CF string quartet app from Intel to GA, and exploit massive parallelism firstly to make a more realistic string sound. I do not know whether better sound will come from physical modelling, using a string of multiple cores as finite elements to model a real instrument; or from present practice, having a matrix of cores to manipulate the frequency response of a recorded instrument; or from a combination of the two approaches.  But it seems like fun to find out.  One thing is certain: a high res recording at 192 - 384 kHz is a memory monster.  In contrast, the score of a really good synthesizer (ie, same high res sample rate) demands much less bulk storage.   

Caritas,
Nick


On 3 August 2013 00:29, Howerd <how...@yahoo.co.uk> wrote:
Hi Nick,

BTW thnks for the email - I only check this group occasionally :-)

The loop speed that I mentioned is for a polyForth running on five F18 cores, using off chip RAM - it is much slower than a loop on 1 core, which could run 1M loops in ~1/4 ms.
So the : asd ... : code is high level polyForth code, nothing to do with F18 instructions directly.

> Is it in the PolyForth virtual machine
Yes :-)

(snip)
 
Using ArrayForth ( or anything else ) to create F18 instructions would create code that would run at the F18 speed ( ~600 MIPS).
 
The point in my original post was that even running a 5-core polyForth gives speeds comparable to an 8051 or MSP430, at similar power levels.
You can always drop down to the F18 assembler code, for speed, too.

BTW I am very interested in music applications in CF and/or GA144 :-)

Best regards,
Howerd


--
Dr NG Maroudas, 47 Hachoresh, Kyriat Tivon, Lower Galilee, Israel 36051 
Tel  Home +972 48 337 315     Cellular +972 547 602 687 
Email  ngmar...@gmail.com

jmdrake

unread,
Aug 26, 2014, 8:51:00 AM8/26/14
to Color...@googlegroups.com
Hello.  Sorry for a reply to such an old post but I haven't been doing a lot of ColorForth/arrayForth lately.  That said, I think it's inaccurate to say that a single 
GA144 core is only as powerful as a single Novix CPU.  According to GreenArrays documentation a SINGLE FA18 core approaches 

So why is polyForth so slow on a chip designed to run Forth?  Because the polyForth environment is a multicore virtual machine that emulates the type
of computer that polyForth was designed to run on.  polyForth is an interpreter, not a compiler.  It's to the GA144 what the bash shell is to a Unix machine.
Nobody would expect a Unix shell script to come close to compiled C code and nobody should expect polyForth or eForth to be anything close to native
code on  GA144.  By contrast the Novix architecture was designed to be the same as what polyForth was expecting.  I would expect an FA18 VM
running on top of a Novix to be slow as well.

As for machineforth versus colorforth, sure you can program a GA144 from an ANS Forth environment.  You could theoretically program it from a
Python environment.  I think that misses the point of Nick's question though.  A loop compiled into native FA18 code would be much much faster
than one running on top of the VM needed to support polyForth.  Additionally, ColorForth, while not machineForth, does borrow some ideas from
it such as multiple entry / exit points and tail recursion.  ColorForth is closer to machineForth than is ANS Forth.  Multiple entry points allows
for something I call "fall through factoring".


It's an interesting enough trick, although I haven't had a whole lot of times I've felt compelled to use it.

NickM

unread,
Aug 27, 2014, 8:29:27 AM8/27/14
to Color Forth
Hi, John.
Nick here:  No need to apologize for late reply, it is good to see continued interest in ColorForth & ArrayForth.  I am still convinced that Chuck's multicore chip could make an outstandingly realistic music synthesizer (required sample freq 96-392 kHz) and your post reassures me that CF is a good choice for speed, being nearest to machine code. But, due to age and ill health, I fear that conventional multi-core technology will market such synthesizers before I can get  going on my little GA hobby-board. Anyone who might be interested in a high-resolution synth project, here is a snip from my post of 4 August 2013:

"Briefly, muggins had a notion 15 years ago that he would need MHz sample rates to synthesize accurate harmony (Nyquist? just ignore him!).  CF on a 3.2 GHz P4 proved fast enough to sound   four-part harmony at bandwidth 16Hz - 32kHz and sampling rate 300 kHz.  It was enough to satisfy my own ears; and to demonstrate that even a 2-core cpu at 3 MHz was too slow; further progress lay in the direction that Chuck was leading us:  the massively parallel processing cpu that GA now offer on a plate. Ray StMarie and I discussed this a few years ago; unfortunately, illness, relocating,  ageing, etc (not to mention Chuck's own trouble with SeaForth) brought a halt to my work with CF.  On the positive side, progress in recording technology has gone way above 300 kHz and sounds great (just listen to free DXD samples from 2L of Norway, on a 200 euro DAC from M2Tech of Italy).  It now looks like Nyquist needs to be supplemented by some new criterion, possibly in the time domain rather than pure frequency.  So it seems to me time to transfer my old CF string quartet app from Intel to GA, and exploit massive parallelism firstly to make a more realistic string sound. I do not know whether better sound will come from physical modelling, using a string of multiple cores as finite elements to model a real instrument; or from present practice, having a matrix of cores to manipulate the frequency response of a recorded instrument; or from a combination of the two approaches.  But it seems like fun to find out.  One thing is certain: present high res recordings at 192 - 384 kHz are memory monsters.  In stark contrast, the midi "score" of a really high-res synthesizer (same sample rate) would demand much less bulk storage. "   
 
Now I can confirm my guess from 20 years ago: Nyquist 2*24000 Hz is nowadays considered inadequate for realistic sound; modern high-speed DAC designers recommend at least 4*24000 (96 kHz) to reproduce such complex sinusoidal wave forms. DXD recordings (353kHz) are nowadays cheaper (in real terms) than 44kHz CDs were in the 70s. You can now buy a DXD player for 200 euro (eg, Geek or M2Tech).  The only thing you cannot buy yet (at any price) is a DXD synth; and when they come they will be expensive.  Which is why the GA synth still seems a worthwhile project despite agonizingly slow progress.

​Caritas,
Nick​


For more options, visit https://groups.google.com/d/optout.



--
Dr NG Maroudas, 47 Hachoresh, Kyriat Tivon, Lower Galilee, Israel 36051 
Tel  Home +972 48 337 315     Cellular +972 547 602 687

***

Das tuchtige, und wenn auch falsch,
Wirkt tag fur tag, von haus zu haus;
Das tuchtige, wenns wahrhaft ist,
Wirk uber alle zeiten hinaus. 

-- Wolfgang von Goethe, evolutionary biologist 1749-1832

John Drake

unread,
Aug 27, 2014, 10:23:42 AM8/27/14
to Color...@googlegroups.com
Hey Nick.  Sounds like an interesting project.  Sadly I know little about signal processing.  But your
post brings up another problem that I've kicked around a solution for for several years.  The biggest
problem with ColorForth is code sharing.  There's nothing like SourceForge or Github for ColorForth.
That's largely due to how ColorForth files are stored.  I know blocks can be passed around, but that's
not a user friendly solution.  So here's my proposal.  We need a webserver that understands
ColorForth blocks, can decode and display them.  Then you could have an account, similar to a
Github account, where you could upload CF blocks containing your code.  Eventually I'd like to
add a block based version control system where each time you upload your blocks it saves the
ones that are different.  

I've found Python code that can decode CF blocks.  


I'll post more about this going forward.

daru...@gmail.com

unread,
Aug 27, 2014, 10:53:08 AM8/27/14
to Color...@googlegroups.com
Please look at my talk last weekend at SVFIG.  I've been converting colorForth to ASCII for many years now.


It provides a complete round trip of cf->af->cf, including the stand alone kernel.

DaR

jmdrake

unread,
Aug 27, 2014, 11:19:39 AM8/27/14
to Color...@googlegroups.com
Thanks!  I've downloaded it and I reading through the PDF docs now.
Reply all
Reply to author
Forward
0 new messages