Interpreters and parsing in Forth

207 views
Skip to first unread message

none albert

unread,
Mar 29, 2022, 6:38:32 AMMar 29
to
I hate seeing separate parsers/interpreters written in Forth.
The following is an example to show how easy it can be done,
**but it is not supposedly a portable program **.

The first part within ---- is the only addition ciforth needs
to understand for example pascal programs up to lexical analysis.
The crux is revector NAME that takes the place of WORD and the
word PREFIX that foregoes the need space separated keywords
(that look silly in pascal).
-------------------------------------------------------------
\ NOTE: Charles Moore convention for parameters. (Extra spaces
\ around parameters.) A verb surrounded by spaces denotes truth.
\ Example of an uglified stack comment:
\ For a char return : a (Pascal) word may start with this.
\ ( char -- flag )

\ Return the address just past the parse area.
\ ( -- address ) \ uglified
: EOP SRC CELL+ @ ;

\ Push back the parse pointer by one character.
\ ( -- ) \ uglified
: PP-- -1 PP +! ;

\ All delimiters are prefixes, i.e. they do their own parsing.
DATA delimiters 0 , 128 ALLOT

\ For a char return : a (Pascal) word may start with this.
\ Now do the uglifying yourself.
: ?START delimiters $@ ROT $^ 0<> ;

\ A name now starts with the next non-blank, but ends on a blank
\ or delimiter. Return token , a string constant.
: TOKEN
\ Find first non-blank
_ BEGIN DROP PP@@ ?BLANK NOT OVER EOP = OR UNTIL
\ If we are at starter, return whole parse area.
DUP C@ ?START IF EOP OVER - EXIT THEN
\ A token ends at a blnk or the next delimiter
_ BEGIN DROP PP@@ DUP ?BLANK SWAP ?START OR UNTIL
\ A delimiter is not part of the current token
DUP C@ ?START IF PP-- THEN
\ start end --> stringconstant
OVER - ;
'TOKEN 'NAME 2 CELLS MOVE

-----------------------------------------------------------------------
Example, by Hopkins test convention:
{ : pascal-delimiters ":[](){};" ; -> }
{ pascal-delimiters delimiters $! -> }
{ BL ?START &A ?START &[ ?START -> FALSE FALSE TRUE }
{ 0 delimiters ! -> }

\ This is the existing definition for NAME as a reference.
\ Return next word from the input stream. ( -- add len)
: NAME2
_ BEGIN DROP PP@@ ?BLANK NOT OVER EOP = OR UNTIL \ ( -- start )
_ BEGIN DROP PP@@ ?BLANK UNTIL \ ( -- start end )
OVER - ;
---------------------------------
\ Now introduce some auxiliary definitions, thunks,
\ the pascal goes to the level of lexical analysis.
\ The thunks show that the analysis is correct.

\ For body of `CREATE word, return name of that word.
: $ego BODY> >NFA @ $@ ;

\ Define some (Pascal or other) keyword.
: some-keyword CREATE DOES> $ego "|" TYPE TYPE "|" TYPE CR ;

\ Apply xt (a defining word) to all names on the same line.
: a_row: ^J PARSE SAVE SET-SRC BEGIN DUP CATCH UNTIL DROP RESTORE ;
\ With xt (just a word) use all names on the same line to define an alias.
: aliases:
^J PARSE SAVE SET-SRC BEGIN DUP 'ALIAS CATCH UNTIL 2DROP RESTORE ;

-------------------------------------------
Parsing pascal goes as follows:

NAMESPACE pascal
INCLUDE pascal.frt

\ Parse a string with only words from `pascal defined.
\ Those words define what actually happens.
: parse-pascal pascal-delimiters delimiters $!
pascal EVALUATE PREVIOUS 0 delimiters ! ;

-----------------------------------
This is the lexer for pascal. It has been tested by
analysing a lisp implementation in pascal.

\ $Id: pascal.frt,v 1.13 2017/05/30 22:11:03 albert Exp $
\ Copyright (2016): Albert van der Horst {by GNU Public License version 2}
\ Protected by GPL, quality but no warranty.
\ Definition of Pascal tokenizer, part of the alif project.

: pascal-delimiters "-+*/=<>:;.{}()[]," ;
\ Print class and identification of token.
: .cl+id TYPE ": " TYPE TYPE CR ;

: keyword CREATE DOES> $ego "Keyword" .cl+id ;
: keyword-decl CREATE DOES> $ego "Keyword-decl" .cl+id ;
: ascii-operator CREATE DOES> $ego "Operator keyword" .cl+id ;
: operator CREATE PREFIX DOES> $ego "operator" .cl+id ;
: interpunction CREATE PREFIX DOES> $ego "Interpunction" .cl+id ;
: bracket CREATE PREFIX DOES> $ego "Bracket" .cl+id ;
: number -1 PP +! TOKEN "Number" .cl+id ; PREFIX


pascal DEFINITIONS PREVIOUS \ Make `pascal CURRENT/
: catch-all TOKEN "Identifier" .cl+id ; PREFIX
LATEST >NFA \ Keep on stack to seal the wordlist
'FORTH ALIAS FORTH
'keyword a_row: program begin end procedure function var const
'keyword a_row: if then else do while elsif repeat until
'keyword-decl a_row: packed array of label
'ascii-operator a_row: and or mod div not
'number aliases: 0 1 2 3 4 5 6 7 8 9
'interpunction a_row: . ; : , ..
'operator a_row: - + * /
'operator a_row: = < >
\ Two char operators must hide single char operators/interpunction.
'operator a_row: <> <= >= :=
'bracket a_row: [ ] ( )
: { &} PARSE "Comment" .cl+id ; PREFIX
: ' &' PARSE "String: " TYPE &' EMIT TYPE &' EMIT CR ; PREFIX
: (* "comment(*" TYPE CR
PP @ BEGIN &* PARSE 2DROP PP@@ NIP &) = UNTIL
PP @ OVER - 2 - TYPE CR "*)comment" TYPE CR ; PREFIX
\ Seal the pascal namespace using the address left on the stack,
\ by voiding catch-all's name
HERE 0 , ( nfa emptystring -- ) SWAP !
DEFINITIONS
: pWORDS pascal WORDS PREVIOUS ;
--------------------------------------

Sorry for shouting.

FORTH IS ITS OWN INTERPRETER.
IMPLEMENTING AN INTERPRETER IN FORTH IS AN ABOMINATION.

Groetjes.
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Zbig

unread,
May 31, 2022, 1:41:50 PMMay 31
to
> FORTH IS ITS OWN INTERPRETER.
> IMPLEMENTING AN INTERPRETER IN FORTH IS AN ABOMINATION.

I wonder if on that rule one could implement Forth-specific (even simple)
filesystem, using existing dictionary facility (I mean „its logic”). You know:
creating a file as analogy to word creation, create (sub)directory = create
vocabulary etc.

minf...@arcor.de

unread,
May 31, 2022, 2:47:35 PMMay 31
to
Just take it as tongue-in-cheek comment. After all Forth is a wonderful tool
to build your own DSL interpreter.

dxforth

unread,
Jun 1, 2022, 10:24:56 PMJun 1
to
Even better if one can avoid it. Moore once wrote a BASIC compiler in Forth.
Was everyone impressed? Yes - but a lot of effort for something neither he
nor they actually needed.

Marcel Hendrix

unread,
Jun 2, 2022, 1:17:47 AMJun 2
to
The most sophisticated tools I know (MATLAB, Octave, Scilab, NGSPICE,
Mathematica, ...) have some kind of interpreter on board. It makes using
them extremely easy and convenient.

-marcel

dxforth

unread,
Jun 2, 2022, 2:37:42 AMJun 2
to
By sophisticated you mean complicated? You do all the hard work so that
others can have it easy. Either one is getting handsomely paid for it -
or it's a labour of love :)

Marcel Hendrix

unread,
Jun 2, 2022, 4:54:04 AMJun 2
to
On Thursday, June 2, 2022 at 8:37:42 AM UTC+2, dxforth wrote:
> On 2/06/2022 15:17, Marcel Hendrix wrote:
> > On Thursday, June 2, 2022 at 4:24:56 AM UTC+2, dxforth wrote:
[..]
> > The most sophisticated tools I know (MATLAB, Octave, Scilab, NGSPICE,
> > Mathematica, ...) have some kind of interpreter on board. It makes using
> > them extremely easy and convenient.
> By sophisticated you mean complicated? You do all the hard work so that
> others can have it easy. Either one is getting handsomely paid for it -
> or it's a labour of love :)

A side-benefit of having an interpreter is that they have to open up their API,
and keep it that way. Most of the algorithms are based on open source
anyway, but it saves me a lot of time just looking at it or copying bits and
pieces.

-macel

none albert

unread,
Jun 2, 2022, 5:40:45 AMJun 2
to
I guessed that the BASIC compiler originated from Moore himself,
but I found no confirmation of that. What is you source of information?

Groetjes Albert

dxforth

unread,
Jun 2, 2022, 9:48:12 PMJun 2
to
On 2/06/2022 19:40, albert wrote:
> In article <t7971l$1csa$1...@gioia.aioe.org>, dxforth <dxf...@gmail.com> wrote:
>>On 1/06/2022 04:47, minf...@arcor.de wrote:
>>> Zbig schrieb am Dienstag, 31. Mai 2022 um 19:41:50 UTC+2:
>>>> > FORTH IS ITS OWN INTERPRETER.
>>>> > IMPLEMENTING AN INTERPRETER IN FORTH IS AN ABOMINATION.
>>>> I wonder if on that rule one could implement Forth-specific (even simple)
>>>> filesystem, using existing dictionary facility (I mean „its
>>logic”). You know:
>>>> creating a file as analogy to word creation, create (sub)directory = create
>>>> vocabulary etc.
>>>
>>> Just take it as tongue-in-cheek comment. After all Forth is a wonderful tool
>>> to build your own DSL interpreter.
>>
>>Even better if one can avoid it. Moore once wrote a BASIC compiler in Forth.
>>Was everyone impressed? Yes - but a lot of effort for something neither he
>>nor they actually needed.
>
> I guessed that the BASIC compiler originated from Moore himself,
> but I found no confirmation of that. What is you source of information?

FD V3N6 "Charles Moore's Basic Compiler Revisited" - M. Perry

It references Moore's "BASIC Compiler in FORTH" FORML Proceedings 1981

dxforth

unread,
Jun 2, 2022, 11:00:23 PMJun 2
to
I'm all for other folk writing and maintaining programs for the masses
as it leaves me free to write the small things I need, as I please.

But why go down the hard path of emulating the MATLABs and SPICE in
Forth if those tools already exist? Are they flawed? When I began
my forth compiler, the systems I could afford were rather miserable
so I felt some justification. OTOH I severely miscalculated both
the time it would take and my ability.

Marcel Hendrix

unread,
Jun 3, 2022, 1:23:30 AMJun 3
to
On Friday, June 3, 2022 at 5:00:23 AM UTC+2, dxforth wrote:
[..]
> But why go down the hard path of emulating the MATLABs and
> SPICE in Forth if those tools already exist? Are they flawed?

Well, in my view, yes.

And I don't emulate them, I use them by sending a few lines
of text and catch the output (or access their API).

Are all the tools we use just perfect and is there
nothing in them that even slightly bothers us?

-marcel

minf...@arcor.de

unread,
Jun 3, 2022, 2:58:23 AMJun 3
to
I used a linear algebra DSL in Forth for adaptive filter coefficient
calculation in the loop. A descendant of Kalman filters.

The DSL emulates classic Matlab syntax as far as it makes sense.
It couldn't have been done with Matlab coder or compiler.

Anton Ertl

unread,
Jun 3, 2022, 6:44:33 AMJun 3
to
dxforth <dxf...@gmail.com> writes:
>On 2/06/2022 15:17, Marcel Hendrix wrote:
>> The most sophisticated tools I know (MATLAB, Octave, Scilab, NGSPICE,
>> Mathematica, ...) have some kind of interpreter on board.

And if the interpreter is not designed in from the start, it still
appears in an ad-hoc form, according to Greenspun's tenth rule.

>You do all the hard work so that
>others can have it easy.

Division of labour and toolmaking are new concepts? [Meta: Oh boy,
dxforth's style of "discussion" is contagious.]

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

Marcel Hendrix

unread,
Jun 3, 2022, 7:31:58 AMJun 3
to
On Friday, June 3, 2022 at 12:44:33 PM UTC+2, Anton Ertl wrote:
[..]
> And if the interpreter is not designed in from the start, it still
> appears in an ad-hoc form, according to Greenspun's tenth rule.

"Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified,
bug-ridden, slow implementation of half of Common Lisp."

*Exactly* on point for NGSPICE :--(

-marcel

Andy Valencia

unread,
Jun 3, 2022, 10:35:19 AMJun 3
to
Marcel Hendrix <m...@iae.nl> writes:
> On Friday, June 3, 2022 at 12:44:33 PM UTC+2, Anton Ertl wrote:
> [..]
> "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified,
> bug-ridden, slow implementation of half of Common Lisp."

We found the same thing, but for programs which do search-for-solution.
In that case, the same statement applies, except "Lisp" -> "Prolog".

Andy Valencia
Home page: https://www.vsta.org/andy/
To contact me: https://www.vsta.org/contact/andy.html
Reply all
Reply to author
Forward
0 new messages