I hate seeing separate parsers/interpreters written in Forth.
The following is an example to show how easy it can be done,
**but it is not supposedly a portable program **.
The first part within ---- is the only addition ciforth needs
to understand for example pascal programs up to lexical analysis.
The crux is revector NAME that takes the place of WORD and the
word PREFIX that foregoes the need space separated keywords
(that look silly in pascal).
-------------------------------------------------------------
\ NOTE: Charles Moore convention for parameters. (Extra spaces
\ around parameters.) A verb surrounded by spaces denotes truth.
\ Example of an uglified stack comment:
\ For a char return : a (Pascal) word may start with this.
\ ( char -- flag )
\ Return the address just past the parse area.
\ ( -- address ) \ uglified
: EOP SRC CELL+ @ ;
\ Push back the parse pointer by one character.
\ ( -- ) \ uglified
: PP-- -1 PP +! ;
\ All delimiters are prefixes, i.e. they do their own parsing.
DATA delimiters 0 , 128 ALLOT
\ For a char return : a (Pascal) word may start with this.
\ Now do the uglifying yourself.
: ?START delimiters $@ ROT $^ 0<> ;
\ A name now starts with the next non-blank, but ends on a blank
\ or delimiter. Return token , a string constant.
: TOKEN
\ Find first non-blank
_ BEGIN DROP PP@@ ?BLANK NOT OVER EOP = OR UNTIL
\ If we are at starter, return whole parse area.
DUP C@ ?START IF EOP OVER - EXIT THEN
\ A token ends at a blnk or the next delimiter
_ BEGIN DROP PP@@ DUP ?BLANK SWAP ?START OR UNTIL
\ A delimiter is not part of the current token
DUP C@ ?START IF PP-- THEN
\ start end --> stringconstant
OVER - ;
'TOKEN 'NAME 2 CELLS MOVE
-----------------------------------------------------------------------
Example, by Hopkins test convention:
{ : pascal-delimiters ":[](){};" ; -> }
{ pascal-delimiters delimiters $! -> }
{ BL ?START &A ?START &[ ?START -> FALSE FALSE TRUE }
{ 0 delimiters ! -> }
\ This is the existing definition for NAME as a reference.
\ Return next word from the input stream. ( -- add len)
: NAME2
_ BEGIN DROP PP@@ ?BLANK NOT OVER EOP = OR UNTIL \ ( -- start )
_ BEGIN DROP PP@@ ?BLANK UNTIL \ ( -- start end )
OVER - ;
---------------------------------
\ Now introduce some auxiliary definitions, thunks,
\ the pascal goes to the level of lexical analysis.
\ The thunks show that the analysis is correct.
\ For body of `CREATE word, return name of that word.
: $ego BODY> >NFA @ $@ ;
\ Define some (Pascal or other) keyword.
: some-keyword CREATE DOES> $ego "|" TYPE TYPE "|" TYPE CR ;
\ Apply xt (a defining word) to all names on the same line.
: a_row: ^J PARSE SAVE SET-SRC BEGIN DUP CATCH UNTIL DROP RESTORE ;
\ With xt (just a word) use all names on the same line to define an alias.
: aliases:
^J PARSE SAVE SET-SRC BEGIN DUP 'ALIAS CATCH UNTIL 2DROP RESTORE ;
-------------------------------------------
Parsing pascal goes as follows:
NAMESPACE pascal
INCLUDE pascal.frt
\ Parse a string with only words from `pascal defined.
\ Those words define what actually happens.
: parse-pascal pascal-delimiters delimiters $!
pascal EVALUATE PREVIOUS 0 delimiters ! ;
-----------------------------------
This is the lexer for pascal. It has been tested by
analysing a lisp implementation in pascal.
\ $Id: pascal.frt,v 1.13 2017/05/30 22:11:03 albert Exp $
\ Copyright (2016): Albert van der Horst {by GNU Public License version 2}
\ Protected by GPL, quality but no warranty.
\ Definition of Pascal tokenizer, part of the alif project.
: pascal-delimiters "-+*/=<>:;.{}()[]," ;
\ Print class and identification of token.
: .cl+id TYPE ": " TYPE TYPE CR ;
: keyword CREATE DOES> $ego "Keyword" .cl+id ;
: keyword-decl CREATE DOES> $ego "Keyword-decl" .cl+id ;
: ascii-operator CREATE DOES> $ego "Operator keyword" .cl+id ;
: operator CREATE PREFIX DOES> $ego "operator" .cl+id ;
: interpunction CREATE PREFIX DOES> $ego "Interpunction" .cl+id ;
: bracket CREATE PREFIX DOES> $ego "Bracket" .cl+id ;
: number -1 PP +! TOKEN "Number" .cl+id ; PREFIX
pascal DEFINITIONS PREVIOUS \ Make `pascal CURRENT/
: catch-all TOKEN "Identifier" .cl+id ; PREFIX
LATEST >NFA \ Keep on stack to seal the wordlist
'FORTH ALIAS FORTH
'keyword a_row: program begin end procedure function var const
'keyword a_row: if then else do while elsif repeat until
'keyword-decl a_row: packed array of label
'ascii-operator a_row: and or mod div not
'number aliases: 0 1 2 3 4 5 6 7 8 9
'interpunction a_row: . ; : , ..
'operator a_row: - + * /
'operator a_row: = < >
\ Two char operators must hide single char operators/interpunction.
'operator a_row: <> <= >= :=
'bracket a_row: [ ] ( )
: { &} PARSE "Comment" .cl+id ; PREFIX
: ' &' PARSE "String: " TYPE &' EMIT TYPE &' EMIT CR ; PREFIX
: (* "comment(*" TYPE CR
PP @ BEGIN &* PARSE 2DROP PP@@ NIP &) = UNTIL
PP @ OVER - 2 - TYPE CR "*)comment" TYPE CR ; PREFIX
\ Seal the pascal namespace using the address left on the stack,
\ by voiding catch-all's name
HERE 0 , ( nfa emptystring -- ) SWAP !
DEFINITIONS
: pWORDS pascal WORDS PREVIOUS ;
--------------------------------------
Sorry for shouting.
FORTH IS ITS OWN INTERPRETER.
IMPLEMENTING AN INTERPRETER IN FORTH IS AN ABOMINATION.
Groetjes.
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&
c.xs4all.nl &=n
http://home.hccnet.nl/a.w.m.van.der.horst