Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Forth Standard Library? (was "Requirements on the Forth standard")

408 views
Skip to first unread message

Jim Peterson

unread,
Jun 17, 2021, 3:14:43 PM6/17/21
to
[moved to a new thread, as I'm drifting off topic]

On Friday, June 11, 2021 at 4:33:00 PM UTC-4, Paul Rubin wrote:
> These days you'd bring up a new target using a cross compiler on a
> separate computer...

Yes, you would definitely do this if a cross compiler exists. I'm
more talking about novel, prototypical processors, for which
retargetting gcc (or lcc) would be more of a burden than getting
a Forth system going in assembly.

> I'd also say that the
> Forths used by professionals are way too complicated for the approach
> you're describing.

I'm definitely feeling this, too. There is something to be said for
the forth-on-forth work, that takes a system having a small subset of
Forth implemented and subsequently develops more of the standard using
just the subset words.

On Friday, June 11, 2021 at 7:14:04 PM UTC-4, Ruvim wrote:
> - Forth foundation library (FFL)
> https://irdvo.nl/FFL/

This! Absolutely! I'm not sure why it took me so long to hear about
this. It uses unfortunate naming conventions, and for some reason
the module docs are sorted alphabetically rather than by the categories
given on the front page, but this is the sort of stuff I'm asking
about. Why is something like this not standardized?.. at least
some of the more generic modules (maybe not the GTK-server one).

Also, it would be better to implement it in modules/namespaces like
Krishna mentions, below:

On Friday, June 11, 2021 at 7:54:41 AM UTC-4, Krishna Myneni wrote:
> A modular programming
> framework which supports name reuse, built on top of standard Forth, is
> described at the link below. We have used this framework successfully to
> write and use Forth source libraries (widely used throughout the kForth
> examples) for different applications.
>
> Krishna
>
>
> https://github.com/mynenik/kForth-64/blob/master/doc/modular-forth.pdf

The capabilities mentioned in that paper should probably be standardized,
too.

I know... I know.... I'm pushing to add all these existing things to the
standard, but I really like the language, and dream about its potential
and I think that focusing its progress down a road that brings a lot
of built-in capabilities would really help it succeed, versus the
herding of cats approach that gives the so many different implementations
of basic things that I've come across.

On Friday, June 11, 2021 at 7:14:04 PM UTC-4, Ruvim wrote:
> I use names that are qualified by namespaces in the following form:
>
> ns-a::ns-ab::ns-abc::wordname
>
>
> Each namespace is represented by a word list.
> Usually, I define a short alias (with limited scope).
>
> E.g.
> ns-a::ns-ab::ns-abc constant x
> And then use
> x::wordname

I'm not sure how your code works. Perhaps it's something to do with
the recognizers you mention, as to me, "ns-a::ns-ab::ns-abc::wordname"
would be a single (very long) word. I could see maybe "ns-a::" being
a word with the meaning "parse the next word and compile/interpret it
as though ns-a is the first thing in the search order". Then you
could "ns-a:: ns-ab:: ns-abc" all you wanted (but "x::" wouldn't exist.)

On Friday, June 11, 2021 at 10:40:43 PM UTC-4, Hugh Aguilar wrote:
> Anyway --- please show us your linked-list code --- implementing a linked-list was attempted previously
> by Peter Knaggs at EuroForth, but he failed badly. I doubt that there is anybody on the Forth-200x committee
> who knows how to implement a linked list.

That seems strange. In my mind, linked lists are about the easiest non-
native data structures to implement in Forth (second only to cell arrays).
Anyhow the code is here, with linked lists starting around line 341
(though I now defer to the snl/snn modules in the FFL):

https://pastebin.com/UWb4pZQq

This was an effort I was undertaking to explore the minimal structures
that must already exist within a Forth system, in an effort to say "Hey,
since Forth systems already have these, internally, why don't we expose
them via a standard interface for the user to use" (excluding the stk.*,
string.* and array.* words, that I was looking at as additional work).


> I don't think it is a good idea for the Forth-200x Standard to provide code-libraries as part of the Standard,
> primarily because the Forth-200x committee members are idiots not capable of writing code that works.

I think this misses the point, though. It's only the interface that needs
to be standardized. Proposing implementation source would be more of a
free-for-all from all contributors, where systems developers choose which
source set is the best in their opinion... a sort of friendly competition,
if you will. Source that violates the interface spec or demonstrates a
hole in the spec would, as always, cause debate.


--Jim

none albert

unread,
Jun 17, 2021, 4:24:16 PM6/17/21
to
In article <53ff0e8d-ea72-4105...@googlegroups.com>,
Jim Peterson <elk...@gmail.com> wrote:
<SNIP>
>On Friday, June 11, 2021 at 10:40:43 PM UTC-4, Hugh Aguilar wrote:
>> Anyway --- please show us your linked-list code --- implementing a linked-list was attempted previously
>> by Peter Knaggs at EuroForth, but he failed badly. I doubt that there is anybody on the Forth-200x committee
>> who knows how to implement a linked list.
>
>That seems strange. In my mind, linked lists are about the easiest non-
>native data structures to implement in Forth (second only to cell arrays).
>Anyhow the code is here, with linked lists starting around line 341
>(though I now defer to the snl/snn modules in the FFL):
>
>https://pastebin.com/UWb4pZQq
>

Each time there is a structure with a field that is a pointer you can make it
into a linked list by filling in the pointer with address of another such
struct. In many if not all Forth's the dictionary consists of structs
that de facto form a linked list, because they point to each other.
Do not take Hugh too seriously.

>
>--Jim

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Hugh Aguilar

unread,
Jun 17, 2021, 11:41:48 PM6/17/21
to
On Thursday, June 17, 2021 at 1:24:16 PM UTC-7, none albert wrote:
> In article <53ff0e8d-ea72-4105...@googlegroups.com>,
> Jim Peterson <elk...@gmail.com> wrote:
> <SNIP>
> >On Friday, June 11, 2021 at 10:40:43 PM UTC-4, Hugh Aguilar wrote:
> >> Anyway --- please show us your linked-list code --- implementing a linked-list was attempted previously
> >> by Peter Knaggs at EuroForth, but he failed badly. I doubt that there is anybody on the Forth-200x committee
> >> who knows how to implement a linked list.
> >
> >That seems strange. In my mind, linked lists are about the easiest non-
> >native data structures to implement in Forth (second only to cell arrays).
> >Anyhow the code is here, with linked lists starting around line 341
> >(though I now defer to the snl/snn modules in the FFL):

FFL has bugs.

> >https://pastebin.com/UWb4pZQq

Your code has a bug in it:
----------------------------------------------------------
\ Call xt ( ple -- ) for each element in the list.
: list.foreach ( xt pList -- )
swap >r
begin
list.next
dup
while
>r
2r@ swap execute
r>
repeat
drop
r> drop
;
----------------------------------------------------------
If you put it the node into another list, doing so will modify the link field (zero it if the node is appended, or set it to the previous first-node if the node is prepended). You have not yet obtained your next-node link from the node however, so when you go back to the start of your loop and do LIST.NEXT you will get the new link rather than the old link.

This is my code:
----------------------------------------------------------
: each ( i*x head 'toucher -- j*x ) \ toucher: i*x node -- j*x
>r
begin dup while \ -- node \r: -- 'toucher
r@ over .fore @ >r \ -- node 'toucher \r: -- 'toucher next
execute r> repeat drop
rdrop ;
----------------------------------------------------------

> Each time there is a structure with a field that is a pointer you can make it
> into a linked list by filling in the pointer with address of another such
> struct. In many if not all Forth's the dictionary consists of structs
> that de facto form a linked list, because they point to each other.
> Do not take Hugh too seriously.

Albert van der Horst doesn't know what a general--purpose data-structure is.
None of the Forth-200x committee know either.

Don't take the ANS-Forth cult too seriously --- in general, don't take any self-proclaimed programming expert seriously if they are lacking any source-code to show you --- the internet is full of fakes you claim programming expertise but don't have any source-code because they don't know how to write source-code.

A good example of a fake-expert is Peter Knaggs who went to EuroForth and bragged that he is the world's expert on linked-lists in Forth, but he didn't have any source-code and his design was a jumble of nonsense:
https://groups.google.com/g/comp.lang.forth/c/cMa8wV3OiY0/m/INBDVBh0BgAJ

If you want a copy of my novice-package, I will email it to you.
Don't give it to Stephen Pelc --- I don't want him to learn how to program in Forth.

Jim Peterson

unread,
Jun 18, 2021, 9:58:48 AM6/18/21
to
On Thursday, June 17, 2021 at 11:41:48 PM UTC-4, Hugh Aguilar wrote:
> FFL has bugs.

Maybe it's a fixer-upper :)

Plenty of software has bugs. As programmers with access to the source, it
is our social responsibility to hunt down and eradicate those bugs... or at
least illuminate them. Case in point:

> Your code has a bug in it:

> If you put it the node into another list, doing so will modify the link field (zero it if the node is appended, or set it to the previous first-node if the node is prepended). You have not yet obtained your next-node link from the node however, so when you go back to the start of your loop and do LIST.NEXT you will get the new link rather than the old link.

I see what you're saying. Having a C++ background, I'm of the inclination
to never ever alter the structure of a storage container while iterating
through it, so the idea of doing so didn't occur to me. I imagine adding a
comment that such alterations are not acceptable would be one fix, but I
like your version better.

I attempted to adjust my code to have a similar capability and I believe I
wound up having something virtually identical to yours:

\ Call xt ( ple -- ) for each element in the list.
: list.foreach ( xt pList -- )
swap >r
list.next
begin
dup
while
r@ over list.next >r
execute
r>
repeat
drop
r> drop
;

I believe the only difference comes about from what we individually define
to be "a list", where mine is one level of indirection separated from
yours.

It seems to me, though, that if you rewrite the list.next pointers (in
your notation, .fore) for *any* element in the list, you'll have to
rewrite them for *all* elements in the list. When passed a list1 like:

list1: A->B->C->D->E

if 'toucher moves B and D to a new list2, would this not alter the original
list1 so that you end up with:

list1: A->B

list2: D->B

and nodes C and E are forever lost (this assumes you pre-pend new items to
list2, as with my list.push, otherwise you get A->B->D and B->D (and a lot
of run-time) if you list.app).

Unless 'toucher actually looks at the *next* item in the list, not the one
passed in (doing ".fore @" to inspect its contents, decide its fate, and
then adjust lists accordingly using the value upon which you called .fore).
That would also explain the extra level of indirection that I have but
appears to be missing from yours: you pushed it in to 'toucher in order to
have that capability? But in that case, aren't you hitting the same bug by
preemptively storing the next element with ".fore @ >r"?

Or maybe yours is a doubly-linked list? That would explain the presence of
".fore". Perhaps there's an equivalent ".back"?

> If you want a copy of my novice-package, I will email it to you.

I'd be happy to look at it, as now I seem to have more questions that at the
start.

> Don't give it to Stephen Pelc --- I don't want him to learn how to program in Forth.

I won't, though I don't understand the reasoning. We should want everyone to
learn how to program in Forth.

On Thursday, June 17, 2021 at 4:24:16 PM UTC-4, none albert wrote:
> Each time there is a structure with a field that is a pointer you can make it
> into a linked list by filling in the pointer with address of another such
> struct.

This is true, but in order to reuse linked-list code like the list.foreach
routine and others, the link pointer should be at offset 0 in the node
structure, with the user's node data at other offsets. Anything else makes
it a bit awkward to find the actual node structure from a pointer to the node,
and routines like list.foreach, list.find, etc. need to be specialized for
every non-zero offset.

> In many if not all Forth's the dictionary consists of structs
> that de facto form a linked list, because they point to each other.

Yes. This was my original motivation for writing the code. I was really just
exploring possible interfaces that standard Forth systems might be able to
expose to the user with minimal effort, as the code likely already exists,
internally.

--Jim

Ruvim

unread,
Jun 18, 2021, 1:07:53 PM6/18/21
to
On 2021-06-17 22:14, Jim Peterson wrote:
> [moved to a new thread, as I'm drifting off topic]

> On Friday, June 11, 2021 at 7:14:04 PM UTC-4, Ruvim wrote:
>> - Forth foundation library (FFL)
>> https://irdvo.nl/FFL/
>
> This! Absolutely! I'm not sure why it took me so long to hear about
> this. It uses unfortunate naming conventions, and for some reason
> the module docs are sorted alphabetically rather than by the categories
> given on the front page, but this is the sort of stuff I'm asking
> about. Why is something like this not standardized?.. at least
> some of the more generic modules (maybe not the GTK-server one).

1. Why do you need to standardize this? What is a gain? And what does
this standardization mean at all?

2. A very good reason is required to include something into the standard
for a language, when this thing can be already implemented in a standard
way.

[...]


> It's only the interface that needs to be standardized.

What words you are lack to implement such an interface?



[...]

> On Friday, June 11, 2021 at 7:14:04 PM UTC-4, Ruvim wrote:
>> I use names that are qualified by namespaces in the following form:
>>
>> ns-a::ns-ab::ns-abc::wordname
>>
>>
>> Each namespace is represented by a word list.
>> Usually, I define a short alias (with limited scope).
>>
>> E.g.
>> ns-a::ns-ab::ns-abc constant x
>> And then use
>> x::wordname
>
> I'm not sure how your code works. Perhaps it's something to do with
> the recognizers you mention,

Yes, it's implemented via a kind of recognizers: https://git.io/JnET3

> as to me, "ns-a::ns-ab::ns-abc::wordname"
> would be a single (very long) word.

Yes, and it behaves like a single ordinary word.
For example, "POSTPONE" and "'" can be applied to this word.

E.g. the phrase:

' x::wordname execute

or, using a recognizer to obtain the xt:

'x::wordname execute

works as expected.



> I could see maybe "ns-a::" being
> a word with the meaning "parse the next word and compile/interpret it
> as though ns-a is the first thing in the search order". Then you
> could "ns-a:: ns-ab:: ns-abc" all you wanted (but "x::" wouldn't exist.)

The approach "first thing in the search order" is not enough hygienic,
and it can produce unexpected effects if the word changes the search order.

And if you make it hygienic, you probably cannot apply "POSTPONE" and
Tick "'" to this thing.



--
Ruvim

Jim Peterson

unread,
Jun 18, 2021, 1:31:32 PM6/18/21
to
On Friday, June 18, 2021 at 1:07:53 PM UTC-4, Ruvim wrote:
> 1. Why do you need to standardize this? What is a gain? And what does
> this standardization mean at all?
>
> 2. A very good reason is required to include something into the standard
> for a language, when this thing can be already implemented in a standard
> way.

The very good reason is the enhanced productivity to the users of the
language. There are many standardized words in Forth that could already
be defined with other standard words, so clearly the fact that it can
already be implemented is not a deterrent.

Take, for instance, C++. It has many standard libraries for linked lists,
maps, strings, vectors, deques, etc. All of these things could have been
left for implementation by the user, as far as the C++ standardization
committee was concerned, but they weren't. They were included as part of
the standard. I propose that Forth should do something very similar, since
whenever I start to think about doing something serious in it, I suddenly
want such capability. FFL certainly raises the bar, which is why I think
the standard should condone it, or something like it, and it should become
more integrated into most Forth systems.

> > It's only the interface that needs to be standardized.
> What words you are lack to implement such an interface?

I'm not sure what you're asking, here.

> > I could see maybe "ns-a::" being
> > a word with the meaning "parse the next word and compile/interpret it
> > as though ns-a is the first thing in the search order". Then you
> > could "ns-a:: ns-ab:: ns-abc" all you wanted (but "x::" wouldn't exist.)
> The approach "first thing in the search order" is not enough hygienic,
> and it can produce unexpected effects if the word changes the search order.
>
> And if you make it hygienic, you probably cannot apply "POSTPONE" and
> Tick "'" to this thing.

Yes. That's why "x::" wouldn't exist in the scheme. It's definitely a
hacked scheme, but not knowing how recognizers work, I don't know of any
other way to succinctly approach the usability you mention.

--Jim

Ruvim

unread,
Jun 18, 2021, 4:17:25 PM6/18/21
to
On 2021-06-18 20:31, Jim Peterson wrote:
> On Friday, June 18, 2021 at 1:07:53 PM UTC-4, Ruvim wrote:
>> 1. Why do you need to standardize this? What is a gain? And what does
>> this standardization mean at all?
>>
>> 2. A very good reason is required to include something into the standard
>> for a language, when this thing can be already implemented in a standard
>> way.
>
> The very good reason is the enhanced productivity to the users of the
> language.

If a user can use already some library in any standard system, why do
you need to standardize this library? How can this standardization
enhance productivity of the user?

And then, how to determine, what should be included into the standard,
and what should not?

And after all, you can try to design a proposal. Why not?


> There are many standardized words in Forth that could already
> be defined with other standard words,

Yes, but they are already there due to the historical reasons.

> so clearly the fact that it can
> already be implemented is not a deterrent.

I think, it's a deterrent in some extent, but not a critical factor.



> Take, for instance, C++. It has many standard libraries for linked lists,
> maps, strings, vectors, deques, etc. All of these things could have been
> left for implementation by the user, as far as the C++ standardization
> committee was concerned, but they weren't. They were included as part of
> the standard. I propose that Forth should do something very similar, since
> whenever I start to think about doing something serious in it, I suddenly
> want such capability. FFL certainly raises the bar, which is why I think
> the standard should condone it, or something like it, and it should become
> more integrated into most Forth systems.



Actually, I think that a standard library is a good thing.
But it can (and should, if any) be standardized separately.


--
Ruvim

dxforth

unread,
Jun 19, 2021, 5:33:18 AM6/19/21
to
On 19/06/2021 06:17, Ruvim wrote:
> ...
> Actually, I think that a standard library is a good thing.
> But it can (and should, if any) be standardized separately.

When folks refer to a standard they usually mean a standard as they see
it in other languages - namely a comprehensive set of functions designed
to address everyday needs. It's unlikely they're interested in arcane
things clf likes to talk about. They're less interested in clever.
Differences between systems is anathema to them. I understand the appeal
- I just doubt Forth is capable of delivering it. Forth is a language
for geeks; and it has delivered a standard for geeks. Being geeks, of
course they're unhappy with it. Nobody has ever asked 'newbies' what
they expect from a Forth Standard because it's never been about them.

Ruvim

unread,
Jun 19, 2021, 8:47:06 AM6/19/21
to
On 2021-06-19 12:33, dxforth wrote:
> On 19/06/2021 06:17, Ruvim wrote:
>> ... Actually, I think that a standard library is a good thing.
>> But it can (and should, if any) be standardized separately.
>
> When folks refer to a standard they usually mean a standard as they see
> it in other languages - namely a comprehensive set of functions designed
> to address everyday needs.


BTW, there are plenty functions (words) in the Forth standard that
address the everyday needs. E.g., arithmetic, logic, control flow,
memory access, creating new definitions, etc.


The most common (and low level) nonstandard functions for my everyday
needs are substring matching words: https://git.io/Jngre

What is your selection?




> It's unlikely they're interested in arcane
> things clf likes to talk about.  They're less interested in clever.
> Differences between systems is anathema to them.  I understand the appeal
> - I just doubt Forth is capable of delivering it.

Do you mean there are not enough people involved with Forth to deliver that?


> Forth is a language
> for geeks; and it has delivered a standard for geeks.  Being geeks, of
> course they're unhappy with it.  Nobody has ever asked 'newbies' what
> they expect from a Forth Standard because it's never been about them.

Usually, newbies don't use any standards at all. Newbies use tutorials,
manuals, forums, books, etc. (as sources for knowledge or learning)


--
Ruvim

Marcel Hendrix

unread,
Jun 20, 2021, 1:52:27 AM6/20/21
to
On Saturday, June 19, 2021 at 2:47:06 PM UTC+2, Ruvim wrote:
> On 2021-06-19 12:33, dxforth wrote:
> > On 19/06/2021 06:17, Ruvim wrote:
> >> ... Actually, I think that a standard library is a good thing.
> >> But it can (and should, if any) be standardized separately.
[..]
> The most common (and low level) nonstandard functions for my everyday
> needs are substring matching words: https://git.io/Jngre
>
> What is your selection?

It depends on what I'm busy with. For my iSPICE project:

FPREM \ ( F: r1 r2 -- rem[r1/r2] ) take FP remainder
FP/REM \ ( F: r1 r2 -- rem[r1/r2] ) ( -- div[r1/r2] ) take FP remainder and quotient
+E.R \ ( fieldwidth> -- ) ( F: r -- ) print in field
+E. \ ( -- ) ( F: r -- ) print in default field"
(F.N1) \ ( -- c-addr u ) ( F: r -- ) format in engineering notation
F.N1 \ ( F: r -- ) print in engineering notation with trailing space
(F.N2) \ ( -- c-addr u ) ( F: r -- ) format in engineering notation, no space
F.N2 \ ( F: r -- ) print in engineering notation, no space
F.N2+ \ ( F: r -- ) 3 decimal places
FDUMP \ ( a-addr +n -- ) print +n floats from a-addr
DFDUMP \ ( a-addr +n -- ) print +n dfloats from a-addr

CELLPACK \ ( c-addr1 u1 addr2 -- addr2 )
CELLPLACE+ \ ( c-addr1 u a2 -- )
CELLCHAR+ \ ( char addr -- )
CHAR-APPEND \ ( c-addr u c -- c-addr2 u2 )
CHAR-PREPEND \ ( c-addr u c -- c-addr2 u2 )
Split-At-WS \ ( addr1 n1 -- addr1 n2 addr1+n2 n1-n2 )
Split-At-Char \ ( addr1 n1 char -- addr1 n2 addr1+n2 n1-n2 )
Split-At-Char-NC \ ( addr1 n1 char -- addr1 n2 addr1+n2 n1-n2 ) not case-sensitive
Split-At-LastChar \ ( addr1 n1 char -- addr1 n2 addr1+n2 n1-n2 )
Split-At-LastChar-NC \ ( addr1 n1 char -- addr1 n2 addr1+n2 n1-n2 ) not case-sensitive
Split-At-Word \ ( addr1 n1 addr2 n2 -- addr1 n3 addr1+n4 n1-n4 )
Split-At-Word-NC \ ( addr1 n1 addr2 n2 -- addr1 n3 addr1+n4 n1-n4 ) not case-sensitive
-LEADING \ ( c-addr u -- c-addr2 u2 )
-LEADING-WS \ ( c-addr u -- c-addr2 u2 )
-TRAILING-WS \ ( c-addr u -- c-addr2 u2 )
TRIM \ ( c-addr1 u1 -- c-addr2 u2 ) -LEADING -TRAILING
DNIP \ ( str1 str2 -- str2 ) 2SWAP 2DROP
LAST-CHAR \ ( c-addr u -- c ) 1- DUP 0< IF 2DROP 0 ELSE + C@ ENDIF

ONLY-PATH \ ( c-addr u1 -- c-addr u2 ) Only path
ONLY-FILENAME \ ( c-addr u1 -- c-addr u2 ) Only filename
ONLY-BASEFILENAME \ ( c-addr1 u1 -- c-addr2 u2 ) filename without extension ( .xxxxxx )
FILE-EXISTS? \ ( c-addr u -- TRUE=exists ) TRUE if the file exists (See: FILE-STATUS).
FILE-TIME \ ( c-addr u -- ftime|0 ) 100ns increments since the last write to this file (0 is probably FILE NOT FOUND).
win->linux \ ( in-str buf -- c-addr u ) -- in goes a FULL path, out comes a FULL path

UNTAB \ ( c-addr u -- addr2 u2 ) remove TABs from this string (note: in-place!)
/SLASHING \ ( c-addr u -- addr2 u2 ) normalize to Linux slashes (note: in-place!)
\SLASHING \ ( c-addr u -- addr2 u2 ) normalize to Windows slashes (note: in-place!)
UNSLASH \ ( c-addr u -- addr2 u2 ) according to the OS, exchange '\' and '/' in this path-string (note: in-place!)
UNQUOTE \ ( c-addr u -- addr2 u2 ) Remove a leading and trailing double quote character
UNEXE \ ( c-addr u -- addr2 u2 ) Remove a trailing ".exe" string
+SLASH? \ ( c-addr u -- addr2 u2 ) Add a trailing "\" or "/" when it is not there yet
PLACE+ \ ( c-addr u 'sum$ -- ) add string to sum
$+ \ ( c-addr1 u1 c-addr2 u2 -- c-addr3 u3 ) add str2 to str1. Result str3 is member of a bufferpool
$>UPC \ ( c-addr u -- c-addr u ) convert (in place) to uppercase
$>LWC \ ( c-addr u -- c-addr u ) convert (in place) to lowercase
MUNCH ( char -- ) \ Like SKIP, but for the input stream.
SPLIT-cc \ ( c-addr u char1 char2 -- caddr1 u1 $rest ) Matches char2, skipping char1 char2 pairs.

LOCATE name \ find file and source line for `name'
INSPECT \ after a successful LOCATE, jump to file at line

-marcel

none albert

unread,
Jun 20, 2021, 4:35:36 AM6/20/21
to
In article <646fae32-6e60-4eff...@googlegroups.com>,
What interests me. Are neither added for completeness sake or are
they all used on a more or less regular bases?

>
>-marcel

Doug Hoffman

unread,
Jun 20, 2021, 6:07:49 AM6/20/21
to
On 6/19/21 8:47 AM, Ruvim wrote:


> The most common (and low level) nonstandard functions for my everyday
> needs are substring matching words: https://git.io/Jngre
>
> What is your selection?

For strings I use the following, which work on both allotted and
allocated ASCII strings:

:! ( c-addr u o -- ) replaces entire text of string with given text
:@ ( o -- c-addr u ) retrieves entire text of string in c-addr u format
:size ( o -- u ) retrieves the length of the entire string
:add ( c-addr u o -- ) appends text to end of string
:prepend ( c-addr u o -- ) prepends text to start of string
:+ ( o2 o -- ) concatenates str2 to end of str
:at ( idx o -- c ) returns char at idx
:to ( c idx o -- ) stores char to idx
:each ( o -- c t | f ) retrieves next char under true, or false if end
is reached
:uneach ( o -- ) resets each pointer to first char
:upper ( o -- ) converts entire string to upper-case
:lower ( o -- ) converts entire string to lower-case
:search ( c-addr u o -- t | f ) Boyer-Moore search, case sensitive
:searchCI case insensitive version of :search
Note that :search and :searchCI set internal pointers for START and END
of found text. If text is not found, pointers are unchanged. Subsequent
use of :search/:searchCI begin just after END regardless.
:insert ( c-addr u o -- ) insert text just after END. START and END are
moved to just after inserted text
:replace ( c-addr u o -- ) replace text delimited by START and END
:delete ( o -- ) delete text delimited by START and END
:= ( c-addr u o -- flag ) is given text equal to the entire string?
:=CI case insensitive version of :=
:@sub ( o -- c-addr u ) return substring delimited by START and END
:=sub ( c-addr u o -- flag ) is given text equal to substring delimited
by START and END? case sensitive
:=subCI case insensitive version of :=sub
:selectAll ( o -- )
:ch+ ( c o -- ) appends char to end of string
:chsearch ( c o -- flag ) case sensitive char search, if found START and
END delimit the char
:chsearchCI case insensitive version of :chsearch
:chinsert ( c o -- ) inserts char at END, START and END are moved to
just past inserted char
:first ( o -- c )
:second ( o -- c )
:last ( o -- c )
:split ( c o -- array-obj ) splits entire string, delimited by given
char, into an allocated array of strings
:sch&repl ( c-addr1 u1 c-addr2 u2 o -- flag ) search for text1 starting
at END. If found replace with text2. Success flag is returned. Will only
replace one occurrence. case sensitive
:sch&replCI case insensitive version of :sch&repl
:replall ( c-addr1 u1 c-addr2 u2 o -- flag ) replaces ALL occurrences
:replallCI case insensitive version of :replall
:reset ( o -- ) sets START and END to zero
:start ( o -- addr ) returns addr of START
:end ( o -- addr ) returns addr of END
:copy ( o -- o2 ) returns an allocated copy of the string
<free ( o -- ) frees all memory for an allocated string
is-a string ( o -- flag )
is-a-kindOf string ( o -- flag )

For any of the above that try to increase the size of the string, buffer
bounds checking is performed (only for allotted strings). Allocated
strings always have their sizes increased. There is no string stack.



Note that many of these words are also used on different data types, in
particular the following for arrays:
:size
:add
:+
:at
:to
:each
:uneach
:delete
:insert
:remove
:first
:second
:last
:search
<free
is-a array
is-a-kindOf array ( o -- flag )

-Doug
Message has been deleted

Marcel Hendrix

unread,
Jun 20, 2021, 7:05:39 AM6/20/21
to
On Sunday, June 20, 2021 at 12:07:49 PM UTC+2, Doug Hoffman wrote:
> On 6/19/21 8:47 AM, Ruvim wrote:
[..]
> :split ( c o -- array-obj ) splits entire string, delimited by given
> char, into an allocated array of strings

I have something like that but in practice I find too many exception
cases for the split, e.g. "(aap) (noot) (mies)" needs both '(' and ')'
as paired delimiters, a blank is not always equivalent to white-space
etc.. Still looking for an argument list parser with a simple and
powerful interface.

-marcel

Marcel Hendrix

unread,
Jun 20, 2021, 7:08:05 AM6/20/21
to
On Sunday, June 20, 2021 at 10:35:36 AM UTC+2, none albert wrote:
> In article <646fae32-6e60-4eff...@googlegroups.com>,
> Marcel Hendrix <m...@iae.nl> wrote:
> >On Saturday, June 19, 2021 at 2:47:06 PM UTC+2, Ruvim wrote:
> >> On 2021-06-19 12:33, dxforth wrote:
> >> > On 19/06/2021 06:17, Ruvim wrote:
> >> >> ... Actually, I think that a standard library is a good thing.
> >> >> But it can (and should, if any) be standardized separately.
> >[..]
> >> The most common (and low level) nonstandard functions for my everyday
> >> needs are substring matching words: https://git.io/Jngre
> >>
> >> What is your selection?
> >
> >It depends on what I'm busy with. For my iSPICE project:
[..]
> What interests me. Are neither added for completeness sake or are
> they all used on a more or less regular bases?

They were not written for completeness sake, I really needed them
(at least once :-) in writing a ~6000 line program over a period of,
by now, 4 years.

The ones shown are in miscutil.frt because I believe they have
some general appeal. There are many specialized but heavily used
string words that are not in miscutil.frt.

Sometimes I find a way to do a whole class of operations in a
much simpler way (e.g. switchable I/O device drivers simplify
string ops because you can 'TYPE to memory' instead). I did
not clean up miscutil.frt in such cases.

-marcel

Doug Hoffman

unread,
Jun 20, 2021, 9:16:31 AM6/20/21
to
Yes. It is not trivial code and I have had the same problems. But my
current version seems to be working well, though I admit to using it
most always for commas and whitspace as the split "character". Makes it
quite convenient to read in a .csv file.

-Doug

Krishna Myneni

unread,
Jun 20, 2021, 11:46:58 AM6/20/21
to
I rarely hear of newbies wanting to learn COBOL, a language which is
probably still in use. New programmers can certainly learn and use Forth
effectively to accomplish many ordinary tasks, without having to worry
much about "arcane" topics such as anonymous definitions and namespace
control. However, it is likely that they will encounter such topics when
using modern languages. The days of the single-language programmer are over.

Krishna



Jim Peterson

unread,
Jun 22, 2021, 2:41:02 PM6/22/21
to
On Saturday, June 19, 2021 at 5:33:18 AM UTC-4, dxforth wrote:
> On 19/06/2021 06:17, Ruvim wrote:
> > ...
> > Actually, I think that a standard library is a good thing.
> > But it can (and should, if any) be standardized separately.
> When folks refer to a standard they usually mean a standard as they see
> it in other languages - namely a comprehensive set of functions designed
> to address everyday needs. It's unlikely they're interested in arcane
> things clf likes to talk about. They're less interested in clever.
> Differences between systems is anathema to them. I understand the appeal
> - I just doubt Forth is capable of delivering it. Forth is a language
> for geeks; and it has delivered a standard for geeks. Being geeks, of
> course they're unhappy with it. Nobody has ever asked 'newbies' what
> they expect from a Forth Standard because it's never been about them.

I understand this concept and see the appeal of keeping the Forth language
"clean" by not addressing the needs of newbies overmuch. On the other
hand, what I'm proposing is not that we define mandatory words that are
always present or even optional wordsets, but libraries of well-constructed
and designed code that would need to be "included" or "required" (as per
the REQUIRED word) in order to even be available from the dictionary.
While one might argue "can't a user simply decide to have some third-party
library as a dependency and say as much", I think that having the standard
identify and promote a particular set of code and interfaces for the most
basic purposes (strings, lists, arrays, hash tables, etc.) would really
expand the scope and readability of what could be seen as "standard-
compliant" Forth programs... not to mention open up tutorial possibilites
for newcomers.

If anything, I would argue that the standard Forth words should be *reduced*
in scope, making a minimal kernel from which it is proven that all other
(currently-standard) words could be defined. From there, if a system
developer wishes to define more words than just that kernel, for efficiency's
sake, they could do so. All other words could have standard, source-level
definitions to fill in the gaps and make a basic system:

[UNDEFINED] nip [IF] : nip swap drop ; [THEN]
[UNDEFINED] rot [IF] : rot >r swap r> swap ; [THEN]
... etc. ...


I, myself, have created systems using just the standard words:

STATE ! @ , [ ] : ;

as well as the non-standard words (ala jonesforth):

DP LATEST

and an intimate knowledge of the processor on which it was implemented and
the format of the dictionary entries. In my mind, much of the appeal of
Forth is the ability to get such rudimentary capabilities in place and
subsequently bootstrap to a full-blown, capable language.



On Saturday, June 19, 2021 at 8:47:06 AM UTC-4, Ruvim wrote:
> The most common (and low level) nonstandard functions for my everyday
> needs are substring matching words: https://git.io/Jngre
>
> What is your selection?

On Sunday, June 20, 2021 at 1:52:27 AM UTC-4, Marcel Hendrix wrote:
> On Saturday, June 19, 2021 at 2:47:06 PM UTC+2, Ruvim wrote:
> > What is your selection?
> It depends on what I'm busy with. For my iSPICE project:
>
> FPREM \ ( F: r1 r2 -- rem[r1/r2] ) take FP remainder
> FP/REM \ ( F: r1 r2 -- rem[r1/r2] ) ( -- div[r1/r2] ) take FP remainder and quotient
> +E.R \ ( fieldwidth> -- ) ( F: r -- ) print in field
> [ ... ]

On Sunday, June 20, 2021 at 6:07:49 AM UTC-4, Doug Hoffman wrote:
> On 6/19/21 8:47 AM, Ruvim wrote:
> > What is your selection?
> For strings I use the following, which work on both allotted and
> allocated ASCII strings:
>
> :! ( c-addr u o -- ) replaces entire text of string with given text
> :@ ( o -- c-addr u ) retrieves entire text of string in c-addr u format
> :size ( o -- u ) retrieves the length of the entire string
> :add ( c-addr u o -- ) appends text to end of string
> [ ... ]

So, I'm seeing a lot of words for doing string operations, along with some
floating-point and arrays. What is everyone's opinion of FFL's string module,
str, (and arrays: car... I don't see any FFL math stuff except cpx)? If you
had to rewrite your code to use FFL for your processing, could you do it
easily? Is there a route for getting your code for the parts that are
missing from FFL into that library? I see that Doug Hoffman's string
objects appear to have START and END internal state variables. I'm not sure
how to fit that in to the FFL... maybe another level of indirection that
points to an FFL str and also has start and end?

--Jim

Paul Rubin

unread,
Jun 22, 2021, 8:38:41 PM6/22/21
to
Jim Peterson <elk...@gmail.com> writes:
> I think that having the standard identify and promote a particular set
> of code and interfaces for the most basic purposes (strings, lists,
> arrays, hash tables, etc.) would really expand the scope and
> readability of what could be seen as "standard- compliant" Forth

I wonder how many practical Forth programs use those datatypes. The
Fortran standard might specify a complete set of advanced math
functions, but the Perl standard (if it existed) probably wouldn't,
since Perl is targeted to a different set of application areas.

Similarly, maybe it's just me, but I think of Forth as targeted to the
type of program that doesn't make much use of strings, lists, or hash
tables. Arrays are simple enough that it's usually been enough to just
have a few idioms, but maybe a standard lib could help. There are
already a few string words and I guess there should be some more.

> If anything, I would argue that the standard Forth words should be
> *reduced* in scope, making a minimal kernel from which it is proven
> that all other (currently-standard) words could be defined.

The standard is already mostly written that way, with a core wordset and
some optional extensions. A little bit of reorganization would help in
minimizing the core while putting enough into it to enable implementing
the extensions, but the current situation in that regard is not too bad.

Going further, I think, would best be done by radically departing from
Forth as the standard currently describes. Of course many
implementations like 8th and Oforth do go in that direction.

> So, I'm seeing a lot of words for doing string operations, along with
> some floating-point and arrays. What is everyone's opinion of FFL's
> string module, str

I don't remember how it works so I might look into it again, but I
haven't felt the need for it in the limited amount of Forth programming
I've done. I'd frankly rather see standardized wordsets for
multitasking and cross-compilation.

minf...@arcor.de

unread,
Jun 23, 2021, 3:22:11 AM6/23/21
to
It all depends on the application domain. I use arrays and multitasking a lot,
but this multitasking is very hardware-specific because it is controlled by timers
and interrupts. Can't be standardized.

Would I use a fat Forth on a fat OS, I'd use OS-specific threads. Can't be
standardized either.

Some folks here put too much emphasis on standardizing bits and pieces.
I wonder where this strange legalistic attitude comes from. Progress and invention
always goes hand in hand with leaving old stuff behind...

Paul Rubin

unread,
Jun 23, 2021, 3:50:25 AM6/23/21
to
"minf...@arcor.de" <minf...@arcor.de> writes:
> but this multitasking is very hardware-specific because it is
> controlled by timers and interrupts. Can't be standardized.

What is the obstacle? Timers and interrupts are reasonably abstractable
across implementations and cpus.

dxforth

unread,
Jun 23, 2021, 4:29:56 AM6/23/21
to
On 23/06/2021 17:22, minf...@arcor.de wrote:
> ...
> Some folks here put too much emphasis on standardizing bits and pieces.
> I wonder where this strange legalistic attitude comes from.

Safety in conformity?

"MinForth V3.4 is widely conformant to the Forth-2012 draft standard (see
enclosed file forth-2012.pdf) including all wordsets (but for xchars). It
passes the standard test suite (and more)"

Stephen Pelc

unread,
Jun 23, 2021, 5:56:39 AM6/23/21
to
On Tue, 22 Jun 2021 17:38:38 -0700, Paul Rubin
<no.e...@nospam.invalid> wrote:

>Jim Peterson <elk...@gmail.com> writes:
>> I think that having the standard identify and promote a particular set
>> of code and interfaces for the most basic purposes (strings, lists,
>> arrays, hash tables, etc.) would really expand the scope and
>> readability of what could be seen as "standard- compliant" Forth
>
>I wonder how many practical Forth programs use those datatypes. The
>Fortran standard might specify a complete set of advanced math
>functions, but the Perl standard (if it existed) probably wouldn't,
>since Perl is targeted to a different set of application areas.

Our client with a Forth office application of 1.4 million lines
of Forth source code certainly uses strings, lists and arrays
very heavily.

Stephen

--
Stephen Pelc, ste...@vfxforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974
web: http://www.mpeforth.com - free VFX Forth downloads

none albert

unread,
Jun 23, 2021, 7:18:28 AM6/23/21
to
In article <0c439373-7bcc-4976...@googlegroups.com>,
I would argue that the standard is already much like this.
Only the Core words are require to be a Forth system.
So the core words are the kernel you're talking about.
Then to the standard is added a number of cooperating wordsets,
or what normal people (c, python, FORTRAN, Ada) would call libraries.
What is vehemently obnoxious is that commercial vendors and free suppliers
alike, mistake those libraries for base facilities, and think that
it is a good idea to jumble it all together in a fat Forthsystem.

What has to be done is taking seriously the point of view that wordsets
are optional. The standard is missing wording like:
"
A standard system shall document the way a library ("wordset") can
be used.
" not assuming all the wordsets are in the kernel.
The idea was hinted at by ENVIRONMENT c.s. but

Thus translated, your plea is for adding more wordsets to the
standard. Wether in a separate document or not is t.b.d.

Let us talk about strings.
Everybody serious about there string wordset, should document it
in a fashion mimicking other wordsets in the standard.
Everybody should stop trying their version of strings accepted and added
to a standard document. Instead competing string packages should
be properly documented and application writers should be able to
choose, like the situation with graphic libraries in most languages.

If we look at C, there is the kernel that is unusable in itself (!!!).
You may write a program without output, but it cannot be
compiled without the documentation of the c-implementation.
Then there is the string library, the standard library, etc. and
the prescript to use #include in the source and the a -I option
in the linker.

Why would Forth be afraid of a kernel that is of little use in itself?

>I, myself, have created systems using just the standard words:
>
>STATE ! @ , [ ] : ;
>
>as well as the non-standard words (ala jonesforth):
>
>DP LATEST
>
>and an intimate knowledge of the processor on which it was implemented and
>the format of the dictionary entries. In my mind, much of the appeal of
>Forth is the ability to get such rudimentary capabilities in place and
>subsequently bootstrap to a full-blown, capable language.

Right so. I designed ciforth in a similar way.
If the language need not be ISO-compliant, educational rather than
practical, you arive at "jonesforth" or my "yourforth".

>
>--Jim

Ilya Tarasov

unread,
Jun 23, 2021, 11:12:26 AM6/23/21
to
Visions about 2100 A.D. - no Forth apps, Forth-2100 draft, discussions about CORE EXT words...

dxforth

unread,
Jun 23, 2021, 11:55:39 PM6/23/21
to
On 23/06/2021 21:18, albert wrote:
>
> Why would Forth be afraid of a kernel that is of little use in itself?

Of what use is a kernel that can't communicate with the OS? What
file functions and numeric conversion don't involve string handling?
SKIP SCAN PACK PLACE +STRING -BLANKS >BLANKS C/STRING TRIM (factor
of -TRAILING) are non-standard string functions found in my kernel.
Most are written in assembler for speed/size.

Paul Rubin

unread,
Jun 24, 2021, 3:52:55 AM6/24/21
to
dxforth <dxf...@gmail.com> writes:
> Of what use is a kernel that can't communicate with the OS?

You could use library functions to communicate with the OS. Preventing
the kernel from doing so is helpful for sandboxing, though maybe that
concept isn't too relevant for Forth. That the Lua kernel can't
communicate with the OS is considered a good feature of Lua though. It
means you can use Lua as an embedded extension language for your
whatever-app, and not worry about the user's Lua script going off and
doing uncontrolled things with the system. The script can only compute
and return values (numeric, string, etc.) unless you link additional
functions into the Lua interpreter that can do things like access files.

dxforth

unread,
Jun 24, 2021, 5:22:35 AM6/24/21
to
Doing anything useful means trusting the programmer and forth is premised
on that. What I disagree with are these arbitrary choices of words and
categories that Forth standards have ordained. It may do for novice
implementers and resulted in a bureaucracy to manage it but that's about
it. A forth consisting of nothing but ANS/200x words raises a red flag.

"One original thought is worth a thousand mindless [ANS] quotings."
- Diogenes Laertius

Ruvim

unread,
Jun 24, 2021, 10:17:39 AM6/24/21
to
On 2021-06-22 21:41, Jim Peterson wrote:

> If anything, I would argue that the standard Forth words should be *reduced*
> in scope, making a minimal kernel from which it is proven that all other
> (currently-standard) words could be defined. From there, if a system
> developer wishes to define more words than just that kernel, for efficiency's
> sake, they could do so. All other words could have standard, source-level
> definitions to fill in the gaps and make a basic system:
>
> [UNDEFINED] nip [IF] : nip swap drop ; [THEN]
> [UNDEFINED] rot [IF] : rot >r swap r> swap ; [THEN]
> ... etc. ...

To my taste, it looks bad to wrap every definition from the optional
word sets into this construct.

Is there another option?


--
Ruvim

Jim Peterson

unread,
Jun 24, 2021, 10:59:08 AM6/24/21
to
Personally, I've considered the possibility of "yet another word" of the
form "?:", in the spirit of "?do" and "?dup", except for defining words that
don't exist. The above would become:

?: nip swap drop ;
?: rot >r swap r> swap ;

It would look cleaner, but ultimately would just feel like syntactic sugar.

--Jim

Ruvim

unread,
Jun 24, 2021, 11:15:59 AM6/24/21
to
Ideally, the definitions should be written as usual:

: nip ( x2 x1 -- x1 ) swap drop ;
: rot ( x3 x2 x1 -- x2 x1 x3 ) >r swap r> swap ;


But a special loader should be used that loads each definition if this
definition isn't yet available. Even better, if this loader also skips
intermediate definitions (helpers) that are not exported, if they are
not used.


--
Ruvim

Ruvim

unread,
Jun 24, 2021, 11:45:19 AM6/24/21
to
On 2021-06-24 17:59, Jim Peterson wrote:
How do you skip a multi-line definition?


--
Ruvim

Jim Peterson

unread,
Jun 24, 2021, 1:01:12 PM6/24/21
to
On Thursday, June 24, 2021 at 11:15:59 AM UTC-4, Ruvim wrote:
> Ideally, the definitions should be written as usual:
>
> : nip ( x2 x1 -- x1 ) swap drop ;
> : rot ( x3 x2 x1 -- x2 x1 x3 ) >r swap r> swap ;
>
>
> But a special loader should be used that loads each definition if this
> definition isn't yet available. Even better, if this loader also skips
> intermediate definitions (helpers) that are not exported, if they are
> not used.

I think this is misleading, to have something look like a colon definition
yet not operate as such. I had thought about a forward-definition
capability, where you could say:

<somekey1> forward nip
<somekey2> forward rot

and <somekey1> and <somekey2> are used, if ever the words "nip" and "rot"
appear, to somehow track down source for "nip" and "rot" and defined them
on the spot, but it feels like this would lead to defining a word in the
middle of another word definiton (where "nip" and "rot" had appeared), and
I doubt the, traditionally linear, dictionary space would cope with that
politely. If word definitions were relocatable, the system might be able
to shift the current word definition to make room for "nip" and "rot", but
that's a largish "if", especially given the potentional for "orig" and "dst"
objects on the compilation stack.

On Thursday, June 24, 2021 at 11:45:19 AM UTC-4, Ruvim wrote:
> How do you skip a multi-line definition?

I was envisioning leveraging something along the lines of POSTPONE :NONAME,
in the instances where the word did not need to be defined again, but that
would also likely need a S" MARKER UNDO" EVALUATE beforehand, as well as
S" DROP UNDO" EVALUATE afterward, in order to restore to a state where that
code was not injected into the dictionary in any manner (otherwise, why
not simply redefine them). It is unfortunate that one can't simply,
iteratively PARSE-NAME / REFILL until getting to a ";". The loop would
have to account for "/"- and "("-style comments and occurences of S", C",
S\", and .(, as well as any possible, user-defined words with similar
behavior.

Alternatively, one could simply declare that an isolated semicolon is not
allowed anywhere except at the actual end of a ?: definition.

--Jim

Paul Rubin

unread,
Jun 24, 2021, 3:49:29 PM6/24/21
to
dxforth <dxf...@gmail.com> writes:
> Doing anything useful means trusting the programmer

No I don't agree with that. Look at how web browsers have to run
untrusted Javascript code without letting it run amuck, for example.
Same thing with PostScript (which even famously resembles Forth) in
printable documents. At best it depends on the application area.

> and forth is premised on that.

Then either Forth is based on mistaken premises, or Forth is unsuited
to some application areas.

> A forth consisting of nothing but ANS/200x words raises a red flag.

Seems like a separate issue, but maybe true.

Paul Rubin

unread,
Jun 24, 2021, 3:52:27 PM6/24/21
to
Ruvim <ruvim...@gmail.com> writes:
> : nip ( x2 x1 -- x1 ) swap drop ;
> : rot ( x3 x2 x1 -- x2 x1 x3 ) >r swap r> swap ;

: nip {: x2 x1 :} x1 ;
: rot {: x3 x2 x1 :} x2 x1 x3 ;

Jim Peterson

unread,
Jun 24, 2021, 7:14:55 PM6/24/21
to
I would worry about such an implementation. I could see the resulting code being only as efficient as:

: nip
_st0 ! _st1 !
_st0 @
;

though I could see highly optimized implementations of {: on general-purpose processors maybe block-copying the data stack to a temporary buffer, adjusting the stack pointer, then pushing a chunk of the temporary buffer back on to the stack.

I think we'd all be a lot better off is words were required to declare their stack effects, and the implementation of {: could then simply track the effects through a definition and do an appropriate PICK to get the values as needed (where an equivalent PLACE would do the opposite... by my definition of PLACE, anyway).

Words with variable stack effect (like N>R, or GET-ORDER) are unsettling to me. What's more, if the static stack effects were known/declared, the system could go a long ways towards enforcing them and marking errors (e.g., when a loop leaks values on to the stack, or when you forget to drop a counter after the end of a loop, or even when an if/else or an early exit might not have cleaned up properly). I don't have any feel for a downside to requiring such a thing.

--Jim

dxforth

unread,
Jun 24, 2021, 8:58:04 PM6/24/21
to
On 25/06/2021 05:49, Paul Rubin wrote:
> dxforth <dxf...@gmail.com> writes:
>> Doing anything useful means trusting the programmer
>
> No I don't agree with that. Look at how web browsers have to run
> untrusted Javascript code without letting it run amuck, for example.
> Same thing with PostScript (which even famously resembles Forth) in
> printable documents. At best it depends on the application area.
>
>> and forth is premised on that.
>
> Then either Forth is based on mistaken premises, or Forth is unsuited
> to some application areas.

It's possible ANS/200x is unsuited but then who imagines one language
can/should do it all? 'Forth' I would define more broadly than what
we know as computer language.

>
>> A forth consisting of nothing but ANS/200x words raises a red flag.
>
> Seems like a separate issue, but maybe true.
>

ANS/200x has presented itself as a standardized language like the others
and will be compared with them. I think individual Forth implementations
have demonstrated better success in the fields they've targetted.

Ruvim

unread,
Jun 25, 2021, 2:58:38 AM6/25/21
to
On 2021-06-24 20:01, Jim Peterson wrote:
> On Thursday, June 24, 2021 at 11:15:59 AM UTC-4, Ruvim wrote:
>> Ideally, the definitions should be written as usual:
>>
>> : nip ( x2 x1 -- x1 ) swap drop ;
>> : rot ( x3 x2 x1 -- x2 x1 x3 ) >r swap r> swap ;
>>
>>
>> But a special loader should be used that loads each definition if this
>> definition isn't yet available. Even better, if this loader also skips
>> intermediate definitions (helpers) that are not exported, if they are
>> not used.
>
> I think this is misleading, to have something look like a colon definition
> yet not operate as such.

It isn't.

The idea is following.

When "included" is applied to a file name, the file is loaded as usual.

When "included-undefined" (the name is for example) is applied to the
file name, only the definitions that are undefined are loaded.



> On Thursday, June 24, 2021 at 11:45:19 AM UTC-4, Ruvim wrote:
>> How do you skip a multi-line definition?
>
> I was envisioning leveraging something along the lines of POSTPONE :NONAME,
> in the instances where the word did not need to be defined again, but that
> would also likely need a S" MARKER UNDO" EVALUATE beforehand, as well as
> S" DROP UNDO" EVALUATE afterward, in order to restore to a state where that
> code was not injected into the dictionary in any manner (otherwise, why
> not simply redefine them). It is unfortunate that one can't simply,
> iteratively PARSE-NAME / REFILL until getting to a ";". The loop would
> have to account for "/"- and "("-style comments and occurences of S", C",
> S\", and .(, as well as any possible, user-defined words with similar
> behavior.


It shows need of lexical blocks in Forth.


>
> Alternatively, one could simply declare that an isolated semicolon is not
> allowed anywhere except at the actual end of a ?: definition.


Alternatively, one could introduce full fledged lexical blocks.


--
Ruvim

Anton Ertl

unread,
Jun 25, 2021, 4:59:39 AM6/25/21
to
Jim Peterson <elk...@gmail.com> writes:
>On Thursday, June 24, 2021 at 3:52:27 PM UTC-4, Paul Rubin wrote:
>> Ruvim <ruvim...@gmail.com> writes:=20
>> > : nip ( x2 x1 -- x1 ) swap drop ;=20
>> > : rot ( x3 x2 x1 -- x2 x1 x3 ) >r swap r> swap ;
>> : nip {: x2 x1 :} x1 ;=20
>> : rot {: x3 x2 x1 :} x2 x1 x3 ;
>
>I would worry about such an implementation. I could see the resulting code=
> being only as efficient as:
>
>: nip
> _st0 ! _st1 !
> _st0 @
>;
>
>though I could see highly optimized implementations of {: on general-purpos=
>e processors maybe block-copying the data stack to a temporary buffer, adju=
>sting the stack pointer, then pushing a chunk of the temporary buffer back =
>on to the stack.

Why would you call such an implementation "highly optimized"?

Anyway, let's see:

Linux/FORTH (C) 2005 Peter Fälth Version 1.6-982-823 Compiled on 2017-12-03
Running on Linux 4.5.0-0.bpo.2-amd64 on x86_64
Current directory is /home/anton no block file
see nip
868D1BC 88D62DD 4 88D1EA3 21 prim NIP

88D62DD 8D6D04 lea ebp , [ebp+4h]
88D62E0 C3 ret near
ok
: nip {: x2 x1 :} x1 ; NIP redefined ok
see nip
8691F10 804FBE4 4 88C8000 5 normal NIP

804FBE4 8D6D04 lea ebp , [ebp+4h]
804FBE7 C3 ret near
ok
see rot
868D0EC 88D6226 15 88D1CD2 51 prim ROT

88D6226 8B4504 mov eax , [ebp+4h]
88D6229 8B4D00 mov ecx , [ebp]
88D622C 894D04 mov [ebp+4h] , ecx
88D622F 895D00 mov [ebp] , ebx
88D6232 8BD8 mov ebx , eax
88D6234 C3 ret near
ok
: rot {: x3 x2 x1 :} x2 x1 x3 ; ROT redefined ok
see rot
8691F24 804FBE8 15 88C8000 5 normal ROT

804FBE8 8B4504 mov eax , [ebp+4h]
804FBEB 8B4D00 mov ecx , [ebp]
804FBEE 894D04 mov [ebp+4h] , ecx
804FBF1 895D00 mov [ebp] , ebx
804FBF4 8BD8 mov ebx , eax
804FBF6 C3 ret near

As you can see, the "normal" definitions have exactly the same code as
the original "prim" definitions. Ok, maybe the "prim" definitions are
bad, so let's see if other Forth systems are any better:

VFX 4.72:
NIP
( 0804EA68 8D6D04 ) LEA EBP, [EBP+04]
( 0804EA6B C3 ) NEXT,
( 4 bytes, 2 instructions )
ok
see rot
ROT
( 0804EBC4 8BD3 ) MOV EDX, EBX
( 0804EBC6 8B5D04 ) MOV EBX, [EBP+04]
( 0804EBC9 8B4D00 ) MOV ECX, [EBP]
( 0804EBCC 894D04 ) MOV [EBP+04], ECX
( 0804EBCF 895500 ) MOV [EBP], EDX
( 0804EBD2 C3 ) NEXT,
( 15 bytes, 6 instructions )

SwiftForth i386-Linux 3.11.0 23-Feb-2021
see nip
804B50F 4 [EBP] EBP LEA 8D6D04
804B512 RET C3 ok
see rot
804B5EF EBX ECX MOV 8BCB
804B5F1 4 [EBP] EBX MOV 8B5D04
804B5F4 0 [EBP] EAX MOV 8B4500
804B5F7 EAX 4 [EBP] MOV 894504
804B5FA ECX 0 [EBP] MOV 894D00
804B5FD RET C3 ok

iforth:
$10131188 : NIP 488BC04883ED088F4500 H.@H.m..E.
$10131192 pop rbx 5B [
$10131193 add rsp, 8 b# 4883C408 H.D.
$10131197 push rbx 53 S
$10131198 ; 488B45004883C508FFE0 H.E.H.E..` ok

$101311A8 : ROT 488BC04883ED088F4500 H.@H.m..E.
$101311B2 pop rbx 5B [
$101311B3 pop rax 58 X
$101311B4 pop rcx 59 Y
$101311B5 push rax 50 P
$101311B6 push rbx 53 S
$101311B7 mov rbx, rcx 488BD9 H.Y
$101311BA push rbx 53 S
$101311BB ; 488B45004883C508FFE0 H.E.H.E..` ok

>I think we'd all be a lot better off is words were required to declare thei=
>r stack effects, and the implementation of {: could then simply track the e=
>ffects through a definition and do an appropriate PICK to get the values as=
> needed (where an equivalent PLACE would do the opposite... by my definitio=
>n of PLACE, anyway).

As lxf demonstrates, that is not necessary.

>Words with variable stack effect (like N>R, or GET-ORDER) are unsettling to=
> me.

Yes, they are unidiomatic. Fortunately, they are used rarely.

>What's more, if the static stack effects were known/declared, the sys=
>tem could go a long ways towards enforcing them and marking errors (e.g., w=
>hen a loop leaks values on to the stack, or when you forget to drop a count=
>er after the end of a loop, or even when an if/else or an early exit might =
>not have cleaned up properly).

For the leaking loop, the unbalanced if, and the early exit no
declaration is necessary: You can see that there is something fishy
going on by seeing that different paths result in different stack
depths. A declaration is useful for cases where the stack mistake is
on a code fragment that is always executed by the word.

>I don't have any feel for a downside to req=
>uiring such a thing.

1) There are occasional intentional cases that a stack-depth checker
would warn about. I think these are rare enough that it is
acceptable to insert something in these few places that shuts up
the checker.

2) There may be an effect like the one observed by Bjarene Stroustroup:

|As programmers learned C with Classes or C++, they lost the ability to
|quickly find the ``silly errors'' that creep into C programs through
|the lack of checking. Further, they failed to take the precautions
|against such silly errors that good C programmers take as a matter of
|course. After all, ``such errors don't happen in C with Classes.''
|Thus, as the frequency of run-time errors caused by uncaught argument
|type errors goes down, their seriousness and the time needed to find
|them goes up.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2021: https://euro.theforth.net/2021

Jim Peterson

unread,
Jun 25, 2021, 8:34:40 AM6/25/21
to
On Friday, June 25, 2021 at 2:58:38 AM UTC-4, Ruvim wrote:
> On 2021-06-24 20:01, Jim Peterson wrote:
> > On Thursday, June 24, 2021 at 11:15:59 AM UTC-4, Ruvim wrote:
> >> Ideally, the definitions should be written as usual:
> >>
> >> : nip ( x2 x1 -- x1 ) swap drop ;
> >> : rot ( x3 x2 x1 -- x2 x1 x3 ) >r swap r> swap ;
> >>
> >> [snip]
> >
> > I think this is misleading, to have something look like a colon definition
> > yet not operate as such.
> It isn't.
>
> The idea is following.
>
> When "included" is applied to a file name, the file is loaded as usual.
>
> When "included-undefined" (the name is for example) is applied to the

I still don't think this is a good idea. Having some code that looks
extraordinarily similar to one, very familiar thing but is actually a
different thing due to the manner in which the code is incorporated
("included-undefined") feels far more prone to misinterpretation than using a
"?:" operator, which might put the reader off for a moment until the
definition/intent of that operator is discovered, but ultimately would serve
as a constant reminder of what the writer intended.

I took a crack at "?:"... it maybe works? (don't put a bare semicolon
anywhere but the end, though):

: ?:
>in @
bl word find nip
if
\ skip definition:
drop
begin
parse-name
dup if
s" ;" compare 0= if exit then
else
2drop
refill 0= abort" end of input during ?: definition"
then
again
else
\ rewind and define:
>in ! :
then
;

> > It is unfortunate that one can't simply,
> > iteratively PARSE-NAME / REFILL until getting to a ";". The loop would
> > have to account for "/"- and "("-style comments and occurences of S", C",
> > S\", and .(, as well as any possible, user-defined words with similar
> > behavior.
> It shows need of lexical blocks in Forth.
> >
> > Alternatively, one could simply declare that an isolated semicolon is not
> > allowed anywhere except at the actual end of a ?: definition.
> Alternatively, one could introduce full fledged lexical blocks.

Adding lexical blocks feels like a route towards some language that is no
longer Forth. I don't think it would make it any easier to skip over a word
definition, either, since the skipper would still need to recognize when some
text is in a string or comment, versus not.



On Friday, June 25, 2021 at 4:59:39 AM UTC-4, Anton Ertl wrote:
> Why would you call such an implementation "highly optimized"?
>
> Anyway, let's see:
>
> Linux/FORTH (C) 2005 Peter Fälth Version 1.6-982-823 Compiled on 2017-12-03
> [shockingly optimized code]
>
> VFX 4.72:
> [more shockingly optimized code]
>
> SwiftForth i386-Linux 3.11.0 23-Feb-2021
> [still more shockingly optimized code]
>
> iforth:
> [lots more shockingly optimized code]

I'm shocked. Also, I get totally different results, that are more along the
lines of what I expect:


0[sparkle:~/personal/src/repos/j3]1137: gforth
Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
: nip { x1 x0 } x0 ; redefined nip ok
see nip
: nip
>l >l @local1 lp+2 ; ok
see >l
Code >l
0x00005615e4c4f147 <gforth_engine+775>: mov %r15,0x1d812(%rip) # 0x5615e4c6c960 <saved_ip>
0x00005615e4c4f14e <gforth_engine+782>: mov (%r14),%rax
0x00005615e4c4f151 <gforth_engine+785>: sub $0x8,%rbx
0x00005615e4c4f155 <gforth_engine+789>: add $0x8,%r14
0x00005615e4c4f159 <gforth_engine+793>: add $0x8,%r15
0x00005615e4c4f15d <gforth_engine+797>: mov %rax,(%rbx)
0x00005615e4c4f160 <gforth_engine+800>: mov -0x8(%r15),%rcx
0x00005615e4c4f164 <gforth_engine+804>: jmpq *%rcx
end-code
ok
see @local1
Code @local1
0x00005615e4c4f292 <gforth_engine+1106>: mov %r15,0x1d6c7(%rip) # 0x5615e4c6c960 <saved_ip>
0x00005615e4c4f299 <gforth_engine+1113>: mov 0x8(%rbx),%rax
0x00005615e4c4f29d <gforth_engine+1117>: sub $0x8,%r14
0x00005615e4c4f2a1 <gforth_engine+1121>: add $0x8,%r15
0x00005615e4c4f2a5 <gforth_engine+1125>: mov %rax,(%r14)
0x00005615e4c4f2a8 <gforth_engine+1128>: mov -0x8(%r15),%rcx
0x00005615e4c4f2ac <gforth_engine+1132>: jmpq *%rcx
end-code
ok
see lp+2
Code lp+2
0x00005615e4c4f17e <gforth_engine+830>: mov %r15,0x1d7db(%rip) # 0x5615e4c6c960 <saved_ip>
0x00005615e4c4f185 <gforth_engine+837>: add $0x10,%rbx
0x00005615e4c4f189 <gforth_engine+841>: add $0x8,%r15
0x00005615e4c4f18d <gforth_engine+845>: mov -0x8(%r15),%rcx
0x00005615e4c4f191 <gforth_engine+849>: jmpq *%rcx
end-code
ok


I have no idea how to begin to write a system that would be able to interpret
the given source into such tight machine language as you've shown. The appeal
of Forth, to me, was the simplicity with which I believed word definitions
simply placed down a list of tokens (XTs) corresponding to words within the
definition, for the most part (with some special ones for if/else, literals,
etc). Such simplicity allows people like me to implement a relatively
powerful system on prototype processors that don't have cross-compilation
support.

> For the leaking loop, the unbalanced if, and the early exit no
> declaration is necessary: You can see that there is something fishy
> going on by seeing that different paths result in different stack
> depths. A declaration is useful for cases where the stack mistake is
> on a code fragment that is always executed by the word.

> |As programmers learned C with Classes or C++, they lost the ability to
> |quickly find the ``silly errors'' that creep into C programs through
> |the lack of checking. Further, they failed to take the precautions
> |against such silly errors that good C programmers take as a matter of
> |course. After all, ``such errors don't happen in C with Classes.''
> |Thus, as the frequency of run-time errors caused by uncaught argument
> |type errors goes down, their seriousness and the time needed to find
> |them goes up.

I feel like you're trying to say that I should just be able to look at the
code I've written and notice there is a problem with the stack. I am not
at that level of expertise. Stack imbalances are the number one cause of
errors in the Forth stuff I write, and I'd rather get an alert from the
compiler. Maybe just some training wheels that tell me "oh, this word exited
without the proper differential to DEPTH" at runtime? Even my attempt at
defining "?:", above, was originally flawed because I had forgotten the nip
after find and had drop instead of 2drop before the refill. A simple stack
check would have alerted me. Maybe something like below (though EXIT and
THROW will cause trouble):

--Jim


[UNDEFINED] cell- [IF] : cell- -1 cells + ; [THEN]

\ Make a temporary stack for depth checks:
create _depth_stack 128 cells allot
_depth_stack value _ds_ptr
: >d _ds_ptr ! _ds_ptr cell+ to _ds_ptr ; \ no overflow checking!
: d> _ds_ptr cell- to _ds_ptr _ds_ptr @ ; \ no underflow checking!

\ Optionally trap calls to ; :
: orig; postpone ; ;
defer ; immediate
' orig; is ;

\ two code fragments to be postponed:
: delta_depth depth + >d ;
: check_depth d> depth <> abort" stack depth mis-match" ;

\ The ; trap:
: check;
postpone check_depth
orig;
['] orig; is ;
;


\ The main word:
: check(
-1 0 ( dir sum )
begin
parse-name

2dup s" --" compare 0= if
2drop
swap negate swap
false
else
s" )" compare if
over +
false \ continue loop
else
true \ exit loop
then
then
until
nip

postpone literal
postpone delta_depth

['] check; is ;

; immediate



\ A demonstration:
: myword check( a b -- c )
over - \ oops, should have been "swap -"
;

1 2 myword \ causes an abort with appropriate stack trace

Anton Ertl

unread,
Jun 25, 2021, 12:57:47 PM6/25/21
to
Jim Peterson <elk...@gmail.com> writes:
>On Friday, June 25, 2021 at 4:59:39 AM UTC-4, Anton Ertl wrote:
>> Why would you call such an implementation "highly optimized"?=20
>>=20
>> Anyway, let's see:=20
>>=20
>> Linux/FORTH (C) 2005 Peter F=C3=A4lth Version 1.6-982-823 Compiled on 201=
>7-12-03=20
>> [shockingly optimized code]
>>
>> VFX 4.72:=20
>> [more shockingly optimized code]
>>=20
>> SwiftForth i386-Linux 3.11.0 23-Feb-2021=20
>> [still more shockingly optimized code]
>>=20
>> iforth:=20
>> [lots more shockingly optimized code]

Only the LXF code was optimized, the others were the disassemblies of
the primitives. I tried the NIP definition on the others, and the
code was worse than their primitive implementation.

>I'm shocked. Also, I get totally different results, that are more along th=
>e
>lines of what I expect:
>
>
>0[sparkle:~/personal/src/repos/j3]1137: gforth

Gforth is not highly optimized; in particular, it does not optimize
locals. Also, the gforth engine is not the most efficient gforth
engine; gforth-fast is more efficient.

>I have no idea how to begin to write a system that would be able to interpr=
>et
>the given source into such tight machine language as you've shown.

You could read the papers referenced from
<http://www.complang.tuwien.ac.at/projects/rafts.html>. They don't
deal with locals, but you can model values in locals just like these
papers model values on the stack.

>The app=
>eal
>of Forth, to me, was the simplicity with which I believed word definitions
>simply placed down a list of tokens (XTs) corresponding to words within the
>definition, for the most part (with some special ones for if/else, literals=
>,
>etc).

If you want maximal simplicity, you will forego "highly optimized".
Maybe Peter Fälth can tell us how many lines of code his compiler has.

>> |As programmers learned C with Classes or C++, they lost the ability to=
>=20
>> |quickly find the ``silly errors'' that creep into C programs through=20
>> |the lack of checking. Further, they failed to take the precautions=20
>> |against such silly errors that good C programmers take as a matter of=20
>> |course. After all, ``such errors don't happen in C with Classes.''=20
>> |Thus, as the frequency of run-time errors caused by uncaught argument=20
>> |type errors goes down, their seriousness and the time needed to find=20
>> |them goes up.=20
>
>I feel like you're trying to say that I should just be able to look at the
>code I've written and notice there is a problem with the stack.

No. Currently we test the words and look for stack depth errors, and
we have enough of those to find and debug them quickly. If the Forth
system warned us about most stack depth mistakes, we might become less
proficient at dealing with the rest (and EXECUTE ensures that there is
a non-empty rest).

Your run-time test is a good one: We need to continue writing tests to
benefit from it, and it also catches depth bugs in using EXECUTE (at
least those that are tested).

P Falth

unread,
Jun 25, 2021, 1:27:02 PM6/25/21
to
For sure I can.

The compiler is 1350 lines, 40060 bytes
To that we should also add the definitions of the words that use the compiler
400 lines of code and 12756 bytes. These are like the primitives in a normal
Forth built with an assembler for the core words

BR
Peter

dxforth

unread,
Jun 25, 2021, 10:28:40 PM6/25/21
to
Nor would I wish it to. I enjoy Forth's 'closeness to the metal' and
the ability to control what is being compiled. I suppose demonstrating
stack operations can be just as effective as locals is good - but who
will interpret it that way. For myself, I can't imagine writing NIP
in anything but assembler; stack if I'm desperate; locals never.
0 new messages