Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

string-stack.4th

1,180 views
Skip to first unread message

hughag...@gmail.com

unread,
Mar 4, 2016, 12:42:53 AM3/4/16
to
On Saturday, November 17, 2012 at 10:24:26 AM UTC-7, Elizabeth D. Rather wrote:
> On 11/17/12 2:56 AM, Alex McDonald wrote:
> > On Nov 17, 12:18 pm, Hugh Aguilar <hughaguila...@yahoo.com> wrote:
> ...
> >> I have multiple stacks, with each stack associated to a particular
> >> data-type --- well, at least to a particular data-type size, as the
> >> double-wide stack will be for both double-precision integers and
> >> adr,len strings which are different data types but which both fit
> >> nicely on the double-wide stack. The point of doing this is to reduce
> >> the amount of stack-juggling that is done --- stack-juggling is a big
> >> part of why ANS-Forth is widely considered to be unreadable --- it is
> >> a major mess to have double-wide data intermingled with single-wide
> >> data on the parameter stack. I will also support local variables of
> >> course, but they won't be needed as a crutch to avoid stack-juggling
> >> as is often true in ANS-Forth --- with the multiple stacks, Forth
> >> purists who dislike locals will be able to write code without locals,
> >> while yet avoiding all of that ugly stack-juggling that typically
> >> plagues ANS-Forth.
> >
> > : foo s" abcd" over 1 ." the character " type ." is the first
> > character of " type ;
> > : hugh ??? ;
> >
> > Fill in the ??? please without introducing stack to stack moves.
> >
>
> The addr-len parameters for a string are not an example of "double-wide"
> data, they are a pair of single-cell items each of which has a meaning
> and potential use independent of the fact that together they describe a
> string.
>
> For example:
>
> S" Here is some text" PAD SWAP MOVE
>
> ...uses them individually but in a relevant way, as does Alex's example.
>
> Double-length integers are less likely to need their components treated
> individually, but even there it's awfully nice to be able to convert a
> double that you know (because of application knowledge) to a single by
> doing a DROP.
>
> Cheers,
> Elizabeth

I have written another file for the novice-package. This provides a stack of strings. My goal here was to get ANS-Forth up to the level of QBASIC in regard to text handling. I remember reading an article comparing Forth to QBASIC written in the early 1990s. The author said that Forth is a better language because it has features such as meta-compilation that allow the language to be extended, but that QBASIC was far superior in out-of-the-box capability, especially in regard to strings. Forth has always been like this --- a lot of potential, but little or no capability --- more thunder than lightning.

Using string-stack.4th, for example, we can do things like this:

: hugh ( -- )
s" abcd" >$
." the character: " dup$ 1 left$ .$
." is the first character of: " .$ ;

I do have stack-to-stack moves (the >$ above), despite the challenge to write the code without. I don't really understand why this is a problem. S" is the standard way to generate constant strings (note that the novice-package provides an S" that works in interpretive mode, and there can be more than one of them, all of which is an enhancement to ANS-Forth).

The strings are held on the heap. The code is efficient however, in that it minimizes the number of ALLOC and DEALLOC calls. In the above function, for example, the DUP$ does not allocate a new heap element.

I can provide string-stack.4th to anybody who wants it. It will also be in the next novice-package release, which is coming soon.

hughag...@gmail.com

unread,
Mar 4, 2016, 12:54:32 AM3/4/16
to
Here is another fun quote:

On Thursday, November 15, 2012 at 5:36:18 PM UTC-7, Stephen Pelc wrote:
> For practical applications, all Forth standards to date have been
> inadequate for commercial level string handling.

Both of these quotes come from this thread:
https://groups.google.com/forum/#!topic/comp.lang.forth/IAyf4wHMTig%5B151-175%5D

Note that my string-stack.4th is written entirely in ANS-Forth, as is all of the novice-package. Stephen Pelc's quote is yet another example of the committee members telling the Forth community that what they want to do is impossible --- the committee members seem to be obsessed by the desire to restrict what can be done with Forth.

hughag...@gmail.com

unread,
Mar 4, 2016, 9:01:30 PM3/4/16
to
On Thursday, March 3, 2016 at 10:42:53 PM UTC-7, hughag...@gmail.com wrote:
> On Saturday, November 17, 2012 at 10:24:26 AM UTC-7, Elizabeth D. Rather wrote:
> > The addr-len parameters for a string are not an example of "double-wide"
> > data, they are a pair of single-cell items each of which has a meaning
> > and potential use independent of the fact that together they describe a
> > string.
> >
> > For example:
> >
> > S" Here is some text" PAD SWAP MOVE
> >
> > ...uses them individually but in a relevant way, as does Alex's example.

My string-stack.4th could easily be rewritten to use UTF-8 or UTF-16 or any other character-coding system. I mention this because Ching mentioned the subject of Chinese text to me, which made me think about UTF-8.

The advantage of abstracting away the concept of a string, is that the internal definition of a string can be changed while the API remains the same. I could write a version of string-stack.4th that worked with UTF-8 and most of the programs using string-stack.4th would continue to work unchanged using either version. The only problems that would arise are when the strings on the string-stack are transferred over to the data-stack as an adr/cnt pair by $> --- but this should not be a problem because $> is only used when strings have to be stored in data-structures --- if the string is going to be stored in a data-structure and later put back on the string-stack, the adr/cnt will be fine, but the adr/cnt is only a problem when the programmer accesses the string assuming carnal knowledge about the string (assuming that it is ascii).

I'll make a note in the documentation explaining this issue --- tell the users that if they use $> to obtain an adr/cnt pair then they should not do anything with the adr/cnt pair except store it somewhere and eventually use >$ to put it back on the string-stack --- they should not access the string's chars manually because doing so involves a lot of assumptions about the string that aren't necessarily true.

Another problem with UTF-8 (or UTF-16 or any other UTF format) is that UCASE$ and LCASE$ are quite difficult to write, as there is no simple way to distinguish uppercase from lowercase in the myriad languages supported (some languages don't even have uppercase and lowercase; I don't think Chinese does). This is another issue I will have to mention in the documentation --- if people want their programs to work with UTF-8 or any encoding other than ascii, they have to avoid UCASE$ and LCASE$ as well as all of the case-insensitive comparison functions.

Mark Wills

unread,
Mar 5, 2016, 5:52:48 AM3/5/16
to
Good job. Pleased to see someone else working with string stacks.

hughag...@gmail.com

unread,
Mar 5, 2016, 6:17:20 PM3/5/16
to
On Saturday, March 5, 2016 at 3:52:48 AM UTC-7, Mark Wills wrote:
> Good job. Pleased to see someone else working with string stacks.

For the record, I was inspired to write string-stack.4th by Mark Wills (this is mentioned in my documentation). He has Turbo Forth code and an ANS-Forth implementation as well. My internal workings are different, but the general idea of a stack for strings came from him. To a large extent my API is the same as his, but there are some differences --- the API is largely derived from QBASIC with MID$ and +$ etc..

foxaudio...@gmail.com

unread,
Mar 6, 2016, 11:50:23 AM3/6/16
to
And for the record, I wrote these kind of concepts in the 1980s for Ti-Forth for the the TI-99 as a recovering BASIC programmer and submitted an article to Dr. Dobbs but it was declined. I re-wrote the code for HS/Forth in the 1990s for use in real world applications. In the 2000s somewhere I shared that code with Mark after I saw his cool work on the TI-99. (hard to forget your first machine) Mark re-wrote string stack code based on these concepts for Turbo Forth putting them back into the TI-99. That's kind of amusing. :-)


BF

foxaudio...@gmail.com

unread,
Mar 6, 2016, 1:54:30 PM3/6/16
to
I stripped my old string stack code down to what I feel is truly useful and also adapted it closer to current Forth thinking on the use of stack strings.
It's not totally compatible with that line of thinking but this code makes my old HS/Forth hobby system better able to compile other peoples code today and still gives me my crutch. :-)
This version also compiles on Win32Forth. I think is pretty standard code.

\ strings.HSF slightly new approach to my old string system

\ Wil Baden's toolbelt
: c+! ( n addr -- ) dup>r c@ + r> c! ;
: append-char ( char string -- ) dup>r count dup 1+ r> c! + c! ;
: +place ( addr1 n addr2 -- ) 2dup 2>r count + swap move 2r> c+! ;
: place ( addr1 n addr2 -- ) 2dup 2>r 1+ swap move 2r> c! ;

HEX
\ build a string stack/array to hold input strings
100 constant ss-width
ss-width 1- constant max$len

: stack-base here ss-width + ;

variable $depth

DECIMAL
: LEN ( $1 -- length) c@ ;
: ITEMS ( -- n) $depth @ ;
: new: ( -- ) 1 $depth +! ;
: drop$ ( -- ) - -1 $depth +! ;
: ]stk$ ( ndx -- addr) ss-width * stack-base + ;
: TOP$ ( -- addr) $depth @ ]stk$ ;
: collapse ( -- ) $depth off ;
: >new:top$ ( addr len -- top$) new: top$ dup >r place r> ;
: move$ ( $1 $2 -- ) >r COUNT R> PLACE ;
: push$ ( $ -- ) new: top$ move$ ;

\ used primitives to build counted string functions
: left$ ( $addr #char -- top len ) >r count drop r> >new:top$ ;
: right$ ( $addr1 n -- $addr2) >r count r> over min /string >new:top$ ;
: +$ ( $1 $2 -- top$ ) SWAP push$ count TOP$ +PLACE TOP$ ;
: mid$ ( adr$ start len -- top$) new: top$ 2>r + 1+ 2r> place top$ ;
: pos$ ( adr$ char -- position ) >r count 2dup r> scan drop nip swap - ;

: compare$ ( $1 $2 -- -n:0:n ) count rot count compare ;
: =$ ( $1 $2 -- flag ) compare$ 0= ;
: +char ( $ char -- $) over append-char ;

: :="" ( caddr -- ) 0 swap c! ;
: :=" ( $addr -- <text> ) [CHAR] " word swap move$ ;

\ these 2 words provide cheap automatic garbage collection
: $. ( $ -- ) count type collapse ;
: $! ( $1 $2 -- ) move$ collapse ;

: split ( str len char -- str1 len1 str2 len2 )
>r 2dup r> scan 2swap 2 pick - ;

: snip ( str len -- ) 1- swap 1+ swap ;
: separate ( str len char -- str1 len1 str2 len2 ) split 2swap snip 2swap ;

hughag...@gmail.com

unread,
Mar 6, 2016, 5:08:15 PM3/6/16
to
On Sunday, March 6, 2016 at 11:54:30 AM UTC-7, foxaudio...@gmail.com wrote:
> On Sunday, March 6, 2016 at 11:50:23 AM UTC-5, foxaudio...@gmail.com wrote:
> > On Saturday, March 5, 2016 at 6:17:20 PM UTC-5, hughag...@gmail.com wrote:
> > > On Saturday, March 5, 2016 at 3:52:48 AM UTC-7, Mark Wills wrote:
> > > > Good job. Pleased to see someone else working with string stacks.
> > >
> > > For the record, I was inspired to write string-stack.4th by Mark Wills (this is mentioned in my documentation). He has Turbo Forth code and an ANS-Forth implementation as well. My internal workings are different, but the general idea of a stack for strings came from him. To a large extent my API is the same as his, but there are some differences --- the API is largely derived from QBASIC with MID$ and +$ etc..
> >
> > And for the record, I wrote these kind of concepts in the 1980s for Ti-Forth for the the TI-99 as a recovering BASIC programmer and submitted an article to Dr. Dobbs but it was declined. I re-wrote the code for HS/Forth in the 1990s for use in real world applications. In the 2000s somewhere I shared that code with Mark after I saw his cool work on the TI-99. (hard to forget your first machine) Mark re-wrote string stack code based on these concepts for Turbo Forth putting them back into the TI-99. That's kind of amusing. :-)

When Mark discussed this topic with me, he recommended that I hold the strings in the heap. This is what I did. My string-stack is only 2 cells wide (the adr/cnt pair). My strings are not limited to 255 chars as yours are.

My own innovation was to work out a method in which I minimize the calls to ALLOC and DEALLOC on the assumption that these are the speed-killers. I mentioned in my first post that DUP$ does not call ALLOC but all it does is make a copy of the adr/cnt of the string that it is duplicating. The LEFT$ just modifies this adr/cnt (adjusts the cnt). The .$ just types out the string and drops the adr/cnt from the string-stack. The next .$ also types out the string but it has to DEALLOC the string before it drops the adr/cnt from the string-stack.

Most likely, my code is going to be faster than yours. You don't list DUP$ in your code, but if you did then it would call MOVE$ which copies the string --- copying a string is also a speed-killer that I avoid.

Anyway, I'm not saying that I invented the concept of the string-stack. I got the idea from Mark Wills. He apparently got the idea from you. The idea may have been independently invented several times (Forth has a stack for cells so a stack for strings is an obvious next step).

All I'm saying is that my string-stack.4th is the most efficient.

HAA

unread,
Mar 6, 2016, 6:32:49 PM3/6/16
to
hughag...@gmail.com wrote:
> ...
> Anyway, I'm not saying that I invented the concept of the string-stack. I got the idea
> from Mark Wills. He apparently got the idea from you. The idea may have been
> independently invented several times (Forth has a stack for cells so a stack for
> strings is an obvious next step).

The idea for a string stack has been around a long time e.g. "Stacking Strings
in Forth" J.Cassady, BYTE 2/1981.



foxaudio...@gmail.com

unread,
Mar 6, 2016, 7:45:39 PM3/6/16
to
I am pretty certain there was a an article in Forth Dimensions some time in the mid '80s that got me thinking in that fashion.

My goal was to be able to write string expressions without worrying about intermediate results and then quickly collapsing the string stack after getting the result. So the string words leave an address behind for the next function to consume. That's the only thing remotely innovative.

It works ok. I would be very interested in seeing how it compares in speed to using Allocate et al...

Hugh can you run some tests when you have some time?

I implemented Allocate for my own amusement and it seems much more complicated that simply grabbing some free space quickly and then setting a variable to 0 when you are finished.

However the efficiency of cutting up stack strings is very impressive. Beats copying or allocating every time I should think.

BF

hughag...@gmail.com

unread,
Mar 6, 2016, 10:09:14 PM3/6/16
to
On Sunday, March 6, 2016 at 5:45:39 PM UTC-7, foxaudio...@gmail.com wrote:
> It works ok. I would be very interested in seeing how it compares in speed to using Allocate et al...
>
> Hugh can you run some tests when you have some time?

I would like to get a program that does a lot of string-handling --- perhaps a port of an old QBASIC program --- and use that as a demonstration of my string-stack.4th. A program like this might also make for a good benchmark. Any suggestions would be appreciated.

> I implemented Allocate for my own amusement and it seems much more complicated that simply grabbing some free space quickly and then setting a variable to 0 when you are finished.

Allocate varies a lot in speed from one Forth system to another. This also depends upon the OS underneath. For the most part, allocate is pretty slow. It is necessary to use the heap though, if you are going to support strings of any size --- I think this is important, as that old 255 char limit is something that we want to put in the past and forget about.

> However the efficiency of cutting up stack strings is very impressive. Beats copying or allocating every time I should think.

I have two kinds of strings on the string-stack, which are "unique" and "derivative." A unique string is one that is on the heap. A derivative string is one that is inside of a unique string. A derivative string is indicated by having a negative cnt value. I have these kinds of words:

1.) JUGGLERS --- These include SWAP$ etc. and they just move items around on the string-stack.

2.) DUPLICATORS --- These include DUP$ etc. and they make a derivative item.

3.) CONSUMERS --- These include .$ etc. and they consume an item. If the item is a derivative it is just dropped. If the item is a unique, then FIX-DERIVATIVES is called first (this searches the string-stack for derivatives that are inside of this unique and changes them into uniques) and then it is dropped.

4.) MUTATORS --- These include UCASE$ etc. and they modify an item. If the item is a derivative it gets changed into a unique first, and if the item is a unique then FIX-DERIVATIVES gets called first, then the item is modified.

Because the DUPLICATORS put the derivative on top of the string-stack, you generally have derivatives above their uniques on the string-stack. This is good because a CONSUMER or MUTATOR of a derivative is faster than of a unique. For the most part, we can avoid doing ALLOC and DEALLOC because we are working with derivatives. This plan gets messed up when SWAP$ or ROT$ etc. are used and we end up with a unique above its derivatives on the string-stack --- then when we do a CONSUMER or MUTATOR of the unique we end up having to call FIX-DERIVATIVES which is the speed-killer --- this should be pretty rare in most programs however, because SWAP$ and ROT$ etc. tend to be pretty rare.

You asked above for a speed benchmark. This is somewhat difficult because it depends upon what a typical program is like (how much SWAP$ and ROT$ etc. are used). Since we don't actually have any programs at this time, it is somewhat early to say anything about what a typical program would be like.

BTW: My ultimate goal with all of this is to write a program that can translate Esperanto into English or Spanish or whatever. I thought of this way back in 1984 when I was learning Esperanto. It is too difficult for a program to translate directly from Spanish to English or vice-versa because those language are difficult to figure out (this was true 30 years ago and is still true). By making the human write in Esperanto however, most of the work can be dumped on the human. Translating from Esperanto to English or Spanish should be possible. All the Esperanto words have a suffix that tells you exactly what part of speech they are. In string-stack.4th I have functions for working with prefixes and suffixes that I provided specifically for taking apart Esperanto words. The idea is that the front-end of the program builds a lexical tree for the Esperanto sentence. There would be a back-end of the program specific to each output language that converts the lexical tree into English, Spanish, etc..

BTW: It was Marcos Cruz who got me thinking about Esperanto --- I haven't given any thought to that subject in almost 30 years --- at one time though, I did know a little about the subject.

Albert van der Horst

unread,
Mar 7, 2016, 7:27:32 AM3/7/16
to
hughag...@gmail.com writes:

>On Sunday, March 6, 2016 at 5:45:39 PM UTC-7, foxaudio...@gmail.com wrote:
>> It works ok. I would be very interested in seeing how it compares in spee=
>d to using Allocate et al... =20
>>=20
>> Hugh can you run some tests when you have some time?

>I would like to get a program that does a lot of string-handling --- perhap=
>s a port of an old QBASIC program --- and use that as a demonstration of my=
> string-stack.4th. A program like this might also make for a good benchmark=
>. Any suggestions would be appreciated.

>> I implemented Allocate for my own amusement and it seems much more compli=
>cated that simply grabbing some free space quickly and then setting a varia=
>ble to 0 when you are finished.

>Allocate varies a lot in speed from one Forth system to another. This also =
>depends upon the OS underneath. For the most part, allocate is pretty slow.=
> It is necessary to use the heap though, if you are going to support string=
>s of any size --- I think this is important, as that old 255 char limit is =
>something that we want to put in the past and forget about.

>> However the efficiency of cutting up stack strings is very impressive. B=
>eats copying or allocating every time I should think.

>I have two kinds of strings on the string-stack, which are "unique" and "de=
>rivative." A unique string is one that is on the heap. A derivative string =
>is one that is inside of a unique string. A derivative string is indicated =
>by having a negative cnt value. I have these kinds of words:

That is a neat trick, and I like the following analysis, which could be
a basis for a good practical package.

>1.) JUGGLERS --- These include SWAP$ etc. and they just move items around o=
>n the string-stack.

>2.) DUPLICATORS --- These include DUP$ etc. and they make a derivative item=
>.

>3.) CONSUMERS --- These include .$ etc. and they consume an item. If the it=
>em is a derivative it is just dropped. If the item is a unique, then FIX-DE=
>RIVATIVES is called first (this searches the string-stack for derivatives t=
>hat are inside of this unique and changes them into uniques) and then it is=
> dropped.

>4.) MUTATORS --- These include UCASE$ etc. and they modify an item. If the =
>item is a derivative it gets changed into a unique first, and if the item i=
>s a unique then FIX-DERIVATIVES gets called first, then the item is modifie=
>d.

>Because the DUPLICATORS put the derivative on top of the string-stack, you =
>generally have derivatives above their uniques on the string-stack. This is=
> good because a CONSUMER or MUTATOR of a derivative is faster than of a uni=
>que. For the most part, we can avoid doing ALLOC and DEALLOC because we are=
> working with derivatives. This plan gets messed up when SWAP$ or ROT$ etc.=
> are used and we end up with a unique above its derivatives on the string-s=
>tack --- then when we do a CONSUMER or MUTATOR of the unique we end up havi=
>ng to call FIX-DERIVATIVES which is the speed-killer --- this should be pre=
>tty rare in most programs however, because SWAP$ and ROT$ etc. tend to be p=
>retty rare.

I would like to extend the analysis by the following.
If you swap derived strings there is no problem. Only if you juggle
a derived string under a unique string there may be problems. In both
SWAP$ and ROT$ that can only be the case if the new string on top is
unique. It suffices to replace the one (swap) or two (rot) strings
by uniques if they are derived.

>You asked above for a speed benchmark. This is somewhat difficult because i=
>t depends upon what a typical program is like (how much SWAP$ and ROT$ etc.=
> are used). Since we don't actually have any programs at this time, it is s=
>omewhat early to say anything about what a typical program would be like.

The allocate contained in ciforth would be suitable, as it uses the first
free space available. You can just copy it from lina's forth.lab, or mail
me for a more heavily commented version.

>BTW: My ultimate goal with all of this is to write a program that can trans=
>late Esperanto into English or Spanish or whatever. I thought of this way b=
>ack in 1984 when I was learning Esperanto. It is too difficult for a progra=
>m to translate directly from Spanish to English or vice-versa because those=
> language are difficult to figure out (this was true 30 years ago and is st=
>ill true). By making the human write in Esperanto however, most of the work=
> can be dumped on the human. Translating from Esperanto to English or Spani=
>sh should be possible. All the Esperanto words have a suffix that tells you=
> exactly what part of speech they are. In string-stack.4th I have functions=
> for working with prefixes and suffixes that I provided specifically for ta=
>king apart Esperanto words. The idea is that the front-end of the program b=
>uilds a lexical tree for the Esperanto sentence. There would be a back-end =
>of the program specific to each output language that converts the lexical t=
>ree into English, Spanish, etc..

That is interesting. A collegue of mine Toon Witkam had this idea of translating
into Esperanto to disambiguate natural language. It turned into a project
for the EU and earned him a doctorate, later he became professor.
The EU didn't really like it too much, at the time it was slow (80's),
but mostly they didn't want too clear language in their treatises.
That was a multi-manyear project. I don't see you pull off a similar project
without a team.

>BTW: It was Marcos Cruz who got me thinking about Esperanto --- I haven't g=
>iven any thought to that subject in almost 30 years --- at one time though,=
> I did know a little about the subject.

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

hughag...@gmail.com

unread,
Mar 8, 2016, 1:45:01 AM3/8/16
to
On Monday, March 7, 2016 at 5:27:32 AM UTC-7, Albert van der Horst wrote:
> I would like to extend the analysis by the following.
> If you swap derived strings there is no problem. Only if you juggle
> a derived string under a unique string there may be problems. In both
> SWAP$ and ROT$ that can only be the case if the new string on top is
> unique. It suffices to replace the one (swap) or two (rot) strings
> by uniques if they are derived.

Well, that is a possibility. You are saying to have the juggler (SWAP$ ROT$ etc.) fix the problem. By comparison, I am leaving the problem until it really has to be fixed by a consumer or mutator.

I think that you are not considering what happens with a function such as +$ that takes two parameters rather than one. In this case, having a unique above a derivative is not a problem anyway --- +$ handles all four permutations efficiently.

Note that +$ is in my EXCEPTIONS group that don't fit in the other categories. Note also that I have a comment in my source-code mentioning that +$ was one of the most complicated functions I've ever written as a Forth programmer --- efficiency has its price!

Mark Wills

unread,
Mar 8, 2016, 2:29:08 PM3/8/16
to
There are various attempts at string stacks in the various editions
of Forth Dimensions over the years.

hughag...@gmail.com

unread,
Mar 10, 2016, 1:28:00 AM3/10/16
to
On Monday, March 7, 2016 at 5:27:32 AM UTC-7, Albert van der Horst wrote:
> hughag...@gmail.com writes:
> >BTW: My ultimate goal with all of this is to write a program that can trans=
> >late Esperanto into English or Spanish or whatever. I thought of this way b=
> >ack in 1984 when I was learning Esperanto. It is too difficult for a progra=
> >m to translate directly from Spanish to English or vice-versa because those=
> > language are difficult to figure out (this was true 30 years ago and is st=
> >ill true). By making the human write in Esperanto however, most of the work=
> > can be dumped on the human. Translating from Esperanto to English or Spani=
> >sh should be possible. All the Esperanto words have a suffix that tells you=
> > exactly what part of speech they are. In string-stack.4th I have functions=
> > for working with prefixes and suffixes that I provided specifically for ta=
> >king apart Esperanto words. The idea is that the front-end of the program b=
> >uilds a lexical tree for the Esperanto sentence. There would be a back-end =
> >of the program specific to each output language that converts the lexical t=
> >ree into English, Spanish, etc..
>
> That is interesting. A collegue of mine Toon Witkam had this idea of translating
> into Esperanto to disambiguate natural language. It turned into a project
> for the EU and earned him a doctorate, later he became professor.
> The EU didn't really like it too much, at the time it was slow (80's),
> but mostly they didn't want too clear language in their treatises.
> That was a multi-manyear project. I don't see you pull off a similar project
> without a team.

I agree that I need a team --- I think that one dozen beautiful women would be adequate --- I'll do the programming!

In the meantime, I upgraded the capability of string-stack.4th a bit. I can do this now:

s" foo:=bar" >$ ok
s" :=" >$ infix$ . -1 ok
extract$ ok
.$ bar ok
.$ foo ok

INFIX$ returns a flag indicating if it found the pattern in the string. The dot gave me -1 (TRUE) so I then used EXTRACT$ to remove the infix and return the prefix and suffix that were surrounding the infix. So, I parsed a little bit of Pascal code there (in Pascal, := is an assignment). I think my string-stack.4th could be used for lexical analysis just as well as Lex is used in the C world.

Coos Haak

unread,
Mar 10, 2016, 5:05:28 AM3/10/16
to
Op Wed, 9 Mar 2016 22:27:58 -0800 (PST) schreef hughag...@gmail.com:
Using strings on a heap, not a stack:

want split-text ok
"foo:=bar" ok
":=" ok
split-text ok
type foo ok
type bar ok

groet, Coos

hughag...@gmail.com

unread,
Mar 10, 2016, 12:28:04 PM3/10/16
to
Where is this SPLIT-TEXT defined? Some code-library?

franck....@gmail.com

unread,
Mar 10, 2016, 1:42:43 PM3/10/16
to
Le lundi 7 mars 2016 04:09:14 UTC+1, hughag...@gmail.com a écrit :

> 1.) JUGGLERS --- These include SWAP$ etc. and they just move items around on the string-stack.
>
> 2.) DUPLICATORS --- These include DUP$ etc. and they make a derivative item.
>
> 3.) CONSUMERS --- These include .$ etc. and they consume an item. If the item is a derivative it is just dropped. If the item is a unique, then FIX-DERIVATIVES is called first (this searches the string-stack for derivatives that are inside of this unique and changes them into uniques) and then it is dropped.
>
> 4.) MUTATORS --- These include UCASE$ etc. and they modify an item. If the item is a derivative it gets changed into a unique first, and if the item is a unique then FIX-DERIVATIVES gets called first, then the item is modified.
>

This is an interesting memory management. I wonder how this compare to
a memory management using a garbage collector.

Did you run some benchmark tests ?

Franck

Coos Haak

unread,
Mar 10, 2016, 6:10:41 PM3/10/16
to
Op Thu, 10 Mar 2016 09:28:02 -0800 (PST) schreef hughag...@gmail.com:

<snip>
> Where is this SPLIT-TEXT defined? Some code-library?

Straight high level, standard code:

: SPLIT-TEXT ( c-addr1 u1 c-addr2 u2 -- c-addr3 u3 c-addr1 u4 )
>r >r 2dup r> r@ search
if r> over >r /string 2swap r> -
else 2drop "" 2swap rdrop
then
;

Where "" could be defined as : "" 0 0 ;
You might use 2SWAP afterwards if the the second part must be on top.
I use this to generate html from my glossary files.

groet Coos

WJ

unread,
Mar 11, 2016, 1:27:43 AM3/11/16
to
hughag...@gmail.com wrote:

> In the meantime, I upgraded the capability of string-stack.4th
> a bit. I can do this now:
>
> s" foo:=bar" >$ ok
> s" :=" >$ infix$ . -1 ok
> extract$ ok
> .$ bar ok
> .$ foo ok
>
> INFIX$ returns a flag indicating if it found the pattern in
> the string. The dot gave me -1 (TRUE) so I then used EXTRACT$
> to remove the infix and return the prefix and suffix that were
> surrounding the infix. So, I parsed a little bit of Pascal
> code there (in Pascal, := is an assignment).

Don't forget that ":=" can be surrounded by whitespace.

Ruby:

" foo := bar ".split(":=").map(&:strip)
===>
["foo", "bar"]


OCaml:

#load "str.cma";;

List.map String.trim (Str.split (Str.regexp ":=") " foo := bar ") ;;
===>
["foo"; "bar"]

If you want a considerable amount of string-manipulation power,
you'll have to implement a regular-expression engine.

Example: find each integer in a string and increment it.

Ruby:

"9, 10, 11 I see, now that 1983 has come and gone".
gsub(/\d+/){|s| s.to_i + 1}
===>
"10, 11, 12 I see, now that 1984 has come and gone"

OCaml:

#load "str.cma";;

Str.global_substitute (Str.regexp "[0-9]+")
(fun s -> string_of_int ((int_of_string (Str.matched_string s)) + 1))
"9, 10, 11 I see, now that 1983 has come and gone";;
===>
"10, 11, 12 I see, now that 1984 has come and gone"


Instead of splitting a string, shatter it. In other words,
keep the separating strings instead of discarding them.

Ruby:

"foo := bar; write 8*9; end".split(/([^\w]+)/)
===>
["foo", " := ", "bar", "; ", "write", " ", "8", "*", "9", "; ", "end"]


OCaml:

#load "str.cma";;
open Str;;

Str.full_split (Str.regexp "[^a-z0-9]+") "foo := bar; write 8*9; end" ;;
===>
[Text "foo"; Delim " := "; Text "bar"; Delim "; "; Text "write"; Delim " ";
Text "8"; Delim "*"; Text "9"; Delim "; "; Text "end"]

--
From the New York Times of October 11, 1991, ... we learn that ... researchers
at Boston University admitted that, "There is no question but that Dr. King
plagiarized in the dissertation." ... "Dr. Martin Luther King, Jr." [Michael
King] spent his last night on Earth having sexual intercourse with two women at
the motel and physically beating and abusing a third. -- K. A. Strom

hughag...@gmail.com

unread,
Mar 11, 2016, 1:42:12 AM3/11/16
to
On Thursday, March 10, 2016 at 4:10:41 PM UTC-7, Coos Haak wrote:
> Op Thu, 10 Mar 2016 09:28:02 -0800 (PST) schreef hughag...@gmail.com:
>
> <snip>
> > Where is this SPLIT-TEXT defined? Some code-library?
>
> Straight high level, standard code:
>
> : SPLIT-TEXT ( c-addr1 u1 c-addr2 u2 -- c-addr3 u3 c-addr1 u4 )
> >r >r 2dup r> r@ search
> if r> over >r /string 2swap r> -
> else 2drop "" 2swap rdrop
> then
> ;
>
> Where "" could be defined as : "" 0 0 ;

The problem I notice immediately is that c-addr3 points to inside of a memory-block on the heap --- if you use FREE on this, it will fail --- how do you keep track of which strings can be given to FREE and which can not?

hughag...@gmail.com

unread,
Mar 11, 2016, 2:00:08 AM3/11/16
to
Here is some code run using .S$ to show the string-stack throughout:

s" foo:=bar" >$ dup$ ok
.s$
STRING STACK:
derivative: |foo:=bar|
unique: |foo:=bar| ok
s" :=" >$ infix$ . -1 ok
.s$
STRING STACK:
derivative: |:=|
derivative: |foo:=bar|
unique: |foo:=bar| ok
extract$ ok
.s$
STRING STACK:
derivative: |bar|
derivative: |foo|
unique: |foo:=bar| ok
third$ 2@ second$ 2@ first$ 2@ .s
DATA STACK
top
-3 FFFF:FFFD
67355897 0403:C4F9
-3 FFFF:FFFD
67355892 0403:C4F4
8 0000:0008
67355892 0403:C4F4
ok-6

Notice that only the original string s" foo:=bar" is put on the heap (also the pattern string s" :=" but it is freed by INFIX$). The |:=| string returned by INFIX$ is not the pattern string given to it, but instead is a derivative string inside of the original string. All the way to the very end, nothing is getting put on the heap, but everything that I'm working with are derivative strings.

At the very end I show what the three strings are, and only the original string is a unique, but the other two strings (the prefix and suffix) are derivatives. I expect that this is faster than a system in which every string is put on the heap and then freed later, and significantly faster than a system in which every string is put on the heap and then GC'd later --- I'm just manipulating adr/cnt pairs and not messing with the heap at all --- that has got to be fast!

If I do need to convert a derivative into a unique however, this gets done automatically. Here is some more code:

3drop 3drop ok
ucase$ ok
.s$
STRING STACK:
unique: |BAR|
derivative: |foo|
unique: |foo:=bar| ok
third$ 2@ second$ 2@ first$ 2@ .s
DATA STACK
top
3 0000:0003
67355916 0403:C50C
-3 FFFF:FFFD
67355892 0403:C4F4
8 0000:0008
67355892 0403:C4F4
ok-6

UCASE$ is a mutator, which means that if it is given a derivative (as in this case) it converts it into a unique before mutating it --- if it is given a unique it calls FIX-DERIVATIVES to fix any derivatives of the string that may be on the string-stack before mutating it --- so, I avoid messing with the heap as much as possible, but when it is necessary then it is done automatically.

Anyway --- no --- I haven't run any benchmark tests. For one thing, I'm not familiar enough with oForth to write the oForth code --- also, it depends upon which ANS-Forth compiler is used (VFX generates pretty fast code, but SwiftForth generates abysmally bloated and slow code), so a benchmark against oForth tests the compiler more than the algorithm.

Coos Haak

unread,
Mar 11, 2016, 6:29:18 AM3/11/16
to
Op Thu, 10 Mar 2016 22:42:09 -0800 (PST) schreef hughag...@gmail.com:
No problem, I can type the string or concatenate two strings.

do-debug ok
( ) "foo:=bar" ":=" split-text ok
( "bar" "foo" ) ":=" strcat ok
( "bar" "foo:=" ) 2swap ok
( "foo:=" "bar" ) strcat ok
( "foo:=bar" ) type foo:=bar ok
The strings live in a circular heap.
strcat ensures (by strdup) that the first string is placed at the end
of the string pointer so it can add the second to its end.
This also works for strings outside the heap e.g. in colon definitions.
It speaks for itself that the length of the strings is on the
stack, not between the strings like the old counted strings.

groet Coos

JennyB

unread,
Mar 11, 2016, 6:47:06 AM3/11/16
to
Let's see how this might be done without a heap:

String space grows linearly, like dictionary space.
Unique strings are written once and once only into string space, and a reference pushed on the string stack.
String-handling words make no distinction between unique and derived arguments.
They consume the string stack but do not reduce string space.

String space is only reclaimed when the string stack empty. At that point it is up to the program to ensure that any unique string that is still required has already been copied elsewhere. A mechanism for named string spaces looks possible, similar to MARKER or Anton's Regions.

That would work, except that for the most papt, a unique string /is/ unique; you want it once and once only. It's fairly easy to check if the string you are about to consume is in fact the last string in string space and remove it if it is:

S" FOO" $" BAR" 2SWAP $push $push $. $.
- easy To leave nothing in string space

S" FOO" $" BAR" $push $push $swap $. $.
- not so easy.

JennyB

unread,
Mar 12, 2016, 4:33:05 AM3/12/16
to
That is very neat! Just one big circular buffer, and hardly any memory management overhead. I presume that with strcat you can do the following optimisations:

If str1 is at the string pointer, append str2 and bump the pointer.
If str2 is at the string pointer, and str1 is directly below it
(easily checked by simple arithmetic) do NIP +

It should be easy to maintain more than one buffer if you need to cope with growing multiple strings simultaneously.

Another thought:

Instead of having negative counts represent 'derived strings' they could represent 'braids' - strings of addr count pairs that may themselves represent either strings or other braids.

hughag...@gmail.com

unread,
Mar 12, 2016, 7:56:45 PM3/12/16
to
I already have a circular buffer in the novice package, but I don't like it. It is deprecated now that I have this string-stack.4th package (although I still use it in string-stack.4th for SPLITS$).

A problem with the buffer is that it doesn't handle gigantic strings. Also, the strings have limited lifespan before the buffer wraps around and they get over-written, but there is no way to ensure that this doesn't happen.

I like my string-stack.4th --- I'm going with it for any string-handling software in the future.

> Another thought:
>
> Instead of having negative counts represent 'derived strings' they could represent 'braids' - strings of addr count pairs that may themselves represent either strings or other braids.

Well, my derivative string can be derived from a unique or from another derivative string --- so it is a "braid" already.

s" abcdefgh" >$ ok
dup$ 2 3 mid$ ok
dup$ 1 1 mid$ ok
.s$
STRING STACK:
derivative: |d|
derivative: |cde|
unique: |abcdefgh| ok
third$ 2@ second$ 2@ first$ 2@ .s
DATA STACK
top
-1 FFFF:FFFF
67356647 0403:C7E7
-3 FFFF:FFFD
67356646 0403:C7E6
8 0000:0008
67356644 0403:C7E4
ok-6

hughag...@gmail.com

unread,
Mar 22, 2016, 2:26:40 AM3/22/16
to
On Monday, March 7, 2016 at 5:27:32 AM UTC-7, Albert van der Horst wrote:
> hughag...@gmail.com writes:
> >BTW: My ultimate goal with all of this is to write a program that can trans=
> >late Esperanto into English or Spanish or whatever. I thought of this way b=
> >ack in 1984 when I was learning Esperanto. It is too difficult for a progra=
> >m to translate directly from Spanish to English or vice-versa because those=
> > language are difficult to figure out (this was true 30 years ago and is st=
> >ill true). By making the human write in Esperanto however, most of the work=
> > can be dumped on the human. Translating from Esperanto to English or Spani=
> >sh should be possible. All the Esperanto words have a suffix that tells you=
> > exactly what part of speech they are. In string-stack.4th I have functions=
> > for working with prefixes and suffixes that I provided specifically for ta=
> >king apart Esperanto words. The idea is that the front-end of the program b=
> >uilds a lexical tree for the Esperanto sentence. There would be a back-end =
> >of the program specific to each output language that converts the lexical t=
> >ree into English, Spanish, etc..
>
> That is interesting. A collegue of mine Toon Witkam had this idea of translating
> into Esperanto to disambiguate natural language. It turned into a project
> for the EU and earned him a doctorate, later he became professor.
> The EU didn't really like it too much, at the time it was slow (80's),
> but mostly they didn't want too clear language in their treatises.
> That was a multi-manyear project. I don't see you pull off a similar project
> without a team.

I'm switching from Esperanto to Ido. These are some reasons:
1.) Ido doesn't have any diacritical marks like Esperanto, so it can be ASCII.
2.) Ido is not sexist like Esperanto or English. It is possible to use pan-gender words to avoid implying gender.
3.) The Ido words seem to be borrowed from more common languages than in Esperanto, so they are easier to learn --- Esperanto borrowed from too many languages in an effort to please everybody, but a lot of those words are just obscure (I don't even know what language they come from) and are hard to remember.
3.) Ido word-roots have only one meaning, so there is no ambiguity. It is never necessary to figure out the meaning of a word from the context. This is really the most important feature!

I think that, using string-stack.4th, I can take apart the words' prefixes and suffixes to obtain the word-root. I can also make a lexical-tree out of the sentence and determine if the sentence is grammatically legal. All of this can be done without knowing what the words mean. This would be the front-end for the program.

The back-end would convert the lexical-tree into some natural language such as English or Spanish. This is going to require a big dictionary. Putting this together would likely require a team as there are tens of thousands of words. Also, every target language needs a team who know that language (I only know English).

Anyway, the front-end for the program is something that I should be able to write.

hughag...@gmail.com

unread,
Apr 10, 2016, 11:23:29 PM4/10/16
to
I made a substantial improvement to the efficiency of the system. It was pretty simple: I made >$ to require that it be given a constant string (typically a S" string) and I made it create a derivative rather than a unique on the string-stack. So it is possible now to have derivatives on the string-stack that are not derived from any unique on the string-stack but are instead derived from a constant string in the dictionary. Now, in many cases, I don't have to put any string on the heap. Of course, this is still done automatically when necessary (by the mutators). Here is that code again with the new system:

s" foo:=bar" >$ dup$ ok
.s$
STRING STACK:
derivative: |foo:=bar|
derivative: |foo:=bar| ok
s" :=" >$ infix$ . -1 ok
.s$
STRING STACK:
derivative: |:=|
derivative: |foo:=bar|
derivative: |foo:=bar| ok
extract$ ok
.s$
STRING STACK:
derivative: |bar|
derivative: |foo|
derivative: |foo:=bar| ok
third$ 2@ second$ 2@ first$ 2@ .s
DATA STACK
top
-3 FFFF:FFFD
5050918 004D:1226
-3 FFFF:FFFD
5050913 004D:1221
-8 FFFF:FFF8
5050913 004D:1221
ok-6

I was reading about Stephen Pelc's BNF interpreter that is in VFX. It looks pretty interesting. I notice that his code is not ANS-Forth however.

: || IF R> DROP 1 THEN ;

This is obviously illegal in ANS-Forth. Section 3.2.3.3. says:
A program shall not access values on the return stack (using R@, R>, 2R@ or 2R>) that it did not place there using >R or 2>R;

Pelc should have written this:

macro: || if 1 exit then ;

Or, for improved readability:

macro: || ( flag -- true | ) \ exits the parent function on a true flag
dup if exit then \ if the flag is true, exit and return the flag
drop ; \ if the flag is false, drop the flag and keep going

MACRO: is in the novice-package and, like everything in the novice-package, it is ANS-Forth.

Also, I don't know why he is using 1 rather than -1 for true, but that doesn't matter very much.

My string-stack is a stack, so it supports recursive functions. I think my string-stack.4th can do everything that Pelc's BNF interpreter can do, although I don't have the BNF-like appearance which might be useful to somebody who is working from a language-specification that is in BNF.

The most important point is that my string-stack.4th is written in ANS-Forth whereas Pelc's BNF interpreter is written in VFX-specific code that will trap the user in vendor lock-in. Also, my string-stack.4th is available for free. Pelc's BNF interpreter is described in the manual, but doesn't seem to be available in the evaluation version of VFX, so you have to spend money to get it --- and I doubt that you get the source-code even if you do spend the money.

Has anybody ever written a traditional-language interpreter in Forth, either using Pelc's BNF interpreter or writing it manually?

I'm not much interested in writing a traditional-language interpreter in Forth. I know that Ching has his ABC (Advanced BASIC Compiler) written in Forth, and he likes it. I really just program in Forth myself. It is somewhat useful though to have BASIC or some other well-known language available for users who don't want to program in Forth.

I'm planning on using string-stack.4th primarily for interpreting human-language text, especially Ido.

Stephen Pelc

unread,
Apr 11, 2016, 5:45:16 AM4/11/16
to
On Sun, 10 Apr 2016 20:23:27 -0700 (PDT), hughag...@gmail.com
wrote:

>I was reading about Stephen Pelc's BNF interpreter that is in VFX. It looks=
> pretty interesting.

The world's greatest and only Forth programmer was probably trying to
read the VFX Forth manual. Just to avoid confusion here's an
extract from the VFX Forth manual:

"This article first appeared in ACM SigFORTH Newsletter vol. 2 no. 2.
Since then the code has been updated from the original by staff at
MPE, and this documentation has been derived from the article supplied
by Brad Rodriguez, whose original implementation is a model of Forth
programming."

> Pelc's BNF interpreter is described in the manual, but doesn't seem =
>to be available in the evaluation version of VFX, so you have to spend mone=
>y to get it --- and I doubt that you get the source-code even if you do spe=
>nd the money.
>
>Has anybody ever written a traditional-language interpreter in Forth, eithe=
>r using Pelc's BNF interpreter or writing it manually?

The world's greatest and only Forth programmer appears not be able to
read. Try
whereis ::=
in the console of the Eval version. The resource compiler supplied
with VFX Forths for Windows makes heavy use of the BNF parser.

Stephen

--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Mark Wills

unread,
Apr 11, 2016, 6:16:41 AM4/11/16
to
The string stack library is developing into an awesome thing! Really
good!

Re "traditional" language development: In Forth Dimensions, Volume III
issue 6, you'll find "Charles Moores BASIC Compiler Re-visited" by
Michael Perry.

In Volume VI, issue 4, you'll find a PASCAL P-Code interpreter.

You might find Volue VIII issue 6 interesting: Seven Thousand Seven
Hundred and Seventy Six Limericks, by Nathanial Grossman. An excercise
in string manipulation.

Volume X, Issue 1: IMPROVED STRING HANDLING BY MIKE ELOLA

Volume X, Issue 3: USING A STRING STACK - RON BRAITHWAITE
Continued in Vol X, No 4.

Not related, but this looks really cool, and I'm going to read it later:
Vol XI no. 1
FORTH NEEDS THREE MORE STACKS - AYMAN ABU-MOSTAFA
Standard Forth uses the return stack for a number of disparate tasks: that
is bad programming at best, confusing and error-prone at worst. This article
suggests an aux stack for loop parameters and temporary storage. And a
method of handling conditionals without branching allows use of conditionals
outside colon definitions; this is done with a condition stack and a case
stack.

Mark Wills

unread,
Apr 11, 2016, 6:47:34 AM4/11/16
to
In addition, there's lots of Pascal to Forth type articles on the web
(and one in Forth Dimensions, too, but I didn't come across it when
looking through the issues):

https://www.google.co.uk/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#safe=active&q=pascal+in+forth

http://tangentstorm.github.io/winfield-pascal-83.html


More on languages in Forth:

From The Rochester Papers (JFAR Vol III)


BNF: A Parser Written in Forth
Leonard Morgenstern
BNF, named in honor of Backus and Naur, is a parser, written in Forth, capable of analyzing complex
input. Parser words can be intennixed with ordinary Forth words, so that BNF can analyze the text for
correctness and also act on it. BNF provides for parallel, series, repetition, and nulls, the latter being elements
that must or may be absent. BNF does not conform to the conventional requirement of "one symbol look-ahead
without backtracking", a feature that provides certain advantages.

An Approach to Natural Language Parsing
Jack Park
Brownsville, CA
A description of the design of an expectation-based parser is given. The parser uses a three-part structure
consisting of primitive structures, expectation procedures and heuristics, and a lexical dictionary. By use of
this structure in a Forth environment, a self-parsing vocabulary is built.


A Prolog Interpreter
C. H. Ting
San Mateo, CA
Prolog is an interesting language with simple syntax structure. It is rather straightforward to implement
it in Forth. Since most of the work in interpreting Prolog queries is in string comparison, the interpreter can
be constructed by running two pointers, one for the query string and the other for the data base. This method
thus eliminates the need to build data structures according to the data types used in Prolog. In this paper, only
relations are allowed in the data base. Both the "does" and "which" queries are implemented to use the data base.
Work to implement rules in data base is in progress.


A Forth LISP
Martin J. Tracy
Micromotion
Los Angeles, CA
Forth is an extensible language and can be extended to handle lists of facts. The LISP language has many
good list operators which can be translated to Forth in a straightforward manner. By choosing appropriate data
structures and algorithms, Forth can have all of the power of LISP without the performance penalties. On the
other hand, Forth pays for this with greater complexity and the accompanying risk of producing incorrect code.
This paper, which freely combines source code with text, describes a hybrid Forth/LISP list handler. It
discusses list data structures, list primitives, "aliases", dynamic garbage collection, determining symbol
uniqueness, and other implementation details. As a demonstration of the power of the hybrid, it is itself used
to implement the "micro-LISP" described in Winston and Hom's liSP, 2nd edition, (Addison-Wesley, 1984).


Andrew Haley

unread,
Apr 11, 2016, 6:50:57 AM4/11/16
to
Mark Wills <markwi...@gmail.com> wrote:
> Not related, but this looks really cool, and I'm going to read it later:
> Vol XI no. 1
> FORTH NEEDS THREE MORE STACKS - AYMAN ABU-MOSTAFA
> Standard Forth uses the return stack for a number of disparate
> tasks: that is bad programming at best, confusing and error-prone at
> worst. This article suggests an aux stack for loop parameters and
> temporary storage. And a method of handling conditionals without
> branching allows use of conditionals outside colon definitions; this
> is done with a condition stack and a case stack.

IMO this is misguided. Forth design in embedded systems (its
traditional field of use) has tended to use small tasks, each one
dedicated to doing a single job.This has turned out to be a useful
design principle as long as when it is applied consistently. And by
small, I mean really small: maybe each task is only a few words long.
The super-lightweight multi-tasker originally from Moore himself, then
from others at Forth, Inc. is designed to make this work well even on
very small systems.

It's not hard to partition memory for arrays of small tasks if each
one has a data stack, a return stack, and a user area. If you have
five stacks it gets to be difficult, not to mention rather space-
inefficient.

Andrew.

Raimond Dragomir

unread,
Apr 11, 2016, 7:21:22 AM4/11/16
to
Difficult, same for any number of stacks. Space inefficient? yes, for
every stack or two (if facing each other) you loose some space.
If the performance hit is acceptable you can use any number of stacks
based on linked lists with 100% efficiency but more overhead, on a single
area. Probably an overkill unless you really have a lot of stacks :)

What I found is this: with a good compiler you can have locals and
loop indexes and temporary data (emulated r> r@ etc.) working on the same
return stack with no collisions. This compiler is around 16K for a cortex-m0
system. A simple compiler like eforth or the like will be indeed in the
5K-8K range, but I prefer my ~16K one anytime.




Andrew Haley

unread,
Apr 11, 2016, 7:36:26 AM4/11/16
to
Raimond Dragomir <raimond....@gmail.com> wrote:
> luni, 11 aprilie 2016, 13:50:57 UTC+3, Andrew Haley a scris:
>>
>> It's not hard to partition memory for arrays of small tasks if each
>> one has a data stack, a return stack, and a user area. If you have
>> five stacks it gets to be difficult, not to mention rather space-
>> inefficient.
>
> Difficult, same for any number of stacks.

I'm not quite sure why you think this. When partitioning the system
you have to look at the total data + return size of every task and
partition accoringly. Maybe I'm just a weakling, but I think I'd find
having to figure out the stack sizes for every task would be tricky.

> Space inefficient? yes, for every stack or two (if facing each
> other) you loose some space. If the performance hit is acceptable
> you can use any number of stacks based on linked lists with 100%
> efficiency but more overhead, on a single area. Probably an overkill
> unless you really have a lot of stacks :)

Hmm, surely this is just another way to waste space and CPU time.

> What I found is this: with a good compiler you can have locals and
> loop indexes and temporary data (emulated r> r@ etc.) working on the
> same return stack with no collisions.

Exactly so. The return stack is great!

Andrew.

hughag...@gmail.com

unread,
Apr 11, 2016, 11:00:09 PM4/11/16
to
On Monday, April 11, 2016 at 2:45:16 AM UTC-7, Stephen Pelc wrote:
> The world's greatest and only Forth programmer appears not be able to
> read. Try
> whereis ::=
> in the console of the Eval version. The resource compiler supplied
> with VFX Forths for Windows makes heavy use of the BNF parser.

I did this:
whereis <BNF

I was told that it wasn't there, so I didn't bother looking any further

Now, reading your post, I did this:
whereis ::=
File: %VFXPATH%\VFXBase\BNF.fth, line 747 ok

Unfortunately, there is no BNF.fth file provided with the evaluation version. Maybe if I paid you for the pro version I would get the BNF.fth file, but maybe not as there is no guarantee made on this anywhere.

I'm not realistically going to use your BNF stuff if I don't have the source-code. You documented <BNF and BNF> but they aren't available, so that was misleading. Also, your documentation is very vague. Here is an example:
------------------------------------------------------------------------
::= name starts the definition of the BNF production name.
;; ends a BNF definition.
------------------------------------------------------------------------

That doesn't really tell me anything about what they do. You aren't providing a stack-picture for your words or really giving me any information at all.

You are assuming that I am a "user" who is going to use this code without understanding how it works. You are also assuming that I am a "client" who is okay with vendor lock-in to VFX.

If you aren't going to provide the BNF.fth file and/or if it is not written in ANS-Forth (I'll bet it is not!), then I have no interest in using your BNF stuff --- I already said that my string-stack.4th can do everything that is needed --- so I recommend that everybody just ignore your BNF stuff and use string-stack.4th to get "commercial level string handling."

Earlier, we had this exchange:

On Saturday, March 5, 2016 at 4:22:03 AM UTC-7, Stephen Pelc wrote:
> On Fri, 4 Mar 2016 15:07:29 -0800 (PST), hughag...@gmail.com
> wrote:
>
> > Stephen Pelc says: "For practical applications, all Forth standards to
> > date have been inadequate for commercial level string handling."
>
> Yes, I said that
>
> > Oh nooooooo! A Forth-200x committee member says=
> > that Forth can't be used for "practical applications" involving "commercia=
> >l level string handling"
>
> No, I did not say that. The solution in Forth, as you well know, is to
> expand the base Forth system to have a commercial-level string
> handling package. That's what several of our clients have done.

Your use of the word "clients" implies that you believe only your clients ($$$) deserve to get "commercial level string handling" --- that is not true --- Forthers can avoid paying you any money at all by simply ignoring your closed-source BNF stuff and using my open-source string-stack.4th.

hughag...@gmail.com

unread,
Apr 11, 2016, 11:20:13 PM4/11/16
to
On Monday, April 11, 2016 at 3:50:57 AM UTC-7, Andrew Haley wrote:
> Mark Wills <markwi...@gmail.com> wrote:
> > Not related, but this looks really cool, and I'm going to read it later:
> > Vol XI no. 1
> > FORTH NEEDS THREE MORE STACKS - AYMAN ABU-MOSTAFA
> > Standard Forth uses the return stack for a number of disparate
> > tasks: that is bad programming at best, confusing and error-prone at
> > worst. This article suggests an aux stack for loop parameters and
> > temporary storage. And a method of handling conditionals without
> > branching allows use of conditionals outside colon definitions; this
> > is done with a condition stack and a case stack.
>
> IMO this is misguided. Forth design in embedded systems (its
> traditional field of use) has tended to use small tasks, each one
> dedicated to doing a single job.This has turned out to be a useful
> design principle as long as when it is applied consistently. And by
> small, I mean really small: maybe each task is only a few words long.
> The super-lightweight multi-tasker originally from Moore himself, then
> from others at Forth, Inc. is designed to make this work well even on
> very small systems.

What I think is misguided is using the same Forth for both micro-controllers and desktop-computers.

I have said many times that my FloatForth is intended to be a standard for desktop-computers. One application program that can be written in FloatForth is a cross-compiler for a micro-controller --- but the Forth used on the micro-controller isn't compatible with FloatForth --- for one thing, FloatForth requires a 64-bit cell size (and, realistically, a 64-bit processor), whereas a micro-controller would be either 16-bit or 32-bit.

There are a lot of design decisions that are going to be different between a language used on a desktop-computer with a 64-bit x86 and a language used on an MSP-430. Right now I want FloatForth to be the standard for desktop-computers. I'm dubious that it is a good idea to have a standard for micro-controllers at all --- programs aren't going to be portable anyway, because they are hardware-specific --- the micro-controllers also vary a lot in their capabilities, from the 16-bit MSP-430 to the 32-bit ARM Cortex.

One of the many failures of ANS-Forth was that it specified one language for both desktop-computers and micro-controllers. This made sense in the 1970s and early 1980s when the 6502 and Z80 were used for both desktop-computers and micro-controllers. By 1994 however, it no longer made sense because most desktop-computers used the 32-bit Pentium, and micro-controllers continued to use 8-bit or 16-bit processors (mostly the Dallas 80c320, but also such things as the MC6808 and MC6812).

I hate ANS-Forth! This is really 1970s technology in every way --- it was a joke in 1994, and the joke is getting pretty stale in 2016.

hughag...@gmail.com

unread,
Apr 11, 2016, 11:34:28 PM4/11/16
to
I'm familiar with Raimond's LFTforth (for the record: I thought up the name LFTforth). I think this is a very good design!

I'm learning more about Forth all the time. For example, I had originally thought that I would have a double-stack that contained both double-precision integers and also strings (the address and count are the two cells). Now that I have written this string-stack.4th I realize that the strings need their own stack. The reason is that FIX-DERIVATIVES needs to search the string-stack and fix any derivatives (turn them into uniques) if the unique that they are derived from is getting mutated or is getting dropped. FIX-DERIVATIVES can't distinguish between strings and double-integers though, so it needs to have only strings on the stack.

I now think that I need both a string-stack and a double-precision-integer stack --- these are both in addition to the single-precision-integer stack (also contains pointers, or any other single-cell data) and the return-stack (also contains locals).

For the most part, I don't see a problem with having multiple stacks in regard to memory-efficiency. These are very small stacks --- maybe 32 elements each --- memory-efficiency isn't going to be a problem except on an extremely memory-constrained micro-controller, but even small micro-controllers such as the MSP-430 typically have 32KB of RAM which is quite a lot (this isn't the 1980s anymore). Also, something like the string-stack is going to be an extension to the language anyway --- most micro-controllers aren't going to need this just because they don't work with strings.

WJ

unread,
Apr 22, 2016, 5:59:14 AM4/22/16
to
Why this is marked as abuse? It has been marked as abuse.
Report not abuse
Coos Haak wrote:

> Using strings on a heap, not a stack:
>
> want split-text ok
> "foo:=bar" ok
> ":=" ok
> split-text ok
> type foo ok
> type bar ok

#load "str.cma";;

", foo := bar, and more, the end"
|> Str.split (Str.regexp ", +\\| *:= *") ;;

===>
["foo"; "bar"; "and more"; "the end"]

--
"A Jew always has a much higher soul than a gentile, even if he is a
homosexual," he [Israel's Deputy Defense Minister Eli Ben-Dahan] said.
theoccidentalobserver.net/2015/10/israeli-minister-latest-in-long-line-of-jews-asserting-the-inferiority-of-non-jews

WJ

unread,
Apr 22, 2016, 6:03:40 AM4/22/16
to
Coos Haak wrote:

> Using strings on a heap, not a stack:
>
> want split-text ok
> "foo:=bar" ok
> ":=" ok
> split-text ok
> type foo ok
> type bar ok

OCaml:

hughag...@gmail.com

unread,
Jun 3, 2016, 8:53:06 AM6/3/16
to
On Sunday, April 10, 2016 at 8:23:29 PM UTC-7, hughag...@gmail.com wrote:
> I was reading about Stephen Pelc's BNF interpreter that is in VFX. It looks pretty interesting. I notice that his code is not ANS-Forth however.
>
> : || IF R> DROP 1 THEN ;
>
> This is obviously illegal in ANS-Forth. Section 3.2.3.3. says:
> A program shall not access values on the return stack (using R@, R>, 2R@ or 2R>) that it did not place there using >R or 2>R;
>
> Pelc should have written this:
>
> macro: || if 1 exit then ;
>
> Or, for improved readability:
>
> macro: || ( flag -- true | ) \ exits the parent function on a true flag
> dup if exit then \ if the flag is true, exit and return the flag
> drop ; \ if the flag is false, drop the flag and keep going
>
> MACRO: is in the novice-package and, like everything in the novice-package, it is ANS-Forth.
>
> Also, I don't know why he is using 1 rather than -1 for true, but that doesn't matter very much.

Not only is Pelc's code not ANS-Forth, but it also has a bug in it. If the colon word that calls || uses locals, it will crash when || is called because the EXIT in || is trying to exit out of the calling function but doesn't know that it has locals on top of the return-address.

By comparison, my version of || works fine whether the calling function has locals or not.

hughag...@gmail.com

unread,
Apr 15, 2017, 7:54:24 PM4/15/17
to
Anton Ertl's post came from here:
https://groups.google.com/forum/#!topic/comp.lang.forth/EkjxmjenL5s

I'm replying to it in this thread because this is where the answer is.

On Saturday, April 15, 2017 at 1:08:54 AM UTC-7, Anton Ertl wrote:
> hughag...@gmail.com writes:
> >1.) I read that Forth-200x description of S\" and found this bizarre restri=
> >ction:
> >"A program shall not alter the returned string."
> >I don't know why they have that restriction, and I certainly don't.
>
> : foo s\" bla" ;
> 'x foo drop c!
> foo type
>
> If programs were allowed to alter the returned string, what should the
> output of TYPE be? Is the returned string required to be in RAM?
> Should it make a new copy every time it runs? How long is that copy
> guaranteed to live?

This is really the whole point of having a string-stack --- so you have somewhere to put your strings.

S" and S\" would create a string that can't be altered, but when they execute that push a string onto the string-stack. They don't have to copy the string because the string pushed onto the string-stack is a derivative. After the string is pushed onto the string-stack it can be worked with just like any string. If it is modified, it internally gets converted into a unique string first (a unique copy is made). If a string is consumed (such as by .$ etc.) then it gets dropped (if it is a derivative it just gets dropped, but if it is unique then it gets a DEALLOC to remove it from the heap before it gets dropped).

A string on the string-stack is guaranteed to live forever --- until it gets consumed. When it gets consumed (such as by .$ or DROP$ etc.) it gets freed from the heap if it is a unique, so the programmer doesn't have to manually use FREE or DEALLOC on it. Also, if a unique string gets consumed, before it id freed we internally search the entire string-stack for derivatives and convert them into uniques.

Here is an example (using .S$ to show what is on the string-stack):

s" hello world" >$ ok
dup$ ok
.s$
STRING STACK:
derivative: |hello world|
derivative: |hello world| ok

reverse$ ok
.s$
STRING STACK:
unique: |dlrow olleh|
derivative: |hello world| ok

.$ dlrow olleh ok
.$ hello world ok

A major failing of ANS-Forth is that you pass data between functions in global variables. This is extremely bad programming! For example, your S" stores the string in a global variable when it is used interpretively. Then you do another S" and you overwrite it. Another example is that <# #> stores the string in a global variable. When you do another <# #> it overwrites it. All of these problems get solved by the use of a string-stack.

hughag...@gmail.com

unread,
Apr 15, 2017, 10:55:50 PM4/15/17
to
The important point here is that we never have an address and count on the data-stack. You can't just modify strings willy-nilly. You don't have to worry about keeping track of whether a particular string can be modified or if this is forbidden (it is forbidden for S" and S\" strings, and maybe some others). Strings are only modified when they are on the string-stack, and the words that modify them follow the rules:
1.) Turn a derivative into a unique first
2.) If it is already a unique then turn any derivatives it has into uniques.

Also, storing string in the heap is not a good solution.

1.) This involves copying the string every time. By comparison, the string-stack just makes a derivative --- the derivative doesn't get turned into a unique until this is necessary --- in many cases, the derivative never gets turned into a unique.

2.) The user has to manually FREE the string. This bloats out the source-code, and it is confusing to the user to remember when to do this. If the user forgets to do this, he gets a memory leak. By comparison, the string-stack consumers takes care of this internally when necessary.

What ANS-Forth does is just pass strings between functions in global variables ( PAD and the S" pad and the <# #> pad). This is terrible programming! This was done in the early 1970s when computers had such limited RAM that there was no room for multiple strings. Charles Moore came up with these global pads as an expedient measure to get some application program finished --- apparently it was an application program that didn't need to work with more than one string at a time.

Julian Fondren

unread,
Apr 15, 2017, 11:08:01 PM4/15/17
to
On Saturday, April 15, 2017 at 9:55:50 PM UTC-5, hughag...@gmail.com wrote:
> You can't just modify strings willy-nilly.

https://www.youtube.com/watch?v=764T6P7cnKM

I recommend that you learn Rust.

hughag...@gmail.com

unread,
Apr 16, 2017, 12:15:40 AM4/16/17
to
On Saturday, April 15, 2017 at 4:54:24 PM UTC-7, hughag...@gmail.com wrote:
> Anton Ertl's post came from here:
> https://groups.google.com/forum/#!topic/comp.lang.forth/EkjxmjenL5s
>
> I'm replying to it in this thread because this is where the answer is.
>
> On Saturday, April 15, 2017 at 1:08:54 AM UTC-7, Anton Ertl wrote:
> > hughag...@gmail.com writes:
> > >1.) I read that Forth-200x description of S\" and found this bizarre restri=
> > >ction:
> > >"A program shall not alter the returned string."
> > >I don't know why they have that restriction, and I certainly don't.
> >
> > : foo s\" bla" ;
> > 'x foo drop c!
> > foo type
> >
> > If programs were allowed to alter the returned string, what should the
> > output of TYPE be? Is the returned string required to be in RAM?
> > Should it make a new copy every time it runs? How long is that copy
> > guaranteed to live?
>
> This is really the whole point of having a string-stack --- so you have somewhere to put your strings.
>
> S" and S\" would create a string that can't be altered, but when they execute that push a string onto the string-stack.

This was a typo. I meant:
S" and S\" would create a string that can't be altered, but when they execute they push a string onto the string-stack.

I don't know what 'X does, so I don't understand your example.

Here is an example with the string-stack:

: foo s" bla" >$ ;

Ideally, the >$ would be done internally by S" so there would never be an address/count pair on the data-stack.

foo ok
.s$
STRING STACK:
derivative: |bla| ok
reverse$ ok
.s$
STRING STACK:
unique: |alb| ok
ok
.$ alb ok

Notice that when the S" string is pushed to the string-stack, it is a derivative string. This means that only the reference gets pushed. There is no need to call ALLOCATE and there is no need to copy the string. This is very fast. When the string gets modified however, then it first gets converted into a unique (which involves calling ALLOCATE and copying the string). When the string gets consumed by .$ it gets freed from the heap.

Here is another example:

foo dup$ reverse$ ok
.s$
STRING STACK:
unique: |alb|
derivative: |bla| ok
dup$ ok
.s$
STRING STACK:
derivative: |alb|
unique: |alb|
derivative: |bla| ok
.$ alb ok
.s$
STRING STACK:
unique: |alb|
derivative: |bla| ok

Consuming the string with .$ is easy because the string is a derivative.

Here is another example:

foo dup$ reverse$ ok
.s$
STRING STACK:
unique: |alb|
derivative: |bla| ok
dup$ ok
.s$
STRING STACK:
derivative: |alb|
unique: |alb|
derivative: |bla| ok
swap$ ok
.s$
STRING STACK:
unique: |alb|
derivative: |alb|
derivative: |bla| ok
.$ alb ok
.s$
STRING STACK:
unique: |alb|
derivative: |bla| ok

This is the same as the last example except that there was a SWAP$ prior to the .$ that so the unique rather than the derivative was consumed. When a unique is consumed, we search the entire string-stack for any derivatives of that unique and change them into uniques. This is why we now have an |alb| string on the string-stack that is unique --- it is the string that had been a derivative prior to .$ being called --- it couldn't continue being a derivative because the unique that it is a derivative of is getting consumed.

Most of the time, there is no need to convert derivatives into uniques. This is because we typically have derivatives on top of uniques on the data-stack. SWAP$ and ROT$ mess this up though, so we end up with uniques on top of derivatives. In this case, it becomes necessary to convert derivatives into uniques. This is pretty rare, because stack-juggling of the data-stack is pretty rare --- it only causes a minor slowdown though --- all of this is taken care of internally, so the user doesn't have to worry about it (the user just assumes that the behavior is going to be the same as if everything on the string-stack were a unique).

Note that none of this can be put in Forth-200x because Forth-200x is mandated to be 100% compatible with ANS-Forth. ANS-Forth has a lot of legacy code that has address/count pairs being held on the data-stack and strings being modified willy-nilly, and ANS-Forth has strings (generated by S" and S\" and maybe some other words) that supposedly can't be modified, although there is nothing to prevent a program from modifying them, so there may be programs that modify these strings and these programs sometimes work and sometimes don't work.

ANS-Forth is just no good. ANS-Forth is the work of an idiot marketing-genius who has never written a Forth program in her life. If Forth is going to succeed, we need a new standard.

Rudy Velthuis

unread,
Apr 16, 2017, 9:50:10 AM4/16/17
to
hughag...@gmail.com wrote:

> Also, storing string in the heap is not a good solution.
>
> 1.) This involves copying the string every time.

Ok, not speaking from Forth experience, but from a lot of experience
with using and implementing string code in several languages: the above
is nonsense.

If possible, use COW (copy on write), if possible paired with some kind
of (automatic) reference counting (or similar) and you won't have to
copy each time. Only if the string is modified and the refcount is > 1.
Note that the refcount and length should be part of the string data on
the heap (the so-called "payload"), not part of the "string pointer".

You could also make ALL strings immutable. That would only require a
copy if the string is modified, otherwise you can simply return (a
pointer to) the original. Also easier to handle concurrent access that
way.

I don't really understand why currently, in Forth, an address/length
combination is used so often. The length should, when possible, be
stored as part of the string data, just like old-fashioned (byte)
counted strings. The address-length pair only make sense if the string
data is read directly from some kind of buffer and the buffer should
not/cannot be modified. This only makes sense for transient data, e.g.
when parsing a word or string of which the string data that can easily
be discarded after use (e.g. when interpreting or compiling and you
only parse to, er, find out what to do next).

--
Rudy Velthuis http://www.rvelthuis.de

"I still live." -- Daniel Webster, dying words

lehs

unread,
Apr 16, 2017, 11:23:17 AM4/16/17
to
Hugh, I tested your string stack against mine in Win32Forth 6.15.03, a simple test:

\ : test 0 do over$ over$ drop$ drop$ loop ;
\ : test 0 do sover sover sdrop sdrop loop ;

and found that mine ran the test 6 times faster than your.

My string stack is a shitty hack without any kind of sophistication: no heaps, no error checks, no adress references just copying all strings when juggling...

I not assure how to interpret this, because I'm convinced you're a far better coder than me.

lehs

unread,
Apr 16, 2017, 11:33:10 AM4/16/17
to
I THINK it's about the heap in Windows. I had the same experience for my big integers, the code was about 5 times faster with unsophisticated stack juggling than with intensiv use of allocate, free etc.

https://github.com/Lehs/ANS-Forth-libraries

lehs

unread,
Apr 16, 2017, 12:02:40 PM4/16/17
to
Den söndag 16 april 2017 kl. 17:23:17 UTC+2 skrev lehs:
With this test my code ran 4 times faster.

\ : test1 0 do over$ over$ swap$ drop$ drop$ loop ;
\ : test1 0 do sover sover sswap sdrop sdrop loop ;

The method with heaps and references is faster for SWAP.

https://forthmath.blogspot.se
https://github.com/Lehs/ANS-Forth-libraries

Anton Ertl

unread,
Apr 16, 2017, 12:06:10 PM4/16/17
to
"Rudy Velthuis" <newsg...@rvelthuis.de> writes:
>I don't really understand why currently, in Forth, an address/length
>combination is used so often. The length should, when possible, be
>stored as part of the string data, just like old-fashioned (byte)
>counted strings.

Why?

Concerning reasons: From
<http://www.complang.tuwien.ac.at/anton/euroforth/ef13/papers/ertl-strings.pdf>:

|The favoured string representation in standard
|Forth is c-addr u. It allows representing strings
|of any length with any content, and you can pro-
|duce arbitrary substrings without needing to copy
|the string to a new buffer. The disadvantage of
|this representation is that it takes two cells on the
|stack, and dealing with several strings at once can
|therefore be cumbersome.
|
| The other common string representation in stan-
|dard Forth is the counted string: The on-stack rep-
|resentation is the address of the count byte; the
|count byte is followed by the characters of the
|string. The advantage of this representation is that
|it needs only one cell on the stack. But it can only
|represent strings with up to 255 chars, and any
|substring operation needs to create a new string
|buffer. Converting from counted to c-addr u is
|easy (count), but the other direction is cumber-
|some. Some people have suggested using cell counts
|instead of byte counts to get rid of the length limi-
|tation.

> The address-length pair only make sense if the string
>data is read directly from some kind of buffer and the buffer should
>not/cannot be modified. This only makes sense for transient data, e.g.
>when parsing a word or string of which the string data that can easily
>be discarded after use (e.g. when interpreting or compiling and you
>only parse to, er, find out what to do next).

Much of my strings usage is like that. A good reason to use c-addr u.
Do you really want to copy a substring to a separate buffer (and where
do you allocate that?) before you can, e.g., write it to a file? I
don't.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2016: http://www.euroforth.org/ef16/

Rudy Velthuis

unread,
Apr 16, 2017, 6:16:36 PM4/16/17
to
Anton Ertl wrote:

> > The address-length pair only make sense if the string
> > data is read directly from some kind of buffer and the buffer should
> > not/cannot be modified. This only makes sense for transient data,
> > e.g. when parsing a word or string of which the string data that
> > can easily be discarded after use (e.g. when interpreting or
> > compiling and you only parse to, er, find out what to do next).
>
> Much of my strings usage is like that. A good reason to use c-addr u.
> Do you really want to copy a substring to a separate buffer (and where
> do you allocate that?) before you can, e.g., write it to a file? I
> don't.

No, of course not. Then you use c-addr u, like in other languages. But
if you have to manipulate or keep around more than one string, and
these are referenced in more than one spot, it often makes sense to
have semi-permanent or permanent storage. So, yes, often such strings
are simply stored in the word, or in an allocated buffer, or, well, in
a separate heap, but then a double-cell "pointer" is useless. Then the
length can be stored with (before) the string data.
--
Rudy Velthuis http://www.rvelthuis.de

"Dying is a very dull, dreary affair. And my advice to you is to
have nothing whatever to do with it." -- W. Somerset Maugham.

lehs

unread,
Apr 16, 2017, 6:29:09 PM4/16/17
to
The strings on the stack where
S" Hello world" >str ( >$ )
s" This is a very long string about nothing only created to test two different string-stack systems" >str ( >$ )

If your code had been faster I would have reorganized mine and got a faster code, which would have been nice...

Rudy Velthuis

unread,
Apr 16, 2017, 6:34:32 PM4/16/17
to
In many language runtimes, allocation and freeing can be pretty slow
and hard to do. Ok, some have very sophisticated memory managers than
can do this faster, but it remains something that is not cheap.

My toy Forth is written in assembly, and that is why I don't have
implemented ALLOCATE and friends yet: I would have to write my own
memory manager.

--
Rudy Velthuis http://www.rvelthuis.de

"His ignorance is encyclopedic"
-- Abba Eban (1915-2002)

lehs

unread,
Apr 16, 2017, 8:11:18 PM4/16/17
to
I don't know how ALLOCATE & Co are implemented in Win32Forth, but SP-Forth use winapi. I will test SP-Forth and GForth for Androids and see what I get.

Here is an other Abba Eban:
"History teaches us that men and nations behave wisely once they have exhausted all other alternatives."
I hope his wisdom influenced on his acting - but I doubt.

hughag...@gmail.com

unread,
Apr 16, 2017, 10:26:46 PM4/16/17
to
My STRING-STACK.4TH does use COW (copy on write). Weirdly enough, you snipped my entire description of how I use COW and showed one sentence out of context that seemed to imply that I didn't use COW. When I said that storing strings in the heap is a bad idea, I meant a dumb implementation in which every copy requires an ALLOCATE and a MOVE.

COW only works when you know that a pointer is a string, and you know how to find all strings. It is necessary to be able to find all strings because, when a unique is modified, all of the derivatives of that unique have to be converted into uniques. So, you have to be able to find all strings and test them to determine if they are derivatives, and if so if they are derivative of the unique that is about to be modified. This is easy in STRING-STACK.4TH because all the strings are on the string-stack. This is also possible in other languages because they have typed data and so strings have a tag that indicates that they are a string, and there is an internal data-structure that keeps track of all the strings in existence.

hughag...@gmail.com

unread,
Apr 16, 2017, 10:31:39 PM4/16/17
to
I don't know how to interpret this either. I don't believe that it is true.

empty$ ok
s" foo" >$ s" bar" >$ .s$
STRING STACK:
derivative: |bar|
derivative: |foo| ok
over$ over$ .s$
STRING STACK:
derivative: |bar|
derivative: |foo|
derivative: |bar|
derivative: |foo| ok
drop$ drop$ .s$
STRING STACK:
derivative: |bar|
derivative: |foo| ok

As you can see, this code works only with derivatives. It never calls ALLOCATE and it never copies a string.

You are telling me that copying entire strings, rather than copying a reference to the string, is 6 times faster? I don't believe that.

hughag...@gmail.com

unread,
Apr 16, 2017, 10:43:49 PM4/16/17
to
I don't think this is true. None of those strings ever get put in the heap. ALLOCATE never gets called. No string is ever copied.

empty$ ok
s" foo" >$ s" bar" >$ .s$
STRING STACK:
derivative: |bar|
derivative: |foo| ok
over$ over$ .s$
STRING STACK:
derivative: |bar|
derivative: |foo|
derivative: |bar|
derivative: |foo| ok
swap$ .s$
STRING STACK:

hughag...@gmail.com

unread,
Apr 17, 2017, 1:42:48 AM4/17/17
to
On Sunday, April 16, 2017 at 7:43:49 PM UTC-7, hughag...@gmail.com wrote:
> On Sunday, April 16, 2017 at 9:02:40 AM UTC-7, lehs wrote:
> > With this test my code ran 4 times faster.
> >
> > \ : test1 0 do over$ over$ swap$ drop$ drop$ loop ;
> > \ : test1 0 do sover sover sswap sdrop sdrop loop ;
> >
> > The method with heaps and references is faster for SWAP.
>
> I don't think this is true. None of those strings ever get put in the heap. ALLOCATE never gets called. No string is ever copied.

It is possible that you have an old version of STRING-STACK.4TH --- for a while, I had >$ making unique strings, which was not necessary most of the time (because most strings come from S" that is non-mutable) --- now I have >$ assume that the string is non-mutable and makes a derivative on the string-stack, and I have MUT>$ that is for mutable strings (such as <CSTR ... CSTR> strings) and these get converted into unique strings. I also have HEAP>$ that is for strings that are already on the heap (typically because they came from $> and got held in a data-structure for a while).

hughag...@gmail.com

unread,
Apr 17, 2017, 2:09:17 AM4/17/17
to
Actually, that wouldn't have much effect on the speed because >$ isn't inside of the loop. All you have inside of the loop are stack-juggling words, and these don't make unique strings (they don't call ALLOCATE and they don't copy the string); they just juggle references.

I think that you are just making up a bullshit story. There is no way that copying strings is 6 times (or 4 times) faster than copying a reference. STRING-STACK.4TH uses COW (copy-on-write) for the purpose of making it fast.

lehs

unread,
Apr 17, 2017, 2:34:29 AM4/17/17
to
No you might be right! I just did the same test on GForth Android and got the results:

0.230113 seconds for 1 miljon loops, your string-stack.4th
0.459716 seconds for 1 miljon loops, my stringstack.4th

There was some problem loading your novice.4th plus string-stack.4th (pasting) in Win32Forth, with 3 numbers on the stack after loaded.

This is interesting so I will go on testing. If I understand how your system works, I might be able to speed up my bigintegers.4th with that idea of yours.

Lars-Erik Svahn
https://forthmath.blogspot.se
https://github.com/Lehs/ANS-Forth-libraries

lehs

unread,
Apr 17, 2017, 3:15:40 AM4/17/17
to
Tests on GForth 64 bits Windows:

s" Hello world" >$ ok
s" This is a very long string about nothing created to test two different systems of string stack" >$ ok
: test 0 do over$ over$ drop$ drop$ loop ; ok
utime 1000000 test utime d- d. -170776 ok

s" Hello world" >str ok
s" This is a very long string about nothing created to test two different systems of string stack" >str ok
: test 0 do sover sover sdrop sdrop loop ; ok
utime 1000000 test utime d- d. -377323 ok

I can't load novice.4th in SP-Forth. It stops reporting an underflow.
Also loading thru Win32Forth IDE stops with an error at your first macro definition.

It seems that your implementation is more than twice as fast as my very simple implementation. Perhaps not som much in string handling, but if big integers could use the same idea it would means a lot.

lehs

unread,
Apr 17, 2017, 3:46:28 AM4/17/17
to
Den måndag 17 april 2017 kl. 08:09:17 UTC+2 skrev hughag...@gmail.com:
I don't blame you for the suspicion, but it was the result that I got. After had been repairing an error due to a wrap the code did paste-load on Win32Forth without numbers on the stack and the test was much faster (about a half second).

Maybe I will try to separate string-stack from the novice package and then eventually be able to understand how it works.

Paul Rubin

unread,
Apr 17, 2017, 4:19:52 AM4/17/17
to
lehs <skydda...@gmail.com> writes:
> I might be able to speed up my bigintegers.4th

I wonder if you could run a test for me: let p = 17**250 + 2704
(this number happens to be prime) and time how long it takes for
your implementation to do a Fermat test: calculate (3**p) mod p
which should be equal to 3.

My feeling is that if this takes less than a few seconds, your package
is reasonably usable for public-key cryptography.

lehs

unread,
Apr 17, 2017, 5:06:14 AM4/17/17
to
utime b 17 b 250 b** b 2704 b+ bthree bswap bdup b**mod utime d- d.
-4797000 ok
utime b 17 b 250 b** b 2704 b+ bthree bswap bdup b**mod~ utime d- d.
-547000 ok
.b

3
3 ok

With SP-Forth: 5 sec without Barret reduction and 0.5 seconds with Barret reduction.

With GForth Android it takes about 18 sec without and 2 seconds with.

lehs

unread,
Apr 17, 2017, 5:50:54 AM4/17/17
to
With GForth 64 bit Windows:
utime b 17 b 250 b** b 2704 b+ bthree bswap bdup b**mod utime d- d. -4264661 ok
utime b 17 b 250 b** b 2704 b+ bthree bswap bdup b**mod~ utime d- d. -560484 ok
.b
3
3 ok

Win32Forth (stopwatch):
48 sec without and 5 sec with Barett reduction.

Anton Ertl

unread,
Apr 17, 2017, 8:13:27 AM4/17/17
to
"Rudy Velthuis" <newsg...@rvelthuis.de> writes:
>Anton Ertl wrote:
>
>> > The address-length pair only make sense if the string
>> > data is read directly from some kind of buffer and the buffer should
>> > not/cannot be modified. This only makes sense for transient data,
>> > e.g. when parsing a word or string of which the string data that
>> > can easily be discarded after use (e.g. when interpreting or
>> > compiling and you only parse to, er, find out what to do next).
>>
>> Much of my strings usage is like that. A good reason to use c-addr u.
>> Do you really want to copy a substring to a separate buffer (and where
>> do you allocate that?) before you can, e.g., write it to a file? I
>> don't.
>
>No, of course not. Then you use c-addr u, like in other languages.

So that's why we have c-addr u. Now adding another string
representation complicates things. I need to decide which string
representation is appropriate where, have to keep track of which
string representation is used where, have to insert conversion words
in various places, have to write some words in two variants, etc. And
what benefit do we get for putting up with these complications?

> But
>if you have to manipulate or keep around more than one string, and
>these are referenced in more than one spot, it often makes sense to
>have semi-permanent or permanent storage. So, yes, often such strings
>are simply stored in the word, or in an allocated buffer, or, well, in
>a separate heap, but then a double-cell "pointer" is useless.

The c-addr u representation is just as useful as ever: It tells you
where the string starts and how long it is.

hughag...@gmail.com

unread,
Apr 17, 2017, 4:11:22 PM4/17/17
to
On Monday, April 17, 2017 at 5:13:27 AM UTC-7, Anton Ertl wrote:
> "Rudy Velthuis" <newsg...@rvelthuis.de> writes:
> >Anton Ertl wrote:
> >
> >> > The address-length pair only make sense if the string
> >> > data is read directly from some kind of buffer and the buffer should
> >> > not/cannot be modified. This only makes sense for transient data,
> >> > e.g. when parsing a word or string of which the string data that
> >> > can easily be discarded after use (e.g. when interpreting or
> >> > compiling and you only parse to, er, find out what to do next).
> >>
> >> Much of my strings usage is like that. A good reason to use c-addr u.
> >> Do you really want to copy a substring to a separate buffer (and where
> >> do you allocate that?) before you can, e.g., write it to a file? I
> >> don't.
> >
> >No, of course not. Then you use c-addr u, like in other languages.
>
> So that's why we have c-addr u.

Rudy Velthuis doesn't know what he is talking about. Consumers work fine with derivative strings; they don't need to convert the derivative into a unique (which involves calling ALLOCATE and copying it the string over). You can write a string to a file or whatever.

Only modifiers need to convert the derivative into a unique (or, if it is a unique already, need to convert all of its derivatives into uniques). Modifiers (such as REVERSE$ etc.) aren't used very much, so this doesn't need to be done very much.

Anton Ertl doesn't know what he is talking about either, because he cheerfully agreed with Rudy Velthuis in this bogus criticism of my STRING-STACK.4TH package. It is possible that Anton Ertl doesn't know how COW (copy-on-write) works. It is also possible that Anton Ertl does know how COW (copy-on-write) works and he is just being dishonest --- pretending that any criticism of my STRING-STACK.4TH is valid --- brown-nosing Elizabeth Rather as usual (because she appointed him to his job as the Forth-200x committee chairpersone).

> Now adding another string
> representation complicates things. I need to decide which string
> representation is appropriate where, have to keep track of which
> string representation is used where, have to insert conversion words
> in various places, have to write some words in two variants, etc. And
> what benefit do we get for putting up with these complications?

LOL! No, you don't have to decide which string representation is appropriate where, or keep track of which string representation is used where, etc..
You just press the flush lever on Elizabeth Rather's stupid string representation! Why is this so difficult for you???

It is utterly stupid to pass strings in global variables (the <# #> pad, the S" pad, and PAD).
It is utterly stupid to have address/count pairs on the data-stack.
It is utterly stupid to have some strings that can't be modified (because they came from S" or S\"), although there is nothing to actually stop you from modifying them, and have to remember that this.
It is utterly stupid to have some strings that are on the heap and will eventually need a FREE, and have to remember this.
It is utterly stupid to have some strings that are in global variables and will get overwritten, and have to remember this.
It is utterly stupid to modify strings willy-nilly without any kind of system in place.

It is never appropriate to use Elizabeth Rather's shit-head string scheme! Way back in the 1984 I was well aware of the problem with passing data between functions in global variables. This was the major problem in line-number BASIC. I really expected Forth to do better, but Forth-83 had the same problem. Forth-83 was introduced one year after Elizabeth Rather kicked Charles Moore out of Forth Inc. and it was Elizabeth Rather's shit-head scheme. At that time, everybody expected a new standard pretty soon that would fix these obvious blunders. A lot of time went by! Eventually ANS-Forth came out in 1994 and this continued to feature Elizabeth Rather's shit-head scheme.

The whole point of introducing a new standard is to kick Elizabeth Rather out of the Forth community. She is a shit-head! She has to go!

Paul Rubin

unread,
Apr 17, 2017, 4:14:14 PM4/17/17
to
lehs <skydda...@gmail.com> writes:
> With SP-Forth: 5 sec without Barret reduction and 0.5 seconds with
> Barret reduction.
> With GForth Android it takes about 18 sec without and 2 seconds with.

Wow, that's impressive for pure Forth. Is SP-Forth on an x86?
I mostly wanted the time for just the b**mod but I guess 17**250+2704
didn't use much of the time.

Thanks!

lehs

unread,
Apr 17, 2017, 6:58:21 PM4/17/17
to
GForth Android runs on 1.6 GHz Rockchip RK3066
The others on 1.6 GHz Celeron
both dual core.

I've no idea how to code public-key cryptography or why I would need it.

Rudy Velthuis

unread,
Apr 17, 2017, 7:07:52 PM4/17/17
to
Anton Ertl wrote:

> >> Much of my strings usage is like that. A good reason to use
> c-addr u. >> Do you really want to copy a substring to a separate
> buffer (and where >> do you allocate that?) before you can, e.g.,
> write it to a file? I >> don't.
> >
> > No, of course not. Then you use c-addr u, like in other languages.
>
> So that's why we have c-addr u. Now adding another string
> representation complicates things.

I am not talking about another string *representation*. If I were to
design this, I would store strings (when there is a need to store them)
as counted strings, but with a 32 bit length. But like in other
langages, if you copy or search or do other operations on strings, you
must often also supply a length. Then you supply one or more u's.

--
Rudy Velthuis http://www.rvelthuis.de

"It is a miracle that curiosity survives formal education."
-- Albert Einstein

Rudy Velthuis

unread,
Apr 17, 2017, 7:13:59 PM4/17/17
to
hughag...@gmail.com wrote:

> My STRING-STACK.4TH does use COW (copy on write). Weirdly enough, you
> snipped my entire description of how I use COW and showed one
> sentence out of context that seemed to imply that I didn't use COW.

You seemed to think that storing on the heap requires a lot of copying.
No, it doesn't necessarily do that.

> COW only works when you know that a pointer is a string, and you know
> how to find all strings.

COW does not rely on that at all. That is where refcounting would come
into play.

--
Rudy Velthuis http://www.rvelthuis.de

"Cancel the kitchen scraps for lepers and orphans! No more
merciful beheadings! And call off Christmas!"
-- The Sheriff of Nottingham

Rudy Velthuis

unread,
Apr 17, 2017, 7:18:43 PM4/17/17
to
hughag...@gmail.com wrote:

> > > No, of course not. Then you use c-addr u, like in other languages.
> >
> > So that's why we have c-addr u.

I understand the need for c-addr u in certain situations, but it is IMO
not a good idea to always having to supply the u. Only supply it where
it is needed.

> Rudy Velthuis doesn't know what he is talking about. Consumers work
> fine with derivative strings;

Rudy Velthuis does know what he is talking about. He is just not so
familiar with current affairs in Forth yet and wonders why certain
things are done the way they are currently done, in Forth.

I don't quite know what you mean with "derivative strings" though.

And where I come from, strings are often modified, e.g. concatenated,
replaced, searched, etc. They often have a life that is a little bit
longer than what could be found in the current input buffer or other
volatile storate.

I assume this is not the case in most Forth code? Or are such strings
generally compiled into a definition or stored in a variable, meaning
they are copied after all?

--
Rudy Velthuis http://www.rvelthuis.de

"Alas, to wear the mantle of Galileo it is not enough that you
be persecuted by an unkind establishment, you must also be
right." -- Bob Park

Rudy Velthuis

unread,
Apr 17, 2017, 7:22:48 PM4/17/17
to
lehs wrote:

> I don't know how ALLOCATE & Co are implemented in Win32Forth, but
> SP-Forth use winapi. I will test SP-Forth and GForth for Androids and
> see what I get.

But I assume that SP-Forth only allocates large chunks from the WinAPI
and then sub-allocates those (using its own memory manager) into
smaller chunks when they are requested? Otherwise it could be pretty
slow, indeed.

--
Rudy Velthuis (dentist)

"For there was never yet philosopher
That could endure the toothache patiently."
-- William Shakespeare, Much Ado About Nothing

Albert van der Horst

unread,
Apr 17, 2017, 7:48:41 PM4/17/17
to
In article <xn0kov39w...@nntp.aioe.org>,
Rudy Velthuis <newsg...@rvelthuis.de> wrote:
>hughag...@gmail.com wrote:
>
>> > > No, of course not. Then you use c-addr u, like in other languages.
>> >
>> > So that's why we have c-addr u.
>
>I understand the need for c-addr u in certain situations, but it is IMO
>not a good idea to always having to supply the u. Only supply it where
>it is needed.
>
>> Rudy Velthuis doesn't know what he is talking about. Consumers work
>> fine with derivative strings;
>
>Rudy Velthuis does know what he is talking about. He is just not so
>familiar with current affairs in Forth yet and wonders why certain
>things are done the way they are currently done, in Forth.

Both stringconstants (addr n) and stringvariable (storage with
first cell containing the length are useful.
The ease with which forth can pass around double items on the stack
means that we have no need to allocate or store strings
in many cases. In C that would require a struct instead of '' char *pc ''

This script copies a file in argument 1 to standard output
and removes all empty lines:

-------------------------
#!/usr/bin/lina -s

1 ARG[] GET-FILE
BEGIN DUP WHILE ^J $/ -TRAILING DUP IF TYPE CR ELSE 2DROP THEN REPEAT
2DROP
----------------------------------------------

All strings here are (addr n) , the only buffer is implicitly allocated
by GET-FILE. Pieces of the input are copied to the output.

>
>--
>Rudy Velthuis http://www.rvelthuis.de

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Paul Rubin

unread,
Apr 17, 2017, 7:57:45 PM4/17/17
to
lehs <skydda...@gmail.com> writes:
> GForth Android runs on 1.6 GHz Rockchip RK3066
> The others on 1.6 GHz Celeron both dual core.

Pretty good. On the original IBM PC, with a C implementation, this
calculation apparently would have taken minutes.

> I've no idea how to code public-key cryptography or why I would need it.

Basically say I want to send you a secret message. You pick two random
primes p and q and compute the product N=pq and send me N. Other people
(including me) can't find p and q from N, under the assumption that
factoring numbers of this size is computationally intractable. Now you
figure out the totient phi(N)=(p-1)*(q-1), which is the size of Z*(N),
the group of numbers coprime to N. Since you can find p and q from
phi(N), that means finding phi(N) is as hard as factoring, so N won't
leak info about phi(N) more than it does about the factors.

Now use the extended Euclidean algorithm to find a number d such that
d*3 == 1 (mod phi(N)) and keep d secret. I take my message M and send
you M**3 mod N. You use d to find M**(3*d) mod N, but since 3*d-1 is
a multiple of phi(N), that means that M**(3*d) mod N is just M.

So without telling me anything secret, on a channel where anyone can be
listening, you've sent me info that lets me send you an encrypted
message. This is public-key cryptography and the above algorithm is
called "RSA" after its inventors Rivest, Shamir, and Adelman.

Obviously there are a lot of practical details glossed over above, but
basically it's all about how fast you can do modular exponentials in a
group of size around 2**1000. That's why I wanted that timing.

Elizabeth D. Rather

unread,
Apr 17, 2017, 9:07:13 PM4/17/17
to
Versions were implemented by FORTH, Inc. and MPE for the Europay project
in the 90's, for processing smart card transations. On an 8051 (which
was used in some of the terminals we programmed) it took some seconds
(maybe 20-30, but it feels like an eternity). Nowadays smart card
terminals are proliferating in the US, and I marvel at how slow some of
them are: that algorithm is exactly what is taking all the time. The
terminals we programmed in Forth in the 90's were faster than most of
the ones we see in stores.

Cheers,
Elizabeth

--
Elizabeth D. Rather
FORTH, Inc.
6080 Center Drive, Suite 600
Los Angeles, CA 90045
USA

David N. Williams

unread,
Apr 17, 2017, 9:11:53 PM4/17/17
to
Out of curiosity, I tried this with the integer part of my forth-gmpfr
package, which has a garbage collected bignum stack in ANS Forth, with
bindings to the GNU Multiple Precision library:

http://www.umich.edu/~williams/archive/forth/gmpfr/

I used pfe for the timings. In the file names 3^p_mod_p_xxx.fs below,
"xxx" indicates the number of trials. The trials were run on a MacBook
Pro 2.2 GHz Intel Core i7 (4 cores) under macOS Sierra 10.12.4. The
gmpfr package and GNU libraries used haven't been changed since January,
2012.

[haag:~/gmpfr] dnwills% /usr/bin/time -p pfe -q -y 3^p_mod_p_100.fs
real 0.32
user 0.31
sys 0.00
[haag:~/gmpfr] dnwills% /usr/bin/time -p pfe -q -y 3^p_mod_p_200.fs
real 0.70
user 0.69
sys 0.00
[haag:~/gmpfr] dnwills% /usr/bin/time -p pfe -q -y 3^p_mod_p_1000.fs
real 2.95
user 2.94
sys 0.00

So the user time for one calculation is less than 3 milliseconds. I
don't know how much of that is bignum stack overhead, compared to the
GNU library calls.

Here's the 1000-trial code:

3^p_mod_p_1000.fs
-----------------
\ Title: GMPFR Benchmark for (3**p) mod p with a large prime p
\ File: 3^p_mod_p_1000.fs
\ Author: David N. Williams
\ Date: April 17, 2017
\ License: public domain

\ To run with pfe:
\ /usr/bin/time -p pfe -q -y 3^p_mod_p_1000.fs

true constant MULTI-INTEGERS
s" gmpfr.fs" INCLUDED
17 u>b 250 b^u 2704 bu+ bconstant p

: tries ( n -- )
0 DO
3 u>b p p b-pow-mod \ 3^p mod p
bdrop
\ b.
LOOP
;

1000 tries
bye
-----------------

Here are some checks (long result broken into lines):

p b. 40947778423540104959067988292528857405972603447195632195037
8594747292053767136408255971273136413908600752651227962494772762
1804563995511385846512375859465262648854834054706258499178583491
6459459029262407818318245524277482110372262457010816382213629216
030118638993632313978231084140883786669390389605436763953 ok
3 u>b p p b-pow-mod b. 3 ok

I haven't tested whether the value of p above is correct, but at least
it's big, odd, and produces 3 in the calculation.

-- David W.

lehs

unread,
Apr 17, 2017, 9:14:32 PM4/17/17
to
I don't know if SP-Forth has a memory manager but ALLOCATE and FREE seems to be implemented bye the kernel32.dll. I wrongly supposed that Hugh had used ALLOCATE for his String-stack. His 2k rows code depends on the 3k rows on his novice package and it isn't readily accessible to me even if I have the source.

I would like faster stacks than mine, but I don't now anything of this "memory management" and what it can do for me.

Paul Rubin

unread,
Apr 17, 2017, 9:15:26 PM4/17/17
to
"David N. Williams" <will...@umich.edu> writes:
> So the user time for one calculation is less than 3 milliseconds.

Cool. Yes, GMP is very highly optimized, both in terms of using fancy
specialized algorithms for the modexp (e.g. Montgomery reduction), and
in using carefully tuned assembly code to get max performance from the
machine arithmetic. I didn't even mention GMP in my post because
there's no way to expect a pure Forth implementation to come anywhere
near it. I just wanted to know whether the Forth version was fast
enough to be useful at all, which it is.

hughag...@gmail.com

unread,
Apr 17, 2017, 9:44:28 PM4/17/17
to
I do use ALLOCATE for unique strings.

lehs

unread,
Apr 17, 2017, 10:13:13 PM4/17/17
to
That was fast. My hobby project started with trying to make arithmetic in ascii-strings and have developed to be several hundred times faster by time. But obviously there is much left to do.

The purpose with my ANS-Forth library is to have something to refer to when writing Forth code at OEIS, Euler, Rosetta etc that is easy to load.

I have loaded your gmpfr-0.9.0-file and will examine it. I knew there where that fast c-code, but I didn't know there was a stack based big integer for ANS Forth.

Paul Rubin

unread,
Apr 17, 2017, 10:21:58 PM4/17/17
to
lehs <skydda...@gmail.com> writes:
> I have loaded your gmpfr-0.9.0-file and will examine it. I knew there
> where that fast c-code,

GMP is extremely complicated and impressive and includes a lot of
assembly code, not just C. It would be a huge amount of work to
replicate all the stuff that it does. For example, it uses FFT-based
multiplication when the numbers involved are large enough. It also
has a highly optimized version of modexp for use in crypto.

> but I didn't know there was a stack based big integer for ANS Forth.

That does sound helpful, that there's a wrapper ready to use.
Message has been deleted

hughag...@gmail.com

unread,
Apr 17, 2017, 10:49:53 PM4/17/17
to
On Monday, April 17, 2017 at 4:18:43 PM UTC-7, Rudy Velthuis wrote:
> hughag...@gmail.com wrote:
>
> > > > No, of course not. Then you use c-addr u, like in other languages.
> > >
> > > So that's why we have c-addr u.
>
> I understand the need for c-addr u in certain situations, but it is IMO
> not a good idea to always having to supply the u. Only supply it where
> it is needed.
>
> > Rudy Velthuis doesn't know what he is talking about. Consumers work
> > fine with derivative strings;
>
> Rudy Velthuis does know what he is talking about. He is just not so
> familiar with current affairs in Forth yet and wonders why certain
> things are done the way they are currently done, in Forth.
>
> I don't quite know what you mean with "derivative strings" though.

You don't know what you are talking about when you criticize my STRING-STACK.4TH for not using COW --- it does use COW --- that is the whole point.

A "unique" string is on the heap. A "derivative" string is a reference to another string, either a unique or a non-mutable (such as came from S").

The consumers (such as with .$ DROP$ etc.) will FREE the string afterward if it unique, but will first convert any derivatives of that string into uniques (involves calling ALLOCATE and copying the string). Whether it is unique or derivative it gets dropped from the string-stack. I have a lot of functions for searching strings, and these are all consumers.

The modifiers (such as REVERSE$ etc.) change a string. If the string is unique, they first search the string-stack for any derivatives of this string and convert them into uniques (involves calling ALLOCATE and copying the string). If the string is derivative, they first convert the string into a unique (involves calling ALLOCATE and copying the string).

My string-concatenation function +$ was one of the most complicated functions I've written in quite some time. It was complicated because there are two parameters, and each parameter can be either unique or derivative, so there are 4 permutations of kinds of strings being concatenated.

I have a lot of functions for finding substrings. These are not modifiers. They produce a derivative string whose address is inside of the unique string. If that unique later gets modified or consumed, they are found on the string-stack and determined to be derivatives of that string, so they get converted into uniques at that time.

The stack jugglers (such as SWAP$ ROT$ etc.) just move around the items on the string-stack without concern as to them being unique or derivative. The duplicators (such as DUP$ OVER$ etc.) made a derivative string (doesn't matter if the string being duplicated was a unique or a derivative).

Notice that when DUP$ OVER$ etc. are used, you end up with a derivative on the stack and the unique that it is derived from underneath. This is a good thing! If you then consume that string, it is very efficient, because consuming a derivative is efficient (no need to mess with the heap). Here is an example:

empty$ ok
s" abcdefgh" >$ dup$ .$ abcdefgh ok
.s$
STRING STACK:
derivative: |abcdefgh| ok

The DUP$ just made a derivative, and then .$ consumed it. The original string (which was a derivative of the non-mutable S" string given to >$) is still on the stack.

Here is another example:

empty$ ok
s" abcdefgh" >$ s" FGH" >$ isuffix$ ok-1
. -1 ok
.s$
STRING STACK:
derivative: |fgh|
derivative: |abcdefgh| ok

The top value of the string-stack (the |fgh| string) is a derivative of the second value of the string-stack (the |abcdefgh| string) --- it is inside of that string that it is derived from.

Here is another example (this one useful enough that I made it into a function and put it in the package):

: maybe-extract$ ( -- extracted? ) \ string: a b -- str \ extract B from A if possible
2dup$ swap$ find$ dup -1 = if drop drop$ false exit then \ drop the -1, DROP$ B and return A with a false flag
len$ anti-mid$
true ;

\ MAYBE-EXTRACT returns TRUE if it could extract B from A, and it returns STR which is A with B extracted from it.
\ If it couldn't, then it returns FALSE and it returns STR as A unchanged.

Here is a test:

empty$ ok
s" abcdefgh" >$ s" DEF" >$ maybe-extract$ ok-1
. -1 ok
.s$
STRING STACK:
unique: |abcgh| ok
empty$ ok
s" abcdefgh" >$ s" XYZ" >$ maybe-extract$ ok-1
. 0 ok
.s$
STRING STACK:
derivative: |abcdefgh| ok

Note that here, when we find the string and extract it, the new string is a unique because it has been modified (it had the substring extracted from it).

> And where I come from, strings are often modified, e.g. concatenated,
> replaced, searched, etc. They often have a life that is a little bit
> longer than what could be found in the current input buffer or other
> volatile storate.
>
> I assume this is not the case in most Forth code? Or are such strings
> generally compiled into a definition or stored in a variable, meaning
> they are copied after all?

Most Forth code is crap.

Every time that I write any ANS-Forth code, I help to kill ANS-Forth. This STRING-STACK.4TH is a typical example. By writing it I induced Anton Ertl to step forward and defend Elizabeth Rather's idiotic scheme of holding strings in global variables (the <# #> pad, the S" pad, and PAD) --- he makes himself look like an idiot by doing this --- it is obvious that he does this because Elizabeth Rather appointed him to the Forth-200x committee and he is obliged to defend all of her design decisions in ANS-Forth, not matter how idiotic.

hughag...@gmail.com

unread,
Apr 18, 2017, 3:10:13 AM4/18/17
to
On Monday, April 17, 2017 at 1:11:22 PM UTC-7, hughag...@gmail.com wrote:
> On Monday, April 17, 2017 at 5:13:27 AM UTC-7, Anton Ertl wrote:
> > Now adding another string
> > representation complicates things. I need to decide which string
> > representation is appropriate where, have to keep track of which
> > string representation is used where, have to insert conversion words
> > in various places, have to write some words in two variants, etc. And
> > what benefit do we get for putting up with these complications?
>
> LOL! No, you don't have to decide which string representation is appropriate where, or keep track of which string representation is used where, etc..
> You just press the flush lever on Elizabeth Rather's stupid string representation! Why is this so difficult for you???
>
> It is utterly stupid to pass strings in global variables (the <# #> pad, the S" pad, and PAD).
> It is utterly stupid to have address/count pairs on the data-stack.
> It is utterly stupid to have some strings that can't be modified (because they came from S" or S\"), although there is nothing to actually stop you from modifying them, and have to remember that this.
> It is utterly stupid to have some strings that are on the heap and will eventually need a FREE, and have to remember this.
> It is utterly stupid to have some strings that are in global variables and will get overwritten, and have to remember this.
> It is utterly stupid to modify strings willy-nilly without any kind of system in place.

I'm trying to kill ANS-Forth. I think STRING-STACK.4TH just by itself is capable of killing ANS-Forth. The supporters of ANS-Forth are put in the awkward position of defending the use of global variables for holding strings, which is utterly stupid. I knew that PAD was a bad idea in 1984 at the age of 18. It doesn't take a programming expert to understand this. Yet, the ANS-Forth supporters have to defend PAD because Elizabeth Rather put PAD in the ANS-Forth standard, and she continues to be highly enthusiastic about PAD over a quarter of a century later.

Although killing ANS-Forth is a worthy goal, it is also a pretty easy goal, as ANS-Forth is almost dead. There are fewer than 20 ANS-Forth programmers in the world today, and all of them are old.

Another goal I had with writing STRING-STACK.4TH is to provide a useful tool for my program that parses Ido text. There are about 200 people in the world who speak Ido, so that is an order of magnitude more popular than ANS-Forth, which means that somebody might actually use my program. With this goal in mind, I provided some functions for extracting prefixes and suffixes from words:
---------------------------------------------------------------------------

Section 1.4.) prefixes and suffixes and infixes, oh my!

PREFIX$ ( -- found? ) \ string: a b -- a | c
This determines if B is a prefix of A. If PREFIX$ returns true, then it returns C which is the prefix inside of A (it is a derivative).

IPREFIX$ ( -- found? ) \ string: a b -- a | c
This is like PREFIX$ except case-insensitive.

SUFFIX$ ( -- found? ) \ string: a b -- a | c
This determines if B is a suffix of A. If SUFFIX$ returns true, then it returns C which is the suffix inside of A (it is a derivative).

ISUFFIX$ ( -- found? ) \ string: a b -- a | c
This is like SUFFIX$ except case-insensitive.

INFIX$ ( -- found? ) \ string: a b -- a | c
This determines if B is an infix of A. If INFIX$ returns true, then it returns C which is the infix inside of A (it is a derivative).

IINFIX$ ( -- found? ) \ string: a b -- a | c
This is like INFIX$ except case-insensitive.

EXTRACT$ ( -- ) \ string: a b -- c d
This requires B to be a derivative inside of A (it is also okay for B to be an empty unique). EXTRACT$ removes the B string from the A string and returns the prefix (the C string) and the suffix (the D string). This should only be used on the results returned by: INFIX$ PREFIX$ SUFFIX$ IINFIX$ IPREFIX$ or ISUFFIX$

Note that EXTRACT$ is the only function we have that requires the parameters to be derived one from the other. All of our other functions work on either unique or derivative strings. EXTRACT$ is context-sensitive, in that it is supposed to be used after certain other functions. See also ANTIMID$ that uses EXTRACT$ internally.

lehs

unread,
Apr 18, 2017, 4:25:51 AM4/18/17
to
I'm not going to replicate it, but if I can understand how to make it work I could use it as it is. But even that seems very difficult.

Alex

unread,
Apr 18, 2017, 8:48:15 AM4/18/17
to
On 4/18/2017 08:10, hughag...@gmail.com wrote:
> I'm trying to kill ANS-Forth. I think STRING-STACK.4TH just by itself is capable of killing ANS-Forth.

Snort.

--
Alex

hughag...@gmail.com

unread,
Apr 18, 2017, 12:09:36 PM4/18/17
to
On Monday, April 17, 2017 at 5:13:27 AM UTC-7, Anton Ertl wrote:
ANS-Forth is a cult. Being an ANS-Forth "programmer" is all about loyalty. Elizabeth Rather, the creator of ANS-Forth, has declared herself to be the "leading expert" of Forth, and she has surrounded herself with sycophants who accept this. The only interpretation of the title "leading expert" is that all Forth programmers are inferior to her in knowledge of Forth.

Like all cults, ANS-Forth is founded upon willful ignorance; knowledge is what the cult members hate and fear the most. Above we see Anton Ertl speaking out against my STRING-STACK.4TH --- most likely, he actually knows what COW (copy on write) is, but he is pretending that he is ignorant of this for the sake of the cult. Elizabeth Rather provided ANS-Forth with static buffers to hold strings (the <# #> and the S" buffer and PAD which is limited to 84 chars) because she is too dumb to know better. She says:
-----------------------------------------------------------------------
People are much too phobic about [global] variables. The cry, "They aren't
re-entrant" simply means you have to look at the whole application and how
it is organized.
-----------------------------------------------------------------------
This works well when the "whole application" is small enough to fit on the chalkboard that she uses in her novice class at Forth Inc. (also taught by Andrew Haley, whom she appointed to the Forth-200x committee as a reward for his loyalty). This doesn't work well when the "whole application" is non-trivial. Also, this assumes that the programmer has the source-code for the "whole application," which isn't true if code-libraries are being used.

Elizabeth Rather is an ignoramus. By this, I mean that she is not just ignorant, but she is proud of her ignorance and she refuses to learn. She is really dumber than a box of rocks! The cult members who accept Elizabeth Rather as their "leading expert" are required to accept an abysmal level of ignorance for themselves so they can stay safely below their cult member. This is why I say that STRING-STACK.4TH will kill ANS-Forth --- code like this shines a spotlight on the stupidity of ANS-Forth --- the cult members are put in the awkward position of defending PAD as being superior --- this makes them look stupid, even by the standards of cults, which are pretty low. If I just said that ANS-Forth is stupid, but didn't write any ANS-Forth code, the cult members would respond by saying that I am incapable of writing ANS-Forth code and hence have nothing to say on the subject of ANS-Forth. Of course, the cult members say this anyway (Alex McDonald has spent almost 4 years saying that my array SORT proves that I have a "serious misunderstanding of how pointers work"), but their accusations are so ridiculous that they just make themselves look foolish --- every time that I write ANS-Forth code, this makes the ANS-Forth enthusiasts look foolish, which is why there is a dearth of volunteers to be ANS-Forth enthusiasts (none under the age of 40).

hughag...@gmail.com

unread,
Apr 18, 2017, 12:15:09 PM4/18/17
to
Please tell us more about how your leading expert's PAD is superior to my STRING-STACK.4TH package as a:
"transient region that can be used to hold data for intermediate processing."

Try to be more articulate --- rather that just saying "snort" your could say: "oink" --- that would be almost like using words!

Hee Haw!

David N. Williams

unread,
Apr 18, 2017, 12:27:08 PM4/18/17
to
On 4/17/17 9:21 PM, Paul Rubin wrote:
> lehs <skydda...@gmail.com> writes:
>> I have loaded your gmpfr-0.9.0-file and will examine it. I knew there
>> where that fast c-code,

Unfortunately I am no longer able to run it with the gforth installed on
my system. Sometime in the next few weeks I'll try to update the GMP
and MPFR libraries and C-call interfaces. Presumably the version that
Marcel built into iForth works fine. I haven't been keeping up for
quite awhile now.

> GMP is extremely complicated and impressive and includes a lot of
> assembly code, not just C. It would be a huge amount of work to
> replicate all the stuff that it does. For example, it uses FFT-based
> multiplication when the numbers involved are large enough. It also
> has a highly optimized version of modexp for use in crypto.
>
>> but I didn't know there was a stack based big integer for ANS Forth.

Forth-gmpfr is an elaborate package, with garbage-collected stacks for
multiple precision integers, rationals, floats, and reliable floats. My
attitude was that GMP and MPFR, as deep, deep implementations of
scientifically and mathematically significant problem domains, justify a
comprehensive treatment in interactive Forth. Of course one doesn't
want to be more elaborate than necessary, and some simplification of the
package is surely possible. But I did put quite a bit of effort into
keeping as much elaboration as possible under the hood, as far as the
user is concerned.

Interfacing with C libraries is itself a serious complication,
especially since we don't have an agreed upon, stable interface that
works with all the major Forths and is respected by updates to those
Forths (AFAIK). Aside from the interface, one has to deal with updates
to the GMP and MPFR libraries. Maybe mostly ignoring them is feasible.

> That does sound helpful, that there's a wrapper ready to use.

It was at one point fairly ready to use, but it will take some work to
make that true now.

I certainly don't want to discourage the development of all-in-Forth
bignums. Forth-gmpfr could be used for testing them. I assume people
are aware of Len Zettel's version in the Forth Scientific Library:

https://www.taygeta.com/fsl/library/big.fth

-- David W.

Alex

unread,
Apr 18, 2017, 1:23:57 PM4/18/17
to
On 4/18/2017 17:15, hughag...@gmail.com wrote:
> On Tuesday, April 18, 2017 at 5:48:15 AM UTC-7, Alex wrote:
>> On 4/18/2017 08:10, hughag...@gmail.com wrote:
>>> I'm trying to kill ANS-Forth. I think STRING-STACK.4TH just by
>>> itself is capable of killing ANS-Forth.
>>
>> Snort.
>>
>> -- Alex
>
> Please tell us more about how your leading expert's PAD is superior
> to my STRING-STACK.4TH package as a: "transient region that can be
> used to hold data for intermediate processing."
>

Why? I'm not making any stupid or preposterous claims, you are.

--
Alex

lehs

unread,
Apr 18, 2017, 1:27:32 PM4/18/17
to
ANS-Forth isn't a cult, it's a standard. ANS-Forth programming isn't about loyalty, but about portability. As you very well know. Your Novice package and String-stack has nothing to do with cults or loyalty.

Against my will I have become interested in Novice package and String-stack. Surely, Novice package isn't at all for novices. It's a collection of rather smart words, most of them yours. Novis-package is a transformation from ANS-Forth to a preliminary Hugh-Forth. I wouldn't advice novices to learn Hugh-Forth if they had become interested in Forth.

m...@iae.nl

unread,
Apr 18, 2017, 2:53:15 PM4/18/17
to
Out of curiosity, I tried this with one of the five
bignum packages in iForth, the GINT:

NEEDS -factor

s" 3" s2 >giant
#17 #250 s1 Gm^n s1 #2704 GS+ ( s1 = 17^250 + 2704 )
s1 .giant
\ 4094777842354010495906798829252885740597260344719563219503785947
\ 4729205376713640825597127313641390860075265122796249477276218045
\ 6399551138584651237585946526264885483405470625849917858349164594
\ 5902926240781831824552427748211037226245701081638221362921603011
\ 8638993632313978231084140883786669390389605436763953

s2 s1 s1 GG^MOD
s2 .giant ( should print 3)

: tries ( n -- )
CR DUP DEC. ." tries: "
TIMER-RESET
0 DO
s" 3" s2 >giant
s2 s1 s1 GG^MOD
LOOP .ELAPSED ;

#1000 tries
1000 tries: 16.914 seconds elapsed. ok

It gives the same result as GMP for "p",
but its powermod takes 16ms / op, 5 times
slower than GMP.

(BTW, GMP does not use assembly unless you
force it, as that would not work everywhere
without gas).

Unfortunately, the Forth bignum package doesn't
have a GG^MOD look-a-like, so I can't compare
that GMP/GINT (maybe lina has one?)

-marcel

Paul Rubin

unread,
Apr 18, 2017, 10:27:37 PM4/18/17
to
lehs <skydda...@gmail.com> writes:
> I'm not going to replicate it, but if I can understand how to make it
> work I could use it as it is. But even that seems very difficult.

It's not too bad from a C program, quite nice from C++, and from some
other languages that use it, you don't even notice it. No idea about
Forth bindings.

The Beez

unread,
Apr 19, 2017, 3:44:26 AM4/19/17
to
On Saturday, March 5, 2016 at 11:52:48 AM UTC+1, Mark Wills wrote:
> Good job. Pleased to see someone else working with string stacks.
4tH features at least three different implementations from single stack bare bones to one that allows multiple stacks and "buffers" old strings so that they can survive a

s> s> 2swap >s >s

Note 4tH's preprocessor also features a string stack.

Hans Bezemer

Rudy Velthuis

unread,
Apr 19, 2017, 7:03:06 PM4/19/17
to
Albert van der Horst wrote:

> In article <xn0kov39w...@nntp.aioe.org>,
> Rudy Velthuis <newsg...@rvelthuis.de> wrote:
> > hughag...@gmail.com wrote:
> >
> >> > > No, of course not. Then you use c-addr u, like in other
> languages. >> >
> >> > So that's why we have c-addr u.
> >
> > I understand the need for c-addr u in certain situations, but it is
> > IMO not a good idea to always having to supply the u. Only supply
> > it where it is needed.
> >
> >> Rudy Velthuis doesn't know what he is talking about. Consumers work
> >> fine with derivative strings;
> >
> > Rudy Velthuis does know what he is talking about. He is just not so
> > familiar with current affairs in Forth yet and wonders why certain
> > things are done the way they are currently done, in Forth.
>
> Both stringconstants (addr n) and stringvariable (storage with
> first cell containing the length are useful.

Ok, fine. But are all c-addr u combinations really supposed to be
constants, or is that just how you call them?

> The ease with which forth can pass around double items on the stack
> means that we have no need to allocate or store strings
> in many cases. In C that would require a struct instead of '' char
> *pc ''
>
> This script copies a file in argument 1 to standard output
> and removes all empty lines:
>
> -------------------------
> #!/usr/bin/lina -s
>
> 1 ARG[] GET-FILE
> BEGIN DUP WHILE ^J $/ -TRAILING DUP IF TYPE CR ELSE 2DROP THEN REPEAT
> 2DROP
> ----------------------------------------------


What does GET-FILE do?

What does $/ do? Split the string at the first occurrence of a given
char?

And I assume that ^J is a syntax for Ctrl-J or linefeed.


--
Rudy Velthuis http://www.rvelthuis.de

"Everyone likes to say Hitler did this and Hitler did that. But
the truth is Hitler did very little. He was a world class
asshole, but the evil actually done, from the death camps to
World War Two, was all done by citizens who were afraid to
question if what they were told by their government was the
truth or not, and who because they did not want to admit to
themselves that they were afraid to question the government,
refused to see the truth behind the Reichstag Fire, refused to
see the invasion by Poland was a staged fake, and followed
Hitler into national disaster."
-- Michael Rivero

Rudy Velthuis

unread,
Apr 19, 2017, 7:08:35 PM4/19/17
to
hughag...@gmail.com wrote:

> > > Rudy Velthuis doesn't know what he is talking about. Consumers
> > > work fine with derivative strings;
> >
> > Rudy Velthuis does know what he is talking about. He is just not so
> > familiar with current affairs in Forth yet and wonders why certain
> > things are done the way they are currently done, in Forth.
> >
> > I don't quite know what you mean with "derivative strings" though.
>
> You don't know what you are talking about when you criticize my
> STRING-STACK.4TH for now using COW

I haven't criticized your file, because I haven't even seen it (nor do
I care a lot about it, at the moment).

I merely reacted to your claim that:

> Also, storing string in the heap is not a good solution.
>
> 1.) This involves copying the string every time.

That is indeed nonsense. If you do it right, you only copy it once.

--
Rudy Velthuis http://www.rvelthuis.de

"Contraction of theological influence has at once been the best
measure, and the essential condition of intellectual advance."
-- William H. Lecky

Rudy Velthuis

unread,
Apr 19, 2017, 7:09:51 PM4/19/17
to
lehs wrote:

> I don't know if SP-Forth has a memory manager but ALLOCATE and FREE
> seems to be implemented bye the kernel32.dll.

Hmmm... That could be slow indeed.
--
Rudy Velthuis http://www.rvelthuis.de

"Benny Goodman plays the clarinet. I play music."
-- Artie Shaw

Elizabeth D. Rather

unread,
Apr 19, 2017, 7:19:10 PM4/19/17
to
On 4/19/17 1:03 PM, Rudy Velthuis wrote:
> Albert van der Horst wrote:
>
>> In article <xn0kov39w...@nntp.aioe.org>,
>> Rudy Velthuis <newsg...@rvelthuis.de> wrote:
>>> hughag...@gmail.com wrote:
>>>
>>>>>> No, of course not. Then you use c-addr u, like in other
>> languages. >> >
>>>>> So that's why we have c-addr u.
>>>
>>> I understand the need for c-addr u in certain situations, but it is
>>> IMO not a good idea to always having to supply the u. Only supply
>>> it where it is needed.
>>>
>>>> Rudy Velthuis doesn't know what he is talking about. Consumers work
>>>> fine with derivative strings;
>>>
>>> Rudy Velthuis does know what he is talking about. He is just not so
>>> familiar with current affairs in Forth yet and wonders why certain
>>> things are done the way they are currently done, in Forth.
>>
>> Both stringconstants (addr n) and stringvariable (storage with
>> first cell containing the length are useful.
>
> Ok, fine. But are all c-addr u combinations really supposed to be
> constants, or is that just how you call them?

The arguments c-addr u just denote a region of memory u address units
long. It may be in a scratch area, an input buffer, RAM, ROM, or
anywhere. Nothing is implied about its contents or location. Any
implications as to the nature of this region is embodied in what the
program does with it, that is, the word processing the c-addr u stack
arguments such as MOVE etc.

>> The ease with which forth can pass around double items on the stack
>> means that we have no need to allocate or store strings
>> in many cases. In C that would require a struct instead of '' char
>> *pc ''

Yes the "2---" words (2DUP, 2OVER, 2SWAP, etc.) are particularly useful
with these pairs of cells.

Rudy Velthuis

unread,
Apr 19, 2017, 8:15:51 PM4/19/17
to
Elizabeth D. Rather wrote:

> > > The ease with which forth can pass around double items on the
> > > stack means that we have no need to allocate or store strings
> > > in many cases. In C that would require a struct instead of '' char
> > > *pc ''
>
> Yes the "2---" words (2DUP, 2OVER, 2SWAP, etc.) are particularly
> useful with these pairs of cells.

Sure, and they are easy to implement. But the single cell equivalents
are probably still more efficient, so manipulating a stack with
single-cell values is probably still better.

Of course, if there is no need to store strings, then it makes no sense
to do that just to save having to pass around a single value.

--
Rudy Velthuis http://www.rvelthuis.de

"He'd noticed that sex bore some resemblance to cookery: it
fascinated people, they sometimes bought books full of
complicated recipes and interesting pictures, and sometimes when
they were really hungry they created vast banquets in their
imagination - but at the end of the day they'd settle quite
happily for egg and chips. If it was well done and maybe had a
slice of tomato."
-- Terry Pratchett (The Fifth Elephant)

Rudy Velthuis

unread,
Apr 19, 2017, 8:16:13 PM4/19/17
to
I understand that.

But the current topic was "c-addr u" for (single byte) strings. I see
that Albert van der Horst likes to call these "c-addr u" combinations
"string constants" ("sc" in his docs for e.g. wina or lina), even if
they are actually not necessarily constants at all.
--
Rudy Velthuis http://www.rvelthuis.de

"In general the art of government consists in taking as much
money as possible from one class of citizens to give to the
other."
-- Voltaire
It is loading more messages.
0 new messages