On Wednesday, October 19, 2016 at 12:59:33 AM UTC-7, Anton Ertl wrote:
>
foxaudio...@gmail.com writes:
> >I noticed a lot of code uses the phrase 1 /STRING so I made a word I call SNIP.
> >
> >: snip ( addr len -- addr' len' ) 1- swap 1+ swap ;
>
> 63 occurences "1 /string" in 62393 lines of code. Too little IMO for
> having to remember yet another word.
If Anton Ertl is opposed to your SNIP because you have added another word for the user to remember, then he must absolutely hate my string-stack.4th package! I add about two dozen words for the user to remember! OMG! How would any human being ever learn all these words? It would take Anton Ertl a lifetime to learn how string-stack.4th works, even if he had Bernd Paysan helping him --- learning string-stack.4th is like learning rocket-science; it requires far more intelligence and perseverence than any ANS-Forth programmer could be expected to have! Even learning your SNIP is beyond the ability of most ANS-Forth programmers, and my string-stack.4th is at least an order of magnitude more complicated. Arg! ;-)
In this case, my DISCARD-LEFT$ would do the trick. I don't much like your SNIP because it is too specific in what it does. It only works with 1 char (by comparison; DISCARD-LEFT$ allows the programmer to specify any number of chars). Most importantly however, you don't have any way to manage your strings. You might have the original string still available, and also have the SNIP substring (I use the term "derivative" in my string-stack.4th) available, at the same time --- if you modify one of these strings, you will modify the other one also --- you are forcing the programmer to keep track of which strings are derivative of which other strings --- by comparison, in my string-stack.4th I have "unique" and "derivative" strings, but the system keeps track of which are which, so the programmer can write his code as if every string were unique and not worry about the optimization going on under the hood.
My string-stack.4th package is written entirely in ANS-Forth and should run on any ANS-Forth system. I'm not an ANS-Forth programmer however. Elizabeth Rather says:
-----------------------------------------------------------------------------
...in Forth it's so easy to build data structures that are exactly right for
the particular application at hand that worrying about what pre-built
structures you have and how to use them is just not worth the bother.
-----------------------------------------------------------------------------
She demands that every ANS-Forth program be written using only the raw ANS-Forth words, without any extensions at all. She demands that you use
1 /STRING
in your application program. Anton Ertl (appointed by Elizabeth Rather to be the chair-person of the Forth-200x committee) is parrotting Elizabeth Rather here. He is opposed to your SNIP because this causes a programmer of your system to have to worry about what pre-built words are available, which is just not worth the bother. Anton is worried that the reader of your code would encounter SNIP and think:
"What the hell is that??? SNIP is not listed in the ANS-Forth document! Arg! Looking up this SNIP word in the programmer's own documentation is just not
worth the bother --- I will forget about this program --- learning how it
works is like rocket-science!"
In my string-stack.4th I'm treating strings as a "pre-built structure" and providing code that will work in any program that does text processing. In ANS-Forth, we have /STRING (and you have provided SNIP which is a slight upgrade), all of which treat the string as a kind of array, and require the user to directly access the chars in this array. By comparison, in my string-stack.4th, the programmer would never directly access the chars in the strings, and I don't actually provide any way to do so. I could easily upgrade my string-stack.4th to use UTF-8 or UTF-32 or whatever, rather than ascii, and all the code written that uses string-stack.4th would continue to work without modification because it doesn't access chars directly and hence doesn't assume that the chars are any particular format or size.
I'm all about writing general-purpose Forth code. This is why Elizabeth Rather and her committee appointees Anton Ertl and Bernd Paysan, and all of their sycophants, say that I'm not an ANS-Forth programmer.
The following is an excerpt from my documentation. These are just the words used for extracting sub-strings from strings. I also have a lot more words used for searching in strings. Oftentimes, you will search first to find your substring, then extract a substring or substrings based on that information.
Anyway, here is the excerpt:
-----------------------------------------------------------------------------
LEN$ ( -- length ) \ string: a --
This returns the length of the string on the data-stack. This consumes the
string on the string-stack (Forth functions traditionally consume their
arguments), so if this is used and you still need the string, then DUP$ or
OVER$ or whatever should be used to keep a copy on the string-stack.
MID$ ( start-index length -- ) \ string: a -- b
The B string is a substring in the middle of the A string.
ANTI-MID$ ( start-index length -- ) \ string: a -- b
Returns the string with the middle part extracted (what MID$ would have
returned is not returned, but instead the edge parts concatenated together are
returned).
INNER$ ( start-index limit-index -- ) \ string: a -- b
This is like MID$ except that it uses a LIMIT-INDEX rather than a LENGTH (this
is somewhat like Mark Wills' MID$ and, to the best of my recollection, like
the QBASIC MID$). Note that the LIMIT-INDEX is 1 beyond the middle-part that
is kept (LIMIT-INDEX minus START-INDEX equals length).
ANTI-INNER$ ( start-index limit-index -- ) \ string: a -- b
Returns the string with the middle part extracted (what INNER$ would have
returned is not returned, but instead the edge parts concatenated together are
returned). Note that the LIMIT-INDEX is 1 beyond the middle-part that is
extracted (LIMIT-INDEX minus START-INDEX equals length).
LEFT$ ( length -- ) \ string: a -- b
This provides a substring of length LENGTH from the left side of the string.
RIGHT$ ( length -- ) \ string: a -- b
This provides a substring of length LENGTH from the right side of the string.
DISCARD-LEFT$ ( length -- ) \ string: a -- b
This discards a substring of length LENGTH from the left side of the string.
DISCARD-RIGHT$ ( length -- ) \ string: a -- b
This discards a substring of length LENGTH from the right side of the string.
FILL$ ( length char -- ) \ string: -- a
This produces a string filled with CHAR of length LENGTH.
BLANK$ ( length -- ) \ string: -- a
This produces a string filled with blanks of length LENGTH.
LPAD$ ( length -- ) \ string: a -- b
This pads the string with blanks on the left side so the total length is
LENGTH --- if the length of A is less than LENGTH nothing is done.
RPAD$ ( length -- ) \ string: a -- b
This pads the string with blanks on the right side so the total length is
LENGTH --- if the length of A is less than LENGTH nothing is done.
LTRIM$ ( -- ) \ string: a -- b
This trims the whitespace from the left side of the string.
RTRIM$ ( -- ) \ string: a -- b
This trims the whitespace from the right side of the string.
TRIM$ ( -- ) \ string: a -- b
This trims the whitespace from the left and right sides of the string.
BLACKEN$ ( -- ) \ string: a -- b
This removes all the whitespace from the entire string.