Isn't it time to give PLACE and +PLACE their proper place in the ANS
Forth standard?
Hans Bezemer
PLACE is easy enough to write in Forth without much worry about speed,
so it's not the kind of word that needs carnal knowledge to justify
its standardization. But is it sufficiently well-standardized in its
stack arguments that it's de facto portable already?
I'm not at all sure about +PLACE, though. I presume it appends one
counted string to another. It doesn't seem to be in SwiftForth.
Andrew.
Sure. Just write the Forth200x proposal.
Stephen
--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Is the inverse of count be to store the top of the stack at one char
behind the address beneath it, and return the address where the count
was stored?
Or is it to store a byte at an address given and increment the address
by a byte?
The rationale for standardizing it would be so that you could define
it if undefined and not have to worry about whether a compliant system
would have the same word appearing with a different meaning.
Or is it simply to drop the count from the top of the stack and
replace the address on the stack by the address minus the size
of the count field, under the assumption that the Forth string
( c-addr len) has a counted memory representation?
That's an operation I've actually used a fair number of times. I
use MCOUNT to name the cell version of COUNT, and define
: -MCOUNT ( addr len -- addr-cell ) drop cell- ;
The minus is because I think of it as the inverse of MCOUNT.
-- David
Actually they're
PLACE ( c-addr1 len1 c-addr2 -- )
Place string c-addr1,len1 at c-addr2 as a counted string.
+PLACE ( c-addr1 len1 c-addr2 -- )
Append string addr1,len1 to the counted string at addr2.
In some systems +PLACE is known as APPEND, so I think we'll
have to decide which name is prefferred.
George Hubert
So, you're saying that PLACE is *not* the inverse of COUNT
Are you sure you mean "counted string", which has a count field in the
first byte? There is already have common usage PACK for this:
: pack ( a u a2 -- | copy string to counted string at a2)
2dup c! 1+ swap cmove ;
Krishna
Oh, I see that PLACE comes from Wil Baden's toolbelt, in which the
third argument is a counted string address. PACK and STRPCK have also
been used for the same function. Since counted strings are generally
considered to be deprecated in modern Forth, there may not be much
support for standardizing words which operate on counted strings.
Krishna
Its ``PLACE'' in Toolbelt 2002:
\ PLACE ( str len addr -- )
\ Copy the string at _str_, whose length is _len_, to _addr_,
\ formatting it as a counted string, i.e., the length is in the
\ first byte. Does not check whether space is allocated for the
\ final string.
If those are two existing names for the process, PLACE is better than
PACK, since the operation does not actually do any packing.
That's a good argument for documenting it in the standard.
>- It is the inverse of the well established COUNT word;
That's a good argument against it. Counted strings are bad practice
that's slowly going out of use; there is no need to introduce new
words for them.
But if you want to see them standardized, go ahead and write an RfD.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2010: http://www.euroforth.org/ef10/
Ok. But, see my note above about why it may not be a good idea to
standardize PLACE.
Krishna
And PACK as I remember from iForth left the storage address on the stack.
( c-addr1 u1 c-addr2 -- c-addr2 )
--
Coos
CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
I don't see any substantial benefit in standardizing it ~ if I did not
think that a warning that a source assumes that you respect
Toolbelt2002 to the extent of not defining other words under those
names, I'd define a load constant, Tool2002 to mean that and use
[has?]
[has?] Tool2002 [DEFINED] PLACE AND 0= [IF]
: PLACE ( ca1 u ca2 -- ) 2DUP C! CHAR+ SWAP CHARS MOVE ;
[THEN]
This is better than Krishna's version, but it's still buggy. It has
to be:
: place ( addr len c-addr -- )
2DUP 2>R
CHAR+ SWAP CHARS MOVE
2R> C!
;
--
No, no, you can't e-mail me with the nono.
My alternatives $! and $+! cannot possibly be misremembered.
$! ( sc adr -- )
$+! ( sc adr -- )
If anything they are the reciprocals of $@ which is OUNT
not COUNT.
P.S.
sc is a string constant like the outcome of #> or such thing
as S" aap" or "aap" .
After 17 years of practice they don't seem to catch on so
maybe I should stop promoting them :-(
>
>-- David
Groetjes Albert
--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Why? What's the difference between moving first and storing the length
after and storing the length first and moving after?
Because if the place to place is inside the original string, at
least one of its characters will be clobbered by the length stored
there, and the wrong contents then moved. MOVE itself is OK, since it
takes care to not screw up when moving overlapping ranges. Therefore,
MOVE first, store length later. Should be obvious.
Are you saying that what you posted is from Wil Baden's toolbelt?
He was ususally very careful about correctness.
Should be obvious but ...
: TEST
S" ABCDEF" >R PAD R@ CMOVE PAD R@ PAD PLACE
PAD R> DUMP ;
1) VFX:
test
0112:DEFC 06 41 42 43 44 45 46 00 00 00 00 00 00 00 00 00 .ABCDEF.........
ok
2) Swiftforth:
test
46FF9F 06 00 42 43 44 45 ..BCDE ok
3) Win32Forth:
test
7E7E98 | 06 06 06 06 06 06 |......| ok
5) Gforth
test
18BF4498: 06 41 42 43 44 45 - .ABCDE
ok
6) CHForth:
TEST
cseg:5FE0 06 06 06 06 06 06 ......
7) MinForth:
0 0 - test
Address: 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
9AD0 .. .. .. .. .. .. .. .. .. .. .. .. 06 41 42 43 .ABC
9AE0 44 45 .. .. .. .. .. .. .. .. .. .. .. .. .. .. DE ok
--
Wil Baden speaking about the origins of PLACE etc. in 2001:
Subject: Re: APPEND vs. +PLACE
From: Neil Bawd <neilb...@earthlink.net>
Newsgroups: comp.lang.forth
Date: Sun, 12 Aug 2001 19:28:28 GMT
in 3B75D6DF.53F3C...@gmx.de, Guido Draheim at guidod-200...@gmx.de wrote:
> Now, the poll booth is open, how'd your vote be (other than
> that you might have already used APPENDZ).
In 1983 I introduced PLACE into F83 as the inverse of COUNT:
PLACE is to COUNT as ! is to @. From this pattern came +PLACE
and C+PLACE: +PLACE is to PLACE as +! is to !; C+PLACE is to
+PLACE as C+! is to +!.
So these are the historical names.
APPEND is a natural language form of +PLACE.
I am gathering my Forth writings of 22 years into a library of
source files.
Since I did originate it, I'll use +PLACE, with APPEND as an
alias.
I also put out OUNT is to COUNT as @ is to C@. As expected
this was not accepted.
It is normal for languages to adopt features from other languages.
So in this clan of words, I will have: @++, @ with postincrement;
!++, ! with postincrement; C@++, C@ with postincrement; C!++, C!
with postincrement. (The ! words SWAP before use.)
Note that C@++ is equivalent to COUNT. So COUNT can be used to
fetch a string from memory, and C@++ to walk through it. For
counted strings with cell counts, @++ can be used to fetch from
memory and C@++ to walk through.
What other patterns have you used? (Don't tell me you don't like
it; tell me what you've done.)
--
Wil Baden Costa Mesa, California Per neilb...@earthlink.net
Wow! It depends on the versions, though. SwiftForth (which
version?) - Fail; CHForth - Epic Fail (MOVE wrong too?); Win32Forth -
Epic Fail - but that's the new version; version 4.2 (last Zimmer) is
correct. Seems that the rest are OK. You have a bug in the code too,
of course - should be PAD R> 1+ DUMP.
> 1) VFX:
>
> test
> 0112:DEFC 06 41 42 43 44 45 46 00 00 00 00 00 00 00 00 00 .ABCDEF.........
> ok
>
> 2) Swiftforth:
>
> test
> 46FF9F 06 00 42 43 44 45 ..BCDE ok
>
> 3) Win32Forth:
>
> test
> 7E7E98 | 06 06 06 06 06 06 |......| ok
>
> 5) Gforth
>
> test
> 18BF4498: 06 41 42 43 44 45 - .ABCDE
> ok
>
> 6) CHForth:
>
> TEST
> cseg:5FE0 06 06 06 06 06 06 ......
>
> 7) MinForth:
>
> 0 0 - test
> Address: 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
> 9AD0 .. .. .. .. .. .. .. .. .. .. .. .. 06 41 42 43 .ABC
> 9AE0 44 45 .. .. .. .. .. .. .. .. .. .. .. .. .. .. DE ok
--
Aha. I'd never place a counted string where it could overlap the
source. I was puzzled why Wil Baden did it that way, it seemed
inefficient.
Sigh...
If PLACE is a string move, what's the reverse of PICK?
Rod Pemberton
A few more ... (spaces reduced)
8) BB4Wforth v47
Parse error: : TEST S" ABCDEF" >R PAD
It seems PAD is not available.
9) bigFORTH 386-Win32 rev. 2.3.1
00 01 02 03 04 05 06 07 \/ 09 0A 0B 0C 0D 0E 0F 01234567VABCDEF
10037750 37 37 33 33 52 77 03 10 06 41 42 43 44 45 46 00 7700Rw...ABCDEF.
ok
Is this a fail or success?
In addition to the output above, Win32Forth 6.14.00 Build: 2 also gives this
error message:
Warning(-4101): DUMP is a system word in an application word
Rod Pemberton
<Perk>... Really? What a completely bonkers idea? Is this because
Forth is built on C these days?
By not standardising it, you are de-facto standardising Toolbelt2002.
That's fine if that's what you want...
Mark
Ahhh ha ha ha! Hilarious.
7 "Standard" systems and only VFX gets it right?
Oh dear oh dear oh dear...
Mark
I think "counted string" means an address pointing to a length byte
followed by data bytes, which limits the size of the string to 255.
Current practice seems to be to pass the address of the raw data
in one cell, and the length in a separate cell. In gforth,
: foo ( addr n ) . . ;
s" hi there" foo
prints
s" hi there" foo 8 31561312 ok
Of course that means quite 4 or more stack slots if you're passing more
than one string. So much for using 3 or fewer.
He's right, of course. Bacause of the 255-character wraparound
behaviour, almost any use of counted strings is a bug. They aren't
even long enough for filenames.
> What a completely bonkers idea? Is this because Forth is built on C
> these days?
No.
Andrew.
> In addition to the output above, Win32Forth 6.14.00 Build: 2 also gives this
> error message:
> Warning(-4101): DUMP is a system word in an application word
That's only a warning. If it was an error it would be of the form:
Error(-4): . stack underflow
and an ABORT would have been performed (Warnings don't ABORT they just
print
their message and continue).
>
> Rod Pemberton
George Hubert
I don't consider Win32Forth's behaviour a fail, nor any of the others
for that matter, as the words PLACE and +PLACE (or APPEND) have no
agreed standard behaviour where the arguments overlap. Should it be
proposed as a standard, then we can argue one way or the other, and I
would contend that the result of using the source buffer as a target
string should be undefined.
PLACE is not in any standard I have seen.
POKE
Understood. Why not introduce a new word then, for big strings? S" can
stay as it is and compile an '8 bit' string. A new word could be
introduced, maybe BS" (big string) which compiles a cell-wide length
followed by the data, instead of a byte?
Mark
> Elko T wrote:
>> ...
>> Because if the place to place is inside the original string, at
>> least one of its characters will be clobbered by the length stored
>> there, and the wrong contents then moved. MOVE itself is OK, since it
>> takes care to not screw up when moving overlapping ranges. Therefore,
>> MOVE first, store length later. Should be obvious.
>> ...
>
> Should be obvious but ...
>
>: TEST
> S" ABCDEF" >R PAD R@ CMOVE PAD R@ PAD PLACE
> PAD R> DUMP ;
>
<snip>
> 6) CHForth:
>
> TEST
> cseg:5FE0 06 06 06 06 06 06 ......
>
So, do I have to worry? The source and destiny overlap here. PLACE should
have been implemented with MOVE but I don't at te least care. Because the
source and destiny have never overlapped in the decades I used PLACE.
Why do you consider VFX "right" and the others not?
S" works fine for long strings (if the system actually supports long
input buffers, no need for a BS". If you are thinking of a variant of
counted strings that uses a cell for the count, that would be better
than char-counted strings, but still has disadvantages compared to
"addr u" type strings, and little existing practice.
>S" works fine for long strings (if the system actually supports long
>input buffers, no need for a BS". If you are thinking of a variant of
>counted strings that uses a cell for the count, that would be better
>than char-counted strings, but still has disadvantages compared to
>"addr u" type strings, and little existing practice.
Oh, really?
Try virtually any commercial desktop app. There may not be a lot of
them, but they have a lot of code and a lot of users.
However, much of this discussion confuses three things
1) String representation on the stack, e.g. -- caddr len
2) String representation in memory
3) Strings as memory objects (more than address/len)
Stephen
--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Alex! I'm really disappointed by your answer. It is not a question
of standard vs. non-standard, but a question of correctness.
- It *is* a fail of the new Win32Forth, because it *was* correct on
the 4.x version. Whoever did the rewrite in code (you? Rainbow Sally?)
broke the old behavior - and that should never have happened.
- There is no reason to require that overlapping PLACE should be
undefined. WTF? As I said, I didn't expect it from you.
- It has always been expected to behave in the *correct* way, even if
it never has been standard. I looked at my files, and see that in my
last embedded system, in a Forth-83, there is the correct version too
(and I probably only corrected MOVE, judging by the comments, and left
PLACE the way it was):
\ MOVE PLACE EBC 02:13 03/15/94
\ Changed the ridiculous word by word MOVE with the byte-wise
\ smart MOVE that does not overlap
: MOVE ( from to len -- )
-ROT 2DUP U<
IF ROT CMOVE>
ELSE ROT CMOVE
THEN ;
: PLACE ( str-addr len to -- )
3DUP 1+ SWAP MOVE C! DROP ;
- So all who say it does not need to work that way, are ignoring
established common practice. It doesn't matter if you will never PLACE
into the original buffer - it is still wrong to do it incorrectly, if
you *can* do it correctly.
Me, probably. But I rarely feel shame or guilt, so I'm afraid this
argument left me unmoved...
>
> - There is no reason to require that overlapping PLACE should be
> undefined. WTF? As I said, I didn't expect it from you.
But now I'm flattered. I regularly suffer from the sin of pride.
>
> - It has always been expected to behave in the *correct* way, even if
> it never has been standard. I looked at my files, and see that in my
> last embedded system, in a Forth-83, there is the correct version too
> (and I probably only corrected MOVE, judging by the comments, and left
> PLACE the way it was):
>
> \ MOVE PLACE EBC 02:13 03/15/94
> \ Changed the ridiculous word by word MOVE with the byte-wise
> \ smart MOVE that does not overlap
>
> : MOVE ( from to len -- )
> -ROT 2DUP U<
> IF ROT CMOVE>
> ELSE ROT CMOVE
> THEN ;
>
> : PLACE ( str-addr len to -- )
> 3DUP 1+ SWAP MOVE C! DROP ;
>
> - So all who say it does not need to work that way, are ignoring
> established common practice. It doesn't matter if you will never PLACE
> into the original buffer - it is still wrong to do it incorrectly, if
> you *can* do it correctly.
But it has, by definition, been right until now.
On the other hand, since you have taken the opportunity to flatter me,
I shall retain my broken version and post up a replacement on the
Win32Forth Yahoo group.
On a more serious note, I would still argue that the result should be
undefined for those arguments, if (and it's a big if) it gets to be
proposed for Forth20xx inclusion. MOVE is significantly more expensive
than CMOVE, and to suffer a continuing performance hit for a use case
that will not cause programmers any inconvenience at all -- and will
almost certainly never be coded -- I find difficult to support.
No, its because Forth often runs on systems where limiting individual
strings to 255 chars makes little sense, and because counted strings
require a copy into a buffer when the source text is a substring of a
longer string.
Char-counted strings are not the kind of thing that *can* be
deprecated ~ they make sense when they make sense, they don't when
they don't, which is a heavily application dependent issue ~ rather
what is deprecated is the practice of using an address of a char-
counted string as the normal stack input to general purpose string
operators.
Operators that take ( ca u ) work equally well with a single cell
addressing a byte counted string, a single cell addressing a cell-
counted string and a single cell addressing a 2variable holding a ( ca
u ) reference, allowing the string operators to be separated from the
storage format decision.
Not if you mean as a language level. Every commonly available re-
usable library establishes a de facto library standard, so
Toolbelt2002 has de facto "library standardized" itself.
When the question is not, "carnal knowledge is required to provide
this facility at all", like a library directory load facility, but
rather, "if I find this word, will it mean what I expect it to mean",
a load constant that returns TRUE if that specific library terminology
is respected is a lot more useful than trying to argue through each
and every useful factor provided by the library as to whether it
should be added to the language standard.
PLACE is something you'd want to do. Why would you want to reverse
PICK? Bad enough that you've buried the data too deep and need it on
the top ~ if you have it on top, why bury it?
Yes. My test failed to dump the last character in the stored string.
Properly it should have been:
... PAD R> 1+ CHARS DUMP ...
I used the latest systems for the test where possible. I knew
several would fail having looked into it sometime ago.
As Wil Baden is the originator of PLACE his implementation
deserves consideration as the "standard" behaviour unless
someone finds it to be wanting. A lesser spec would create
compatibility issues.
Regarding the naming of +PLACE vs. APPEND Wil Baden
states his preference (in 2001) was for +PLACE. By the time
he compiled his Toolkit, however, he appears to have changed
his mind and switched to APPEND.
--
I use a string concatenate primitive from which I can define
any type of append e.g.
\ String concatenate primitive from OTA. Should be coded
\ for speed.
: +STRING ( c-addr1 u1 c-addr2 u2 -- c-addr2 u3 )
2SWAP SWAP 2OVER CHARS + 2 PICK CMOVE + ;
\ counted-string append
: APPEND ( c-addr u adr -- )
COUNT +STRING SWAP 1 CHARS - C! ;
\ z-string append
: ZAPPEND ( c-addr u adr -- )
ZCOUNT +STRING CHARS + 0 SWAP C! ;
Yes, given that it is an overlapping-proof definition, if I was going
to define a non-overlapping operation, it should have a distinct name:
[has?] Tool2002 [DEFINED] PLACE AND [IF]
: STORE-CS ( ca1 u1 ca2 -- ) PLACE ;
[ELSE]
: STORE-CS ( ca1 u1 ca2 -- ) 2DUP C! CHAR+ SWAP CMOVE ;
[THEN]
(with STORE-CCS for cell-counted strings)
Symmetry ...
Porting code ...
RP
That's a decent name choice. But, it doesn't seem to be in use anywhere for
that operation.
Rod Pemberton
I agree with him.
Of the seven posted by Ed, VFX is the only one with a count of six, count
sized as a byte, displays six characters, and displays the correct six
characters. If you include the other few I posted, bigFORTH does too but
also displays a bunch of other stuff.
In detail:
VFX
-correct count value
-correct count size (byte)
-total characters 6
-correct characters
Swiftforth
-correct count value
-possibly incorrect count size (word)
-total characters 5 instead of 6
-missing or overwritten character
Win32Forth
-correct count value
-correct count size
-all characters wrong
-total character placeholders 5 instead of 6
Gforth
-correct count value
-correct count size
-total characters 5 instead of 6
-mising character
CHForth
-correct count value
-correct count size
-all characters wrong
-total character placeholders 5 instead of 6
BB4Wforth
-cannot parse PAD
bigFORTH
-correct count value
-correct count size
-total characters 6
-correct characters
-emits a bunch of other junk
Rod Pemberton
But PICK is very rarely used in good Forth code, as Bruce says, so
you're talking about an opposite that isn't needed to a word that's
rarely needed. Absolutely no need for such a thing. I think it was a
joke, anyway.
Cheers,
Elizabeth
--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com
"Forth-based products and Services for real-time
applications since 1973."
==================================================
Nice!
I agree with Bruce and others who think PLACE, APPEND, etc.,
belong in libraries, not the language standard.
And also that new words operating on the address of a
char-counted memory format ought not to be added to the
standard.
-- David
You all seem to take the same view: Place it a library and the
standarisation problem goes away. May I respectfully remind everyone
that the ANS94 standardisies standard libraries, in addition to the
Core wordset. I assume the 20xx takes the same approach?
Therefore saying "Well, stick it a library and do what you want" isn't
a solution! The libraries are still open to standardisation discussion
just as with Core!
:-) <--- smiley, so as not to cause offense!
Mark
Exactly. Thank you.
> - There is no reason to require that overlapping PLACE should be
> undefined. WTF? [ ... ]
>
> - It has always been expected to behave in the *correct* way, even if
> it never has been standard.
It has always been expected by whom?
> - So all who say it does not need to work that way, are ignoring
> established common practice.
Despite the evidence just a couple of posts ago that most well-known
Forths don't do it your way, you still claim that your way is
"established common practice". That's a major logic failure.
Andrew.
The test was incorrect; it should have read (as Elko noted upstream);
: TEST
S" ABCDEF" >R PAD R@ CMOVE PAD R@ PAD PLACE
PAD R> 1+ DUMP ;
Your corrected table now looks like this.
VFX -- pass
-correct count value
-total characters 6
-correct characters (DUMP overextends itself)
Swiftforth
-correct count value
-missing or overwritten character
Win32Forth
-correct count value
-all characters overwritten by count
Gforth -- pass
-correct count value
-correct characters
CHForth
-correct count value
-all characters overwritten by count
BB4Wforth
-cannot parse PAD
bigFORTH -- pass
-correct count value
-correct characters (DUMP overextends itself)
It was in jest. The operation is one to be avoided at all costs. Some
Forths have UNDER ( a b -- b a b ). I never code PICK either, although
I have been tempted. Getting a value so deep off the stack that the
usual stack juggling words can't reach it is normally a poor factoring
issue.
> > I agree with Bruce and others who think PLACE, APPEND, etc.,
> > belong in libraries, not the language standard.
> > And also that new words operating on the address of a
> > char-counted memory format ought not to be added to the
> > standard.
> You all seem to take the same view: Place it a library and the
> standarisation problem goes away. May I respectfully remind everyone
> that the ANS94 standardisies standard libraries, in addition to the
> Core wordset. I assume the 20xx takes the same approach?
You can remind everyone of that as respectfully as you wish, it
doesn't make it so. The fact that wordsets other than CORE are
optional does not make them libraries.
> But PICK is very rarely used in good Forth code, as Bruce says, so
> you're talking about an opposite that isn't needed to a word that's
> rarely needed. Absolutely no need for such a thing. I think it was a
> joke, anyway.
>
> Cheers,
> Elizabeth
Is "good Forth" practice more important than "common usage" in determining
Forth standards inclusion?
Just wondering ... 8-)
> > > If PLACE is a string move, what's the reverse of PICK?
> > PLACE is something you'd want to do. Why would you want to reverse
> > PICK? Bad enough that you've buried the data too deep and need it on
> > the top ~ if you have it on top, why bury it?
> Symmetry ...
The occasions I've had to use PICK are to define THIRD, normally for
an independent use of the length part of a second ( ca u ) string on
stack. (eg, THIRD OVER to get the pair of counts). And if PICK does
not exist, its possible to define it directly:
[UNDEFINED] THIRD [IF]
[DEFINED] PICK [IF]
: THIRD ( x1 x2 x3 -- x1 x2 x3 x1 ) 2 PICK ;
[ELSE]
: THIRD ( x1 x2 x3 -- x1 x2 x3 x1 ) 2>R DUP 2R> ROT ;
[THEN] [THEN]
Its not an operation with a symmetric inverse, as if you want to get
the length value *to modify it*, that's ``ROT ... -ROT''.
> Porting code ...
If its observed in the wild, and there's no time to rewrite the code
to eliminate the operation, use the name they use, in your porting
toolkit vocabulary for that app, so its not lying around to confuse
things when the compilation of the app is finished.
So for porting, the name to use is not a problem to be solved.
2OVER NIP doesn't work?
No offense taken. But I think libraries need to be more
relaxed, and *not* open to standardization, unless they're about
to become part of the language specification.
Otherwise, the naming problem for words in libraries becomes
intractable, IMO.
The author should decide. Of course he shouldn't use the same
name for something different from well-established practice
without thinking about it, and he should listen to advice, but
he shouldn't feel proscribed from doing it when he thinks it's
the right thing, keeping in mind that people will vote with
their feet if they think it's an egregious abuse.
-- David
No. And you are a wicked person for even thinking such a thing. ;-)
Andrew.
This is the stack diagram of TUCK ;-)
UNDER or SWAPDROP were names for what is NIP nowadays ( a b -- b )
--
Coos
CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
VFX displays 16 bytes/characters, not 6.
>Swiftforth
>-correct count value
>-possibly incorrect count size (word)
>-total characters 5 instead of 6
SwiftForth displays 6 bytes/characters.
>Win32Forth
>-correct count value
>-correct count size
>-all characters wrong
>-total character placeholders 5 instead of 6
I count 6.
>Gforth
>-correct count value
>-correct count size
>-total characters 5 instead of 6
>-mising character
Gforth displays 6 bytes/characters, just like the program asked it to.
>CHForth
>-correct count value
>-correct count size
>-all characters wrong
>-total character placeholders 5 instead of 6
Also shows 6 bytes/characters.
>bigFORTH
>-correct count value
>-correct count size
>-total characters 6
>-correct characters
>-emits a bunch of other junk
Why do you consider the extra junk that VFX displays correct and the
junk that bigForth displays incorrect?
It seems that you consider most Forth systems incorrect because they
don't display extra junk, but just the 6 bytes that the program asked
for. I guess the program should have asked to display one more byte
to display all 6 characters in addition to the count byte, but it
didn't.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2010: http://www.euroforth.org/ef10/
What is replacing counted strings? Do you mean byte-counted strings
are going out of use? A string in memory needs its count somewhere,
unless it's a zstring.
>
> But if you want to see them standardized, go ahead and write an RfD.
Maybe it's time for a LSTRING wordset. I think the intent of the ANS
committee was to allow for the creation of new standard wordsets over
time. That hasn't really happened, but it wouldn't hurt to go through
the process of establishing a new wordset. Memory is so much cheaper
these days, and the 8-bit count is no longer sufficient in many apps,
that it makes sense to have a LSTRING wordset. Maybe a ZSTRING wordset
too.
That PLACE is not a standard word is a good thing. Now it can have a
32-bit count.
-Brad
Strings without a preceding count byte.
> Do you mean byte-counted strings
>are going out of use? A string in memory needs its count somewhere,
Yes: On the stack (or, more generally, close to the address).
- anton.
PICK is in the standard because it's common usage. It's in CORE EXT
because no one wanted it to be required. Good Forth programmers rarely
use it.
I think it a considerable overstatement to say "Counted strings are bad
practice that's slowly going out of use." They remain an extremely
useful internal storage format, much more run-time efficient than
null-terminated strings (as discussed in a relatively recent thread).
However, other formats (e.g. cell-sized count fields) are increasingly
useful, and the real point is to get away from passing *addresses of
counted strings* on the stack, because that constrains the
implementation. If standard practice becomes passing caddr-u pairs,
that will handle all formats, and save having to say COUNT to get the
arguments for conventional counted strings.
>> But if you want to see them standardized, go ahead and write an RfD.
>
> Maybe it's time for a LSTRING wordset. I think the intent of the ANS
> committee was to allow for the creation of new standard wordsets over
> time. That hasn't really happened, but it wouldn't hurt to go through
> the process of establishing a new wordset. Memory is so much cheaper
> these days, and the 8-bit count is no longer sufficient in many apps,
> that it makes sense to have a LSTRING wordset. Maybe a ZSTRING wordset
> too.
>
> That PLACE is not a standard word is a good thing. Now it can have a
> 32-bit count.
I think, in fact, that's a big reason why PLACE hasn't been
standardized. Since it stores a string, its definition would have to
specify the format it stores, and I think no one wants to do that.
Didn't there used to be a "Common usage" (COMUS) repository somewhere?
What became of it? That's the appropriate place for PLACE, IMO.
I find the word "performance hit" ridiculous in this context.
Suppose we have a path .wine/mount/woody_1.iso/usr/bin/gforth
We want (for some reason) to follow those directories with
`` do-something ''
".wine/mount/woody_1.iso/usr/bin/gforth" PAD $!
`` PAD $@ CHAR / $/ do-something '' leaves the remain path.
It would be quite an embarassment if `` PAD $! '' would not leave
the remaining tree in PAD. Fortunately in lina it does:
PAD $@ TYPE
mount/woody_1.iso/usr/bin/gforth OK
Phoo. (Wipes forehead.) Forgot to test that.
Of course there is not really a need to store it back, once you
have a string-constant on the stack, but anyway.
--
Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Which shows you how often I use them!
IMO that is a string constant. You still have a need to store string
constants in memory buffers, that could be called string variables.
Now what is a reasonable place to store the lenght?
Answer: up front, in a cell.
>
>- anton
>--
Groetjes Albert
--
This has been in Dutch Forth's (including iForth) for two decades now.
>
>Mark
That'd work too.
Lessee:
-TRAILING ~ already "long string"
/STRING ~ already "long string"
BLANK ~ already "long string"
CMOVE ~ already "long string"
CMOVE> ~ already "long string"
COMPARE ~ already "long string"
SEARCH ~ already "long string"
SLITERAL ~ already "long string" if implementation permits
Seems like STRING is already LSTRING.
Well, it would break the stack abstraction (cause underflow) if there
are only 3 items on the stack, but since the out-of-range item is
discarded, it could still be safe in implementations where there is no
underflow check and the word past the stack is still addressible. I
think the clf term for that is it depends on carnal knowledge. (I still
mostly think of the circular stack of the GA processors).
Clearly the dutch are decades ahead! Those wiley Dutch!
What word name do you use for this behaviour?
Mark
It could be that reason that I had it down as 2>R DUP 2R> ROT, but
more likely that 2OVER NIP never occurred to me ... FOURTH as 2OVER
DROP does not run into that objection, of course.
It could be implicit ~ a packed string stack growing down with the
pointers to the the heads of the string as a cell stack growing down
could retrieve its length as:
TSTACK @ 2@ TUCK - 1CHAR/ ( ca u )
: +string chars 1+ ; \ counted string
(classic)
: count dup char+ swap c@ ;
: +string chars 1 cells + ; \ counted string (cell
count)
: count dup cell+ swap @ ;
create mystring 10 +string allot
Advantages: Abstraction of string datatype, few sourcecode changes
(only those that require carnal knowledge). You could even vector it.
Hans Bezemer
Hmm. >R OVER R> SWAP might have a little less overhead than
2>R DUP 2R> ROT.
I'd expect it would be S"
There is nothing in the specification of S" that requires it to be a
char-counted string ~ its allowed to be a cell-counted string, since
it returns an address and a cell-wide count. An implementation with
8bit CHARS that supports S" strings longer than 255 characters would
of course have the option of using a cell counted string as an
internal storage format if they so desired.
Maybe you are thinking of C" but there is a reason C" is in CORE EXT
while (compiling) S" is in CORE.
Should COUNT be depreciated?
Disadvantage: changing the specs for a word already standardized and in
common usage. If you want a "big COUNT" you need a new name.
Yeah, probably.
> Should COUNT be depreciated?
Its as useful as GETCHAR in loops as it is to convert a char-counted
string address to a str (stack text representation) ... if there was a
designation of former char-counted operations as obsolescent, I can
only think of C" that would be an obvious candidate, and since its
CORE EXT ~ you can just not implement it if you don't want it.
GET-CELL ( a1 -- a2 x )
Of course, that does not abstract the storage format, but I'm not sure
that I *want* to abstract the storage format at that level, I want to
abstract the storage format at the level of:
str-foo ( -- ca u ) ~ aka "str-foo ( -- str )"
... and if that has a char-counted string and COUNT in one version and
a cell-counted string and GET-CELL in another version, none of the str-
consumers need to be concerned with it.
If its really critical in an application to abstract, then:
DEFER GET-LEN
' COUNT IS GET-LEN
...
' GET-CELL IS GET-LEN
There's FLAG. I reckon Toolbelt 2002 ought to be available at FLAG.
Peter Sovietov's "Forth Wizard" is an interesting program:
http://forpost.sourceforge.net/forthwiz.html
I have modified Peter Sovietov's "Forth Wizard" for personal use. I added
many additional Forth stack words, such as NIP, TUCK, ROLL, PICK, Rob
Chapman's Quarks, and functionality: the return stacks, temporary registers,
etc. Unfortunately, the program has some errors I introduced which I
haven't corrected. My modified version gives the results below for the
stack operation: ( x1 x2 x3 -- x1 x2 x3 x1 ). Unfortunately, due to those
possible errors, you'll need to actually check that the output is correct
before using a sequence.
There are two words whose use you may be unfamiliar with. FYI, I think R!
and PLACE, as used here, might've been names I made up ... "My" PLACE is
not the string operation by Wil Baden. R! is like R@ but from the parameter
stack. It's stack operation is ( a -- a R:a ). UNDER is not used. AIUI,
it's the same as NUP ( a b -- a a b ).
ANS defines R@ this way:
R@ ( -- x ) ( R: x -- x )
So, R! would be defined this way by ANS:
R! ( x -- x ) ( R: -- x )
As used here, "my" PLACE is the reverse of PICK. I.e.,
1 PLACE is ( a b -- b b ) ,
2 PLACE is ( a b c -- c b c) ,
etc.
There are two lists of sequences, one with and one without PICK,
ROLL, -ROLL, or "my" PLACE. Both lists have "my" R! .
( x1 x2 x3 -- x1 x2 x3 x1 )
( without PICK, ROLL, -ROLL, or "my" PLACE )
-rot nup 2swap
>r over r> swap
>r over >r 2r>
>r nup r> rot
dup 2over drop nip
dup 2over rot 2drop
dup 2over -rot 2nip
dup 2over take drop
over 2over drop nip
over 2over rot 2drop
over 2over take drop
rot r! -rot r>
rot dup 2swap rot
rot rot nup 2swap
-rot over swap 2swap
-rot over -rot 2swap
-rot 2dup take 2swap
-rot nup 2tuck 2drop
2>r dup 2r> rot
nup 2over -rot 2nip
r! r> 2over drop nip
r! r> 2over rot 2drop
r! r> 2over -rot 2nip
r! r> 2over take drop
r! >r over 2r> take
r! drop over r> swap
r! drop over >r 2r>
r! drop nup r> rot
r! take r! swap 2r>
r! take tuck r> swap
r! take tuck >r 2r>
>r r! over 2r> take
>r swap r! swap 2r>
>r swap tuck r> swap
>r swap tuck >r 2r>
>r over r@ r> take
>r over r! drop 2r>
>r over r! 2r> take
Etc.
( x1 x2 x3 -- x1 x2 x3 x1 )
( with PICK, ROLL, -ROLL, or "my" PLACE )
2 pick
dup 3 pick nip
over 3 pick nip
rot dup 3 -roll
rot 0 pick 3 -roll
-rot nup 2swap
tuck 3 roll 3 place
2dup 4 pick 2nip
nup 3 roll 3 place
0 pick 3 pick nip
1 pick 3 pick nip
2 roll dup 3 -roll
2 roll 0 pick 3 -roll
2 -roll nup 2swap
r! r> 3 pick nip
r! 2 pick r> drop
>r over r> swap
>r over r> 1 roll
>r over r> 1 -roll
>r over >r 2r>
>r nup r> rot
>r nup r> 2 roll
>r nup r> 3 roll
>r 1 pick r> swap
>r 1 pick r> 1 roll
>r 1 pick r> 1 -roll
>r 1 pick >r 2r>
dup dup 4 pick 2nip
dup over 4 pick 2nip
dup -rot 3 roll 3 place
dup tuck 4 pick 2nip
HTH, at least it's a brain teaser for a few of you ...
Rod Pemberton
Hans
COUNT ( c-addr1 -- c-addr2 u )
Return the character string specification for the string stored at c-
addr1. c-addr2 is the address of the first character of the contiguous
text portion of the string. u is the length in characters of the
string at c-addr2.
Shoot me!
Hans
It's the equivalent of DUP >R, and the STC version of Win32Forth
provides DUP>R since it is a single instruction; PUSH EAX. I'm not
sure the ! notation is appropriate, since it has the signature ( v a
-- ); i.e. nothing on the stack after execution.
Anton Ertl has a good paper on RAFTS, an experimental optimiser. Given
an addressable stack, sequences of stack manipulation words can be
eliminated by executing them at compile time. For instance, the
example you give of the sequences for ( x1 x2 x3 -- x1 x2 x3 x1 ) can
be replaced by a single push to the stack (the equivalent of 2 PICK,
but without having to write such a beast). The papers are here;
Damning with faint praise? Zero-terminated strings are an even worse
practice, that's fortunately not supported by standard Forth at all
(and for good reason).
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2010: http://www.euroforth.org/ef10/
Where do you store the address? Store the length there, too. E.g.,
with "2!".
Exactly.
It should certainly move from CORE elsewhere (CORE EXT? STRING EXT?).
COUNT is not a real problem, though. FIND, however, is a problem, not
just because it expects a counted string, but more importantly,
because it is not well defined. Once we get rid of FIND, there is no
need for WORD and C", and without them, no need for COUNT.
That mostly makes sense, but I'm not sure that FIND ever implied a
need for C" .
Andrew.
> It should certainly move from CORE elsewhere (CORE EXT? STRING EXT?).
CORE EXT would be fine ~ its a useful operation even without char
counted strings, even if a misnomer.
> COUNT is not a real problem, though. FIND, however, is a problem, not
> just because it expects a counted string, but more importantly,
> because it is not well defined. Once we get rid of FIND, there is no
> need for WORD and C", and without them, no need for COUNT.
Would relegating FIND to CORE EXT entail putting PARSE and PARSE-NAME
and some ( ca u ) version of FIND into CORE? eg, ``SEARCH-NAME'' in
parallel to ``SEARCH-WORDLIST''?
C" DUP" (c-addr) FIND works (in a definition)
S" DUP" (c-addr u) FIND does not work
So FIND needs a character-counted string somewhere.
That's really a bad design.
SEARCH-WORDLIST wants a (c-addr u) pair, u does not need to besomewhere in
front of c-addr but has its place on the stack. Much more flexible.
--
Coos
CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
I REALLY agree with you. I've been trying to get both WORD and FIND
obsoleted for some time now. All I get is fuzzy arguments that really
make little sense, apart from "I works for me and I'm having a lot of
work editing old programs if this ever gets through".
Hans Bezemer
Oh, I see.
>> S" DUP" (c-addr u) FIND ?does not work
>> So FIND needs a character-counted string somewhere.
>> That's really a bad design.
>> SEARCH-WORDLIST wants a (c-addr u) pair, u does not need to besomewhere in
>> front of c-addr but has its place on the stack. Much more flexible.
> Coos,
>
> I REALLY agree with you. I've been trying to get both WORD and FIND
> obsoleted for some time now. All I get is fuzzy arguments that really
> make little sense, apart from "I works for me and I'm having a lot of
> work editing old programs if this ever gets through".
Well, PARSE isn't really a replacement for WORD: for that you need
PARSE-WORD , which isn't standard. But PARSE-WORD only works with a
space delimiter; maybe that's sufficient. So, the plan would be to
declare WORD and FIND obsolescent, but there's probably no chance of
them being removed altogether because too much code would break.
Andrew.
You're quite right there: COUNT is *NOT* an alias for C@+, e.g.
BEGIN COUNT WHILE EMIT REPEAT
Is *NOT* a good way at all to print a zero-terminated string (sorry,
Anton). Why? Read the description: I'm not returning an unsigned
number (or string length for that matter), I'm returning a character.
I'm not dealing with counted strings, I'm dealing with zero-terminated
strings. The description doesn't fit, the stack diagram doesn't fit,
why the *** do we need a "standard" if we ignore just about anything
that's written there?
Hans
No, it really does not function as usual ...
... ( ca u ) 0 ?DO COUNT process EMIT LOOP ...
... is really messed up if COUNT does not work as specified.