Hi, everybody, I want to play a bit with multi-threading in Forth. I need to create several OS threads, preferably under Linux (amd64), and I need some synchronization primitive. And I don't want to implement that stuff myself. It would be nice to have a code example as well. (And if the synchronization primitive was a message/runnable queue...) If the threads could run on different cores, that would be great.
Any out-of-box solutions?
> Any out-of-box solutions?
Why not try iForth? It has multi-thread/core stuff that works even when using
it to write SSE/SSE2 floating-point code for Jack (audio-manager) callbacks.
iForth runs on Windows, Linux and OS X. You can have either a 32 or a 64 bit
version on these OSs.
-marcel
Sorry, all the multi-threading we've worked with has been either fully
native (e.g. SwiftX, polyFORTH) or Windows-based, which I think has a
very different set of rules/constraints from Linux. I haven't looked at
Linux SwiftForth, it may have what you need.
Cheers,
Elizabeth
--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com
"Forth-based products and Services for real-time
applications since 1973."
==================================================
> Why not try iForth? It has multi-thread/core stuff that works even when using
> it to write SSE/SSE2 floating-point code for Jack (audio-manager) callbacks.
On the page http://home.iae.nl/users/mhx/i4faq.html there is a remark:
"Tcl/Tk 8.0 works spectacularly well together with iForth under Windows XP,
as shown above".
...and there's a screenshot. To be more specific: did you indeed mean
"TCL/Tk" - or just Tk alone, as graphics kit used by iForth (without TCL
involved in any way)?
--
...but I'm ready to learn ('bout) of the power of Forth
> Sorry, all the multi-threading we've worked with has been either fully
> native (e.g. SwiftX, polyFORTH) or Windows-based, which I think has a
> very different set of rules/constraints from Linux. I haven't looked at
> Linux SwiftForth, it may have what you need.
Did you mean by the above, that multi-threading in Linux SwiftForth is
"deactivated" somehow, and - for example - one won't be able to use more
than just one core of CPU?
I mean that I haven't used it or read the manual, so I don't know what's
there or how it works. Download it and try for yourself!
It works just fine. Linux SwiftForth tasks are libc threads, and the
result is a very clean fit. It scales well with little overhead to
multiple cores.
Andrew.
:-(
iForth:
FORTH> : foo r> r> swap >r >r ; ok
FORTH> : bar 11 . foo 111 . ; ok
FORTH> : baz 22 . bar 222 . ; ok
FORTH> baz 22 11 111 222 ok
which means I cannot implement control structures in Forth.
For comparison, Gforth:
: foo r> r> swap >r >r ; ok
: bar 11 . foo 111 . ; ok
: baz 22 . bar 222 . ; ok
baz 22 11 222 111 ok
> Marcel Hendrix wrote:
>> Michael L Gassanenko <m_l...@yahoo.com> writes Re: multi-threading in Forth?
[..]
> :-(
> iForth:
> FORTH> : foo r> r> swap >r >r ; ok
> FORTH> : bar 11 . foo 111 . ; ok
> FORTH> : baz 22 . bar 222 . ; ok
> FORTH> baz 22 11 111 222 ok
> which means I cannot implement control structures in Forth.
> For comparison, Gforth:
> : foo r> r> swap >r >r ; ok
> : bar 11 . foo 111 . ; ok
> : baz 22 . bar 222 . ; ok
> baz 22 11 222 111 ok
To implement this non-standard code, please use this non-standard trick
(as shown earlier) to prevent in-lining:
: foo r> r> swap >r >r [ -opt ] ;
: bar 11 . foo 111 . ;
: baz 22 . bar 222 . ;
FORTH> baz 22 11 222 111 ok
-marcel
What's non-standard about that code?
The placement of return addresses on the return stack is not demanded by
the current standard. E.g. FigForth and F83 placed them on the return
stack, by tradition and simplicity. Currently an implementation may place
them anywhere, even not on the return stack.
So manipulation of items on the return stack by Foo is allowed, but there
is no guarantee that you manipulate return addresses.
I too have an implementation where the inlining in FOO (called CO here,
courtesy of Albert) must be switched of.
--
Coos
CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
> I too have an implementation where the inlining in FOO (called CO here,
> courtesy of Albert) must be switched of.
So what "return stack" serves for in such implementation, which isn't using
it for keeping the return addresses? Just as "spare parameter stack"?
Actually, _why not_ place return address exactly on return stack? What's the
point, what can we gain such way?
Some processors manage subroutine calls with internal registers or
stacks and the implementation can get much better performance by using
them. But the Standard gives an entitlement for using the Return Stack
with >R R@ and R> so on those implementations a cosmetic Return Stack
has to be provided to support these words.
> Some processors manage subroutine calls with internal registers or
> stacks and the implementation can get much better performance by using
> them.
1. From this thread I understood, that it has been done on Intel CPU. Is the
above valid also for Intel/AMD ones (that most popular among PC users)?
2. Even in a case of some "exotic" CPUs, where it could be maybe even a
requirement: isn't even in such case possible to "map" somehow that internal
registers to return stack available from within Forth?
Actually, I would expect to find return address on the return stack,
according to its name.
I don't have a list, although I can think of some 8051 variants on which
return addresses are managed differently. But the point of a Standard
is to define programmer entitlements that are independent of CPU while
allowing implementers to optimize performance underneath.
> 2. Even in a case of some "exotic" CPUs, where it could be maybe even a
> requirement: isn't even in such case possible to "map" somehow that internal
> registers to return stack available from within Forth?
No, because on some (at least that I am familiar with) there isn't a
single cell-sized register that you can treat that way, and there may be
other limitations.
> Actually, I would expect to find return address on the return stack,
> according to its name.
Well, yes, that was certainly its original use, and probably how it
works on the vast majority of implementations.
>So what "return stack" serves for in such implementation, which isn't using
>it for keeping the return addresses? Just as "spare parameter stack"?
In terms of the ANS and Forth200x standards, yes.
>Actually, _why not_ place return address exactly on return stack? What's the
>point, what can we gain such way?
In practice, embedded Forths using 8051s may keep return addresses on
a third stack. However, what you see more often is that a return
address may be larger than a cell size (DSP, PIC ...) or may not be
formatted as you expect (e.g. Cortex-M has bit 0 always set).
Hence, there are occasion when you *need* to isolate the return
address words because they are not just R@ and friends. Yes, this
sort of problem is seen regularly.
Stephen
--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Do you have any idea just how ridiculous that sounds? :-/ I realise you
are just quoting the standard, but jeesh... "The placement of return
addresses on the return stack is not demanded by the current
standard"... FFS... Is it possible to produce a more left-leaning woolly
standard than the current one? I think not!
Exactly! So we're saying the old "Breakfast, lunch, dinner" example in
Brodie wouldn't work?
What, like a return stack, you mean ;-)
Okay, now it's making more sense! Perhaps the return stack should be
obsoleted?
> In practice, embedded Forths using 8051s may keep return addresses on
> a third stack. However, what you see more often is that a return
> address may be larger than a cell size (DSP, PIC ...) or may not be
> formatted as you expect (e.g. Cortex-M has bit 0 always set).
>
> Hence, there are occasion when you *need* to isolate the return
> address words because they are not just R@ and friends. Yes, this
> sort of problem is seen regularly.
Of course, it's better to create Forth compiler with "strange return stack",
than to not make it at all. But I understand the described situation rather
as a rationale, why ANS-standard doesn't impose the strict rules about stack:
"...because sometimes can be not possible to fulfil such requirement" - than
as a statement: "actually, return stack can be implemented in any way". Yes:
formally it can - it isn't forbidden by law, therefore no Forth creator will
be punished - but the question is: should it really be treated so blithely?
Nevertheless I'm somewhat amazed, that it is done in situations, when there
can be implemented "proper" return stack. It can make more problems, than
e.g. changing the word name from "<MOVE" to "MOVE>" - because it is the
violation of quite basic rule. What is return stack, is described in every
Forth primer. And then we can see, that the rule works "sometimes".
I think, that "rule" doesn't necessarily mean formal requirement of any
standard committee - it can be just "traditional solution", present and known
since 30 years, if it still does its work with no problems. But of course it
can be changed, when it will mean some obvious profits for both compiler
creator and user, for example: Andy Valencia, ForthOS creator, proposed to
use larger blocks - 4096 instead of "traditional" 1024 bytes - and it does
make sense, because meanwhile the HDD vendors changed the size of basic
sector unit to... 4096 bytes (from 512). So this is an example, how the
change of standard can follow "the life". Using blocks of the same size as
HDD sector means better performance (of course, if one's willing to use
blocks at all).
So if I'm trying to use a compiler written for some "exotic" hardware, yes:
I can expect the presence of some non-standard solution. But why in the case
of compiler written for most popular hardware? There surely is some
rationale behind it, although IMHO if such change won't give any tremendous
profits (much greater efficiency?), perhaps it's better to _not_ introduce
it, because most programmers will expect to find the things working "old way".
Well, of course I don't have so much Forth-programming experience as
everyone of you here (if my exercises can be taken as any "experience" at
all), but this is as I see it.
>Okay, now it's making more sense! Perhaps the return stack should be
>obsoleted?
No. The wording of the standard is that it is only standard to remove
stuff that you put on the return stack yourself, which excludes
return addresses. On the majority of desktop systems, i386, x86_64
or ARM, using R@ and friends to access a return address will not
surprise you.
> Exactly! So we're saying the old "Breakfast, lunch, dinner" example in
> Brodie wouldn't work?
I'm afraid, that without necessary modifications on iForth it won't work.
But I didn't try iForth yet.
There's no need to beat on ANS Forth; it correctly tells you that this
technique will have portability issues. Just deal with them. It's not
like a much-less-followed standard would be free of issues like this.
If you search for "recognizing tail recursion" on Google Groups (still
useful for Usenet, just not for following it), you'll find some
interesting comments about this and the standard's position on it.
So, just declare an environmental dependency on... whatever this is
called, exactly, and use system optimizer hooks on those systems that
are otherwise OK with the practice. You're now aware of the iForth
hook; the SwiftForth hook is NO-TAIL-RECURSION , used like IMMEDIATE
\ : call >r ;
: concat ( c-addr1 u1 c-addr2 u2 -- c-addr3 u1+u2 )
third 2dup + dup >r allocate throw >r
r@ + swap move r@ swap move r> r> ;
: free-after ( a u -- a u )
over r> swap >r ['] call catch r> free throw throw ;
\ Frees after the string is typed
: x s" hi" s" there" concat free-after cr type ;
\ Frees after the string is typed in Z
: y s" hi" s" there" concat free-after ;
: z y cr type ;
If you want the X behavior but not the Z behavior in this instance, use
NO-TAIL-RECURSION on FREE-AFTER and Y will reliably do the wrong thing
of freeing the string before returning it.
From another time that this has been discussed:
http://groups.google.com/group/comp.lang.forth/msg/7e37f3b5f43cddf9
SP> The chances of understanding are much improved by extensive
SP> commenting or documentation in the source code. They are
SP> also improved by suitable choice of word names, even if these
SP> are simply renamings.
SP> In the case of LATER, the assumption that return addresses are
SP> on the return stack was unstated. Michael has proposed using
SP> RA as an indicator for this. It is an unfortunate side-effect
SP> of ANS Forth that such techniques are sometimes disparaged.
SP>
SP> : later \ -- : resume in CALLER'S CALLER
SP> \ Document assumptions here
SP> ra> ra> swap >ra >ra
SP> ;
SP>
SP> Some code just *is* difficult to understand, and Forth seems
SP> to enable use of difficult (rare?) techniques more than most
SP> other languages.
SP>
SP> This particular example should not be discouraged, it should
SP> instead be documented and placed in a system layer rather
SP> than an application layer. I would rather call this word
SP> some thing like
SP> ;LATER:
SP> [LATER]
SP> to indicate that something startling will happen.
8051 was just fine; IIRC it was AVR that had a small circular buffer
for return addresses. OTOH, if one really cares, it is possible
to have a real return stack even on such hardware, you just don't
have to use that poor excuse for a stack.
>
>> 2. Even in a case of some "exotic" CPUs, where it could be maybe even a
>> requirement: isn't even in such case possible to "map" somehow that
>> internal
>> registers to return stack available from within Forth?
>
> No, because on some (at least that I am familiar with) there isn't a
> single cell-sized register that you can treat that way, and there may be
> other limitations.
/
I suspect the systems for such hardware are not 100% standard
for different reasons, one of which is that the users of
exotic hardware value access to their device more than
standard-compliance.
>
>> Actually, I would expect to find return address on the return stack,
>> according to its name.
>
> Well, yes, that was certainly its original use, and probably how it
> works on the vast majority of implementations.
>
> Cheers,
> Elizabeth
>
In 1994 it was unclear whether the return address manipulations are
a dirty hack or an important technique. The TC just chose the easier
option.
Strictly speaking, a Forth function has the following inputs:
1) the data stack
2) the interpretation stack (the return stack plus the interpretation
pointer IP)
3) the memory state
The "ordinary" words like DUP and ! create a new state of
the data stack and/or memory and the interpretation stack
(because they at least advance IP) and (!) invoke the
function that was pointed to by IP passing to it the
newly created inputs.
But this is not the only way to use the argument IP: for
example, BRANCH fetches a destination address from IP,
and passes control to the function pointed to by the
destination address, with the IP in interpretation
stack state replaced by a new address.
More complicated actions are possible too, for example,
it is possible to call the code at IP multiple times.
(You won't be able to EXIT there after that, you will
have to RDROP EXIT instead.)
But again, although all this is theoretically important,
the TC wanted an _industrial_ standard rather an
academic one, and there had been no common practice
of industrial use of such techniques. (Yes, there used
to be a significant industrial use.)
So the return address manipulations were left beyond
the standard. Of course, the standard was misread as
discouraging the use or r.a.manipulations (as well
as a prohibiting a lot of other techniques).
At the moment, ability to define my own control
structures is a must for me, so I checked whether
this stuff works.
See 3.2.3.3 Return stack
And the comments of several in this thread.
I think so. There is the matter of whether, say, return stack
manipulation is more useful than tail call optimization. I'd choose
tail call optimization every time, but IME code that (ab)uses the
return stack for control flow is hard to understand and maintain.
Whatever the problem, there's almost always a better way to do it than
fiddling with the R-stack.
> Andy Valencia, ForthOS creator, proposed to use larger blocks - 4096
> instead of "traditional" 1024 bytes - and it does make sense,
> because meanwhile the HDD vendors changed the size of basic sector
> unit to... 4096 bytes (from 512).
Yes, it makes very good sense to do this. However, given a 4096-byte
block buffer, you can easily get a compatible BLOCK with
: block ( n - a) 4 /mod big-block swap 1024 * + ;
So, there is no need for application code to be affected by the use of
big blocks.
Andrew.
It depends on your Forth. If you compile to native code you're going
to have to use the internal stack for return addresses, and this is
limited to less than 256 bytes. (Actually, much less than 256 bytes
if you want to use internal memory for anything else.) I don't think
you can use this stack for >R etc as well as return addresses:
internal RAM is just too precious.
> OTOH, if one really cares, it is possible to have a real return
> stack even on such hardware, you just don't have to use that poor
> excuse for a stack.
At some considerable cost in performance, yes. On every entry to a
word you could push the return address into external RAM. It's always
a trade-off.
Andrew.
[..]
> I think, that "rule" doesn't necessarily mean formal requirement of any
> standard committee - it can be just "traditional solution", present and known
> since 30 years, if it still does its work with no problems.
But it generally doesn't (have no problems), and never did. This means that
it is not a portable technique, or one that can be made portable, and that is
the reason the Standard does not mandate it (The Standard's goal is not to
define better Forths).
> So if I'm trying to use a compiler written for some "exotic" hardware, yes:
> I can expect the presence of some non-standard solution. But why in the case
> of compiler written for most popular hardware? There surely is some
> rationale behind it, although IMHO if such change won't give any tremendous
> profits (much greater efficiency?), perhaps it's better to _not_ introduce
> it, because most programmers will expect to find the things working "old way".
There might be some misunderstanding here. iForth DOES use a return stack, there
are real return addresses on it, and you can manipulate these if you know what
you are doing (and know about [ -opt ]).
Unmanaged access to return stack items that weren't put there by the
program itself will lead to problems. The reason for this is that R> >R R@ etc.
are first inlined and then optimized away, as are other short definitions. For
the purposes and target audience of iForth this is many times more important than
the benefits of return stack manipulation (which is still possible by either
using [ -opt ] or by redefining R> >R etc.)
-marcel
BTW: Yes, iForth does/did interface to both Tcl and Tk. However, the ActiveState
implementation I just now tried it with on Windows 7 seems to have a broken pipe
interface (older implementations worked).
It's a matter of tradeoffs. If you stick around here, you'll see that
nothing regarding a Standard is done "blithely". In this case, it is
possible that handling return addresses differently can produce a very
significant improvement in performance. That benefits every application
that runs on the platform. Manipulating return addresses, on the other
hand, is a technique that is fraught with problems and negative impact
on readability and maintainability, and a strategy that is rarely (if
ever) actually required. So the decision seems pretty logical, to me.
> ...What is return stack, is described in every
> Forth primer. And then we can see, that the rule works "sometimes".
It works often. If it's important to you to do this stuff, as others
have said, just declare a dependency on being able to do it, bear in
mind that what you're doing won't run everywhere, and live with that.
There are many things in Brodie's books that won't work on many modern
Forths, for very good reasons. That is why folks here keep pointing out
that Starting Forth, though readable and fun, is seriously dated and
can't be relied upon technically.
> It's a matter of tradeoffs. If you stick around here, you'll see that
> nothing regarding a Standard is done "blithely".
I'm pretty sure about it - but I meant exactly the opposite: the
things/rules, which _aren't formally_ standarized. But - despite of this -
were created/used in some specific way during years.
> In this case, it is
> possible that handling return addresses differently can produce a very
> significant improvement in performance. That benefits every application
> that runs on the platform. Manipulating return addresses, on the other
> hand, is a technique that is fraught with problems and negative impact
> on readability and maintainability, and a strategy that is rarely (if
> ever) actually required. So the decision seems pretty logical, to me.
Well, in "Starting Forth" it has been introduced as quite normal procedure,
just one of possible ways to do things.
But, actually, my point was, that I would expect the return addresses on...
yes, on return stack.
> There might be some misunderstanding here. iForth DOES use a return stack,
> there are real return addresses on it, and you can manipulate these if you
> know what you are doing (and know about [ -opt ]).
Then I've got to take a closer look at iForth, and its docs.
> BTW: Yes, iForth does/did interface to both Tcl and Tk. However, the
> ActiveState implementation I just now tried it with on Windows 7 seems to
> have a broken pipe interface (older implementations worked).
Maybe it could be worthy to let them (TCT) know, so they could be able to
fix it in 8.6.0 "final".
Fascinating. A person who apparently has no problems with
the use of data stack, discourages the use of return stack,
although both lead to equally untraditional, from an average
specialist's viewpoint, code, and both are backed by
solid mathematical background.
Probably, the word "fiddling" attempts to characterize it
as something illogical. It's not. But to demonstrate that,
I will introduce a new notation.
== means functional equivalence (*)
[footnote]
(*) that is, we assume that the code is not accessed as data (**).
(**) In fact, some elements of code, such as branch
destinations, _are_ accessed as data, but they are not
executed as code.
[/footnote]
For FOO defined as
: foo bar ;
we write:
foo == nest[ bar exit ]
analogously, for
: baz foo ;
we have
baz == nest[ foo exit ] == nest[ nest[ bar exit ] exit ]
The next thing to realize is that a call is just
placement of an address onto the return stack.
r[ qux ] stands for a code fragment that places the address of the code
fragment qux onto the return stack.
CREATE quux ] qux exit [
quux >R == r[ qux exit ]
And now, the most important property:
nest[ x ] y ... == r[ y ... ] x
The ellipsis above shows what happens with the code that follows y. The
ellipsis at the end of an expression may be omitted, for example, we may
rewrite the above as:
nest[ x ] y == r[ y ... ] x
or even
nest[ x ] == r[ ... ] x
Now,
foo == nest[ bar exit ] == r[ ... ] bar exit
(
Note that
exit x == exit y == exit
because exit does not pass control to the code that follows it.
Looking at
nest[ x ] == r[ ... ] x
you might wonder what about the code that follows x. The answer
is: in practice x usually ends with exit, so whatever follows
that exit is irrelevant.
)
Analogously, for baz defined above:
baz == nest[ foo exit ] == nest[ nest[ bar exit ] exit ]
== r[ ... ] r[ exit ] bar exit
Let us consider a more complicated example.
[example1]
: foo ." foo " ;
: qux ." qux " ;
CREATE quux ] qux exit [
: test quux >R foo ;
test == nest[ quux >R foo exit ] == nest[ r[ qux exit ] foo exit ]
It is evident that for our definition of foo
r[ anything ] foo == foo r[ anything ]
because foo defined above does not depend on and does not
affect the return stack state. This is a very important
property, it is written as foo:R.
(You may notice that I resort to common sense in the middle
of formal transformations. I do not try to try to prove
what we all already know is true, because it is not
important how we prove that. This knowledge is one level
of abstraction below our maths.)
So, continuing the chain,
nest[ r[ qux exit ] foo exit ]
== nest[ foo r[ qux exit ] exit ]
== nest[ foo qux exit ]
because
r[ x ] exit == x
(This is an axiom, and it means that exit transfers control
to the code whose address is taken off the return stack.)
So, test will print "foo qux ". Let's try it in Gforth:
: foo ." foo " ; ok
: qux ." qux " ; ok
CREATE quux ] qux exit [ ok
: test quux >r foo ; ok
ok
test foo qux ok
[/example1]
[example2]
And what about
: co r> r> swap >r >r ;
?
: x a co b ;
: y c y d ;
y == nest[ c nest[ a nest[ r> r> swap >r >r exit ] b exit] d exit ]
== r[ ... ] c r[ d exit ] a r[ b exit ] r> r> swap >r >r exit
== r[ ... ] c r[ b exit ] a r[ d exit ] exit
== r[ ... ] c r[ b exit ] a d exit
it would be tempting to continue the transformation, replacing
[ r[ b exit ] a d exit ] with [ a d b exit ] but to do that, we
must know that [a d]:R (the code fragment [a d] does not depend
on or affect the return stack).
(You likely have already figured out that we use brackets
to group two code fragments into one code fragment.)
[/example2]
By the way, the notation for
r[ something ] R>
is
'[ something ]
We could use it above, but did not. Of course,
'[ x ] >R == r[ x ]
And now, some important theorem.
But first, a definition.
Definition
S1( x ) is such a code fragment that
ForAll N: x N == N S1( x )
In other words, it does what x does, but leaves the stack top intact.
Examples:
S1( NOOP ) == NOOP
S1( DROP ) == NIP
S1( DUP ) == OVER SWAP
S1( EXIT ) == DROP EXIT
Indeed, N S1( EXIT ) == EXIT N, and N DROP EXIT == EXIT
Analogously,
Definition
R1( x ) is such a code fragment that
ForAll N: x N >R == N >R R1( x )
that is, R1( x ) acts like x but leaves the return stack top intact.
Examples:
R1( DUP ) == DUP
R1( DROP ) == DROP
R1( RDROP ) == R> RDROP >R
R1( EXIT ) == RDROP EXIT
Indeed, N >R R1( EXIT ) == EXIT N >R, and N >R RDROP EXIT == EXIT
Definition
x:R if x == R1( x )
(x:R means that x does not affect or depend on the return stack)
Examples:
DUP:R
DROP:R
And now, the theorem of call equivalence.
Theorem
x == nest[ R1( x ) exit ]
Proof
nest[ R1( x ) exit ] == r[ ... ] R1( x ) exit
== x r[ ... ] exit == x ... == x
Consequence
For x:R,
x == nest[ x exit ]
(that is, if words in a colon definition do not
read or modify the return stack, a call of the definition
is equivalent to inline substitution of the definition's body).
Are you still with me? ;-)
> I think so. There is the matter of whether, say, return stack
> manipulation is more useful than tail call optimization. I'd choose
> tail call optimization every time, but IME code that (ab)uses the
> return stack for control flow is hard to understand and maintain.
> Whatever the problem, there's almost always a better way to do it than
> fiddling with the R-stack.
Maybe you're right - I understand, that the examples in "Starting Forth"
weren't recommendations for "how this should be done", but rather the
illustrations of return stack functionality and of structure of the words.
But my point was rather: if possible, I would rather expect return addresses
still on "traditional" return stack - as on "standard place" in case of most
Forth compilers - instead of searching "where is it this time?".
>> Andy Valencia, ForthOS creator, proposed to use larger blocks - 4096
>> instead of "traditional" 1024 bytes - and it does make sense,
>> because meanwhile the HDD vendors changed the size of basic sector
>> unit to... 4096 bytes (from 512).
>
> Yes, it makes very good sense to do this. However, given a 4096-byte
> block buffer, you can easily get a compatible BLOCK with
>
>: block ( n - a) 4 /mod big-block swap 1024 * + ;
>
> So, there is no need for application code to be affected by the use of
> big blocks.
Right, but this is a "workaround for today"; and since during following
years we simply won't be able to buy an HDD with 512 byte sector, the
mentioned change seems to be logical consequence.
Of course it won't have any real meaning to compilers (most Forth compilers
of today), that aren't using "native" blocks, but "block files" instead.
Can you elaborate on your IME? I'm not being flippant; I'm privately
becoming more and more confident that fiddling with the return stack is
a huge readability and therefore maintainability win, so I'd appreciate
a course correction.
I recall that you prefer immediate words for a good subset of what
return-stack words can do. E.g.,
: with-base ( u -- )
r> BASE @ >r swap BASE !
['] call catch r> BASE ! throw ;
: base[ ( u -- )
postpone BASE postpone @ postpone >r
postpone BASE postpone ! ; immediate
: ]base ( -- )
postpone r> postpone BASE postpone ! ; immediate
: x1 ( ... "to end of line" -- ... )
16 with-base 0 parse evaluate ;
: x2 ( ... "to end of line" -- ... )
16 base[ 0 parse evaluate ]base ;
Return-stack access must be balanced after WITH-BASE and between BASE[
]BASE ; only WITH-BASE will reset the BASE if there's an exception
(maybe not desirable, but when it isn't WITH-BASE can be easily defined
to not do this). In the presence of tail-call optimization, WITH-BASE
_can_ be used to do perverse things that only sometimes work
: in-hex 16 with-base ;
: x3 ( c-addr u -- )
in-hex evaluate ; \ only works if WITH-BASE was JMP'd to.
, but I oppose that. You can take any feature too far. You can have a
Forth word that accepts twenty arguments on the data stack.
But, these are so similar; how is only one of them hard to understand
and maintain?
Another class of return-stack words are words that validate some input:
: -too-small ( c-addr u -- c-addr u | *returns from caller )
dup 2 > ?exit 2drop r>drop ;
: -not-forth ( c-addr u -- c-addr u | *returns from caller )
2dup 2 - + 2 s" .f" compare(nc) 0= ?exit 2drop r>drop ;
: .forth-file ( dirent-addr -- )
USING dirent filename zcount
-too-small -not-forth type space ;
"If it's not too small, and if it's not not forth, TYPE SPACE it."
Maybe S" .f" HAS-EXT; would be a better name, with the semicolon to
suggest the potential exit.
With immediate wordS:
: -too-small ( c-addr u -- c-addr u | *exit )
postpone dup 2 postpone literal postpone <=
postpone if postpone 2drop postpone exit postpone then ; immediate
And so on. It's used in exactly the same way, it has exactly the same
issue of "to test this, I need to use it within a temporary definition",
and it could be neater with a macro to replace the POSTPONEs.
So these are _exactly the same_. How is the only of them hard to
understand and maintain?
I could have also "not done that" - have written
: .forth-file ( dirent-addr -- )
USING dirent filename zcount
dup 2 <= if 2drop exit then
2dup 2 - + 2 s" .f" compare(nc) if 2drop exit then
type space ;
Or:
: .forth-file ( dirent-addr -- )
USING dirent filename zcount ( c-addr u )
dup 2 > if 2dup 2 - + 2 s" .f" compare(nc) 0= if
type space exit
then then 2drop ;
Or:
: .forth-file ( dirent-addr -- )
USING dirent filename zcount
2dup long-enough? if 2dup is-forth? if
type space exit
then then 2drop ;
The last is nice and conventional, but is the return-stack version
really any harder to read?
Which classes of return-stack-fiddling word do you object to?
<snip>
> Right, but this is a "workaround for today"; and since during following
> years we simply won't be able to buy an HDD with 512 byte sector, the
> mentioned change seems to be logical consequence.
All disks I have used till now had 512 byte *physical* sectors, an OS might
use them in chunks of 8 or something, it still does not matter for a
program. The 1024 size of a Forth block is a *logical* size.
Yes, it's possible to access physical sectors, e.g. on a floppy disk, but I
think it would not be sensible to do that on a terabyte disk.
As a user, you don't need to search, because real return addresses are
not intended to be user-accessible. They are a system facility, as are
others (such as code space and name fields). It inappropriate to think
that you can manipulate return addresses in any way that is predictably
the same from one implementation to another.
FORTH83 mandated implementation details such as return addresses, stack
widths (only 16 bits!), layout of definitions, etc. In the Forth94
process, we realized that this was a terrible mistake, and took the firm
step towards specifying behavior, not implementation. The result has
been significantly faster and more powerful Forth systems. The cost, of
course, is that users can't write code that depends on implementation
details. IMO it's a win.
Forth offers plenty of ways to control program flow. You don't need to
play cute tricks with return addresses.
If the return stack is obsoleted, how do you intend to ...
a) implement a Forth for a threaded interpreter without it? E.g., a
threaded interpreter saves and restores interpreter addresses using the
return stack via ENTER/DOCOL and EXIT/NEXT. Historically, some Forth words
start with R> due to threaded interpreters, e.g.:
LIT BRANCH COMPILE DOES RP!
b) implement a temporary register without it? An 'A' register ala
MachineForth? E.g., a temporary may be needed for SWAP or for words using
>R and R> .
c) resolve the fact that parameter stack items must be reorganized and extra
space is required to do that?
d) keep control flow addresses off of the parameter stack? I.e., a single
stack is commonly used in C, but not required. This leads to many of the
problems C has with various exploits. This would be the same as merging the
return stack and parameter/data stack of Forth into a single stack.
Rod Pemberton
I don't discourage the use of return stack. It's very useful for
saving items from the data stack. It's not useful for changing
control flow, IME; that's the Forth equivalent of GOTO.
> Probably, the word "fiddling" attempts to characterize it
> as something illogical. It's not. But to demonstrate that,
> I will introduce a new notation.
<formalism elided>
What is the point of this? It can't be in response to what I wrote,
since I never claimed that fiddling with the return stack was
"illogical". I did claim that it was, ah, fiddly. :-)
> (that is, if words in a colon definition do not read or modify the
> return stack, a call of the definition is equivalent to inline
> substitution of the definition's body).
Quite so: this also means that fiddling with the return stack may
destroy this property. For example,
: bletch foo bar baz ;
may execute FOO and BAR, but not BAZ, because BAR has the side-effect
of exiting from BLETCH . Throwing an exception does that too, but
exceptions are constrained in a way that R-stack fiddling is not. You
can, at least, see the site to which an exception returns.
> Are you still with me? ;-)
Andrew.
It's not a workaround at all. In the past, Forths used to have
different sized blocks, and there was a constant B/BUF that was used.
Changing BLOCKs to a standard size across all implementations, no
matter what their physical block size, was an improvement. To go back
to the old days, where you had to cope with variable block sizes, is a
regression.
Andrew.
What is CALL ?
> : base[ ( u -- )
> postpone BASE postpone @ postpone >r
> postpone BASE postpone ! ; immediate
> : ]base ( -- )
> postpone r> postpone BASE postpone ! ; immediate
>
> : x1 ( ... "to end of line" -- ... )
> 16 with-base 0 parse evaluate ;
>
> : x2 ( ... "to end of line" -- ... )
> 16 base[ 0 parse evaluate ]base ;
This doesn't do anything nonstandard with the R-stack, as far as I can
see. It seems very elaborate for what it does, but I suppose if you
have a very great deal of BASE-changing it might be worthwhile. On
the other hand, it might be clearer simply to use WITH-BASE wthout all
the syntactic sugar.
> Return-stack access must be balanced after WITH-BASE and between BASE[
> ]BASE ; only WITH-BASE will reset the BASE if there's an exception
> (maybe not desirable, but when it isn't WITH-BASE can be easily defined
> to not do this). In the presence of tail-call optimization, WITH-BASE
> _can_ be used to do perverse things that only sometimes work
>
> : in-hex 16 with-base ;
> : x3 ( c-addr u -- )
> in-hex evaluate ; \ only works if WITH-BASE was JMP'd to.
>
> , but I oppose that. You can take any feature too far. You can have a
> Forth word that accepts twenty arguments on the data stack.
>
> But, these are so similar; how is only one of them hard to understand
> and maintain?
I didn't say that. Fiddling with the R-stack is just one way to
confuse the reader. There are many more. :-)
> Another class of return-stack words are words that validate some input:
>
> : -too-small ( c-addr u -- c-addr u | *returns from caller )
> dup 2 > ?exit 2drop r>drop ;
> : -not-forth ( c-addr u -- c-addr u | *returns from caller )
> 2dup 2 - + 2 s" .f" compare(nc) 0= ?exit 2drop r>drop ;
>
> : .forth-file ( dirent-addr -- )
> USING dirent filename zcount
> -too-small -not-forth type space ;
>
> "If it's not too small, and if it's not not forth, TYPE SPACE it."
> Maybe S" .f" HAS-EXT; would be a better name, with the semicolon to
> suggest the potential exit.
>
> With immediate wordS:
>
> : -too-small ( c-addr u -- c-addr u | *exit )
> postpone dup 2 postpone literal postpone <=
> postpone if postpone 2drop postpone exit postpone then ; immediate
>
> And so on. It's used in exactly the same way, it has exactly the same
> issue of "to test this, I need to use it within a temporary definition",
> and it could be neater with a macro to replace the POSTPONEs.
>
> So these are _exactly the same_. How is the only of them hard to
> understand and maintain?
They're both hard to understand and maintain. What's wrong with
-too-small if -not-forth if type space then then
? Golly, even a Forth beginner could understand what that does . We
can't have that! :-)
> I could have also "not done that" - have written
>
> : .forth-file ( dirent-addr -- )
> USING dirent filename zcount
> dup 2 <= if 2drop exit then
> 2dup 2 - + 2 s" .f" compare(nc) if 2drop exit then
> type space ;
>
> Or:
>
> : .forth-file ( dirent-addr -- )
> USING dirent filename zcount ( c-addr u )
> dup 2 > if 2dup 2 - + 2 s" .f" compare(nc) 0= if
> type space exit
> then then 2drop ;
>
> Or:
>
> : .forth-file ( dirent-addr -- )
> USING dirent filename zcount
> 2dup long-enough? if 2dup is-forth? if
> type space exit
> then then 2drop ;
>
> The last is nice and conventional, but is the return-stack version
> really any harder to read?
Yes. Conventional is good. The maintenance programmer's time is
valuable. Think about it for a moment: on first sight, how long would
it take the reader to figure out what these programs do? Remember,
you're writing code to be read.
There's an old saying that appears in many forms, one of which is
"Don't use a ten dollar word when a nickel one will do." This is good
advice when writing English, and terrific advice when programming.
People learning English often use heavyweight synonyms and complex
structures they've learned. That's good while learning, but it's also
a sure sign of someone not very fluent with the language.
A maintenance programmer has a reasonable expectation that a
heavyweight technique will only be used when demanded by a heavyweight
problem. When they find otherwise there's a kind of cognitive
dissonance (talk about 10-dollar words!) that forces a break in
concentration. It also forces them to step away from the problem they
were trying to solve in order to learn how your custom control
structures work.
This is not general advice against custom control structures, which
are one of Forth's strengths. But it's important to realize that,
while they can simplify code and make it more readable, there is a
cost. A wise programmer balances costs and benefits, and uses
powerful techniques sparingly.
Andrew.
Take the example of `` :; '' of colorforth ( CO in my book).
Some considered it an example of beneficial use of the return stack
to implement coroutines:
: CO R> R> SWAP >R >R ;
It is much better though to consider CO as an abstraction in
it own right. The explicit return stack manipulation is
a possible implementation technique.
In ciforth CO is a code word with just one instruction.
In colorforth :; is an instruction at the machine level.
>
>Cheers,
>Elizabeth
Groetjes Albert
--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
> All disks I have used till now had 512 byte *physical* sectors,
AFAIK presently (all? Or just "most"?) HDDs still have some "translation
chip" built-in, to be seen by older hardware as having 512 bytes per sector.
Can't examine it, as I'm using - like for today - quite small disks.
> an OS might
> use them in chunks of 8 or something, it still does not matter for a
> program. The 1024 size of a Forth block is a *logical* size.
Why exactly this size of a block you see as "logical"?
> Yes, it's possible to access physical sectors, e.g. on a floppy disk, but
> I think it would not be sensible to do that on a terabyte disk.
Why not, actually? If we mean e.g. using blocks as kind of "virtual memory",
direct use of physical sectors means faster access.
Forget physical sectors. 4K will become the norm, and not all disks are
512 byte sector size; some are 520 bytes. Flash memory will also become
the predominant technology, and that needs a completely new way of think
about writing to "disk", since the block erase overhead is considerable.
Yes, in the 1980's when SF was written, Forth83 mandated implementation
details that you could use in this way. You could also depend on being
able to access up to 65535 bytes of memory, 16-bit values on the data
stack, and indirect-threaded implementation. As I have explained
elsewhere in this thread, removing implementation details from the
standard 20 years ago has enabled modern Forths to be much faster and
more powerful.
Moral: do not rely on any text written in the mid-1980's to be
accurately descriptive of 2011 technology.
> But, actually, my point was, that I would expect the return addresses on...
> yes, on return stack.
That was historically accurate, and still works in some implementations.
It is not universally true.
Elizabeth correctly refers to the implicit contract which is the standard
behaviour of Forth. There are things upon which application developers
can rely, but implementation details are not one of those. That frees
implementers from being stuck forever with the fig model, and allows Forth
to evolve. The balance is a healthy one.
Rob Sciuk
> Date: Fri, 05 Aug 2011 19:02:15 -0500
> From: Julian Fondren <ayr...@gmail.com>
> Newsgroups: comp.lang.forth
> Subject: Re: multi-threading in Forth?
>
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>> I think so. There is the matter of whether, say, return stack
>> manipulation is more useful than tail call optimization. I'd choose
>> tail call optimization every time, but IME code that (ab)uses the
>> return stack for control flow is hard to understand and maintain.
>> Whatever the problem, there's almost always a better way to do it than
>> fiddling with the R-stack.
>
> Can you elaborate on your IME? I'm not being flippant; I'm privately
> becoming more and more confident that fiddling with the return stack is
> a huge readability and therefore maintainability win, so I'd appreciate
> a course correction.
I would suggest that you implement a generic stack, and re-write >R R> and
R@ in terms of that code. You could then use the faux "return stack" in
the manner you expect, and with impunity ... or simply drop the return
stack use and replace it with your own stack implementation ... and that
would be completely portable.
Cheers,
Rob Sciuk
But WITH-BASE is the word that uses return address manipulation.
Another alternative is to write
: eval-rest ( ... -- ... )
0 parse evaluate ;
: x3 ( ... "to end of line" -- ... )
['] eval-rest 16 base-execute ;
And of course, if we had nested colon defs (called quotations, blocks,
or lambda expressions in other languages). we could write this as follows:
: x4 ( ... "to end of line" -- ... )
[: 0 parse evaluate :] 16 base-execute ;
One disadvantage I see for the WITH-BASE variant here is that you have
to wrap it in another colon definition if you want to do anything
after restoring BASE.
>> : .forth-file ( dirent-addr -- )
>> USING dirent filename zcount
>> -too-small -not-forth type space ;
[...]
>They're both hard to understand and maintain. What's wrong with
>
> -too-small if -not-forth if type space then then
It's longer.
>> : .forth-file ( dirent-addr -- )
>> USING dirent filename zcount
>> 2dup long-enough? if 2dup is-forth? if
>> type space exit
>> then then 2drop ;
>>
>> The last is nice and conventional, but is the return-stack version
>> really any harder to read?
>
>Yes. Conventional is good. The maintenance programmer's time is
>valuable. Think about it for a moment: on first sight, how long would
>it take the reader to figure out what these programs do?
That's a good question. If the maintenance programmer is familiar
with the idiom, does it take the maintenance programmer longer to
understand the shorter code?
>A maintenance programmer has a reasonable expectation that a
>heavyweight technique will only be used when demanded by a heavyweight
>problem. When they find otherwise there's a kind of cognitive
>dissonance (talk about 10-dollar words!) that forces a break in
>concentration. It also forces them to step away from the problem they
>were trying to solve in order to learn how your custom control
>structures work.
You are assuming that the maintenance programmer is not familiar with
the idiom. The same argument could be used against any programming
technique and any programming language: Just assume that the
programmer has to learn the technique/language first, and your
argument will eliminate the technique/language from consideration. I
don't think that this argument is useful.
I am not convinced that return address manipulation offers enough over
"conventional" to make it worthwhile to teach and learn the idioms
that use it, or whether we should better concentrate on adding
features like nested colon defs. But in any case, I would like to see
more useful arguments in the discussion.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2011: http://www.euroforth.org/ef11/
> I would suggest that you implement a generic stack, and re-write >R R>
> and R@ in terms of that code. You could then use the faux "return
> stack" in the manner you expect, and with impunity ... or simply drop
> the return stack use and replace it with your own stack implementation
> ... and that would be completely portable.
If this were possible, then I could use >R R> R@ with impunity _right
now_, without going to the trouble of reimplementing , as I'd know that
I could write a straightforward compatibility layer for a future system.
I would have the attitude towards return-address manipulation that I
have toward PLACE APPEND ++ SKIP SCAN and others.
But this isn't the case. I can't roll my own return addresses just by
rolling my own stack.
\ documentation synonyms.
: >ra ( -- r-addr ) postpone >r ; immediate
: ra> ( r-addr -- ) postpone r> ; immediate
: ra@ ( -- r-addr ) psotpone r@ ; immediate
Look at these words:
: call ( r-addr -- ) >ra ;
: cont ( -- r-addr ) ra@ magic-number + ;
: x ( -- r-addr ) cont ahead ." It's like an XT." then ;
x call
>
> What is CALL ?
>
: call >r ;
m_l_g3 calls it ENTER
> They're both hard to understand and maintain. What's wrong with
>
> -too-small if -not-forth if type space then then
>
> ? Golly, even a Forth beginner could understand what that does . We
> can't have that! :-)
>
: -too-small ( c-addr u -- c-addr u 0 | -1 )
... ;
: -too-small ( c-addr u -- c-addr u f )
... ;
Neither word follows the "consume all inputs" norm, but compared to what
I went with...
> Yes. Conventional is good. The maintenance programmer's time is
> valuable. Think about it for a moment: on first sight, how long would
> it take the reader to figure out what these programs do? Remember,
> you're writing code to be read.
>
> There's an old saying that appears in many forms, one of which is
> "Don't use a ten dollar word when a nickel one will do." This is good
> advice when writing English, and terrific advice when programming.
> People learning English often use heavyweight synonyms and complex
> structures they've learned. That's good while learning, but it's also
> a sure sign of someone not very fluent with the language.
>
> A maintenance programmer has a reasonable expectation that a
> heavyweight technique will only be used when demanded by a heavyweight
> problem. When they find otherwise there's a kind of cognitive
> dissonance (talk about 10-dollar words!) that forces a break in
> concentration. It also forces them to step away from the problem they
> were trying to solve in order to learn how your custom control
> structures work.
>
> This is not general advice against custom control structures, which
> are one of Forth's strengths. But it's important to realize that,
> while they can simplify code and make it more readable, there is a
> cost. A wise programmer balances costs and benefits, and uses
> powerful techniques sparingly.
Thanks for this :-) The middle paragraphs are persuasive enough for me.
> Andrew.
It is so easy, it doesn't really deserve its own screen:
----------------------------
--------------------------
>
>Cheers,
>Rob Sciuk
>
--
Newsgroups: comp.lang.forth
Subject: Re: multi-threading in Forth?
Summary:
Expires:
References: <j1cd95$2md$1...@dont-email.me> <b5SdnSwRyaUTbqbT...@supernews.com> <8662mb5...@gmail.com> <alpine.BSF.2.00.1...@yoko.controlq.com>
Sender:
Followup-To:
Reply-To:
Distribution:
Organization: Dutch Forth Workshop
Keywords:
Cc:
In article <alpine.BSF.2.00.1...@yoko.controlq.com>,
It is so easy, it hardly deserve its own screen:
----------------------------
SCR # 94
0 ( STACK PUSH POP -- a_free_stack ) \ AvdH A1sep26
1 : STACK CREATE HERE CELL+ , CELLS ALLOT DOES> ;
2 100 STACK DEBUG-STACK
3 : PUSH DEBUG-STACK @ SWAP OVER ! 1 CELLS + DEBUG-STACK ! ;
4 : POP DEBUG-STACK @ 1 CELLS - DUP @ SWAP DEBUG-STACK ! ;
5
6
7
8
9
10
11
12
13
14
--------------------------
WANT STACK
: >R PUSH ; : R> POP ; : R@ R> DUP >R ;
94 LOAD \ Ignore redefinition warnings
: >S PUSH ; : S> POP ; : S@ S> DUP >S ;
94 LOAD \ Ignore redefinition warnings
: >T PUSH ; : T> POP ; : T@ T> DUP >T ;
>
>Cheers,
>Rob Sciuk
[...]
> I am not convinced that return address manipulation offers enough over
> "conventional" to make it worthwhile to teach and learn the idioms
> that use it, or whether we should better concentrate on adding
> features like nested colon defs. But in any case, I would like to see
> more useful arguments in the discussion.
>
A few months ago I made an effort to understand some return stack
manipulation techniques, specifically those described in the writings of
mlg3 at:
https://groups.google.com/group/comp.lang.forth/browse_frm/thread/b8a35c3014a40d40/8854a1b225d5bace?hl=en&lnk=gst&q=foreachopcode#8854a1b225d5bace
and http://www.forth.org.ru/~mlg/ef94/ef94-2-paper.txt
To apply the technique I used the Spykerman Sudoku solver ( available
at:
http://www-personal.umich.edu/~williams/archive/forth/utilities/sudoku.fs)
as a testbed where I recoded a few words to see if it was easy, more
readable etc.
To give an example, I recoded the word that displays the Sudoku grid.
The original code was:
: .sudokugrid
CR CR
sudokugrid
81 0 do
dup i + c@ . ." "
i 1+
dup 3 mod 0= if
dup 9 mod 0= if
CR
dup 27 mod 0= if
dup 81 < if ." ------+-------+------" CR then
then
else
." ! "
then
then
drop
loop
drop
CR
;
which, to me least, is not a good example of well-written Forth as it is
difficult to read, but could probably be rewritten to be more readable
in conventional Forth (perhaps someone would like to try?). My rewrite was:
: for-each-square pro 81 0 do i cont loop ;
: .square ( grid sq -- grid sq ) 2dup + c@ . ;
: boxright ( sq -- sq | ) dup 3 mod if drop fail then ;
: gridright ( sq -- sq | ) dup 9 mod if drop ." ! " fail then cr ;
: boxbase ( sq -- ) 80 min 27 mod 0= if ." ------+-------+------" cr
then ;
: .sudokugrid ( -- )
cr cr sudokugrid ( -- grid )
start for-each-square ( -- grid sq )
.square 1+
boxright gridright boxbase ( -- grid )
emerge drop cr ( -- )
;
where the words PRO CONT FAIL START and EMERGE are all described in the
above references. If anyone is interested I can provide definitions for
these that work with GForth.
This could be improved e.g. by renaming START as FOR-EACH and EMERGE as
END-FOR and better word names for the factors.
As a generator of square numbers, FOR-EACH-SQUARE was re-used in the
recoding of the other words that I completed.
How easy was it? Of the four words I recoded, three were easy, the
fourth, called SOLVER, took several attempts to get right. That may have
been a candidate for mlg3's AMONG ... EACH ... ITERATE ... control
structure (but I never got that far - I'm not sure that structure can be
implemented in Forth alone).
--
Gerry
return address manipulation
> : with-base ( u -- )
> r> BASE @ >r swap BASE !
> ['] call catch r> BASE ! throw ;
>
> : x1 ( ... "to end of line" -- ... )
> 16 with-base 0 parse evaluate ;
vs. immediate words
> : base[ ( u -- )
> postpone BASE postpone @ postpone >r
> postpone BASE postpone ! ; immediate
> : ]base ( -- )
> postpone r> postpone BASE postpone ! ; immediate
>
> : x2 ( ... "to end of line" -- ... )
> 16 base[ 0 parse evaluate ]base ;
The other thing you can do is:
: with-base ( xt u -- )
base @ >r base ! catch r> base ! throw ;
: x3 ( ... "to end of line" -- ... )
0 parse ['] evaluate 16 with-base ;
It forces you to name your inner functionality, but that's usually not
*too* inconvenient. As Anton mentioned elsewhere, nested anonymous code
blocks would help with that. It has a bunch of caveats, but you might
take a look at Gerry Jackson's lambda expressions in ANS Forth, if
you're into that sort of thing.
http://www.qlikz.org/forth/library/lambda.html
If you and are willing to write system-dependent code, you can do it
much more simply with something like `AHEAD [ :NONAME ] ... THEN`.
:NONAME often does things (like set up for locals) which don't nest, so
you may have to modify it, but I think it's possible to do roughly this
on many systems, and it would be interesting to see how many and publish
a portability layer. But I just don't have much time for programming
any more...
--Josh
> To apply the technique I used the Spykerman Sudoku solver as a testbed
: | ." ! " ;
: ---- ." ------+-------+------" CR ;
: ... 3 0 do dup char+ swap c@ . loop ;
: .gridline ... | ... | ... CR ;
: .row 3 0 do .gridline loop ;
: .sudokugrid
CR CR sudokugrid
.row
----
.row
----
.row
drop CR ;
--Josh
I don't know why the author of the original used the '!' character,
better would be
: | ." | " ;
> : ---- ." ------+-------+------" CR ;
> : ... 3 0 do dup char+ swap c@ . loop ;
> : .gridline ... | ... | ... CR ;
> : .row 3 0 do .gridline loop ;
> : .sudokugrid
> CR CR sudokugrid
> .row
> ----
> .row
> ----
> .row
> drop CR ;
>
> --Josh
Nice one! Very readable. Had the original been written like that I
wouldn't have bothered recoding that particular word. However my
offering still serves as an example of using return stack manipulation.
--
Gerry
>> : | ." ! " ;
>
> I don't know why the author of the original used the '!' character,
> better would be
>
>: | ." | " ;
I thought at first that maybe the vertical bar wasn't in the usual set
of graphic characters, but it is. So I don't know why either.
> My offering still serves as an example of using return stack
> manipulation.
Yes, and it was interesting reading; thanks for that. I haven't ever
gotten around to really looking carefully at his backtracking stuff
before.
--Josh
it's done a bit differently.
: enter >r ; ok
: generator 0 begin dup 11 < while dup r@ enter 1+ repeat drop rdrop ; ok
: filter dup 1 and if drop rdrop then ; ok
: test1 generator . ; ok
test1 0 1 2 3 4 5 6 7 8 9 10 ok
: test2 generator filter . ; ok
test2 0 2 4 6 8 10 ok
.s <0> ok
To nest such generators and filters we need the words PRO and CONT that
need an extra stack, but the exercise to define an extra stack in Forth
is just boring.
: pro r> r> >l enter ldrop ;
: cont l> >r r@ enter r> >l ;
; == exit -- this is a convenient notational convention
Statement:
x r[ y ; ] == nest[ r> S1( x ) enter y ; ]
It is evident, but if you'd like a proof,
nest[ r> S1( x ) enter y ; ] ==
r[ ... ] r> S1( x ) r[ y ; ] >r ; ==
x '[ ... ] r[ y ; ] >r ; ==
x r[ y ; ] r[ ... ] ; ==
x r[ y ; ] ... == x r[ y ; ]
So,
pro == nest[ r> r> >l nest[ >r ; ] ldrop ; ] ==
== r[ ... ] r> r> >l r[ ldrop ; ] >r ; ==
== nest[ r> r> >l r[ ldrop ; ] >r ; ] ==
== r> >l r[ ldrop ; ]
A useful factor:
>l/ldrop == >l r[ ldrop ; ]
Using it,
pro == r> >l/drop
cont == nest[ l> >r r@ nest[ >r ; ] r> >l ; ] ==
== nest[ l> >r r@ r[ r> >l ; ] >r ; ]
A useful factor:
l>/lrest == l> >r r@ r[ r> >l ; ]
Using it,
cont == nest[ l>/lrest >r ; ]
But we don't need a second stack to implement >l/ldrop and l>/lrest.
If we need only these two words, we may implement the L stack
as a list allocated on the return stack.
(I am not optimistic about the reader's emotions here.
Anton, can at least you continue to read?)
0 value lp@
: lp! to lp@ ;
: enter >r ;
: l@ lp@ @ ;
: >l/ldrop r> swap lp@ >r >r rp@ lp! enter rdrop r> lp! ;
: l>/lrest l@ lp@ dup cell+ @ lp! r> swap >r enter r> lp! ;
: pro postpone r> postpone >l/ldrop ; immediate
: cont l>/lrest >r ;
\ a simple generator-filter-consumer example
: 0-10 pro 11 0 do i cont loop ; ( --> i -- --> )
: //2 pro dup 1 and 0= if cont else drop then ; ( i --> i -- --> )
: test1 0-10 . ;
: test2 0-10 //2 . ;
test1 0 1 2 3 4 5 6 7 8 9 10 ok
test2 0 2 4 6 8 10 ok
\ it is compatible with exceptions:
: catch lp@ >r catch r> lp! ;
: 0-10x pro 0-10 ['] cont catch if ." x " then ;
: test3 0-10x dup . //2 1 throw ;
test3 0 x 1 2 x 3 4 x 5 6 x 7 8 x 9 10 x ok
PS the stack diagram for such words includes four states,
( before --> to-continuation -- from-continuation --> on-return )
or
( before --> to-continuation ) ( on-return <-- from-continuation )
Among these four states, at most two may be non-empty and different.
e.g. (style: only passed data on the stack)
filter ( x --> x -- --> )
generator ( x --> x -- --> )
consumer ( x --> -- --> )
or (style: state on the stack)
generator ( --> x -- x --> )
filter ( x --> x -- x --> x )
consumer ( x --> -- --> x )
And if you write a filter, please remember that in addition to the
control paths
( before --> to-continuation ) and
( from-continuation --> on-return )
there is also
( before --> on-return ).
: //2 pro dup 1 and 0= if cont ELSE DROP then ; ( i --> i -- --> )
: //2 pro
( i ) dup 1 and 0= if ( i ) cont ( ) else ( i ) drop ( ) then
;
(The same consideration applies to some generators.)
PPS
Yes it _is_ exotic. Forth is exotic too, but this is more exotic than Forth.
It can.
http://www.forth.org.ru/~mlg/BacFORTH-90/pgasmil1-96.html
[Search for ": (AMONG)" or ": ITERATE"]
Probably Google can translate it (but do not trust it too much:
sometimes it tells you that "hay eats horse").
Prelude
=======
As I wrote in another post,
0 value lp@
: lp! to lp@ ;
: enter >r ;
: l@ lp@ @ ;
: >l/ldrop r> swap lp@ >r >r rp@ lp! enter rdrop r> lp! ;
: l>/lrest l@ lp@ dup cell+ @ lp! r> swap >r enter r> lp! ;
: pro postpone r> postpone >l/ldrop ; immediate
: cont l>/lrest >r ;
data execution (V.A.Tuzov's technique)
======================================
defer num
defer str
defer ls
\ bracket-backtick (_not_ tick!!)
: [`] ' >body postpone literal ; immediate
: save* r> [`] num @ >r [`] str @ >r [`] ls @ >r enter r> is ls r> is
str r> is num ;
: .str count type space ;
: list 10 num 20 num C" abc" str ;
: print [ reveal ] save* ['] . is num ['] .str is str ['] print is ls ."
( " execute ." ) " ;
' list print
\ --> ( 10 20 abc ) ok
: list2 1 num 2 num ['] list ls c" end" str ;
' list2 print
\ --> ( 1 2 ( 10 20 abc ) end ) ok
: num, postpone literal postpone num ;
: str, count postpone cliteral postpone str ;
: ls, postpone literal postpone ls ;
: concat save* ['] num, is num ['] ls, is ls ['] str, is str
2>r :noname 2r> swap execute execute postpone ; ;
' list ' list2 concat dup print cr xt-see
\ ==> ( 10 20 abc 1 2 ( 10 20 abc ) end )
\ noname :
\ 10 num 20 num c" abc" str 1 num 2 num 139981497499032 ls
\ c" end" str ; ok
To test expressive power of this method, Tuzov implemented
all Lisp functions in this style. (He used dynamic code
generation and a garbage-collected heap,
and the GC was also written using data execution.)
The conclusion was that data execution can do everything
that the lambda calculus can do.
data execution + backtracking
=============================
Problem: print all subsets of a set.
: stack pro depth 1+ 1 ?do depth i - pick cont loop ;
\ : .depth ." [" depth 1 .r ." ] " ;
\ : .s .depth stack . ;
\ : u.s .depth stack u. ;
: .braces ." { " r> enter ." } " ;
: .{} .braces stack count type space ;
: el r@ enter drop ;
\ when we will execute this description of a set, it will
\ generate the subsets on the stack [1]
: subsets pro c" foo" el c" bar" el c" baz" el cont ;
: .subsets subsets cr .{} ;
.subsets
{ foo bar baz }
{ foo bar }
{ foo baz }
{ foo }
{ bar baz }
{ bar }
{ baz }
{ } ok
\ [1] I could add syntax sugar by defining defer { defer , defer }
\ and assigning them as in the previous section; the definition
\ would read:
\ : subsets { c" foo" , c" bar" , c" baz" , } ;
> <formalism elided>
:-(
>
>> (that is, if words in a colon definition do not read or modify the
>> return stack, a call of the definition is equivalent to inline
>> substitution of the definition's body).
>
> Quite so: this also means that fiddling with the return stack may
> destroy this property. For example,
[sigh]
The context was important.
>> Theorem
>>
>> x == nest[ R1( x ) exit ]
It shows that in the general case there's a transformation.
For example, if we have R> and DROP and want to define RDROP,
the body of the colon definition will contain R1( R> DROP )
: RDROP R> R> DROP >R ;
If we want to extract body from a colon definition (that is,
optimize out the call), it is not possible in some cases.
For example,
: lit& R@ @ AND R> CELL+ >R ;
is better left called because it reads and changes the
interpretation stack (return stack+ip).
And in one particular but important case we get the
consequence, which (consequence) is true but not new.
>> Consequence
>>
>> For x:R,
>> x == nest[ x exit ]
>>
>> (that is, if words in a colon definition do not
>> read or modify the return stack, a call of the definition
>> is equivalent to inline substitution of the definition's body).
Evidently, in some particular case we had to get what the
standard says.
> ... For example,
>
> : bletch foo bar baz ;
>
> may execute FOO and BAR, but not BAZ, because BAR has the side-effect
> of exiting from BLETCH .
If BAR causes an exit from BLETCH, BAR/:R (it's not true that BAR:R)
-- it needs to read the return stack to exit, so the consequence
does not apply.
In fact,
: MAX 2DUP > IF DROP EXIT THEN NIP ;
is a standard definition that may not be inlined because EXIT/:R
(it is not true that EXIT:R)
> Throwing an exception does that too, but
> exceptions are constrained in a way that R-stack fiddling is not. You
> can, at least, see the site to which an exception returns.
Let us not mix the flies with the cutlets.
Exceptions, of course, require a special set of axioms, but they
are not relevant now.
*** The result of return address manipulations may be predicted
at compile-time ***
And this is why the formalism is possible.
Sorry, I glitched.
This should read "because it reads and changes the
_return_ stack". (Any word that does not hang/halt
the system reads and/or modifies IP).
Yes. That is true, and trivially so: the maintenace programmer must
have, at one time, been unaware of this idiom.
> The same argument could be used against any programming technique
> and any programming language: Just assume that the programmer has to
> learn the technique/language first, and your argument will eliminate
> the technique/language from consideration. I don't think that this
> argument is useful.
Every techinique has some learning cost associated with it. Of
course, if it is a really useful technique, its cost can be amortized
over the period in which it is used. It's a cost/benefit analysis, as
usual.
There's a lot to be said for convention.
Andrew.
Thanks for the link. Google translate wasn't much good, it failed to
translate whole paragraphs. Babelfish did much better but generated
strange English that is difficult to read - but I expect I'll manage,
after all Forth is Forth and I can probably figure things out. Much of
it seems to be similar to your EuroForth paper.
I see that papers on backtracking are dated in the mid 1990's, have the
ideas stood the test of time? Are they still in use or has something
better been devised. What is the current status of backtracking in Forth?
--
Gerry
Yes, but your argument implied that the technique has to amortise
itself immediately, not over time. Because if the maintenance
programmer has already learned the technique, there is no need "to
step away from the problem they were trying to solve in order to learn
how your custom control structures work".
- anton
You might have inferred that, but I did not intend to imply it. Such
an argument would not make very much sense.
Andrew.
On Sat, 06 Aug 2011 01:49:07 +0400, mlg3 wrote:
> although both lead to equally untraditional, from an average
> specialist's viewpoint, code, and both are backed by solid mathematical
> background.
And that convinced me that 'return stack manipulation of addresses' is
not just a technique but knowledge. I think that knowledgeable use of
return addresses should not be underestimated.
> Are you still with me? ;-)
Yes. Thanks for post, theory presented is backed by examples and more
understandable than presented euroforth papers that sometimes are a bit
too academic.
Have a nice day,
humptydumpty
As far as I could see the formalism told us in a formal way what we
already knew. Perhaps I missed something, though. Were there some
uncommon insights?
> > *** The result of return address manipulations may be predicted
> at compile-time ***
>
> And this is why the formalism is possible.
But what is it for? I don't understand where this is going.
Andrew.
ok,
>>> Theorem
>>>
>>> x == nest[ R1( x ) exit ]
(again that theorem)
Reminder.
ForAll N:
N >R R1( x ) == x N >R
Consequence 2: the correct algorithm of inlining
It requires that you track the return stack values
inside the colon definition's body.
The colon definition looks as
: foo z ;
The following cases are possible:
1. The return stack is left in the original state,
the return address and the values beneath it are not
read or written. It is ok to move the definition's
body into the calling definition as is.
It is the case that z == R1( z )
2. The return stack top is moved off the return stack and
later placed back, after something is done with the return
stack items beneath it. (Finally, the return stack top is
consumed by EXIT compiled by ; (semi-colon).)
It is easy to generate code for such y that
R1( y ) == z
(This y is the code that must be placed in the calling
definition's body in place of foo.
Indeed,
y == nest[ R1( y ) exit ] == nest[ z exit ] == foo
)
For r-stack primitives that access (that is, move)
the return address (the top return stack item),
no code is generated.
For r-stack primitives that access other return stack
items, the code is generated as if the top return stack
item did not exist.
For example,
: rswap r> r> r> swap >r >r >r ;
rswap == r> r> swap >r >r
One mode example.
( x -- y ) ( r: y -- x )
: >r> r> swap r> swap >r swap >r ;
The generated code will be:
( r> ) ( swap ) r> swap >r ( swap ) ( >r )
The removed words are shown in parens; they are removed
because they had to manipulate the return address that
is eliminated. The effect of R> was ( x -- x ra ) but
without ra it is just ( x -- x ), that is, NOOP.
The same applies to SWAP ( x ra -- ra x ), without
ra it becomes NOOP.
Alternative definition of the same word:
( x -- y ) ( r: y -- x )
: >r> r> r> swap rot >r >r ;
The generated code will be:
( r> ) r> ( swap ) swap >r ( >r )
The first SWAP disappears because it has to swap
the return address with another value, and the
return address disappears, so SWAP becomes a NOOP.
ROT becomes SWAP because it originally did
( x y ra -- y ra x ) and after ra disappears,
what must be done becomes ( x y -- y x ).
I think you already have mentioned that
>r> == nest[ r> r> swap rot >r >r ; ] ==
r[ ... ] r> r> swap rot >r >r ; ==
r> r[ ... ] rot >r >r ; ==
r> swap >r r[ ... ] >r ; == r> swap >r
3. The return stack is left in the original state
up to the moment that the return address is passed
to EXIT inside z. You may either not inline such
definition or transform it into an equivalent
single-exit form. (And the transformed code will
fall into category 1.)
Example.
: MAX 2dup > if drop exit then nip ;
To inline MAX, you first need to transform it into
an equivalent of
: max 2dup > if drop else nip then ;
4. All other cases. That is, the return address
is not what is passed to EXIT. Or it is passed,
but not to EXIT (and EXIT may never receive control).
There is a control transfer. You may generate optimized
code even for this case, but you will have to do code
transformations. Note that in many cases it is not
possible to completely eliminate the call (for example,
you may eliminate control transfer, but will still
have to push an address onto the return stack).
In this text, I recommend not to inline such
definitions. (Because else I would have to
give an exhaustive explanation of how to
generate control transfers for such stuff.)
Example.
: branch r> @ >r ;
You may either leave this definition as it is (that
is, call it rather than inline), or transform it into
a control transfer.
For subroutine-threaded code, the result might be
something like
jmp [address]
address: dw destination
Example.
: give> r> ;
I think, now you see why I am not willing to give
recommendations for the general case.
Better leave this definition as it is.
The generated native code might look as:
\ give> bar baz quz
spush address
exit
address:
bar
baz
qux
Example.
: enter >r ;
The easiest thing is to leave it as it is.
If you nevertheless want to generate optimized code,
you will have to generate an equivalent of a call
(the 'nest' action), and then transfer control to the
address at the stack top.
\ foo enter bar
foo
rpush address
spop R1
jump R1
address:
bar
Note that in the last two examples you can generate
inline code, but you cannot completely eliminate
the call: the RPUSH ADDRESS command is a piece
of that call.
Are definitions that use return address manipulations
worth generating optimized code? Such words likely
implement control structures, and the control structures
tend to be executed in loops. On the other hand, in
many cases custom control structures are not needed.
(As least, until we begin to implement custom control
structures for parallel execution on multi-core
processors.)
Anyway, whether or not you choose to generate optimized
code, the return address manipulations _do_ deserve
generation of _correct_ code.
I do not use Forth at work anymore.
SP-Forth has a backtracking package.
The MPE VFX compiler correctly supports return address
manipulations, thus proving in practice that it is possible
to correctly support them in optimizing compilers.
In general, backtracking uses the principle
"call the continuation to transfer control forwards,
a return from the continuation will transfer the control back".
After using this principle once, we have backtrackable code,
the L stack plays the role of the R stack, and CONT EXIT is
used instead of EXIT . Instead of NEST we have NEST then PRO.
So, for backtrackable code there is a different call and
a different exit.
And the number of states at the stack diagram doubles:
( before --> after-success )
( after-failure <-- on-return )
What will happen if one backtrackable word will call _its_
continuation using the convention for backtrackable code?
We get backtracking^2, and introduce one more stack
for continuations^2. And each backtrackable^2 word
is described with 8 stack states.
Surprisingly, programming with backtrackable^2 words
described with 8 stack states is no more difficult
than programming with backtrackable^1 words of 4 stack
states. This is because you follow the "at most two different
non-empty stack states" principle.
Can we convert success^1 to success^2 and vice versa?
I had control structures that did that. They used copying
of return stack areas and required -NOCUT to store relative
addresses on the return stack.
start pro raise: generator -raise b_consumer cont emerge
was equivalent to
among generator every b_consumer iterate
(in fact, raise was implemented via among), and
reduce: b^2_word -reduce
had no such analog.
Hypothesis 1
backtracking^3 will not give us more expressive power
than backtracking^2
Hypothesis 1a
(same as above), provided that we use the "at most
two non-empty states" principle.
Hypothesis 2
raise and reduce allow us to do everything that
may be done with raise^3 and reduce^3.
I do not know how to prove or refute any of these,
and even what "same expressive power" would
formally mean in that context.
It is easy to define the analogs of PRO and CONT for
backtracking^2 or backtracking^3. But I did not
implement AMONG^2.
In general, I have no formalism for AMONG (it CMOVEs
areas of the return stack, and the formalism is based
on functional equivalence, and there is no functional
equivalent for AMONG).
Sounds like your maintenance programmer is alone rather than
within a team, and was hired long after the last guy familiar
with the code left the company.
In such situation, any project will be a problem.
By the way, if your guys wrote a multi-threading application,
and you fired them, and then hired a maintenance programmer
(even if it's one who claimed to know multi-threading),
you will have a similar problem.
In fact, you will have that problem even if your guys
used a 3-rd party library and you fired them. The maintenance
programmer comes and sees that the code tree does not compile
because some component is missing...
Tell me why your maintenance programmer will need to modify
namely custom control structures? The code works, if it did
not work, the project would not have been released in the
first place. Return address manipulations may be found in
google or asked about in comp.lang.forth.
Your maintenance programmer has a lot of components he
knows nothing about, and will never know. He nevertheless
will manage to maintain your code. He does not need to
know how everything works to do it. This is called
professionalism.
In fact Google is helpless at finding Forth code.
A simpler search engine works better.
For example, for the request
": among" rdrop
yandex.ru finds the SP-Forth library
http://www.forth.org.ru/~profit/lib/bac4th.f
and SPIIRAS proceedings
http://www.proceedings.spiiras.nw.ru/data/src/2002/01/01/spyproc-2002-01-01-15.pdf
and, if asked for "more", reports 3 more results from forth.org.ru
It doesn't sound like that to me. Whether you're in a team or not,
you stil have to learn new idioms.
Andrew.
All successful projects are maintained: features are added, bugs are
fixed.
> Your maintenance programmer has a lot of components he knows nothing
> about, and will never know. He nevertheless will manage to maintain
> your code. He does not need to know how everything works to do
> it. This is called professionalism.
Of course. As I said, it's a cost/benefit analysis. Obfuscatory
techniques increase the cost of maintenance, and should usually be
avoided. Professional programmers seek to reduce maintenance costs.
Andrew.