Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

(long) A denotation for Forth lambda's ("quotations")

175 views
Skip to first unread message

Albert van der Horst

unread,
Jan 15, 2018, 12:50:18 PM1/15/18
to
This article is available also at
http://home.hccnet.nl/a.w.m.van.der.horst/forthlecture1.html
in a possibly improved form.

DENOTATIONS
In Forth we add new data types. They must be accommodated by notations
that result in a constant object. This can always be accomplished by a
compiling word that inspect the next word in the source:

HEX: 4ABD12
S" /tmp/q.txt"

The basic idea is to add constant objects, sometimes called literals
or literal data by defining prefix words that parse the remainder of
the data, like so

0x4ABD12
"/tmp/q.txt"

Now 0x is a prefix and 0x4ABD12 is a denotation. Likewise for the string.
Note that the only difference with the previous solution is to leave
out the blank space following HEX: and ". This is easy to implement,
see my Forth implementation. Instead of wanting an exact match with
the word in the dictionary, a literal is matched if its first part
matches a prefix.
To examine an example (strings) in ciforth

WANT SEE
SEE "

Once this invention is made, regular numbers can be parsed using this
mechanism by defining 0..9 in the dictionary for a net simplification
of the Forth as a whole. Strings starting with a blank was a constant
source of confusion with the Forth word S". Now it is just as you
expect, and no different from other computer languages.

QUOTATIONS
Every sequence of symbols that results in a compile time constant can
be considered a denotation. The Forth standard introduces a ":NONAME"
definition. It is a sequence of instructions, starting with :NONAME
and ending with a semicolon ; . It leaves behind an execution token,
which is indeed a compile time constant. Initially in ciforth a noname
definition had a full header, a dea, and there was no separate concept
of an execution token. This is the, somewhat appalling, definition of
:NONAME in ciforth (leaving out technicalities).

: :NONAME ": NONAME" EVALUATE LATEST HIDDEN LATEST ;

With an example

:NONAME 2 3 * ;

In order to execute a colon definition in an indirect threaded
implementation like ciforth, two properties of the word are needed,
the executable code (the machine address to jump to) and the address
of the high level code to be interpreted. So they cannot be a single
cell. The executable code is the same for all colon definitions and is
called docol . An execution token hence is a pointer to a structure
with at least 2 fields.
The name token hints that the header must be maximally
thinned out, to leave only those two fields. Creating these fields
"manually" we get for the above example, using all carnal knowledge of
ciforth that is needed:

HERE docol , HERE CELL+ , ] 2 3 * [ '(;) ,

Note that we use the denotation for dea's here and do not bother to
discriminate between , and COMPILE, . It is possible to make this into
a denotation, but the ; can no longer be used to end it. Choosing {
and } , which is quite a common notation for code sequences, we get

: { HERE docol , HERE CELL+ , ] ;
: } '(;) , POSTPONE [ ; IMMEDIATE

This results in a fresh, gay notation for the denotation. 1)

{ 2 3 * } EXECUTE .
6 OK

TWO STATES
Forth can be in a compile state and in an interpretation state. In
interpretation state the constant generated by the denotation resides
on the stack. Where is it in compilation state? We only have to look
at the traditional solution for numbers for an indirect threaded code,
to answer that. It is build in into the code. This is possible by
virtue of it being constant. The sequence of instructions is broken
and the data is put in the middle of the code. Remember that the high
level code is in fact data, a sequence of execution tokens. The
execution token in front of the data takes care that the interpreter
pointer is moved past the data and that the data ends up on the stack,
such that the compiled behaviour is the same than the execution
behaviour. There may be slight complications for native code, but they
are not important for the insight we're after. For an integer this
done by LIT . Handling strings need not be much different from
numbers. There is a difference in that the data has a variable length.
There is however also a difference in tradition. Strings were not
considered as data to put on the stack, they were just used on the go.
For example, ." would get a character from the input stream, put it on
the screen, then get another character until a double quote is
encountered. The defining word : would get a word ready to incorporate
it into the dictionary moving it to PAD . It wasn't until files came
into fashion with names that could be anything that the word S" can be
considered a string denotation. Its implementation triggers a lot of
debate about so called STATE-smartness.
A decent method to handle strings is by a denotation prefix " that
puts a string on the stack and in compilation mode takes care of
having the string in line within a definition. As long as the input
buffer is valid, a string constant (address length pair) can represent
the string. In interpretation mode this string is used immediately. A
somewhat involved equivalent of LIT is needed in compilation mode. A
possible implementation involves a branch over the area where the
string resides, followed by two LIT 's. An added advantage is that if
we always put the string in the dictionary we no longer have to worry
about the life time of strings in interpretation mode
The whole STATE-smartness debate is cut short by using denotations for
strings instead of special manipulations and simply forbidding any use
of constant/literals outside of the denotation mechanisms.
(In particular they must not be POSTPONE -ed. I hesitate to mention it
because the word POSTPONE should never have come into existence. It is
there such that someone who doesn't know whether a word is IMMEDIATE
or not can extend the compiler. However such a person has no business
trying to extend the compiler. )

THE BOTTOM LINE
By adding the technique for inlining a string to the noname
definition, we get the "quotations" , anonymous pieces of code
identified by a compile time constant. Seen the way we have developed
quotations, questions like "can a quotation that happens to be in the
middle of a definition use local values of that definition" have a
clear answer. Of course not. That would be in violation of the
constantness of denotations. (This is separate from the question
whether it is desirable to have local values. The TO mechanism parses
a string, and that is something we want to get away from, as explained
above.)

OBJECTIONS
I admit that locals can be handy. The price however is high. It blocks
the Forth language from evolving in a sensible direction. So local
adherents will object to my approach, but we'll see. There will be a
sequel.

IMPLEMENTATION NOTES
The main problem with nested compilation is that two definitions are
competing to be added to the dictionary. If one of them is
half-finished and the other is started, there is a conflict. Since
long ciforth has the word NESTED-COMPILE that handles this, and it is
not terribly involved. The situation with quotations is much easier,
because quotations are not linked into the dictionary. Here follows
the full implementation of above quotations in ciforth , which is --
it must be admitted -- a simplistic Forth.

' TASK @ CONSTANT docol

: { 'SKIP , (FORWARD HERE STATE @ docol , HERE CELL+ , ] ; IMMEDIATE
: } '(;) , STATE ! >R FORWARD) R> POSTPONE LITERAL ; IMMEDIATE

The addition compared to :NONAME is jumping over the code, and LITERAL
that make it a denotation. The word SKIP is a simple forward jump, the
words (FORWARD and FORWARD) handle the calculation and filling in of
jump distances. The compilation STATE is fetched and restored, not
manipulated via [ and ] . The addition of a SKIP could be conditional
upon STATE saving futile amounts of memory. That would destroy the
rule that a denotation builds a constant on the stack and leaves it to
LITERAL to decide whether to compile the constant, or leave it alone.
Now the fresh, gay notation works within a definition.

: test { 2 3 * } ;
test EXECUTE .
6 OK

Premature optimisation makes everything much more difficult. A
COMPILE, that tries to be clever messes everything up. There are other
and better ways to speed a program up, as I hope to demonstrate one
day.

1) In Dutch it sounds better
een frisse, vrolijke, schrijfwijze.

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

minf...@arcor.de

unread,
Jan 15, 2018, 4:47:31 PM1/15/18
to
Am Montag, 15. Januar 2018 18:50:18 UTC+1 schrieb Albert van der Horst:
> A COMPILE, that tries to be clever messes everything up. There are other
> and better ways to speed a program up, as I hope to demonstrate one
> day.

In another Forth I read the following definition:

( VARIABLE (POSTPONE) 0 (POSTPONE) ! )

: POSTPONE 1 (POSTPONE) ! ;

So POSTPONE is resolved later by a clever COMPILE, that evaluates the
state of the (POSTPONE) variable and resets it.

What is wrong with this simple method?

minf...@arcor.de

unread,
Jan 16, 2018, 1:51:07 AM1/16/18
to
Of course it is:
: POSTPONE 1 (POSTPONE) ! ; IMMEDIATE

JennyB

unread,
Jan 16, 2018, 3:28:10 AM1/16/18
to
In a classic interpreter loop, only non-immediate words ever see COMPILE, so IMMEDIATE words would never be POSTPONEd. It gets even messier from then on.

minf...@arcor.de

unread,
Jan 16, 2018, 4:31:05 AM1/16/18
to
No, it works also with immediate works. The idea is to reduce complexity in
the interpreter loop for interpretation and compilation by
- giving tokens more information than classic xts
- Albert's proposed denotations
- making COMPILE, smart to cope with it (a simple comma wouldn't do it anymore)

POSTPONE is a very handy word and should work as today.

Albert van der Horst

unread,
Jan 16, 2018, 7:06:28 AM1/16/18
to
In article <71b435d0-1c19-41b3...@googlegroups.com>,
There is nothing wrong, per se.

The better way to speed things up is a compiler that understands
the last bit of the code generated and uses that knowledge.
Every special case it has to know about gets in the way, while the end
results is the same.

minf...@arcor.de

unread,
Jan 16, 2018, 7:52:58 AM1/16/18
to
Am Dienstag, 16. Januar 2018 13:06:28 UTC+1 schrieb Albert van der Horst:
> In article <71b435d0-1c19-41b3...@googlegroups.com>,
> <minf...@arcor.de> wrote:
> >Am Montag, 15. Januar 2018 18:50:18 UTC+1 schrieb Albert van der Horst:
> >> A COMPILE, that tries to be clever messes everything up. There are other
> >> and better ways to speed a program up, as I hope to demonstrate one
> >> day.
> >
> >In another Forth I read the following definition:
> >
> >( VARIABLE (POSTPONE) 0 (POSTPONE) ! )
> >
> >: POSTPONE 1 (POSTPONE) ! ;
> >
> >So POSTPONE is resolved later by a clever COMPILE, that evaluates the
> >state of the (POSTPONE) variable and resets it.
> >
> >What is wrong with this simple method?
>
> There is nothing wrong, per se.
>
> The better way to speed things up is a compiler that understands
> the last bit of the code generated and uses that knowledge.
> Every special case it has to know about gets in the way, while the end
> results is the same.

That would leave the 1-pass compiler in the end.

Anton Ertl

unread,
Jan 16, 2018, 10:08:54 AM1/16/18
to
minf...@arcor.de writes:
>Am Dienstag, 16. Januar 2018 09:28:10 UTC+1 schrieb JennyB:
>> On Tuesday, 16 January 2018 06:51:07 UTC, minf...@arcor.de wrote:
>> > Am Montag, 15. Januar 2018 22:47:31 UTC+1 schrieb minf...@arcor.de:
>> > > Am Montag, 15. Januar 2018 18:50:18 UTC+1 schrieb Albert van der Horst:
>> > > > A COMPILE, that tries to be clever messes everything up. There are other
>> > > > and better ways to speed a program up, as I hope to demonstrate one
>> > > > day.
>> > >
>> > > In another Forth I read the following definition:
>> > >
>> > > ( VARIABLE (POSTPONE) 0 (POSTPONE) ! )
>> > >
>> > > : POSTPONE 1 (POSTPONE) ! ;
>> > >
>> > > So POSTPONE is resolved later by a clever COMPILE, that evaluates the
>> > > state of the (POSTPONE) variable and resets it.
>> > >
>> > > What is wrong with this simple method?

Lots. E.g.:

: my-( [char] ) parse 2drop ; immediate
: another-( 1 postpone my-( drop ; immediate
: foo 2 another-( bla bla ) . ;
foo \ prints 2

\ next example
: twice dup compile, compile, ;
: my-2dup ['] over twice ; immediate
: foo postpone my-2dup ; immediate
: bar foo ;
see bar

\ next example
: postpone-twice
postpone postpone
postpone postpone ; immediate
: foo postpone-twice over over ; immediate
: bar foo ;
see bar

Like STATE-smartness, (POSTPONE)-smartness is an idea that works in
some simple cases and fails in others; and for the same reason.

>> > Of course it is:
>> > : POSTPONE 1 (POSTPONE) ! ; IMMEDIATE
>>
>> In a classic interpreter loop, only non-immediate words ever see COMPILE, so IMMEDIATE words would never be POSTPONEd. It gets even messier from then on.
>
>No, it works also with immediate works.

How?

>The idea is to reduce complexity in
>the interpreter loop for interpretation and compilation by

Any data to support this claim that this reduces the complexity?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2017: http://euro.theforth.net/

minf...@arcor.de

unread,
Jan 16, 2018, 10:36:05 AM1/16/18
to
Thanks for looking into this. But such examples in a standard Forth will
not fit because they use a standard COMPILE, implementation.

A "novel" COMPILE, would have to be (POSTPONE)-smart of course.


>
> >> > Of course it is:
> >> > : POSTPONE 1 (POSTPONE) ! ; IMMEDIATE
> >>
> >> In a classic interpreter loop, only non-immediate words ever see COMPILE, so IMMEDIATE words would never be POSTPONEd. It gets even messier from then on.
> >
> >No, it works also with immediate words.
>
> How?

Here COMPILE, needs an extended xt that includes the compilation semantics.
(I use the word compilation token ct for it)

Immediacy can be detected either by flags or equality of xt and ct.

>
> >The idea is to reduce complexity in
> >the interpreter loop for interpretation and compilation by
>
> Any data to support this claim that this reduces the complexity?

I can't speak for the other Forth where I found this POSTPONE or for Albert,
but I'd assume that overall complexity is not reduced, only shifted to other words.


Anton Ertl

unread,
Jan 16, 2018, 11:31:11 AM1/16/18
to
You mean an implementation of the standard "COMPILE,".

>A "novel" COMPILE, would have to be (POSTPONE)-smart of course.

If it's not the standard "COMPILE,", why call it "COMPILE,"?

Also note that the last example does not contain any "COMPILE,", so
it's not just "COMPILE," that is broken (aka "novel") by this "simple
method".

>> >> > Of course it is:
>> >> > : POSTPONE 1 (POSTPONE) ! ; IMMEDIATE
>> >>
>> >> In a classic interpreter loop, only non-immediate words ever see COMPILE, so IMMEDIATE words would never be POSTPONEd. It gets even messier from then on.
>> >
>> >No, it works also with immediate words.
>>
>> How?
>
>Here COMPILE, needs an extended xt that includes the compilation semantics.
>(I use the word compilation token ct for it)
>
>Immediacy can be detected either by flags or equality of xt and ct.

If you have compilation semantics, you don't need to detect immediacy.
POSTPONE simply becomes

: postpone find-name name>compile swap postpone literal compile, ;

And no need for (POSTPONE)-smartness, "novel" COMPILE, or other
atrocities.

>> >The idea is to reduce complexity in
>> >the interpreter loop for interpretation and compilation by
>>
>> Any data to support this claim that this reduces the complexity?
>
>I can't speak for the other Forth where I found this POSTPONE or for Albert,
>but I'd assume that overall complexity is not reduced, only shifted to other words.

So we get broken COMPILE, and other breakage without any benefit.
That should be enough of an answer to the "What's wrong" question.

minf...@arcor.de

unread,
Jan 16, 2018, 2:59:02 PM1/16/18
to
You seem to simmer too much in your own pot. ;-)
Neither FIND-NAME nor NAME>COMPILE are standard words or in common usage afaik.
So your definition of POSTPONE could mean anything, or nothing.

Anton Ertl

unread,
Jan 17, 2018, 6:22:20 AM1/17/18
to
NAME>COMPILE is a standard word. FIND-NAME is the obvious missing
link, present in several systems, but not essential for the point I
was making. But if you have trouble understanding what it means, here
is its documentation in Gforth:

'find-name' c-addr u - nt | 0 gforth "find-name"
Find the name c-addr u in the current search order. Return its nt,
if found, otherwise 0.

>So your definition of POSTPONE could mean anything, or nothing.

It's just an implementation of the standard word POSTPONE. No need to
play dumb.
0 new messages