A lisp compiler in Forth, some questions

Albert van der Horst

unread,

Oct 24, 2016, 11:07:06 AM10/24/16

to

For list please read expression or whatever the correct wording
would be.

There is the famous READ-EVALUATE-PRINT loop.
In implementing a lisp, I've thought to leave EVALUATE
out at first.
Is that feasible?
That means : does READ leave a Lisp list that can be
PRINTed as a test?

Or does EVALUATE in fact just transform a list into an other list
in the same internal representation?

Based on McCarthy's original microlisp document, I've implemented
the elementary commands up till lambda. It took about a day.
It seems that document is much better than many books that
are written about lisp.
https://cse.sc.edu/~mgv/csce330f13/micromanualLISP.pdf
My Forth interpreter can interpret
( A B C )
and leaves a list represented by a handle than can be printed
by .lisp (a Forth word) giving
(A B C ) OK
(I do not screen off Forth commands just yet).
(Under the hood it creates new atoms, but uses those that already
exist. ).

I've discovered something. I can have a PLUS function and
a PLUS symbol that is bound to a number at the same time.
(DEFUN PLUS ( A B ) (+ A B ) )
(SETQ PLUS 117)
So LISP has (at least) two namespaces.
Why is that kept in the dark?

As a sequel I tried to do the following
(defun PLUS ( A B ) (+ A B ) )
(PLUS 12 3 )
15
That obviously works.
Now I tried to find an expression <e> such that
(<e> 12 3)
where <e> evaluates to PLUS, and then the total evaluates to 15.
Nothing worked.

Is there a fundamental reason that can't be done, or am
I just ignorant?

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Kaz Kylheku

unread,

Oct 24, 2016, 11:36:16 AM10/24/16

to

On 2016-10-24, Albert van der Horst <alb...@cherry.spenarnc.xs4all.nl> wrote:
> For list please read expression or whatever the correct wording
> would be.
>
> There is the famous READ-EVALUATE-PRINT loop.
> In implementing a lisp, I've thought to leave EVALUATE
> out at first.
> Is that feasible?
> That means : does READ leave a Lisp list that can be
> PRINTed as a test?

Of course; and you can always get at it by wrapping it in QUOTE
to suppress evaluation.

> Or does EVALUATE in fact just transform a list into an other list

It's called eval in pretty much every mainstream dialect of Lisp.

> in the same internal representation?

No; some compound expressions produce non-lists, and non-lists
are also expressions that can be input to eval.

In anything that can be called a real Lisp, the input to eval,
if it is a compound expression, is a list: the same stuff
that is produced by eval, if the expression is of a kind
that computes a list.

> Based on McCarthy's original microlisp document, I've implemented
> the elementary commands up till lambda. It took about a day.
> It seems that document is much better than many books that
> are written about lisp.
>

> I've discovered something. I can have a PLUS function and
> a PLUS symbol that is bound to a number at the same time.
> (DEFUN PLUS ( A B ) (+ A B ) )
> (SETQ PLUS 117)
> So LISP has (at least) two namespaces.
> Why is that kept in the dark?

Could be due to a prerence for tiny, six-decade-old documents over
books.

> As a sequel I tried to do the following
> (defun PLUS ( A B ) (+ A B ) )
> (PLUS 12 3 )
> 15
> That obviously works.
> Now I tried to find an expression <e> such that
> (<e> 12 3)
> where <e> evaluates to PLUS, and then the total evaluates to 15.
> Nothing worked.

It's your code!

Does your interpreter evaluate an expression occurring
in the position indicated by <e>?

A traditional two-namespace Lisp (a "Lisp-2" dialect) won't evaluate
<e>. If yours works like that, you're wasting time trying various
expressions there; none will be evaluated.

> Is there a fundamental reason that can't be done, or am
> I just ignorant?

You must implement a function traditionally called FUNCALL, so
you can then do:

(FUNCALL <e> 12 3)

FUNCALL should typically accept, as its first argument, either a
function object (to which it just applies the arguments directly),
or else a symbol (which it resolves to a function object by a lookup
in the global function binding environment, and then applies the
arguments).

Notable "Lisp-1" dialects in which funcall isn't necessary are Scheme
and EuLisp.

ANSI Common Lisp is a Lisp-2 with funcall.

In Common Lisp, by the way, you can still do this:

((lambda (...) ...) arg1 arg2 ...)

A lambda expression in the first position of a compound form is
recognized as denoting a function. This is because it serves
as a function name.

It works in other argument positions also, but only thanks
to a macro called lambda, which transforms:

(lambda (...) ...) -> (function (lambda (...) ...))

The special operator FUNCTION recognizes a lambda expression
as a function name and produces the object that it denote.

In Lisp-1 dialects things are a bit simpler: lambda is usually just an
operator which is evaluated, and calculates a function as it result.

Paul Rubin

unread,

Oct 24, 2016, 12:50:01 PM10/24/16

to

alb...@cherry.spenarnc.xs4all.nl (Albert van der Horst) writes:
> My Forth interpreter can interpret
> ( A B C )
> and leaves a list represented by a handle than can be printed
> by .lisp (a Forth word) giving
> (A B C ) OK

Kaz answered your other questions pretty well, but one further comment:
when possible (which is not always), PRINT should emit stuff that READ
can read back in, so you can paste the output back to the input. That
means if your reader treats ( as a Forth word that must be followed by a
space, then .lisp should print the space as well.

paul wallich

unread,

Oct 24, 2016, 1:42:50 PM10/24/16

to

On 10/24/16 11:16 AM, Albert van der Horst wrote:
> For list please read expression or whatever the correct wording
> would be.
>
> There is the famous READ-EVALUATE-PRINT loop.
> In implementing a lisp, I've thought to leave EVALUATE
> out at first.
> Is that feasible?
> That means : does READ leave a Lisp list that can be
> PRINTed as a test?
>
> Or does EVALUATE in fact just transform a list into an other list
> in the same internal representation?

EVAL does (approximately) two things. One of them is to take an
s-expression (not necessarily a list -- atoms are the most obvious
exception) and return another s-expression that PRINT can print out. The
other is to implement side-effects, if any, that are part of the
s-expression.

That "if any" is important to remember, because in some languages side
effects are implicit. For example, if the value of a is set to 1,
evaluating (+ a 1) will return 2, but the value of a will still be 1.
(Evaluation of (incf a) would return 2, and the value of a would then be
2, because incf is defined to have a side-effect.) And no, whatever is
returned by evaluating an s-expression has no necessary relation to what
happens to anything touched by the s-expression's side effects.

paul

ps I know I'm playing fast and loose with "value" and "return", but I
think that disquisition would make things unnecessarily complex here.

Spiros Bousbouras

unread,

Oct 24, 2016, 1:58:31 PM10/24/16

to

On Mon, 24 Oct 2016 17:16:17 +0200 (CEST)
alb...@cherry.spenarnc.xs4all.nl (Albert van der Horst) wrote:
[...]

> My Forth interpreter can interpret
> ( A B C )
> and leaves a list represented by a handle than can be printed
> by .lisp (a Forth word) giving
> (A B C ) OK
> (I do not screen off Forth commands just yet).
> (Under the hood it creates new atoms, but uses those that already
> exist. ).
>
> I've discovered something. I can have a PLUS function and
> a PLUS symbol that is bound to a number at the same time.
> (DEFUN PLUS ( A B ) (+ A B ) )
> (SETQ PLUS 117)
> So LISP has (at least) two namespaces.
> Why is that kept in the dark?

Depends on the dialect of Lisp ; Common Lisp has two namespaces (in the
context you mention ; overall it has a lot more) but Scheme has one. And it's
not kept in the dark at all , everyone who spends enough time in the Lisp
world learns eventually of this distinction and that it's one of the eternal
flamewars between Common Lisp fans and Scheme fans where each side feel that
the choice of their preferred dialect is the better one.

> As a sequel I tried to do the following
> (defun PLUS ( A B ) (+ A B ) )
> (PLUS 12 3 )
> 15
> That obviously works.
> Now I tried to find an expression <e> such that
> (<e> 12 3)
> where <e> evaluates to PLUS, and then the total evaluates to 15.
> Nothing worked.
>
> Is there a fundamental reason that can't be done, or am
> I just ignorant?

It's a matter of design. The fact that the most straightforward approach ,
namely
(let ((plus-var (function PLUS)))
(plus-var 12 3))

won't work , is a consequence of the two namespaces (or Lisp-2 as it is
usually called) design choice. For more specific information on how
Common Lisp handles things see
http://franz.com/support/documentation/8.2/ansicl/subsubsu/consesas.htm .
So you need to add some construct to act as a substitute. For the constructs
Common Lisp supplies see the post by Kaz Kylheku and also
http://franz.com/support/documentation/8.2/ansicl/dictentr/apply.htm .

By the way , if you're going to be consulting the Common Lisp Hyperspecification
(CLHS) a lot , you might find it more convenient to download a copy. You can get
one from http://www.lispworks.com/documentation/common-lisp.html .

--
Everything ends up a hack. Poor code starts off as a hack. Good code
attracts so much demand for new features that it ends up a hack.
Mike Shepherd
http://forums.theregister.co.uk/forum/1/2016/09/26/openssl_patches_last_weeks_patch/

Kaz Kylheku

unread,

Oct 24, 2016, 2:11:50 PM10/24/16

to

This principle is called "read-print consistency" and is important
in Lisp: it means that an object is printed in a way such that
a similar object is produced when the notation is read back.

If the print function produces something which doesn't satisfy
read-print consistency, the notation should be such that an error
is produced if it is read, rather than some ambiguity (a
misinterpretation of the notation as a different kind of object,
not similar to the original).

The traditional way to do this is to have unreadable notations
begin with hash angle-bracket: #<. For instance
#<compiled-function #x49CDBEF0> might be such a representation.

read-print consistency is a tricky topic. In ANSI Lisp, you don't
have full read-print consistency by default even if all objects
you're printing are printable.

This is because shared substructure and circular structure is not
rendered unless you enable the circle notation with *print-circle*.
Without *print-circle*, printing a cyclic object will put the
implementation into a loop or runaway recursion (which counts as a
failure to produce a readable notation) and shared substructures (other
than via interned symbols) are not captured in the notation.

So for instance if we do this:

> (let ((g (gensym))) (list g g g))
(#:G3210 #:G3210 #:G3210)

what we have is a list of three elements, in which each of the
elements is a repetition of the same object. Now the notation
which is output is certainly readable, but when we fed it to the reader,
we get a list which contains three different objects. Each instance of
the #:G3210 printed syntax produces a new symbol object.

If we turn on circular printing, we get the full-blown read-print
consistency:

> (setf *print-circle* t)
T
> (let ((g (gensym))) (list g g g))
(#1=#:G3210 #1# #1#)

The first object in the list is labeled as 1 using #1=. Then the
other two occurrences just repeat that object by referring to its
label by #1#.

This notation is, of course, readable; it's not output-only.

As you can see, Lisp takes object printing seriously; and it's a fairly
nuanced topic.

Albert van der Horst

unread,

Oct 24, 2016, 2:34:30 PM10/24/16

to

In article <87twc1n...@nightsong.com>,

Thanks for the remark. In the business of looking at Pascal and Basic,
I already have a pretty good tokenizer for Pascal. The implementation
of lisp in Pascal of Lispkit served as a testcase.
Lispkit's secpd.pas can be tokenized fully, there are no places where
it misinterprets a Pasal token. This involves separating A<B , A=B
A<=B , bracketed expressions etc. Also it distinguises normal identifiers
from Pascal keywords.
The mechanism is to replace PARSE-NAME by TOKEN, that doesn't stop at a
blank but at any of a set of delimiters. The PREFIX mechanism does
the rest.

If we would impose the Forth requirement of blank separation,
a Lisper wouldn't probably even care to look at the result.
This may be the next thing to add, because at least I know how
to do that. Also it may make testing easier.