How the backquote and the comma really work?

Marcin Borkowski

unread,

Jun 25, 2015, 1:09:31 PM6/25/15

to Help Gnu Emacs mailing list

Hi all,

I decided that the time has come that I finally approach the scary
backquote-comma duo. (While I understand it superficially, I’d like to
get it right and thoroughly this time.) So my question is whether my
mental model (see below) is correct.

So, I assume that when Emacs Lisp interpreter encounters a backquote, it
looks at the expression after it. If it is anything but a list, it just
works like the usual quote, and the backquoted expression evaluates to
what was backquoted.

If it is a list, its element are read and scanned. If any part of the
list (probably a nested one) begins with a comma, the whole thing after
the comma (be it a symbol, a list or whatever) is evaluated as usual,
and the result is put into the resulting list.

Whew. Is that (more or less) right? (I am aware that I didn’t take
into account the splicing operator, but it doesn’t introduce a lot of
additional complexity) Of course, when writing it, I realized that my
natural-language description is not extremely precise, so a bonus
question is: can I find an Emacs Lisp metacircular evaluator (taking
into account the quoting mechanisms) anywhere?

And I know that I risk starting another thread lasting for dozens of
messages;-) – but I /do/ want to understand this stuff... In fact, in
the spirit of another recent discussion, I want to write a simple code
analyzer, finding one-legged ‘if’s and suggesting replacing them with
‘when’s or ‘unless’es. This is trivial unless (pun intended) you want
to take (back)quotes into consideration.

Best regards,

--
Marcin Borkowski This email was proudly sent
http://mbork.pl from my Emacs.

Michael Heerdegen

unread,

Jun 25, 2015, 1:33:39 PM6/25/15

to help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> So, I assume that when Emacs Lisp interpreter encounters a backquote

It's even less mystical: backquote is just a normal macro:

C-h f ` RET

It's also a reader macro so that you can write

`thing

as an abbreviation of of

(` thing)

but that's just a detail.

> If it is a list, its element are read and scanned. If any part of the
> list (probably a nested one) begins with a comma, the whole thing after
> the comma (be it a symbol, a list or whatever) is evaluated as usual,
> and the result is put into the resulting list.
>
> Whew. Is that (more or less) right?

Seems to be a reasonable mental model. Of course, the elements have
already been read by the reader. Whether these are evaluated or not
depends on whether the macro finds the `backquote-unquote-symbol' in
front of them, so to say.

> so a bonus question is: can I find an Emacs Lisp metacircular
> evaluator (taking into account the quoting mechanisms) anywhere?

You don't need a meta thing, since backquote is completely implemented
in Elisp, just read the source code ;-)

Regards,

Michael.

Marcin Borkowski

unread,

Jun 25, 2015, 2:07:02 PM6/25/15

to help-gn...@gnu.org

On 2015-06-25, at 19:33, Michael Heerdegen <michael_...@web.de> wrote:

> Marcin Borkowski <mb...@mbork.pl> writes:
>
>> So, I assume that when Emacs Lisp interpreter encounters a backquote
>
> It's even less mystical: backquote is just a normal macro:
>
> C-h f ` RET

Thanks. OTOH, backquote.el has more than 200 lines of code, and it is
a bit complicated (I would guess that it might contain some
optimizations/error checking/whatever). Seeing a simplistic (though
working in typical/correct cases) version might be rather illuminating,
no?

> It's also a reader macro so that you can write
>
> `thing
>
> as an abbreviation of of
>
> (` thing)
>
> but that's just a detail.

Interesting. Where is that defined?

>> If it is a list, its element are read and scanned. If any part of the
>> list (probably a nested one) begins with a comma, the whole thing after
>> the comma (be it a symbol, a list or whatever) is evaluated as usual,
>> and the result is put into the resulting list.
>>
>> Whew. Is that (more or less) right?
>
> Seems to be a reasonable mental model. Of course, the elements have
> already been read by the reader. Whether these are evaluated or not
> depends on whether the macro finds the `backquote-unquote-symbol' in
> front of them, so to say.
>
>> so a bonus question is: can I find an Emacs Lisp metacircular
>> evaluator (taking into account the quoting mechanisms) anywhere?
>
> You don't need a meta thing, since backquote is completely implemented
> in Elisp, just read the source code ;-)

See above: it is pretty much complicated compared to entry-level stuff
in SICP. (Though I guess that I will just put my shoulder to the wheel
and study the code. That might lead to more questions;-).)

> Regards,
>
> Michael.

Thanks a lot,

--
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University

Drew Adams

unread,

Jun 25, 2015, 2:11:03 PM6/25/15

to Marcin Borkowski, Help Gnu Emacs mailing list

> So, I assume that when Emacs Lisp interpreter encounters a

> backquote, it looks at the expression after it. If it is anything
> but a list, it just works like the usual quote, and the backquoted
> expression evaluates to what was backquoted.

Not really. A comma (& additional backquotes & additional commas...)
still tells the backquote preceding it to evaluate whatever sexp the
comma precedes.

So `,foo evaluates variable foo, and `',foo evaluates foo and quotes
the result.

(setq foo 'bar) ; => bar
(setq toto `,foo) ; => bar
(setq titi `',foo) ; => 'bar

(setq titi `',foo) is equivalent to (setq titi (list 'quote foo))

Michael H's advice about following the macroexpansion is good.

Michael Heerdegen

unread,

Jun 25, 2015, 2:23:15 PM6/25/15

to help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> Seeing a simplistic (though working in typical/correct cases) version
> might be rather illuminating, no?

Yes. Want to give it a try?

> > It's also a reader macro so that you can write
> >
> > `thing

> Interesting. Where is that defined?

Since you can't define reader macros via Emacs Lisp, it's hardcoded in
the C sources, "lread.c" AFAICT.

Michael.

Marcin Borkowski

unread,

Jun 25, 2015, 2:39:25 PM6/25/15

to help-gn...@gnu.org

On 2015-06-25, at 20:22, Michael Heerdegen <michael_...@web.de> wrote:

> Marcin Borkowski <mb...@mbork.pl> writes:
>
>> Seeing a simplistic (though working in typical/correct cases) version
>> might be rather illuminating, no?
>
> Yes. Want to give it a try?

Sure. I'll get back here with some code (notice: it might make some
time, from a week to a few months - it's not the only thing I have to
do;-)) to discuss.

>> > It's also a reader macro so that you can write
>> >
>> > `thing
>
>> Interesting. Where is that defined?
>
> Since you can't define reader macros via Emacs Lisp, it's hardcoded in
> the C sources, "lread.c" AFAICT.

OK. I'll look into the manual on reader macros, I'm not fluent enough
in C to read that code.

> Michael.

Best,

Michael Heerdegen

unread,

Jun 25, 2015, 2:40:51 PM6/25/15

to help-gn...@gnu.org

Drew Adams <drew....@oracle.com> writes:

> So `,foo evaluates variable foo, and `',foo evaluates foo and quotes
> the result.

Yes, that's an aspect that was missing in the mental model: backquote
works recursively, i.e. unquoting at deeper list levels is handled as
well.

> (setq titi `',foo) ; => 'bar

Whereby we don't want to conceal that the "'" is also a reader macro:
'thing -> (quote thing).

Here is the expression the reader generates for "`',foo":

(read "`',foo") ==> (\` (quote (\, foo)))

(be sure to eval with print-quoted nil, the default).

Michael.

Marcin Borkowski

unread,

Jun 25, 2015, 2:44:31 PM6/25/15

to help-gn...@gnu.org

>>> > It's also a reader macro so that you can write
>>> >
>>> > `thing
>>
>>> Interesting. Where is that defined?
>>
>> Since you can't define reader macros via Emacs Lisp, it's hardcoded in
>> the C sources, "lread.c" AFAICT.
>
> OK. I'll look into the manual on reader macros, I'm not fluent enough
> in C to read that code.

Wow. There seems to be nothing in either manual (the Emacs one and the
Elisp one) about reader macros. Shouldn't this be considered a bug in
the docs?

>> Michael.
>
> Best,

Marcin Borkowski

unread,

Jun 25, 2015, 2:46:37 PM6/25/15

to Help Gnu Emacs mailing list

On 2015-06-25, at 20:10, Drew Adams <drew....@oracle.com> wrote:

>> So, I assume that when Emacs Lisp interpreter encounters a
>> backquote, it looks at the expression after it. If it is anything
>> but a list, it just works like the usual quote, and the backquoted
>> expression evaluates to what was backquoted.
>
> Not really. A comma (& additional backquotes & additional commas...)
> still tells the backquote preceding it to evaluate whatever sexp the
> comma precedes.
>

> So `,foo evaluates variable foo, and `',foo evaluates foo and quotes
> the result.
>

> (setq foo 'bar) ; => bar
> (setq toto `,foo) ; => bar

> (setq titi `',foo) ; => 'bar
>

> (setq titi `',foo) is equivalent to (setq titi (list 'quote foo))
>
> Michael H's advice about following the macroexpansion is good.

Thanks, I stand corrected!

Marcin Borkowski

unread,

Jun 25, 2015, 2:53:38 PM6/25/15

to help-gn...@gnu.org

On 2015-06-25, at 20:40, Michael Heerdegen <michael_...@web.de> wrote:

> Drew Adams <drew....@oracle.com> writes:
>
>> So `,foo evaluates variable foo, and `',foo evaluates foo and quotes
>> the result.
>

> Yes, that's an aspect that was missing in the mental model: backquote
> works recursively, i.e. unquoting at deeper list levels is handled as
> well.
>

>> (setq titi `',foo) ; => 'bar
>

> Whereby we don't want to conceal that the "'" is also a reader macro:
> 'thing -> (quote thing).
>
> Here is the expression the reader generates for "`',foo":
>
> (read "`',foo") ==> (\` (quote (\, foo)))
>
> (be sure to eval with print-quoted nil, the default).

Interesting. I have print-quoted set to nil, however, M-:
(eval-expression, or in my case - icicle-pp-eval-expression) does not
show what you have here. Does eval-expression (or its Icicles
counterpart) mess with print-quoted?

> Michael.

Best,

Michael Heerdegen

unread,

Jun 25, 2015, 3:06:18 PM6/25/15

to help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> Wow. There seems to be nothing in either manual (the Emacs one and the
> Elisp one) about reader macros.

Yip, because this can't be defined or controlled from Lisp at all, so
there is not much to describe. The few hardcoded reader macros that
exist are explained individually in the manual (though the term "reader
macro" may not even be used). Reader macros are common in Common Lisp,
in Emacs Lisp, the few that exist are more a special aspect of syntax,
not even worth to name them specially.

Michael.

Michael Heerdegen

unread,

Jun 25, 2015, 3:40:18 PM6/25/15

to help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> Interesting. I have print-quoted set to nil, however, M-:
> (eval-expression, or in my case - icicle-pp-eval-expression) does not
> show what you have here. Does eval-expression (or its Icicles
> counterpart) mess with print-quoted?

AFAICT the Icicles version uses the pp ("pretty print") library. pp
binds `print-quoted' unconditionally to t when printing.

Michael.

Drew Adams

unread,

Jun 25, 2015, 4:06:01 PM6/25/15

to Michael Heerdegen, help-gn...@gnu.org

> > Interesting. I have print-quoted set to nil, however, M-:
> > (eval-expression, or in my case - icicle-pp-eval-expression) does
> > not show what you have here. Does eval-expression (or its Icicles
> > counterpart) mess with print-quoted?
>
> AFAICT the Icicles version uses the pp ("pretty print") library. pp
> binds `print-quoted' unconditionally to t when printing.

Right, on both counts.

Marcin, you can use `eval-expression' instead of `M-:' (i.e., even
with Icicles):

M-x eval-expression RET

Eval: (let ((print-quoted nil)) (read "`',foo")) RET

Marcin Borkowski

unread,

Jun 25, 2015, 4:19:03 PM6/25/15

to Michael Heerdegen, help-gn...@gnu.org

On 2015-06-25, at 22:05, Drew Adams <drew....@oracle.com> wrote:

>> > Interesting. I have print-quoted set to nil, however, M-:
>> > (eval-expression, or in my case - icicle-pp-eval-expression) does
>> > not show what you have here. Does eval-expression (or its Icicles
>> > counterpart) mess with print-quoted?
>>
>> AFAICT the Icicles version uses the pp ("pretty print") library. pp
>> binds `print-quoted' unconditionally to t when printing.

That's a pity. Not Emacs-y way of doing things, I guess; IMHO, the
Emacs-y way would be to bind print-quoted to pp-default-print-quoted,
set to t by default;-).

> Marcin, you can use `eval-expression' instead of `M-:' (i.e., even
> with Icicles):
>
> M-x eval-expression RET
>
> Eval: (let ((print-quoted nil)) (read "`',foo")) RET
>
> (\` (quote (\, foo)))

Thanks. And sorry for being too lazy to check it myself.

Drew Adams

unread,

Jun 25, 2015, 4:37:46 PM6/25/15

to Marcin Borkowski, Michael Heerdegen, help-gn...@gnu.org

> >> pp binds `print-quoted' unconditionally to t when printing.
>
> That's a pity. Not Emacs-y way of doing things, I guess; IMHO, the
> Emacs-y way would be to bind print-quoted to pp-default-print-
> quoted, set to t by default;-).

Just `pp-print-quoted' - and yes, agreed; such a variable could
have been provided and used.

This is, for example, why library `pp+.el' offers the following
options, as distinct from the ones that lack prefix `pp-':

`pp-eval-expression-print-length'
`pp-eval-expression-print-level'

And it is why it uses `pp-read-expression-map' instead of
`read-expression-map' (similar, but `pp-*' uses some Emacs-Lisp
key bindings). Evaluating with pretty printing is generally a
different use case from `eval-expression'.

But I didn't think to provide a variable `pp-print-quoted' (or
`pp-eval-expression-print-quoted'). It's not a common use case,
but anyway, now you know. `eval-expression' and
`pp-eval-expression' are just commands - nothing special.

(The real gotcha comes when people mistakenly think that using
them is tantamount to evaluating normally in all cases. These
commands do more than just evaluate, including reading the sexp
to evaluate and printing the result.)

Robert Thorpe

unread,

Jun 25, 2015, 7:55:35 PM6/25/15

to Michael Heerdegen, help-gn...@gnu.org

Michael Heerdegen <michael_...@web.de> writes:

> Here is the expression the reader generates for "`',foo":
>
> (read "`',foo") ==> (\` (quote (\, foo)))
>
> (be sure to eval with print-quoted nil, the default).

In some cases the commands "macroexpand" and "macroexpand-all" can be
useful for finding out what complicated types of quoting do.

BR,
Robert Thorpe

Rusi

unread,

Jun 25, 2015, 9:41:50 PM6/25/15

to

On Friday, June 26, 2015 at 5:25:35 AM UTC+5:30, Robert Thorpe wrote:

> Michael Heerdegen writes:
>
> > Here is the expression the reader generates for "`',foo":
> >
> > (read "`',foo") ==> (\` (quote (\, foo)))
> >
> > (be sure to eval with print-quoted nil, the default).
>
> In some cases the commands "macroexpand" and "macroexpand-all" can be
> useful for finding out what complicated types of quoting do.

I was poking around in the new (and very promising) use-package of
John Wiegley to figure out some (my) bugs in understanding use-package.
And so using macroexpand to see what exactly it is up to.
And seeing a lot of '...'s for deep nested structures

Whats the recommended way for looking inside these?

to...@tuxteam.de

unread,

Jun 26, 2015, 3:31:55 AM6/26/15

to Marcin Borkowski, Help Gnu Emacs mailing list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Jun 25, 2015 at 07:09:11PM +0200, Marcin Borkowski wrote:
> Hi all,
>
> I decided that the time has come that I finally approach the scary
> backquote-comma duo. (While I understand it superficially, I’d like to
> get it right and thoroughly this time.) So my question is whether my
> mental model (see below) is correct.
>

> So, I assume that when Emacs Lisp interpreter encounters a backquote, it
> looks at the expression after it. If it is anything but a list, it just
> works like the usual quote, and the backquoted expression evaluates to
> what was backquoted.
>

> If it is a list, its element are read and scanned. If any part of the
> list (probably a nested one) begins with a comma, the whole thing after
> the comma (be it a symbol, a list or whatever) is evaluated as usual,
> and the result is put into the resulting list.

It's *always* read as an S-expression (i.e. either a symbol, a string,
a couple of more things, or a pair (thus, a list too).

Thus something like `(bla bli would be an unfinished expression.

To put a slightly different slant than the other very good answers
on it, backquote is Lisp's take on the shell's, Perl's, Pythons "variable
interpolation". On those languages it operates on strings, in Lisp it
operates on S-expressions. Where in Perl you might say:

my $amount=200;
my $currency="dollars";
print("You owe me $amount $currency\n");

=> You owe me 200 dollars

in Lisp you think in S-expressions. Somewhat equivalent would be

(setq amount 200)
(setq currency 'dollars) ; use a symbol, just for kicks
(print `(You owe me ,amount ,currency))

=> (You owe me 200 dollars)

Of course, print shows the surrounding parentheses because the result
is a list in this case (an S-expression in general).

The whole magic of ` and , comes because you can "unquote" whole
sub-expressions: think

`(you owe me ,(* amount 1.1) ,currency)

and because you can nest the whole thing (unquote within quote within
unquote ...).

It is just a minimalistic, but complete template language, in classical
Lisp tradition.

(and if you have a canonical transformation of e.g. S-expressions
to HTML, it's much more fun to write HTML templates in than the
usual template languages).

Now how this can be used to transform source code (i.e. "write macros")
is left as an exercise to the reader ;-)

Regards
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlWM/9cACgkQBcgs9XrR2kYOFACeMkeLRxMLJHBrJ/n71ErVkZy1
QFIAmwcUEneXbHY+zcbmCDsTGRl97TM8
=AVmD
-----END PGP SIGNATURE-----

Drew Adams

unread,

Jun 26, 2015, 9:49:11 AM6/26/15

to to...@tuxteam.de, Marcin Borkowski, Help Gnu Emacs mailing list

> To put a slightly different slant than the other very good answers
> on it, backquote is Lisp's take on the shell's, Perl's, Pythons
> "variable interpolation". On those languages it operates on strings,
> in Lisp it operates on S-expressions. Where in Perl you might say:
> my $amount=200;
> my $currency="dollars";
> print("You owe me $amount $currency\n");
> => You owe me 200 dollars
>
> in Lisp you think in S-expressions. Somewhat equivalent would be
> (setq amount 200)
> (setq currency 'dollars) ; use a symbol, just for kicks
> (print `(You owe me ,amount ,currency))
> => (You owe me 200 dollars)

Good explanation.

to...@tuxteam.de

unread,

Jun 26, 2015, 10:06:40 AM6/26/15

to Drew Adams, Help Gnu Emacs mailing list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Jun 26, 2015 at 06:48:56AM -0700, Drew Adams wrote:
> > To put a slightly different slant than the other very good answers

[...]

> Good explanation.

Thanks :-)

I learnt quite a bit from the other explanations too. The view
angle I exposed is due to my fascination with these very special
traits of The Lisps -- having one "clay" from which everything is
made (including programs!), the S-expressions, and offering
whatever the machine has under the hood as building blocks (as
far as possible).

In this case, not offer a "macro machinery", but a template
expander and a hook in the evaluator where (surprise!) this
very template expander fits in. But you can use this template
expander for your other mischievous ideas (like, for example,
writing templated HTML if you're so inclined).

Gotta love that.

- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlWNXF0ACgkQBcgs9XrR2kbmTACfeJ+zyQiQ4uRkKoIw53cmMXPS
LZkAn2utpc1Cong5LQmthfx74tUgcKY9
=QmU0
-----END PGP SIGNATURE-----

Michael Heerdegen

unread,

Jun 26, 2015, 10:25:43 AM6/26/15

to help-gn...@gnu.org

Rusi <rusto...@gmail.com> writes:

> And seeing a lot of '...'s for deep nested structures
> Whats the recommended way for looking inside these?

Does C-h k M-: or C-h k C-x C-e help?

Michael.

Rusi

unread,

Jun 26, 2015, 10:35:27 AM6/26/15

to

On Friday, June 26, 2015 at 7:55:43 PM UTC+5:30, Michael Heerdegen wrote:

> Rusi writes:
>
> > And seeing a lot of '...'s for deep nested structures
> > Whats the recommended way for looking inside these?
>
> Does C-h k M-: or C-h k C-x C-e help?
>

Neither works (the ...s remain)
But ielm seems to work

Michael Heerdegen

unread,

Jun 26, 2015, 10:52:04 AM6/26/15

to help-gn...@gnu.org

Rusi <rusto...@gmail.com> writes:

> > Does C-h k M-: or C-h k C-x C-e help?

> Neither works (the ...s remain)

Dunno what you are using. For the default bindings of these keys, a
prefix arg 0 does the job. AFAIK pp doesn't elide anything at all by
default.

Anyway, at the end this is always controlled by `print-level' and
`print-length'.

Michael.

Emanuel Berg

unread,

Jun 26, 2015, 11:10:13 AM6/26/15

to help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> I decided that the time has come that I finally
> approach the scary backquote-comma duo. (While
> I understand it superficially, I’d like to get it
> right and thoroughly this time.) So my question is
> whether my mental model (see below) is correct.

The backtick isn't complicated at all. Besides you got
it right.

Here is how it works:

'(this is not a five)

`(this is still not a five) ; no reason for backtick here

(let ((five 5))
`(now it is a ,five) )

--
underground experts united
http://user.it.uu.se/~embe8573

sokoba...@gmail.com

unread,

Jun 30, 2015, 12:27:46 PM6/30/15

to

Le jeudi 25 juin 2015 19:09:31 UTC+2, Marcin Borkowski a écrit :
> Hi all,
>
> I decided that the time has come that I finally approach the scary
> backquote-comma duo. (While I understand it superficially, I'd like to
> get it right and thoroughly this time.) So my question is whether my
> mental model (see below) is correct.

I made the same decision... long time ago...
Before backquote, I first needed to be clear with "quote" and "eval".
When I fully understood the "'foo" "foo" "(eval 'foo)" "(eval foo)",
I could go for the backquote thing.

I would suggest to read some documentation like:
http://www.gnu.org/software/emacs/manual/html_node/elisp/Backquote.html
http://www.cs.cmu.edu/cgi-bin/info2www?(elisp)Backquote

> So, I assume that when Emacs Lisp interpreter encounters a backquote, it
> looks at the expression after it. If it is anything but a list, it just
> works like the usual quote, and the backquoted expression evaluates to
> what was backquoted.
>
> If it is a list, its element are read

as already mentioned, they were read by the reader long time
before being parsed by the backquote macro itself!

> Michael Heerdegen wrote:
>> Of course, the elements have already been read by the reader.

> and scanned. If any part of the
> list (probably a nested one) begins with a comma, the whole thing after
> the comma (be it a symbol, a list or whatever) is evaluated as usual,
> and the result is put into the resulting list.

Well... more or less...

The backquote process itself does NOT evaluate anything.
It is better to think of it in terms of expansion.
The evaluation comes AFTER the backquote has expanded its stuff.

When I need to evaluate more than 1 or 2 expressions, i use "M-x ielm"
so that I can see all the results, copy/paste them, etc.

Here, I'll use "print" so that you see both the real thing first
and the way it's pretty-printed in a more human-readable way.

ELISP> (print (read "`',foo"))

(\` (quote (\, foo)))

`',foo

ELISP> (print (macroexpand ''foo))
(quote foo)
'foo

ELISP> (print (macroexpand '`',foo))
(list (quote quote) foo)
(list 'quote foo)

ELISP> (print (macroexpand '`(a ,b c)))
(cons (quote a) (cons b (quote (c))))
(cons 'a
(cons b
'(c)))

In this last example, you can see that `(a ,b c)
is very close to (list 'a b 'c)
where the function "list" evaluates its arguments:
- 'a evaluates to the symbol "a"
- b evaluates to the symbol-value of the symbol "b"
- 'c evaluates to the symbol "c"
and "list" returns a new list with the these values.

> Whew. Is that (more or less) right?

Yes! (more or less)

> can I find an Emacs Lisp metacircular evaluator (taking
> into account the quoting mechanisms) anywhere?

You can have a look at https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node367.html

(defun bq-process (x)
(cond ((atom x)
(list *bq-quote* x))
((eq (car x) 'backquote)
(bq-process (bq-completely-process (cadr x))))
((eq (car x) *comma*) (cadr x)) ;; <---
((eq (car x) *comma-atsign*)
(error ",@~S after `" (cadr x)))
((eq (car x) *comma-dot*)
(error ",.~S after `" (cadr x)))
(t (do ((p x (cdr p))
(q '() (cons (bracket (car p)) q)))
((atom p)
(cons *bq-append*
(nreconc q (list (list *bq-quote* p)))))
(when (eq (car p) *comma*)
(unless (null (cddr p)) (error "Malformed ,~S" p))
(return (cons *bq-append*
(nreconc q (list (cadr p))))))
(when (eq (car p) *comma-atsign*)
(error "Dotted ,@~S" p))
(when (eq (car p) *comma-dot*)
(error "Dotted ,.~S" p))))))

(remember that ",xx" has already been expanded into "(, xx)",
which is a list with 2 items, the symbol "," and "xx")
On the 6th line, you can see that if "x" is a list beginning with a comma,
the backquote-process just returns whatever is in the 2nd position of this list.

> And I know that I risk starting another thread lasting for dozens of

> messages;-) - but I /do/ want to understand this stuff... In fact, in

> the spirit of another recent discussion, I want to write a simple code
> analyzer, finding one-legged 'if's and suggesting replacing them with
> 'when's or 'unless'es.

Good idea!

> This is trivial unless (pun intended) you want
> to take (back)quotes into consideration.

You probably don't want to rewrite a reader!
I would suggest that you "just" write a parser, something like:
(until (eof)
(parse (read)))

So, you must be very clear with what you'll get from the reader.
Remember the "read" process skips comments and
"recognizes/creates" a lot of stuff,
like strings, symbols, numbers, lists, etc.

During this process, the macro characters have already been expanded:
"`xx" -> "(`xx)"
but the macro function "`" has NOT been expanded yet.

Maybe your code could force the expansion:
(until (eof)
(parse (macroexpand (read))))

HTH
)jack(

Marcin Borkowski

unread,

Jul 10, 2015, 7:36:27 AM7/10/15

to help-gn...@gnu.org

On 2015-06-25, at 20:39, Marcin Borkowski <mb...@mbork.pl> wrote:

> On 2015-06-25, at 20:22, Michael Heerdegen <michael_...@web.de> wrote:
>
>> Marcin Borkowski <mb...@mbork.pl> writes:
>>
>>> Seeing a simplistic (though working in typical/correct cases) version
>>> might be rather illuminating, no?
>>
>> Yes. Want to give it a try?
>
> Sure. I'll get back here with some code (notice: it might make some
> time, from a week to a few months - it's not the only thing I have to
> do;-)) to discuss.

OK, so -- as I said -- I'm back. I don't have my metacircular
interpreter (yet), and I want to make it rather a simplistic one (no
assignments, for instance -- just evaluating functions, conditionals and
(maybe) while loops), but I concentrated on the reader to start with.
So here's my humble attempt at the reader itself. It does nothing with
ticks, backticks and commas -- AFAIUC, it shouldn't be done at this
level anyway -- it just translates them to special forms (quote ...),
(quasi-quote ...) and (unquote ...). Do I get it correctly that it's
the eval function which should handle these?

--8<---------------cut here---------------start------------->8---
;; A simple metacircular interpreter for (a subset of) Emacs Lisp

(require 'anaphora) ; we'll use acase

(defun mci/next-token ()
"Get the next token from the current buffer at point position.
A token can be: an integer, a symbol, a parenthesis, a comma,
a backquote or a quote. Return a number (in case of an integer),
a symbol (in case of a symbol), or one of the symbols: :open-paren,
:close-paren, :quote, :quasi-quote, :unquote, :eob."
(skip-chars-forward " \t\n")
(cond ((eq (char-after) ?$)
(forward-char)
:open-paren)
((eq (char-after) ?$)
(forward-char)
:close-paren)
((eq (char-after) ?\')
(forward-char)
:quote)
((eq (char-after) ?\`)
(forward-char)
:quasi-quote)
((eq (char-after) ?\,)
(forward-char)
:unquote)
((looking-at "\$[-+]?[[:digit:]]+\$[ \t\n)]")
(skip-chars-forward "[:digit:]")
(string-to-number (match-string 1)))
((looking-at "[^ \t\n)]+")
(goto-char (match-end 0))
(intern (match-string-no-properties 0)))
((eobp)
:eob)))

(defun mci/read ()
"Read one Elisp expression from the buffer at point."
(acase (mci/next-token)
(:open-paren (mci/read-list-contents))
(:close-paren
(error "Unexpected closing paren at line %d encountered -- mci/read"
(line-number-at-pos)))
(:quote (list 'quote (mci/read)))
(:quasi-quote (list 'quasi-quote (mci/read)))
(:unquote (list 'unquote (mci/read)))
(:eob nil)
(t it)))

(defun mci/read-list-contents ()
"Read list contents (until the closing paren), gobble the
closing paren."
(let ((next (mci/next-token))
list)
(while (not (eq next :close-paren))
(if (eq next :eob)
(error "Unexpected EOB while reading a list -- mci/read-list-contents")
(push next list)
(setq next (mci/next-token))))
(nreverse list)))
--8<---------------cut here---------------end--------------->8---

I'd be thankful for any input, either on correctness of the above code,
or on its elegance and `lispy-ness', or on ways to make it better for
novices to understand.

TIA for your help! (And I'm feeling a bit guilty that I ask a fair
share of simple questions -- but my mission here is to try to understand
this stuff as well as I can, and then write about it, so that it will be
easier for others to `get it' -- so that makes my conscience easier;-).)

Michael Heerdegen

unread,

Jul 12, 2015, 11:54:52 AM7/12/15

to Marcin Borkowski, help-gn...@gnu.org

Hi Marcin,

sorry for the late reply.

> So here's my humble attempt at the reader itself. It does nothing
> with ticks, backticks and commas -- AFAIUC, it shouldn't be done at
> this level anyway -- it just translates them to special forms (quote
> ...), (quasi-quote ...) and (unquote ...).

Yes, that's the right approach. You could of course translate into the
symbols named "'", "`" and "," instead, like the Lisp reader does, but
that's a detail. In Elisp, these aren't special forms. They could be
in your interpreter, of course.

> Do I get it correctly that it's the eval function which should handle
> these?

In Elisp, it's not directly handled by eval, since handling the
backquote mechanism is not hardcoded. Instead, backquote is
a macro written in Lisp.

Dunno if your interpreter will support macros. If not, you could handle
backquote directly in your interpreter.

> (require 'anaphora) ; we'll use acase

It would be good if you could drop this dependence. This would spare
people from trying your code from installing additional stuff.

> (defun mci/next-token () ...

> (defun mci/read () ...

> (defun mci/read-list-contents () ...

That looks already very promising!

I never tried to write a Lisp reader in Elisp, but the general approach
seems to be appropriate (others might be able to give more and better
comments -- Drew, Stefan, Lars, ... - anyone?).

There is a problem though when the read expression is nested. I tried
to `mci/read' this string for example:

"(defun fac (x) (if (< 2 x) 1 (* x (fac (1- x)))))"

and got

(defun fac :open-paren x)

as result. If you Edebug your functions, you can see what goes wrong.
Please tell me if you need more hints...

I guess you already know that you have not chosen the easiest way to
understand backquote. Anyway, you learn a lot of stuff with your
approach. Looking forward the next version!

Regards,

Michael.

Vaidheeswaran C

unread,

Jul 12, 2015, 1:38:04 PM7/12/15

to help-gn...@gnu.org

On Thursday 25 June 2015 10:39 PM, Marcin Borkowski wrote:
> I decided that the time has come that I finally approach the scary
> backquote-comma duo. (While I understand it superficially, I’d like to
> get it right and thoroughly this time.) So my question is whether my
> mental model (see below) is correct.

Page 12 here http://www.lisperati.com/casting_spels.pdf is a good
enough mental model for me.

Marcin Borkowski

unread,

Jul 12, 2015, 3:56:14 PM7/12/15

to help-gn...@gnu.org

On 2015-07-12, at 17:54, Michael Heerdegen <michael_...@web.de> wrote:

> Hi Marcin,
>
> sorry for the late reply.

It's fine, I'm working slowly on this anyway.

>> So here's my humble attempt at the reader itself. It does nothing
>> with ticks, backticks and commas -- AFAIUC, it shouldn't be done at
>> this level anyway -- it just translates them to special forms (quote
>> ...), (quasi-quote ...) and (unquote ...).
>
> Yes, that's the right approach. You could of course translate into the
> symbols named "'", "`" and "," instead, like the Lisp reader does, but
> that's a detail. In Elisp, these aren't special forms. They could be
> in your interpreter, of course.

And they will be;-).

>> Do I get it correctly that it's the eval function which should handle
>> these?
>
> In Elisp, it's not directly handled by eval, since handling the
> backquote mechanism is not hardcoded. Instead, backquote is
> a macro written in Lisp.

I see. I’ll try to study its code.

> Dunno if your interpreter will support macros. If not, you could handle
> backquote directly in your interpreter.

No, I don't want to support macros – too much work and little benefit,
I guess. Maybe some day. So all the $quasi-$?quoting will be handled
by eval, as special forms (as I said above).

>> (require 'anaphora) ; we'll use acase
>
> It would be good if you could drop this dependence. This would spare
> people from trying your code from installing additional stuff.

Well, aif is also useful for me. Maybe I’ll add their definitions
directly, or drop them altogether. I’ll see. I’m rather not going to
publish this code on Melpa or anything like that; it’ll quite probably
appear on my blog, however. And if someone wants to dive deep enough in
Elisp to study an MCI, I guess installing one package is not too high
a hurdle.

>> (defun mci/next-token () ...
>
>> (defun mci/read () ...
>
>> (defun mci/read-list-contents () ...
>
> That looks already very promising!

Thanks! I already have a rudimentary mci/eval (not supporting variable
binding yet, however – I’m working on lexical binding, and hopefully
I’ll get closures then almost for free when implementing lambdas), and
mci/apply will come next. (One of the nice things when writing a Lisp
MCI is that you can use the standard eval or apply before you write your
own, and everything can work even before it’s finished;-).)

> I never tried to write a Lisp reader in Elisp, but the general approach
> seems to be appropriate (others might be able to give more and better
> comments -- Drew, Stefan, Lars, ... - anyone?).
>
> There is a problem though when the read expression is nested. I tried
> to `mci/read' this string for example:
>
> "(defun fac (x) (if (< 2 x) 1 (* x (fac (1- x)))))"
>
> and got
>
> (defun fac :open-paren x)
>
> as result. If you Edebug your functions, you can see what goes wrong.
> Please tell me if you need more hints...

Good catch, thanks. No need for edebug, I guess – I haven’t looked at
my code, but I guess I know the problem already. Stupid me.

> I guess you already know that you have not chosen the easiest way to
> understand backquote. Anyway, you learn a lot of stuff with your
> approach. Looking forward the next version!

Not the easiest, but more thorough. I’m not satisfied with the “I kind
of understand this... I guess” situation – if I can’t implement it,
I don’t understand it. (Unfortunately, the converse need not be true;
I might be lucky and implement it without a full understanding, too...)
And I’m really determined to understand this stuff well enough to be
able to teach it to others.

> Regards,
>
> Michael.

Thanks for kind words. Stay tuned!

Marcin Borkowski

unread,

Jul 12, 2015, 4:33:56 PM7/12/15

to help-gn...@gnu.org

On 2015-07-12, at 21:55, Marcin Borkowski <mb...@mbork.pl> wrote:

> On 2015-07-12, at 17:54, Michael Heerdegen <michael_...@web.de> wrote:
>
>> There is a problem though when the read expression is nested. I tried
>> to `mci/read' this string for example:
>>
>> "(defun fac (x) (if (< 2 x) 1 (* x (fac (1- x)))))"
>>
>> and got
>>
>> (defun fac :open-paren x)
>>
>> as result. If you Edebug your functions, you can see what goes wrong.
>> Please tell me if you need more hints...
>
> Good catch, thanks. No need for edebug, I guess – I haven’t looked at
> my code, but I guess I know the problem already. Stupid me.

So, what about this? It seems to work. OTOH, I think it's not the most
elegant thing possible, since there is some code duplication: mci/read
has this: (:open-paren (mci/read-list-contents)) in a (a)case statement,
and mci/read-list-contents has this: (:open-paren (setq next
(mci/read-list-contents))). Something tells my mathematical mind that
there probably exists a cleaner approach.

--8<---------------cut here---------------start------------->8---
(require 'anaphora) ; we'll need acase

;; Reader

(defun mci/next-token ()
"Get the next token from the current buffer at point position.
A token can be: an integer, a symbol, a parenthesis, a comma,
a backquote or a quote. Return a number (in case of an integer),
a symbol (in case of a symbol), or one of the symbols: :open-paren,

:close-paren, :quote, :quasi-quote, :unquote, :eob. (Of course, if
someone is devious enough to include one of these symbols in the
expression being read, he'll get what he deserves: a chaos.)"

(case next
(:open-paren (setq next (mci/read-list-contents)))
(:eob (error "Unexpected EOB while reading a list -- mci/read-list-contents"))
(t (push next list)
(setq next (mci/next-token)))))

(nreverse list)))
--8<---------------cut here---------------end--------------->8---

Best,

Marcin Borkowski

unread,

Jul 14, 2015, 2:17:42 PM7/14/15

to help-gn...@gnu.org

On 2015-07-12, at 22:33, Marcin Borkowski <mb...@mbork.pl> wrote:

> So, what about this? It seems to work. OTOH, I think it's not the most
> elegant thing possible, since there is some code duplication: mci/read
> has this: (:open-paren (mci/read-list-contents)) in a (a)case statement,
> and mci/read-list-contents has this: (:open-paren (setq next
> (mci/read-list-contents))). Something tells my mathematical mind that
> there probably exists a cleaner approach.

Stupid me – again;-). No wonder ‘mci/read-list-contents’ appears twice,
once, once in ‘mci/read’ and once in ‘mci/read-list-contents’ – it seems
there’s no other way (though I can’t prove it formally).

But now my problem is something different, and on a different level –
a “metaproblem” in a sense. I’m still working on ‘mci/eval’; it now
supports ‘progn’ forms and ‘setq’, and I have ‘cons’, ‘car’ and ‘cdr’ as
symbols bound to their Elisp counterparts in the global environment of
my MCI, and when I have lambdas (and write ‘mci/apply’, which should be
relatively easy now), I’ll have more or less complete (though tiny)
Lisp. I guess that adding backquote should be really straightforward
then.

So where’s the problem? Well, it’s quite a lot of fun to put it all
together, and I’m learning a few things along the way, so it’s difficult
to resist the temptation to add more stuff. Macros? ‘cond’ forms?
‘while’ forms? OTOH, my goal is *not* to recreate all Elisp (contrary
to Scheme, Elisp is far from minimalistic, for instance, it has *a lot*
of special forms which could, in principle, be macros - ‘if’, for
example, or ‘let’, or ‘let*’). I definitely do not want to spend too
much time on this – adding lots of special forms would soon cease to be
fun, and once (and if!) I have macros, there’s really no use in adding
them; also, I want to move on to other things.

So now my question is: does it make sense to play around with it more?
Would a more complete Elisp interpreter written in Elisp be useful for
anyone? If yes, I might consider publishing all my code sooner rather
than later. And: if it’s interesting and/or useful for anybody, is
there anything besides lambdas, a proper ‘mci/apply’ function and macros
that definitely *should* be added? (One thing that comes to mind would
be special (= dynamic) variables. I’m not sure whether I would like to
add them – it might be too much work. OTOH, I’d learn to implement
dynamic binding then...)

Emanuel Berg

unread,

Jul 14, 2015, 6:10:04 PM7/14/15

to help-gn...@gnu.org

Lisp is based on lists, so naturally, there are
several ways to set them up based on your needs at the
moment, as they are such a cornerstone to everything
that goes on.

The ways that immediately comes to mind are:

(list <eval_1> ... <eval_n>)

(cons <eval to car> <eval to cdr>)

'(<don't eval_1> ... <don't eval_n>)

`(<don't eval> ,<eval> ,@<eval list and insert elements>)

If the syntax of the quote and backquote makes you
confused, just replace them by exactly that, `quote'
and `backquote': (quote (<don't eval_1> ... )) and all
that, just the same.

Michael Heerdegen

unread,

Jul 21, 2015, 5:50:31 PM7/21/15

to Marcin Borkowski, help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> > sorry for the late reply.
>
> It's fine, I'm working slowly on this anyway.

Sorry again. Was on vacation this time.

> > Yes, that's the right approach. You could of course translate into the
> > symbols named "'", "`" and "," instead, like the Lisp reader does, but
> > that's a detail. In Elisp, these aren't special forms. They could be
> > in your interpreter, of course.
>
> And they will be;-).

FWIW, "," and ",@" are not defined in Elisp at all. They're just
symbols used as tags that are recognized by `backquote'.

> > Dunno if your interpreter will support macros. If not, you could
> > handle backquote directly in your interpreter.
>
> No, I don't want to support macros – too much work and little benefit,
> I guess.

Mmh, I don't think so. Macros are an essential part of the Lisp
language. And I don't think it would be hard, it should be quite simple
to do. `defmacro' kind of defines a function. The difference to real
functions is that they get expressions (code) as arguments before
evaluation; the expansion itself is just like a function call. The
result is a new (expanded) expression to be evaluated.

So if you have `funcall' and `apply', you already have everything you
need to support macros. Your `eval' just has to use these to expand the
macro calls before it starts with the conventional evaluation.

> >> (require 'anaphora) ; we'll use acase
> >
> > It would be good if you could drop this dependence. This would spare
> > people from trying your code from installing additional stuff.
>
> Well, aif is also useful for me.

FWIW, there are now good replacements for such stuff in Emacs Lisp. You
can use `if-let' (new in Emacs 25.1 coming soon) instead of `aif', and
`pcase' (already part of Emacs) instead of `acase'. Learning
`pcase' will take some time, but it is worth it.

Lots of Lispers seems to hate these anaphoric macros and say they are
unlispy. Lots of others love them.

> > I guess you already know that you have not chosen the easiest way to
> > understand backquote. Anyway, you learn a lot of stuff with your
> > approach. Looking forward the next version!
>
> Not the easiest, but more thorough. I’m not satisfied with the “I kind
> of understand this... I guess” situation – if I can’t implement it,
> I don’t understand it. (Unfortunately, the converse need not be true;
> I might be lucky and implement it without a full understanding, too...)

Actually, I think it is not unusual for Lisp programming that you invent
something, implement it, and understand it later. The same is probably
true in some sense for Lisp itself.

Regards,

Michael.

Michael Heerdegen

unread,

Jul 21, 2015, 5:54:24 PM7/21/15

to Marcin Borkowski, help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> So, what about this? It seems to work. OTOH, I think it's not the most
> elegant thing possible, since there is some code duplication: mci/read
> has this: (:open-paren (mci/read-list-contents)) in a (a)case statement,
> and mci/read-list-contents has this: (:open-paren (setq next
> (mci/read-list-contents))). Something tells my mathematical mind that
> there probably exists a cleaner approach.

Yes, that code duplication is no coincidence.

There's a bug with your new version btw. "`" is handled differently at
top level and at higher levels. If you fix that, you will probably have
even more duplicated code.

Michael.

Michael Heerdegen

unread,

Jul 21, 2015, 6:08:42 PM7/21/15

to Marcin Borkowski, help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> Stupid me – again;-). No wonder ‘mci/read-list-contents’ appears twice,
> once, once in ‘mci/read’ and once in ‘mci/read-list-contents’ – it seems
> there’s no other way (though I can’t prove it formally).

I don't think every implementation needs to have it in two different
defuns.

> But now my problem is something different, and on a different level –
> a “metaproblem” in a sense. I’m still working on ‘mci/eval’; it now
> supports ‘progn’ forms and ‘setq’, and I have ‘cons’, ‘car’ and ‘cdr’ as
> symbols bound to their Elisp counterparts in the global environment of
> my MCI, and when I have lambdas (and write ‘mci/apply’, which should be
> relatively easy now), I’ll have more or less complete (though tiny)
> Lisp. I guess that adding backquote should be really straightforward
> then.
>
> So where’s the problem? Well, it’s quite a lot of fun to put it all
> together, and I’m learning a few things along the way, so it’s difficult
> to resist the temptation to add more stuff. Macros? ‘cond’ forms?
> ‘while’ forms? OTOH, my goal is *not* to recreate all Elisp (contrary
> to Scheme, Elisp is far from minimalistic, for instance, it has *a lot*
> of special forms which could, in principle, be macros - ‘if’, for
> example, or ‘let’, or ‘let*’). I definitely do not want to spend too
> much time on this – adding lots of special forms would soon cease to be
> fun, and once (and if!) I have macros, there’s really no use in adding
> them; also, I want to move on to other things.

Once you have macros, implementing the stuff you mentioned should not be
hard if you don't care about efficiency too much.

> So now my question is: does it make sense to play around with it more?
> Would a more complete Elisp interpreter written in Elisp be useful for
> anyone?

For learning purposes, it would be useful.

> If yes, I might consider publishing all my code sooner rather than
> later. And: if it’s interesting and/or useful for anybody, is there
> anything besides lambdas, a proper ‘mci/apply’ function and macros
> that definitely *should* be added? (One thing that comes to mind
> would be special (= dynamic) variables. I’m not sure whether I would
> like to add them – it might be too much work. OTOH, I’d learn to
> implement dynamic binding then...)

If you think you have learned what you wanted to, I would stop. Maybe
you feel like continuing working on it at a later point of time.

But hey, since you asked: Implementing nonlocal exits come to my mind as
a goal. And continuations would be cool. Implementing these are
probably harder lessons.

Regards,

Michael.

Michael Heerdegen

unread,

Jul 24, 2015, 9:01:35 AM7/24/15

to Marcin Borkowski, help-gn...@gnu.org

Michael Heerdegen <michael_...@web.de> writes:

> > Stupid me – again;-). No wonder ‘mci/read-list-contents’ appears
> > twice, once, once in ‘mci/read’ and once in ‘mci/read-list-contents’
> > – it seems there’s no other way (though I can’t prove it formally).
>
> I don't think every implementation needs to have it in two different
> defuns.

For the record: Better make your `mci/read' read lists recursively. If
`mci/read' finds something that isn't a list, read that. If it finds a
list, `mci/read' all its members recursively and put the read objects
into a list. No need for a `mci/read-list-contents'.

Regards,

Michael.

Marcin Borkowski

unread,

Aug 11, 2015, 6:15:40 AM8/11/15

to help-gn...@gnu.org

Thanks, that's my obvious mistake, though it seems that the fix is trivial.

> Michael.

Marcin Borkowski

unread,

Aug 11, 2015, 7:41:39 AM8/11/15

to help-gn...@gnu.org

As you can see, I came back to this project, and I have further
questions...

Interestingly, there's a lot of buzz about Lisp /interpreter/ written in
Lisp, but not so much about Lisp /reader/ written in Lisp. In fact,
I didn't find one on the Internet.

What I found was Peter Norvig's tiny Lisp written in Python
(http://norvig.com/lispy.html). His reader is quite simple, but there
is an important difference: he reads all the tokens into a (Python)
list, and then he can "peek" at the next token without "consuming" it.
In my approach, this is not possible (well, it is of course possible,
but moving the point back so that the same token will be read again is
ugly).

Now I'm wondering: is my approach (read one token at a time, but never
go back, so that I can't really "peek" at the next one) reasonable?
Maybe I should just read all tokens in a list? I do not like this
approach very much. I could also set up a buffer, which would contain
zero or one tokens to read, and put the already read token in that
buffer in some cases (pretty much what TeX's \futurelet does. Now
I appreciate why it's there...).

Yet another approach would be not to signal an `error' in (mci/read)
when the closing paren is encountered, but use `throw' and `catch'. Not
the most elegant way, probably.

So, does anyone know of a Lisp reader written in Lisp, so that I could
learn how smarter people solved this problem?

Anyway, it seems that the main purpose of my project turned out really
well: I'm learning a lot. I'd love to grab some real book on language
design/implementation, but I'd have to schedule considerable time for
that...

> Regards,

Thorsten Jolitz

unread,

Aug 11, 2015, 1:20:32 PM8/11/15

to help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

Hi,

,----
| How the backquote and the comma really work?
`----

funny enough, this was discussed recently on the PicoLisp mailing list
too, and I was surprised to find out that in Emacs Lisp 'Read Macros'
like backquote/comma rather seem to work at run time, while the PicoLisp
equivalents quote/backquote work at read time.

> PicoLisp:
>
> ,----
> | $ pil +
> | : (let X (+ 2 3) '(3 4 `X))
> | -> (3 4 NIL)
> | : '(3 4 `(+ 2 3))
> | -> (3 4 5)
> `----
>
> vs Emacs Lisp:
>
> ,----
> | (let ((X (+ 2 3))) `(3 4 ,X))
> | -> (3 4 5)
> |
> | `(3 4 ,(+ 2 3))
> | -> (3 4 5)
> `----

How does this work in Emacs Lisp? Is the backquote/comma actually a
function call at runtime?

[I did not follow the whole discussion, I hope I don't ask something
already answered]

--
cheers,
Thorsten

Michael Heerdegen

unread,

Aug 12, 2015, 11:02:17 AM8/12/15

to help-gn...@gnu.org

Thorsten Jolitz <tjo...@gmail.com> writes:

> funny enough, this was discussed recently on the PicoLisp mailing list
> too, and I was surprised to find out that in Emacs Lisp 'Read Macros'
> like backquote/comma rather seem to work at run time, while the
> PicoLisp equivalents quote/backquote work at read time.

I don't know PicoLisp, but looking at
http://software-lab.de/doc/ref.html, it indeed seems that backquote in
PicoLisp is something very different than in Common Lisp or Emacs Lisp
or Scheme. AFAICT a backquoted expression really seems to be evaluated
by the reader there.

> How does this work in Emacs Lisp? Is the backquote/comma actually a
> function call at runtime?
>
> [I did not follow the whole discussion, I hope I don't ask something
> already answered]

Yes, just read the thread ;-)

Michael.

Michael Heerdegen

unread,

Aug 12, 2015, 11:29:40 AM8/12/15

to help-gn...@gnu.org

Marcin Borkowski <mb...@mbork.pl> writes:

> Interestingly, there's a lot of buzz about Lisp /interpreter/ written
> in Lisp, but not so much about Lisp /reader/ written in Lisp. In
> fact, I didn't find one on the Internet.

Good question. Maybe it's because doing such things is mainly for
educational reasons, and when you want to learn how a language works,
studying the interpreter is more beneficial.

> What I found was Peter Norvig's tiny Lisp written in Python
> (http://norvig.com/lispy.html). His reader is quite simple, but there
> is an important difference: he reads all the tokens into a (Python)
> list, and then he can "peek" at the next token without "consuming" it.
> In my approach, this is not possible (well, it is of course possible,
> but moving the point back so that the same token will be read again is
> ugly).

What disadvantages do you fear could your version have?

On the page you cited, the flat list is only used as an intermediate
step to produce the syntax tree. There is not much more you could do
with it.

And if you want to re-read any form in some buffer, putting point back
to its beginning is a fast operation.

> Now I'm wondering: is my approach (read one token at a time, but never
> go back, so that I can't really "peek" at the next one) reasonable?
> Maybe I should just read all tokens in a list? I do not like this
> approach very much. I could also set up a buffer, which would contain
> zero or one tokens to read, and put the already read token in that
> buffer in some cases (pretty much what TeX's \futurelet does. Now
> I appreciate why it's there...).

I really don't get the point in which way the Python example would have
advantages over yours. The only difference is that your version
combines the two steps that are separate in the Python example. Your
version is more efficient, since it avoids building a very long list
that is not really needed and will cause a lot of garbage collection to
be done afterwards.

Regards,

Michael.

Pascal J. Bourguignon

unread,

Aug 12, 2015, 12:30:54 PM8/12/15

to

Michael Heerdegen <michael_...@web.de> writes:

> Marcin Borkowski <mb...@mbork.pl> writes:
>
>> Interestingly, there's a lot of buzz about Lisp /interpreter/ written
>> in Lisp, but not so much about Lisp /reader/ written in Lisp. In
>> fact, I didn't find one on the Internet.

Not looking good enough.

https://gitlab.com/com-informatimago/com-informatimago/tree/master/common-lisp/lisp-reader

and of course, there's one in each lisp implementation.

> Good question. Maybe it's because doing such things is mainly for
> educational reasons, and when you want to learn how a language works,
> studying the interpreter is more beneficial.

But also, it's assumed that by teaching the most complex subjects,
people will be able to deal with the less complex subjects by
themselves.

Sometimes indeed it looks like not.

>> Now I'm wondering: is my approach (read one token at a time, but never
>> go back, so that I can't really "peek" at the next one) reasonable?
>> Maybe I should just read all tokens in a list? I do not like this
>> approach very much. I could also set up a buffer, which would contain
>> zero or one tokens to read, and put the already read token in that
>> buffer in some cases (pretty much what TeX's \futurelet does. Now
>> I appreciate why it's there...).

Most languages are designed to be (= to have a grammar that is) LL(1);
there are also LR(0), SLR(1), LALR(1) languages, but as you can see, the
parameter is at most 1 in general. What this means is that the parser
can work my looking ahead at most 1 token. That is, it reads the
current tokens, and it may look the next token, before deciding what
grammar rule to apply. Theorically, we could design languages that
require a bigger look-ahead, but in practice it's not useful; in the
case where the grammar would require longer look ahead, we often can
easily add some syntax (a prefix keyword) to make it back into LL(1) (or
LALR(1) if you're into that kind of grammar).

Why is it useful? Because it allows to read, scan and parse the source
code by leaving it in a file and loading only one or two tokens in
memory at once: it is basically an optimization for when you're
inventing parsers on computers that don't have a lot of memory in the 60s.

And then! Even the first FORTRAN compiler, the one in 63 passes,
actually kept the program source in memory (4 Kw), and instead loaded
alternatively the passes of the compiler to process the data structures
of the program that remained in memory!

So indeed, there's very little reason to use short look-ahead, only that
we have a theorical body well developped to generate parsers
automatically from grammar of these forms.

So, reading the whole source file in memory (or actually, already having
it in memory, eg. in editor/compiler IDEs), is also a natural solution.

Also for some languages, the processing of the source is defined in
phases such as you end up easily having the whole sequence of tokens in
memory. For example, the C preprocessor (but that's another story).

Finally, parser generators such as PACKRAT being able to process
grammars with unlimited lookahead, can benefit from pre-loading the
whole source in memory.

In any case, it's rather an immaterial question, since on one side, you
have abstractions such as lazy streams that let you process sequences
(finite or infinite) as an I/O stream where you get each element in
sequence and of course, you can copy a finite stream back into a
sequence. Both abstractions can be useful and used to write elegant
algorithms. So it doesn't matter. Just have a pair of functions to
convert buffers into streams and streams into buffer and use whichever
you need for the current algorithm!

> I really don't get the point in which way the Python example would have
> advantages over yours. The only difference is that your version
> combines the two steps that are separate in the Python example. Your
> version is more efficient, since it avoids building a very long list
> that is not really needed and will cause a lot of garbage collection to
> be done afterwards.

Nowadays sources, even of complete OS such as Android, are much smaller
than the available RAM. Therefore loading the whole file in RAM and
building an index of tokens into it will be more efficient than
performing O(n) I/O syscalls.

--
__Pascal Bourguignon__ http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk

Marcin Borkowski

unread,

Aug 23, 2015, 4:30:53 AM8/23/15

to help-gn...@gnu.org

On 2015-08-12, at 18:30, Pascal J. Bourguignon <p...@informatimago.com> wrote:

> Michael Heerdegen <michael_...@web.de> writes:
>
>> Marcin Borkowski <mb...@mbork.pl> writes:
>>
>>> Interestingly, there's a lot of buzz about Lisp /interpreter/ written
>>> in Lisp, but not so much about Lisp /reader/ written in Lisp. In
>>> fact, I didn't find one on the Internet.
>
> Not looking good enough.
>
> https://gitlab.com/com-informatimago/com-informatimago/tree/master/common-lisp/lisp-reader

Thanks!

> and of course, there's one in each lisp implementation.

But often in C or something, not in Lisp.

>> Good question. Maybe it's because doing such things is mainly for
>> educational reasons, and when you want to learn how a language works,
>> studying the interpreter is more beneficial.
>
> But also, it's assumed that by teaching the most complex subjects,
> people will be able to deal with the less complex subjects by
> themselves.
>
> Sometimes indeed it looks like not.

Especially if one doesn't have a CS background, and is mostly
self-taught.

Also, it's not that I'm unable to deal with that; after a few
iterations, I usually succeed. My problem was not that I can't do it,
my problem was that I felt I was doing it suboptimally, and wanted to
see how smarter/more knowledgeable people deal with that.

>>> Now I'm wondering: is my approach (read one token at a time, but never
>>> go back, so that I can't really "peek" at the next one) reasonable?
>>> Maybe I should just read all tokens in a list? I do not like this
>>> approach very much. I could also set up a buffer, which would contain
>>> zero or one tokens to read, and put the already read token in that
>>> buffer in some cases (pretty much what TeX's \futurelet does. Now
>>> I appreciate why it's there...).
>
> Most languages are designed to be (= to have a grammar that is) LL(1);
> there are also LR(0), SLR(1), LALR(1) languages, but as you can see, the
> parameter is at most 1 in general. What this means is that the parser
> can work my looking ahead at most 1 token. That is, it reads the
> current tokens, and it may look the next token, before deciding what
> grammar rule to apply. Theorically, we could design languages that
> require a bigger look-ahead, but in practice it's not useful; in the
> case where the grammar would require longer look ahead, we often can
> easily add some syntax (a prefix keyword) to make it back into LL(1) (or
> LALR(1) if you're into that kind of grammar).

Now my lack of education is easily seen. I only heard about formal
grammars (well, I had one class about them - I mean, /one class/, 90
minutes, some 15 years ago).

> Why is it useful? Because it allows to read, scan and parse the source
> code by leaving it in a file and loading only one or two tokens in
> memory at once: it is basically an optimization for when you're
> inventing parsers on computers that don't have a lot of memory in the 60s.

And basically, this confirms my intuition that reading one token at
a time is not necessarily a stupid thing to do.

> And then! Even the first FORTRAN compiler, the one in 63 passes,
> actually kept the program source in memory (4 Kw), and instead loaded
> alternatively the passes of the compiler to process the data structures
> of the program that remained in memory!

Interesting!

> So indeed, there's very little reason to use short look-ahead, only that
> we have a theorical body well developped to generate parsers
> automatically from grammar of these forms.

I see.

> So, reading the whole source file in memory (or actually, already having
> it in memory, eg. in editor/compiler IDEs), is also a natural solution.
>
> Also for some languages, the processing of the source is defined in
> phases such as you end up easily having the whole sequence of tokens in
> memory. For example, the C preprocessor (but that's another story).
>
> Finally, parser generators such as PACKRAT being able to process
> grammars with unlimited lookahead, can benefit from pre-loading the
> whole source in memory.

Thanks for sharing - as hinted above, I have a lot to learn!

> In any case, it's rather an immaterial question, since on one side, you
> have abstractions such as lazy streams that let you process sequences
> (finite or infinite) as an I/O stream where you get each element in
> sequence and of course, you can copy a finite stream back into a
> sequence. Both abstractions can be useful and used to write elegant
> algorithms. So it doesn't matter. Just have a pair of functions to
> convert buffers into streams and streams into buffer and use whichever
> you need for the current algorithm!

And most probably I'll end up coding an abstraction like this, with
a function for looking at the next token without “consuming” it, and
a function for “popping” the next token. Converting between buffers and
streams wouldn’t be very useful for me, since I would either lose the
whole text structure (line-breaks, comments), or have to do a lot of
work to actually preserve it.

>> I really don't get the point in which way the Python example would have
>> advantages over yours. The only difference is that your version
>> combines the two steps that are separate in the Python example. Your
>> version is more efficient, since it avoids building a very long list
>> that is not really needed and will cause a lot of garbage collection to
>> be done afterwards.
>
> Nowadays sources, even of complete OS such as Android, are much smaller
> than the available RAM. Therefore loading the whole file in RAM and
> building an index of tokens into it will be more efficient than
> performing O(n) I/O syscalls.

OTOH, here I walk an Emacs buffer and not an external file. Moreover,
as I said, I don’t want to lose info on where I am in the source.

Thanks!

Pascal J. Bourguignon

unread,

Aug 23, 2015, 12:46:10 PM8/23/15

to

Marcin Borkowski <mb...@mbork.pl> writes:

> On 2015-08-12, at 18:30, Pascal J. Bourguignon <p...@informatimago.com> wrote:
>
>> Michael Heerdegen <michael_...@web.de> writes:
>>
>>> Marcin Borkowski <mb...@mbork.pl> writes:
>>>
>>>> Interestingly, there's a lot of buzz about Lisp /interpreter/ written
>>>> in Lisp, but not so much about Lisp /reader/ written in Lisp. In
>>>> fact, I didn't find one on the Internet.
>>
>> Not looking good enough.
>>
>> https://gitlab.com/com-informatimago/com-informatimago/tree/master/common-lisp/lisp-reader
>
> Thanks!
>
>> and of course, there's one in each lisp implementation.
>
> But often in C or something, not in Lisp.

Nope. Only clisp and ecl have a lisp reader written in C. All the
other implementations have it in lisp (or perhaps java).

> And most probably I'll end up coding an abstraction like this, with
> a function for looking at the next token without “consuming” it, and
> a function for “popping” the next token. Converting between buffers and
> streams wouldn’t be very useful for me, since I would either lose the
> whole text structure (line-breaks, comments), or have to do a lot of
> work to actually preserve it.

Not necessarily. Just add the required information to your token
structure, and you can also intersperse pseudo tokens for line-breaks
and comments, (but this renders the grammar more hairy, since you have
to allow for them between any other token; instead, you can just filter
them out before the actual parsing).

> OTOH, here I walk an Emacs buffer and not an external file. Moreover,
> as I said, I don’t want to lose info on where I am in the source.

In this case, you already have the whole source in memory…