SMIE examples or guides

Björn Lindqvist

unread,

Jun 18, 2016, 10:57:53 PM6/18/16

to help-gn...@gnu.org

Hello emacs,

I'm trying to implement SMIE support for a language's major mode. So
I've been reading the documentation and looking at how SMIE is used in
octave-mode, but it is not easy to understand how it works. Does any
friendlier sources for learning how to use SMIE exist? Like smaller
examples, how tos or smie guides?

--
mvh/best regards Björn Lindqvist

Stefan Monnier

unread,

Jun 19, 2016, 11:22:05 PM6/19/16

to help-gn...@gnu.org

> I'm trying to implement SMIE support for a language's major mode. So
> I've been reading the documentation and looking at how SMIE is used in
> octave-mode, but it is not easy to understand how it works. Does any
> friendlier sources for learning how to use SMIE exist? Like smaller
> examples, how tos or smie guides?

There are several modes using SMIE, beside octave-mode. They don't tend
to be very small, sadly, because indentation is pretty much always
tricky business, so even if it starts small, it quickly grows.

I suggest you post what you've tried and the problems you encountered,
and someone (mostly likely me) will help you out. In return, hopefully
you can provide some suggestions for how to improve the doc.

Stefan

Björn Lindqvist

unread,

Jun 26, 2016, 8:23:36 AM6/26/16

to Stefan Monnier, help-gn...@gnu.org

(Sorry I haven't had time to play with emacs for a while)

What I have is the following:

(smie-setup factor-smie-grammar #'factor-smie-rules)
(setq-local smie-indent-basic 'factor-block-offset)

I have pieced together a factor-smie-rules func by looking at
elixir-mode and octave-mode:

(defun factor-smie-rules (kind token)
(pcase (cons kind token)
(`(:elem . basic) 4)
(`(:after . ,(or `"HELLO")) 4)
))

The intent is that if a line contains the token HELLO, indent is
increased for subsequent lines by four. I'm also planning to add a BYE
token, pairing HELLO and decreasing indent by four. Something happens
when I type HELLO<tab> and emacs says (error "Please avoid it"). So my
rules function is triggered, which is good but the result isn't what I
want.

Stefan Monnier

unread,

Jun 27, 2016, 4:31:52 AM6/27/16

to Björn Lindqvist, help-gn...@gnu.org

> when I type HELLO<tab> and emacs says (error "Please avoid it"). So my
> rules function is triggered, which is good but the result isn't what I
> want.

No, I think the issue is in the grammar. What's your `factor-smie-grammar'?

Stefan

Björn Lindqvist

unread,

Jun 27, 2016, 8:33:12 AM6/27/16

to Stefan Monnier, help-gn...@gnu.org

Just an empty list:

(defconst factor-smie-grammar (list))

Stefan Monnier

unread,

Jun 27, 2016, 6:03:15 PM6/27/16

to Björn Lindqvist, help-gn...@gnu.org

> Just an empty list:
> (defconst factor-smie-grammar (list))

No good. You need to define your grammar, otherwise SMIE doesn't have
anything to work with.

Stefan

Björn Lindqvist

unread,

Jun 28, 2016, 7:34:43 AM6/28/16

to Stefan Monnier, help-gn...@gnu.org

Yes, I understand. But I'm completely at loss on how to do that or
even getting started. I have seen what rules octave-mode has, but I
don't know how to adapt that to my scenario (HELLO token begins
indentation, BYE token ends it).

Stefan Monnier

unread,

Jun 28, 2016, 5:42:24 PM6/28/16

to Björn Lindqvist, help-gn...@gnu.org

> Yes, I understand. But I'm completely at loss on how to do that or
> even getting started. I have seen what rules octave-mode has, but I

Have you looked at its grammar? That would probably be a better start
for your grammar.

> don't know how to adapt that to my scenario (HELLO token begins
> indentation, BYE token ends it).

So it sounds like

(deconst factor-smie-grammar
(smie-prec2->grammar
(smie-bnf->prec2
'((exp ("HELLO" exp "BYE"))))))

might be a good start.

Stefan

Björn Lindqvist

unread,

Jun 28, 2016, 6:30:52 PM6/28/16

to Stefan Monnier, help-gn...@gnu.org

Thank you. That makes some indentation happening, but what I get is
(hoping gmail preserves leading whitespace):

first
HELLO
HELLO foo BYE
HELLO
bla
net
neat
BYE
text more
more
BYE

What I would like instead is:

first
HELLO
HELLO foo BYE
HELLO
bla
net
neat
BYE
text more
more
BYE
text

Stefan Monnier

unread,

Jun 29, 2016, 3:37:46 AM6/29/16

to help-gn...@gnu.org

> HELLO
> HELLO foo BYE
> HELLO
> bla
> net
> neat
> BYE
> text more
> more
> BYE

That's expected: the third HELLO...BYE is indented as an argument of
the first. Similarly to:

(table->method)
(arg1,
arg2
)

[ Read the above, thinking that "(" is like HELLO and ")" is like
"BYE". ]

Similarly, "net" and "neat" are treated as arguments to "bla".

A quick fix to that part would be to add an indentation rule along the
lines of

(`(:elem . arg) 0)

or alternatively

(`(:list-intro . ,_) t)

Tho, if your language makes newlines significant (i.e. "bla\nnet" is not
equivalent to "bla net"), then you might be better off changing the
tokenizer (by providing appropriate :forward-token and :backward-token
arguments to `smie-setup') so as to return an actual token for every
newline encountered, after which you can add corresponding rules to
the grammar.

> HELLO
> bla
> net
> neat
> BYE

To get the BYE indented this way, a quick-fix could be to add a rule
like

(`(:before . "BYE") 4)

-- Stefan

Björn Lindqvist

unread,

Jun 29, 2016, 12:48:40 PM6/29/16

to Stefan Monnier, help-gn...@gnu.org

I got it to work almost by experimenting with your suggestions. But it
appears to do some kind of automatic aligning I don't want:

HELLO one
two
three
BYE

That should instead have been:

HELLO one
two
three
BYE

My language is not newline-significant.

Stefan Monnier

unread,

Jun 30, 2016, 3:23:22 AM6/30/16

to Björn Lindqvist, help-gn...@gnu.org

> I got it to work almost by experimenting with your suggestions. But it
> appears to do some kind of automatic aligning I don't want:
>
> HELLO one
> two
> three
> BYE

If one, two, and three are an arbitrary sequence of <...> bracketed by
HELLO and BYE, then the above looks like the proper indentation to me.

> That should instead have been:
>
> HELLO one
> two
> three
> BYE

The indentation here looks wrong to me unless "one" is special in the
sense that it's some kind of argument to HELLO and "two" and "three" are
an arbitrary sequence of <...> bracketed by "HELLO <something>" and "BYE".

Which is it? And if it's the latter, how do you distinguish the
separation between "one" and "two" (e.g. can the first line be something
like "HELLO one two" where the "one two" is the argument to HELLO, and
if so, how can you distinguish this case from the case where we have
"HELLO one" and the subsequent "two" is part of the inner sequence)?

Stefan

Björn Lindqvist

unread,

Jun 30, 2016, 7:27:47 AM6/30/16

to Stefan Monnier, help-gn...@gnu.org

I'm not sure I understand. There are just two indentation-dependent
tokens in my language; HELLO which increases indentation by four for
subsequent lines and BYE which decreases indentation by four for
subsequent lines. So yes, one, two, three is an arbitrary sequence and
it could have been:

HELLO three one
four ten BYE
eleven HELLO
twelve
BYE

instead. This is what is required even if it doesn't look like proper
indentation. :) In the real mode I'm working on (HELLO/BYE is a toy
example) it makes more sense.

Stefan Monnier

unread,

Jun 30, 2016, 2:50:08 PM6/30/16

to Björn Lindqvist, help-gn...@gnu.org

> HELLO three one
> four ten BYE
> eleven HELLO
> twelve
> BYE

This goes a bit against SMIE's default indentation principles, so it's
going to take more efforts (IOW adding support for this kind of
indentation is still on the todo list).

Maybe something along the following lines would work:

(defun my-indent-foo ()
(unless (looking-at "BYE\\_>")
(save-excursion
(let ((x nil))
(while (progn (setq x (smie-backward-sexp))
(null (car-safe x))))
(when (equal "HELLO" (nth 2 x))
(goto-char (nth 1 x))
(+ 4 (smie-indent-virtual)))))))
and add

(add-hook 'smie-indent-functions #'my-indent-foo nil t)

in your major mode function.

Stefan

Björn Lindqvist

unread,

Jun 30, 2016, 8:49:41 PM6/30/16

to Stefan Monnier, help-gn...@gnu.org

Many thanks! That appears to work. But there is one problem. In a
buffer with just this content:

[
\
]

I get the error: (error "Bumped into unknown token")

And another problem I'm having is trying to support multiple
indentation start tokens:

(deconst factor-smie-grammar
(smie-prec2->grammar
(smie-bnf->prec2

'((exp (("HELLO" "HALLO" "CIAO" "SALUT") exp "BYE"))))))

My intent is to allow either of those four words to be used to start
the indented block. I only need one ending token so there will be no
alternative to BYE. Seems like it should be trivial to extend the BNF
grammar in this way, but I can't figure out what syntax
smie-bnf->prec2 expects for it.

Stefan Monnier

unread,

Jul 1, 2016, 3:13:32 AM7/1/16

to Björn Lindqvist, help-gn...@gnu.org

> Many thanks! That appears to work. But there is one problem. In a
> buffer with just this content:

> [
> \
> ]

> I get the error: (error "Bumped into unknown token")

You should either give a different syntax to the \ char in the
syntax-table, or change the :forward-token and :backward-token functions
so they do something with \

> (smie-bnf->prec2
> '((exp (("HELLO" "HALLO" "CIAO" "SALUT") exp "BYE"))))))

One way to do that is:

(smie-bnf->prec2
'((exp ("HELLO" exp "BYE")

("HALLO" exp "BYE")
("CIAO" exp "BYE")
("SALUT" exp "BYE"))))))

Another is to change the :forward-token and :backward-token functions so
they return the same string (e.g. "HELLO") when tokenizing any one of those.

Stefan

Björn Lindqvist

unread,

Jul 1, 2016, 3:28:11 PM7/1/16

to Stefan Monnier, help-gn...@gnu.org

I managed to write the forward and backward tokenization functions:

(defun factor-smie-token (dir)
(pcase dir
('forward (forward-comment (point-max)))
('backward (forward-comment (- (point)))))
(let ((tok (buffer-substring-no-properties
(point)
(let ((syntax "w_\\\""))
(pcase dir
('forward (skip-syntax-forward syntax))
('backward (skip-syntax-backward syntax)))
(point)))))
;; Normalizes different indent starters.
(cond ((string-match factor-smie-indents-regex tok) ":")
(t tok))))

(defun factor-smie-forward-token ()
(factor-smie-token 'forward))

(defun factor-smie-backward-token ()
(factor-smie-token 'backward))

It works in 99% of the cases. But since almost any character can be
part of a token, it doesn't work perfectly. E.g te]st(ab3 would be a
perfectly valid variable name. I tried changing the syntax to
"w_()\\\"" and that fixes the tokenization but then I lose the useful
automatic indentation smie adds to opening- and closing bracket
characters. E.g I'm happy that smie indents:

[
neat
{
nice
}
(
good
)
]

It's very nice. But the following two lines are not right:

hi[there
two

They should be:

hi[there
two

Stefan Monnier

unread,

Jul 1, 2016, 7:09:36 PM7/1/16

to Björn Lindqvist, help-gn...@gnu.org

> [
> neat
> {
> nice
> }
> (
> good
> )
> ]

You want to add rules like ("[" exp "]") to your grammar, then.
Or alternatively, refine your tokenizer so it returns "" when it finds
a lone "[" or "]", i.e. only accept syntax "()" when it's part of
a token, but not when it's the whole token.

Stefan