False beginner need for doc and advices

66 views
Skip to first unread message

Jean-Baptiste Briaud

unread,
Jul 13, 2020, 4:33:37 PM7/13/20
to SableCC
Hi all,

Intro.
I'm really pleased to get back to code with SableCC.
I'm continuing my language project (stay tuned) and the lexer/parser is back with SableCC.
I think my first encounter was with SableCC 2.x, but not sure.

Q1: SableCC 3 or 4?
From a 2019 answer found here, I plan to start with SableCC 3.7 rather than SableCC 4 which is qualified as non stable by the community.
Is it the right choice of should I start with SableCC 4 ?

I fully understand how hard it can be to give deadlines, but ... what is the status update for SableCC 4 since 2019 ?

Q2. Doc.
Where can I find the grammar doc for SableCC 3?
Same question with a doc on states and how to use it.
I'm struggling to get back with what I was able to do 10 years ago.
I'm using exemples provided with SableCC 3 distribution, but I find this approach quite hard.
On SableCC.org the thesis is stated to be the definitive guide, is it true for SableCC 3 ?
I remember to had read it for the SableCC I was using 10 years ago (not sure which version)

Of course, if the advice given in Q1 is to start with SableCC 4, this Q2 apply to SableCC 4 :-)

Thanks for the answer and big thanks to keep alive SableCC.

JB.

Etienne Gagnon

unread,
Jul 13, 2020, 5:00:36 PM7/13/20
to sab...@googlegroups.com

Hi Jean-Baptiste,


    
Q1: SableCC 3 or 4?
From a 2019 answer found here, I plan to start with SableCC 3.7 rather than SableCC 4 which is qualified as non stable by the community.
Is it the right choice of should I start with SableCC 4 ?

I recommend using SableCC 3. It is stable and complete.

I've let students use SableCC 4 beta. In that context, its many limitations (no lexers states, no *+? operators, no tree transformations, etc.) weren't a problem. Their feedback and my own experience with SableCC4 was quite useful to identify the good and the bad of various proposals. I really like the simplified syntax, but the benefits of the more complex lexers and parser engines and their collateral impacts weren't convincing. So, the project is going through a redesign to better achieve my objectives without adding uncesseray complexity.

Q2. Doc.
Where can I find the grammar doc for SableCC 3?
Same question with a doc on states and how to use it.
I'm struggling to get back with what I was able to do 10 years ago.
I'm using exemples provided with SableCC 3 distribution, but I find this approach quite hard.
On SableCC.org the thesis is stated to be the definitive guide, is it true for SableCC 3 ?

Yes, my old thesis remains the definitive guide for SableCC 3.

SableCC 3 is completely syntax-compatible with the previous SableCC 2. Its main feature is the addition of tree transformations. These tree transformations are internally used by SableCC 3 to automatically resolve grammar conflicts. Many users benefit from this without even being aware of it. SableCC 3 also supports explicit tree transformations directives, but few users take advantage of that. There's a link on the documentation page a CST->AST tutorial.

Cheers,

Etienne

Etienne Gagnon, Ph.D.
http://sablecc.org

Jean-Baptiste Briaud

unread,
Jul 13, 2020, 6:16:22 PM7/13/20
to SableCC
Thank you Etienne for your complete answer.
I found your work on SableCC 3 and Java : a Java 1.7 grammar define in SableCC 3.
This real life example was really rich and unlock me after reading again the thesis.
By the way, it took nearly 2h to compile on my machine :-o

I really like SableCC design with the visitor and the fact there are no action but only grammar in sablecc file.
That really rock!

I'm not sure I fully understand states and I didn't found states in that Java 1.7 SableCC grammar.
I understand it can disambiguate some part of a Production section grammar based on Token that will make sens in some context only (some part of Production only)
Not sure if this understanding is academic :-)

So far, I didn't needed states for my grammar.


Something I'll found useful for a future iteration (SableCC 4 ?) is to take care of ignored token.
I know it sound like a contradiction, but ...
ignored tokens are ignored with the meaning they are not parsed. This is fine.
On the other hand, the ignored tokens do exists in the language file and we could find interest to visit such tokens while ignored by the parser.
Ignored tokens are defined in the SableCC grammar so the lexer knows when not to pass to the parser.

As an example, I'll give JavaDoc. Yes, the content of a JavaDoc comment is not used within the generated .class, but javac could use them to produce a doc next to .class.
For JDK, JavaDoc generator and javac are different program but not in my language.

In my language, there are several types of comments and some are used by the compiler.
The language developer can write anything in such comment, so ignored by the parser is fine, but it could be lexed so the visitor is aware of it.
Methods inXXX, ouXXX and caseXXX (caseMeaningfulComment for example) could be called with an unparsed raw String as a parameter.
Whit this, the compiler could do something with the raw string of an ignored token. If out of interest for a language : no override.

It may open the door to composite languages. Imagine such a comment is more structures that defined in the first SableCC grammar.
Tagged as ignores token in the first grammar, this raw string could be lex/pars/visit by another visitor from another SableCC grammar.
Example : Javascript string inside Java program with 2 sablecc files instead of a very complex global grammar that would includes Java + Javascript.


What do you think ?

JB.

Etienne Gagnon

unread,
Jul 14, 2020, 10:11:22 AM7/14/20
to sab...@googlegroups.com
Hi Jean-Baptiste,

Thanks for your suggestions.

Note that the parser instance generated by SableCC 3 stores ignored
tokens which precede a useful token into a list which can be retrieved
using "((List<Node>) parser.ignoredTokens.getIn(token))". It isn't
beautiful, but it works if you need it.

Have fun!
Reply all
Reply to author
Forward
0 new messages