Using Antlr for developing a template engine

449 views
Skip to first unread message

Burak Emre Kabakcı

unread,
May 7, 2018, 9:50:30 PM5/7/18
to antlr-discussion
I'm trying to write a simple template engine but I'm not able to figure out how Antlr can be used for parsing random strings. Antlr seems to be a good choice if you have a syntax but our template engine doesn't have a syntax. It has the following rule-set:

The template engine will have only two syntax: {variableName} and [ expression ](fallback). {variableName} will be replaced with the value of the variable, for [ expression ](fallback expression) if there is any null variable in expression, the whole block will be replaced by the fallback expression.
Here are some of the examples:
1. select date_trunc({segment}, _time), count(*) from table where _time between {date.start} and {date.end}
  * The variables segment and date will be replaced, if one of their values is null then we won't render, instead throw an exception.
2. select count(*) from table where [product_id = {product}](true)
  * If the value is not set for the variable `product` the output will be:
       * select count(*) from table where true
  * If the value is set:
       * select count(*) from table where product_id = VALUE

Please note that won't really try to parse the whole SQL syntax, instead just replace the values in our script. Is Antlr a good choice for writing such template? I took a look at the example grammars but couldn't find a good one to start with so I'm a bit stuck here. I few keywords for me to search would be really helpful.

Azaad

unread,
May 7, 2018, 11:14:39 PM5/7/18
to antlr-di...@googlegroups.com
Oh I see what you're trying to do. Its kind of like HTML parsing on an HTTP server with tools like Jinja or even PHP. I'm assuming that you want to output an SQL string and I'd say that ANTLR would make something like this very easy to do. Even if you aren't doing much SQL parsing, you're going to have to specify SQL syntax for your lexer, otherwise, you're going to get some errors. Ultimately, you want your grammar to be able to recognize SQL tokens. Take this grammar I wrote as an example:

grammar Example;

input : (sqlToken | variable) +;

sqlToken : SQL_SELECT_TOKEN | SQL_WHERE_TOKEN | SQL_FROM_TOKEN | SQL_STAR_TOKEN | EQUALS | SEMICOLON | ID ;
variable : '{' ID '}'; 

// Lexer rules
ID : [a-zA-Z][a-zA-Z0-9]+  ; 
SQL_SELECT_TOKEN: 'select';
SQL_WHERE_TOKEN : 'where';
SQL_FROM_TOKEN : 'from';
SQL_STAR_TOKEN : '*';
EQUALS : '=';
NEWLINE : [ \t\r\n]+ -> skip ;
SEMICOLON : ';' ;

Lets feed this grammar the following string:

select * from someTable where id={someVariable};

It will produce this parse tree:

Now, using the visitor/listener pattern, we can replace the variable node with whatever we want and output the appropriate SQL. If you want to be able to support more complicated SQL grammar, you will have to specify more SQL lexer rulers so that your parser can recognize the tokens. Oh and remind me again how your template engine didn't have a syntax? ;)

Burak Emre Kabakcı

unread,
May 8, 2018, 7:47:33 AM5/8/18
to antlr-discussion
Hey Azaad,

Thanks for the detailed answer. Unfortunately, the SQL syntax is much more complex than that. Please see this Presto SQL parser here: https://github.com/prestodb/presto/blob/master/presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4

The problem is that there are multiple SQL dialects and I want to be able to support them. (Postgresql, Presto, BigQuery etc.) We can't even write a parser that covers all of them because I'm sure there would be some conflicts. That's why I need a template engine indeed.

What I mean by not having a syntax is that it will just compile, render and replace tags {} or []() and won't do anything if a token is not inside our tags. Therefore the template should just ignore everything that is not inside our tag.

Best,
Emre

Mike Lischke

unread,
May 8, 2018, 11:46:03 AM5/8/18
to antlr-discussion
>
> The problem is that there are multiple SQL dialects and I want to be able to support them. (Postgresql, Presto, BigQuery etc.) We can't even write a parser that covers all of them because I'm sure there would be some conflicts. That's why I need a template engine indeed.

Or you just create multiple parsers, one for each language. Gives you most flexibility with least effort.

Mike
--
www.soft-gems.net

Burak Emre Kabakcı

unread,
May 8, 2018, 11:56:07 AM5/8/18
to antlr-di...@googlegroups.com
That’s not actually that easy. :) They change the dialect over time and maintaining multiple Antlr grammars that have at least a few hundreds of lines is not something we can effort since we don’t have any dedicated team member for it.
--
You received this message because you are subscribed to a topic in the Google Groups "antlr-discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antlr-discussion/iJLyL-LFqiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Susan Jolly

unread,
May 8, 2018, 12:13:28 PM5/8/18
to antlr-di...@googlegroups.com
Have you considered using different lexer modes?  Each mode uses different token names so the parser won't get mixed up as long as it has rules to handle all the different possible tokens.. Secs. 4.5 and 12.4 in the ANTLR 4 reference give some good examples. Sec. 11.1 has an example of how to deal with different dialects.
 

Mike Lischke

unread,
May 8, 2018, 12:24:55 PM5/8/18
to antlr-di...@googlegroups.com

That’s not actually that easy. :) They change the dialect over time and maintaining multiple Antlr grammars that have at least a few hundreds of lines is not something we can effort since we don’t have any dedicated team member for it.

You can solve language variations using semantic predicates. Look at this MySQL grammar: https://github.com/mysql/mysql-workbench/blob/8.0/library/parsers/grammars/MySQLParser.g4#L128, where I use predicates to enable/disable language parts depending on a server version. Works very well.


Reply all
Reply to author
Forward
0 new messages