Any complete example for Javascript out there?

96 views
Skip to first unread message

Sonny Rajagopalan

unread,
Apr 24, 2018, 1:14:43 PM4/24/18
to antlr-discussion
I am trying to learn how to use ANTLR4 for JavaScript and was comfortable dealing with lexer/parser generation etc.

I am however, stumbling on the use of the generated code for my grammar. Specifically, I would like to just walk the parse tree and pick out the specific tokens. Here's an example to clarify.

I have the following grammar:

~/Simple $ more Simple.g4

grammar Simple ;

r             : left_paren function_expr op variable right_paren ;
function_expr : function_name left_paren paramList right_paren ;
paramList     : (param (',' param)*)* ;
param       : param_name | function_expr ;
param_name    : TOKEN_NAME ;
token_name    : TOKEN_NAME ;
function_name : TOKEN_NAME ;
left_paren    : LEFT_PAREN ;
right_paren   : RIGHT_PAREN ;
variable      : TOKEN_NAME ;
op            : OP ;

LEFT_PAREN : [(] ;
RIGHT_PAREN : [)] ;
TOKEN_NAME : [A-Za-z]+[A-Za-z0-9_]* ;
VARIABLE : [A-Za-z]+ ;
OP : '==' | '>=' | '<=' | '!=' | '&&' | '||' | '~' ;
WS  : [ \t\r\n]+ -> skip ;

I generate the lexer and parser for it:

~/Simple $ antlr4 -o . -listener -visitor Simple.g4

I compile the lexer and parser:

~/Simple $ javac *.java

And then I test it:

~/Simple $ grun Simple r -tree
(simple () != simpler)
(r (left_paren () (function_expr (function_name simple) (left_paren () paramList (right_paren ))) (op !=) (variable simpler) (right_paren )))

What I am saying is that I would like to process the stream of tokens/lexemes from left to right in JavaScript. I think this is more of a JavaScript question, but I'd appreciate your help with this!

I have 

var antlr4 = require('antlr4/index');
var SimpleLexer = require ('./SimpleLexer');
var SimpleParser = require ('./SimpleParser');
var SimpleListener = require('./SimpleListener');

SimplePrinter = function () {
    SimpleListener.SimpleListener.call (this);
    return this;
}
SimplePrinter.prototype = Object.create (SimpleListener.SimpleListener.prototype);
SimplePrinter.prototype.constructor = SimplePrinter;
SimplePrinter.prototype.exitR = function (ctx) {
    console.log ("A Simple RULE. AWESOME!!");
    console.log (ctx);
};

var input = process.argv[2];

var chars = new antlr4.InputStream(input);
var lexer = new SimpleLexer.SimpleLexer(chars);
var tokens  = new antlr4.CommonTokenStream(lexer);
var parser = new SimpleParser.SimpleParser(tokens);
parser.buildParseTrees = true;
var tree = parser.R ();
var printer = new SimplePrinter ();
antlr4.tree.ParseTreeWalker.DEFAULT.walk (printer, tree); // Produces a giant object which I am not sure how to unwrap


How do I actually get the output of the lexing in a list and the output of the parsing?

Any help is appreciated.

Thanks,
Sonny.

Sonny Rajagopalan

unread,
Apr 24, 2018, 8:47:13 PM4/24/18
to antlr-discussion
Well, I figured this out owing to some arcane link on the web, and I *really* wish this was better documented in https://github.com/antlr/antlr4/blob/4.6/doc/javascript-target.md where I expected to find it.

Specifically, two salient points:

(a) First, you have to overload the prototypes for each exit function (the original exit functions all appear in the generated *Listener.js, so grep for them to overload them) for each parsed value in your main program which is really your application that uses the parser, and
(b) In the overloaded function, ctx.getText () gives you the text that was parsed. Actually, if you need to know what other methods are available to operate on the ctx object, you are to find it in the Java docs [sic] for the JavaScript method names.

Here's an example (note the various exit methods for Simple.g4, this one is exitParam):

SimplePrinter.prototype.exitParam = function (ctx) {
    console.log ("Param name is " " ctx.getText ());
};

Of course, instead of printing them, you can do various other things with them now that they are parsed. Now, when the line 

antlr4.tree.ParseTreeWalker.DEFAULT.walk (printer, tree);

is executed, each lexeme as parsed is printed (or processed, per your overloaded exit functions). 

I might either propose a pull request for antlr4 on github or write a blog as this is likely going to be useful for others.
Reply all
Reply to author
Forward
0 new messages