Tips for using Rhino as JavaScript Parser?

Glenn Boysko

unread,

Mar 13, 2008, 2:17:52 PM3/13/08

to

Hello:

I've been using Rhino for some time as a parser for JavaScript. The
lack of a complete grammar for ANTLR makes Rhino a good option for me.
I'm using it for things like specialized documentation tools (JavaDoc-
style docs for JavaScript classes) or other code analysis.

I'm sure that you get a lot of these types of questions. What do you
recommend?

For example, I've been using 1.5 and 1.6 in the past and found no node
visitor class. Does one exist in 1.7? There doesn't seem to be a way
to search for a node (by type) in 1.6 either.

If you have any suggestions for writing JavaScript parsers based on
Rhino, I'd really appreciate it.

Thanks,
Glenn

Norris Boyd

unread,

Mar 13, 2008, 3:53:01 PM3/13/08

to

Rhino 1.7 doesn't have any more support for what you want than 1.5 or
1.6. It is possible to write code that traverses over the intermediate
representation. See NodeTransformer.java for an example.

This is a commonly-requested feature, however, and we're looking hard
at implementing more rational and easy-to-use abstract syntax trees
for 1.7R2.

--N

Glenn Boysko

unread,

Mar 13, 2008, 4:31:22 PM3/13/08

to

Norris:

Thanks so much for the reply. I was not aware of NodeTransformer. I'm
scanning code for all instantiations so this works perfectly. Just
overriding visitNew is what I needed.

Glad to hear that you are keeping this in mind.

Regards,
Glenn

jbar...@gmail.com

unread,

Mar 14, 2008, 2:40:16 PM3/14/08

to

On Mar 13, 2:17 pm, Glenn Boysko <gboy...@gmail.com> wrote:

Hello Glenn,

Just yesterday I realized I need to parse JS in a project -- to
generate GraphViz files for example. Since you are doing it already,
could you point to any howto or documentation that the compiler-
challenged among us can use? Anything to avoid? Any of you experience
would be really useful.

Thanks!
Jaime

Glenn Boysko

unread,

Mar 14, 2008, 4:23:58 PM3/14/08

to

Here's what I've done (and members of the group may add more insights/
better ways if they know of them).

First, here's a method I use to parse a JavaScript file into a
ScriptOrFn node (which represents the root of the file contents):

private ScriptOrFnNode parseJavascript(File file) throws
IOException {
// Try to open a reader to the file supplied...
Reader reader = new FileReader(file);

// Setup the compiler environment, error reporter...
CompilerEnvirons compilerEnv = new CompilerEnvirons();
ErrorReporter errorReporter = compilerEnv.getErrorReporter();

// Create an instance of the parser...
Parser parser = new Parser(compilerEnv, errorReporter);

String sourceURI;

try {
sourceURI = file.getCanonicalPath();
} catch (IOException e) {
sourceURI = file.toString();
}

// Try to parse the reader...
ScriptOrFnNode scriptOrFnNode = parser.parse(reader,
sourceURI, 1);

return scriptOrFnNode;
}

This top-level object represents the root "node" in this file. It
extends the org.mozilla.javascript.Node class.

From there, you have a rough syntax tree that you can use for your
analysis of the JavaScript.

To get a textual representation of this syntax tree, you would call
the toStringTree method (and pass in the scriptOrFnNode as its
parameter).

Note that in the standard distribution, this returns null. To change
that, I had to change a constant in Token.java:

public static final boolean printTrees = true; // was false

Not sure if there are easier ways to do what I have done.

Regards,
Glenn

Norris Boyd

unread,

Mar 14, 2008, 8:31:08 PM3/14/08

to

"printTrees" is intended only for debugging, but is a good way to see
what the IR is like.

--N

zarat...@gmail.com

unread,

May 13, 2008, 11:54:32 AM5/13/08

to

On 14 Mar, 22:23, Glenn Boysko <gboy...@gmail.com> wrote:
> Here's what I've done (and members of the group may add more insights/
> better ways if they know of them).
>
> First, here's a method I use to parse a JavaScript file into a
> ScriptOrFn node (which represents the root of the file contents):
>
> private ScriptOrFnNode parseJavascript(File file) throws
> IOException {
> // Try to open a reader to the file supplied...
> Reader reader = new FileReader(file);
>
> // Setup the compiler environment, error reporter...
> CompilerEnvirons compilerEnv = new CompilerEnvirons();
> ErrorReporter errorReporter = compilerEnv.getErrorReporter();
>
> // Create an instance of theparser...

> Parserparser= newParser(compilerEnv, errorReporter);

>
> String sourceURI;
>
> try {
> sourceURI = file.getCanonicalPath();
> } catch (IOException e) {
> sourceURI = file.toString();
> }
>
> // Try to parse the reader...
> ScriptOrFnNode scriptOrFnNode =parser.parse(reader,
> sourceURI, 1);
>
> return scriptOrFnNode;
> }
>
> This top-level object represents the root "node" in this file. It
> extends the org.mozilla.javascript.Node class.
>
> From there, you have a rough syntax tree that you can use for your
> analysis of the JavaScript.
>
> To get a textual representation of this syntax tree, you would call
> the toStringTree method (and pass in the scriptOrFnNode as its
> parameter).
>
> Note that in the standard distribution, this returns null. To change
> that, I had to change a constant in Token.java:
>
> public static final boolean printTrees = true; // was false
>
> Not sure if there are easier ways to do what I have done.
>
> Regards,
> Glenn

Hi guys,

I am rather newbie with Rhino ( as well as with Java ;) ) and trying
to use this nice piece of code above I always got the scriptOrFnNode
== null returned (I switched 'printTrees' to true).

The code I use:

File file = new File("/user/public_html/test.js");

if (file.exists()) {
try {
script = parseJavascript(file);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println(script.toStringTree(script));
}
else
System.out.println("File does not exist ");

It looks as if the parser did nothing, it generates no warning/error
output on stderr as well. Shall I play with the context for
ComplierEnv? How to set the context so that it would neither compile
nor execute the script (all I need is syntax analysis for the code) ?

Thnx in advance & regards,

Szafran

Attila Szegedi

unread,

May 14, 2008, 3:58:16 AM5/14/08

to zarat...@gmail.com, dev-tech-js-...@lists.mozilla.org

There seems to be an initFromContext() call missing...

Here's what I do:

String s = ...; // "s" contains script source code

Context ctx = contextFactory.enterContext();
try
{

CompilerEnvirons compilerEnv = new CompilerEnvirons();

compilerEnv.initFromContext(ctx);
ErrorReporter compilationErrorReporter =
compilerEnv.getErrorReporter();
Parser p = new Parser(compilerEnv, compilationErrorReporter);
ScriptOrFnNode tree = p.parse(s, "", 1);
...
}
finally
{
Context.exit();
}

I definitely get non-null for "tree".

Attila.

Glenn Boysko

unread,

Jul 7, 2008, 1:29:47 PM7/7/08

to

Norris:

At one point, you wrote:

> This is a commonly-requested feature, however, and we're looking hard
> at implementing more rational and easy-to-use abstract syntax trees
> for 1.7R2.

Do you have any updates on this? Are there areny more rational or easy-
to-use ASTs for 1.7R2?

Thanks,
Glenn

Norris Boyd

unread,

Jul 8, 2008, 5:15:33 PM7/8/08

to

The new AST work is in progress and I'm expecting to get a code drop
in a week or two, and then it's just up to me to review and commit to
CVS. Then we have the release process for 1.7R2. So I'd say a month at
the earliest, although once it's in CVS you can use it unreleased.

--N

Glenn Boysko

unread,

Jul 9, 2008, 9:32:35 AM7/9/08

to

Norris:

> The new AST work is in progress and I'm expecting to get a code drop
> in a week or two, and then it's just up to me to review and commit to
> CVS. Then we have the release process for 1.7R2. So I'd say a month at
> the earliest, although once it's in CVS you can use it unreleased.

Can you elaborate on any of the changes specifically? Does it make it
easier for tools to parse JS without compiling it to Java? Are you
using the same Tokens, but structured in trees differently? Newer
tokens?

Any insight you can provide would be helpful.

Thanks,
Glenn

Norris Boyd

unread,

Jul 11, 2008, 10:36:28 AM7/11/08

to

Yes, there will be an API to parse JavaScript and get back an Abstract
Syntax Tree (AST) without compiling to Java or executing. There will
be a new set of classes that describe the JavaScript source. I haven't
seen any of the code yet, so I can't provide any more detail...

--N