Using AST representations to extract a function

595 views
Skip to first unread message

Tauren Mills

unread,
Jul 5, 2011, 6:57:08 PM7/5/11
to UglifyJS
I'm just curious if I could use Uglify's AST representation of some
javascript code to extract a javascript function from a source file.

Basically, I want to do the following:

1. Load a javascript file into a string buffer

2. Locate a named function within the buffer. This function could be
located within multiple levels of closures, or it could be at the top
level.

3. Extract this function from the buffer, completely removing it from
the buffer.

4. Retain the javascript code of the specific function that was found.

I don't care if I lose comments, formatting, and so forth. I can also
make some assumptions that the named function I'm looking for will
only be defined once in the file, but it may be called many times.

I believe that I can create an AST like this:

var jsp = require("uglify-js").parser;
var ast = jsp.parse(stringBuffer);

But I'm not sure what to do next. How can I walk the AST to find the
function I'm after? How can I remove it from the AST? And how can I
get just the contents of that function?

Is there some documentation on how to use the AST somewhere? I'm
starting to go through the code, but docs would help.

Thanks!
Tauren

Mihai Călin Bazon

unread,
Jul 6, 2011, 8:06:13 AM7/6/11
to ugli...@googlegroups.com
Hi Tauren,

The API should be quite simple to use — though indeed there is no documentation for now.  Take a look at tmp/instrument.js [1] for an example that takes a piece of code and adds some trace() calls, passing the line number, before various statements.

The idea is:

- you'll create an AST walker — w = ast_walker()
- you'll define custom walkers using w.with_walkers.  For your case you'll want to catch "defun" or "function" (or both).
- from your walker you can return null or undefined to keep the original AST, or you can return a new AST instead.
- w.walk returns the new AST.

A "function" or "defun" AST is an array that looks like this:

    [ "function" / "defun", NAME, ARGS, BODY ]

NAME can be null if it's an anonymous function, otherwise it's the function name.  ARGS is an array of strings (the argument names) and body is an array of statements (which are ASTs themselves).  To replace the function with its content, for example, you'd return from that walker [ "block", BODY ] — where BODY is the same component from the function's AST.

Finally, call gen_code(ast) to render an AST into JS code.

Hope this helps.

-Mihai

[1] https://github.com/mishoo/UglifyJS/blob/master/tmp/instrument.js
--
Mihai Bazon,
http://mihai.bazon.net/blog

Tauren Mills

unread,
Jul 6, 2011, 6:17:50 PM7/6/11
to ugli...@googlegroups.com
Perfect! That will certainly get me started.

Thanks,
Tauren


2011/7/6 Mihai Călin Bazon <mihai...@gmail.com>:

Reply all
Reply to author
Forward
0 new messages