Simplify binary parsing with stack-based shift reduce

49 views
Skip to first unread message

Ariya Hidayat

unread,
Oct 5, 2012, 10:26:04 AM10/5/12
to esp...@googlegroups.com
Since this question has been asked before (including by Yusuke), let
me share my thought about it.

Originally when I started Esprima some time ago, the expressions were
parsed using a shift-reduce approach (while the rest is still
recursive-descent). The primary reason was because it was inspired by
JavaScriptCore and V8. However, I realized that making the expression
parsing fully recursive-descent (like in SpiderMonkey) also improves
debugging since it is trivial to see what happens by looking at the
call stack. Furthermore, it is easier to understand since you only
need to know one type of parser (recursive descent) instead of two
different ones. In addition to that, the code structure matches nicely
with the grammar and terminologies from ECMAScript specification.

Still, I think there is a benefit of using stack-based shift-reduce,
mainly for two reasons. First, we don't use the native stack (from the
JavaScript engine) anymore and thus we can parse a deeper expression
without exhausting the stack. Second, it improves the performance due
to the elimination of long recursive calls. Applying the (unoptimized)
patch I attached in issue #352 gives a noticeable improvement in our
benchmarks suite. V8 (which is usually very good) still demonstrates
9% speed-up while Safari 6 shows an important 18% win. This is still
with binary expressions, I'm sure there will be some further bonus
from using the same technique for the unary expressions.

Now, the question is rather simple. Should we do it or not?


Thanks!

Regards,

--
Ariya Hidayat, http://ariya.ofilabs.com
http://twitter.com/ariyahidayat

Yusuke Suzuki

unread,
Oct 5, 2012, 11:38:02 AM10/5/12
to esp...@googlegroups.com
Personally I think operator precedence parser in binary expression is better because of its performance improvements.
And I think Esprima core code is mature enough to introduce "tricky but optimized" way.
--
Regards,
Yusuke Suzuki

Claus Reinke

unread,
Oct 5, 2012, 4:47:17 PM10/5/12
to esp...@googlegroups.com
Also, code size improves (see acorn [1] for a recent competitor
with a slightly different implementation;-).

If necessary, you might be able to recover some of the debugability
by tracking the current state in a variable, or by having a stack-
rendering utility function that could be used in watch expressions
during debugging sessions. But, in theory, having less duplicated
code should reduce the number of bugs, and make the remaining
ones easier to find and fix.

Actually, I'd prefer if the duplication would be removed at the root,
in the ES grammar. Hardcoding operator precedence there closes
the mind against moving the language towards user-defined
operators, precedence, and associativity, something I very much
miss in Javascript. It might be useful to leave some room in the
precedence levels, but that can come later.

That is a 'yes';-)
Claus

[1] http://marijnhaverbeke.nl/blog/acorn.html

Ariya Hidayat

unread,
Oct 8, 2012, 11:54:27 PM10/8/12
to esp...@googlegroups.com
Since I don't hear any objection so far, I will go ahead and land my
patch for this.

In case we have problems with debugging, we can always revert back to
the pure recursive descent parsing.


Thanks!
Reply all
Reply to author
Forward
0 new messages