Will replacing classes with closures allow me to avoid megamorphism?

27 views
Skip to first unread message

GregRos

unread,
Apr 14, 2025, 12:06:30 PMApr 14
to v8-users
I'm developing a new version of my parser library (https://parjs.org/), so I'm writing kind of low-level CPU-bound code.

It's a parser combinator library, so parsers are composed of building blocks and combinators that work like a tree. Each combinator "class" has different logic and properties.

However, their external interface is the same, with a single `parse` method that applies whatever logic the class has.

A combinator like "many" can potentially apply any parser repeatedly, and so different instances will have different parser inputs. For example, you can have:
  • new Many(new Digit()) parses multiple digits
  • new Many(new String("abc")) parses multiple literals "abc"
  • new Many(new Letter()) parses multiple letters
Inside the code of `Many`, you'd have a call like `child.parse(input)`

In the previous version, profiling and diagnostics revealed that performance was dominated by function call overhead, and almost all the method calls turned out to be megamorphic because when `child.parse` was called in the same location, `child` was a different type of object.

In this new version, I'm trying to figure out if it's possible to avoid that. My idea is to pass around functions instead of objects with methods. By doing this, I'm hoping that method resolution can be avoided, and that the code won't need to be megamorphic because all the objects are sufficiently similar.

These functions still need internal parameters, which includes input parser objects. 
  • The nicest idea is to use closures.
  • But another solution could be doing something like `parser.parse.bind(parser)` and passing that around.
Will something like this allow me to avoid megamorphism? Is there something else I can do?

Ben Noordhuis

unread,
Apr 14, 2025, 3:27:26 PMApr 14
to v8-u...@googlegroups.com
Yes, closures will work for that. Other approaches that can work (but
are maybe less robust, more cumbersome, etc.) are:

1. using arrays instead of objects (indexed vs. named properties) so
you don't get the "exploding hidden classes" thing; caveat: not 100%
foolproof, V8 has different array representations

2. guarding property accesses behind instanceof checks to keep inline
caches monomorphic: if (obj instanceof Foo) obj.x = 42; else if (obj
instanceof Bar) obj.x = 42; - the instanceof checks can become
expensive(ish) though

life....@gmail.com

unread,
Apr 14, 2025, 4:39:50 PMApr 14
to v8-users
I see, thank you! The array approach is also interesting, though hopefully I can avoid it, since it will probably make the code harder to maintain.
Reply all
Reply to author
Forward
0 new messages