How did the Erjang project effectively understand/target BEAM?

109 views
Skip to first unread message

Jonathan Coveney

unread,
May 10, 2012, 3:35:28 PM5/10/12
to Erjang
I posed this question to erlang-questions:
http://erlang.org/pipermail/erlang-questions/2012-May/066519.html

The question is basically: how does a motivated engineer make sense of
the BEAM opcodes. Erjang has been held up as an example multiple times
as a reason why the BEAM opcodes don't need to be documented, because
if you guys could figure it out, it can't be THAT hard... I think that
that is a misguided opinion, but I thought I would ask the source!

I'm curious what your experience interpreting the BEAM opcodes has
been like? How did you make heads or tails of the opcodes? Did you
just go through the BEAM source and try to figure out what was going
on? Do you think that documentation would have helped your effort a
lot?

I appreciate your time, and I agree with the people in erlang-
questions: Erjang is an impressive piece of work.
Jon

Erik Søe Sørensen

unread,
May 10, 2012, 4:47:32 PM5/10/12
to erj...@googlegroups.com
As for my half of the answer, major ingredients were...:
- Look at compiler output
- Guess qualifiedly (using knowledge of instruction sets in general and the known part of beam in particular)
- For the instruction parameters which you can't easily explain, look at compiler output for variations of the source
- For the instruction parameters which you still can't explain, look at ops.tab, and possibly the C implementation of the instruction
- Implement and try booting OTP / running small programs using the feature
- Patience and determination (finding the process fun helps nicely).
 
(A variation of the first step: when your problem is that the booting process has become stuck at some point, or some program fails, then you often know the instruction in question and need to look at the source which produced it.)
 
Some things, like nested exception handlers and call_fun TCO, were more tricky than others; bitstring operations too, if only by volume.
The rest of the runtime is of course the larger part, or at least the one with more opportunities for lurking bugs.
 
As for whether documentation would have helped: for the easy parts, probably not. For the hard or surprising parts, certainly yes. For knowing when you have guessed all there is to know, with no surprises or corner cases left, absolutely.
 
 
2012/5/10 Jonathan Coveney <jcov...@gmail.com>

Wolfgang Schell

unread,
May 10, 2012, 8:12:49 PM5/10/12
to erj...@googlegroups.com


Am 10.05.12 21:35, schrieb Jonathan Coveney:
> I posed this question to erlang-questions:
> http://erlang.org/pipermail/erlang-questions/2012-May/066519.html

Interesting discussion!


> The question is basically: how does a motivated engineer make sense of
> the BEAM opcodes. Erjang has been held up as an example multiple times
> as a reason why the BEAM opcodes don't need to be documented, because
> if you guys could figure it out, it can't be THAT hard... I think that
> that is a misguided opinion, but I thought I would ask the source!

I tried a couple of times to understand Eriks Interpreter in Erjang
(classes erjang.beam.interpreter.AbstractInterpreter and
erjang.beam.interpreter.Interpreter, the latter is generated from
Interpreter.template and ops.spec using a perl script), which is a
one-stop-shop for all instructions. It's pretty complex, especially if
you don't know anything about BEAM already... :-)

Krestens BEAM-to-bytecode compiler (classes erjang.beam.Compiler and
erjang.beam.CompilerVisitor) is even more complex, but has all the dirty
details.

Good luck with getting some docs. And thanks for starting and pointing
out this discussion!

Wolfgang

Kresten Krab Thorup

unread,
May 11, 2012, 1:13:47 AM5/11/12
to erj...@googlegroups.com, Jonathan Coveney
There is a machine-readable spec here

https://github.com/trifork/erjang/blob/master/src/main/java/erjang/beam/interpreter/ops.spec

The same directory holds some Perl modules that reads the spec and generates Erjang's interpreter.
(Erjang also has a JIT, the interpreter is a late arrival to the VM).

Kresten
Mobile: + 45 2343 4626 | Skype: krestenkrabthorup | Twitter: @drkrab
Trifork A/S | Margrethepladsen 4 | DK- 8000 Aarhus C | Phone : +45 8732 8787 | www.trifork.com<http://www.trifork.com>




Reply all
Reply to author
Forward
0 new messages