Understanding V8 source

701 views
Skip to first unread message

Vasant Tendulkar

unread,
Mar 7, 2014, 11:18:47 AM3/7/14
to v8-u...@googlegroups.com
Hi,

I have just started going through V8 source to understand how it works internally. To get a good start, I saw the videos on the googlecode project page and also the documentation page. However, I found that they just explain the basic concepts of using hidden classes, inline caching and GC. I wanted to know what is the process that V8 goes through when it gets JavaScript code and how it converts it into machine code. I am going through the source as of now, but I do not know where do I start. Could you point me in the right direction to understand the working of V8.

Any help is appreciated. 

Thank you.

Regards, 
Vasant.

Jay Man

unread,
Mar 7, 2014, 12:28:00 PM3/7/14
to v8-u...@googlegroups.com
I was a bit skeptical when you mentioned V8 converts JavaScript to machine language. Then I read these articles:

http://www.2ality.com/2011/01/what-is-javascript-equivalent-of-java.html
quote: "JavaScript: Firefox, Safari and Internet Explorer each use different bytecode, Google’s V8 compiles directly to machine code."

quote: "V8 compiles JavaScript to native machine code (IA-32x86-64ARM, or MIPS ISAs)[3][6] before executing it, instead of more traditional techniques such as executing bytecode or interpreting it."

That's quite enlightening! That would remove the bytecode translation step thereby increasing speed. This sort of blurs the line between JavaScript and C/C++ , although C can freely access memory using pointers unlike JavaScript. Seems like converting to machine code would be tricky to do due to JavaScript's lack of static types though.



Ben Noordhuis

unread,
Mar 7, 2014, 3:36:19 PM3/7/14
to v8-u...@googlegroups.com
Andy Wingo covers V8 internals in considerable detail on his blog.
You will probably want to start with [1]. I'll try my hand at a
high-level overview.

V8 has a baseline compiler that is fast but not very good. It does
some constant folding and other "cheap" optimizations but that's it.
You can find it in src/full-codegen.{cc,h} with arch-specific code in
e.g. src/x64/full-codegen-x64.cc. There is a lot of supporting
infrastructure as well, just follow the #includes.

The code generated by the baseline compiler is continuously profiled
when it's running. The optimizing compiler kicks in when the profiler
determines that a function is "hot" (the unit of optimization in V8
and most JS engines is the function). This compiler consists of two
parts, hydrogen and lithium.

Hydrogen is the machine-independent optimizer; your starting points
are src/hydrogen.{cc,h} but be warned that those files are just the
tip of the iceberg.

Lithium is, as you may have guessed, the low-level, machine-dependent
optimizer. The meat of the code is in src/lithum.{cc,h} and
src/lithium-codegen.{cc,h} with arch-specific code in e.g.
src/x64/lithium-x64.{cc,h} and src/x64/lithium-codegen-x64.{cc,h}.
Lithium essentially lowers the high-level hydrogen IR into a low-level
lithium IR which in turn is lowered to machine code.

To understand the garbage collector, take a look at src/heap.{cc,h},
src/incremental-marking.{cc,h} and src/mark-compact.{cc,h} (also the
*-inl.h files). There is not much to tell about it: it's a fairly
standard compacting, generational mark-and-sweep collector. The young
and the old spaces are divided in sub-spaces for code, objects, etc.
and there's an additional space for large objects.

Hope that points you in the right direction. Good luck and don't
hesitate to ask if you have questions. V8's code base is massive and
can be daunting, but with some perseverance you can get a pretty good
mental picture. exuberant-ctags and cscope help too. ;-)

[1] http://wingolog.org/archives/2011/07/05/v8-a-tale-of-two-compilers
Reply all
Reply to author
Forward
0 new messages