Clarify snapshots when embedding

168 views
Skip to first unread message

Augusto Roman

unread,
Apr 8, 2015, 6:18:54 AM4/8/15
to v8-u...@googlegroups.com
I've dug through the post history, but I'm still having trouble understanding exactly what capability snapshots provide.

My question is:
  Within a single execution of a single binary, given a single isolate and context that has executed some JS, can I snapshot the state and later create a new context from that snapshot?

For example, imagine a server that should execute one of N JS functions with per-request parameters.  I'd like to snapshot each of the loaded JS functions and (for each request selecting a JS func and specifying parameters) create a new, clean context from the appropriate snapshot and execute the func with the parameters.  

How do snapshots relate to StartupData?  (What is StartupData?)  What's the difference between SetNativesDataBlob and SetSnapshotDataBlob?  mksnapshot appears to enable compiling in a fixed, hard-coded snapshot, which is not what I want.

I understand that V8 is not an AOT JS compiler, but some posts seem to indicate that snapshots may be appropriate for my use case which is confusingly inconsistent.

- Augusto

Ben Noordhuis

unread,
Apr 8, 2015, 8:16:37 AM4/8/15
to v8-u...@googlegroups.com
On Wed, Apr 8, 2015 at 12:18 PM, Augusto Roman <aro...@flux.io> wrote:
> I've dug through the post history, but I'm still having trouble
> understanding exactly what capability snapshots provide.
>
> My question is:
> Within a single execution of a single binary, given a single isolate and
> context that has executed some JS, can I snapshot the state and later create
> a new context from that snapshot?

Yes, with restrictions. Native functions and objects currently cannot
be serialized. Large typed arrays exist outside of the JS heap and
probably cannot be serialized. No doubt there are more caveats.

> For example, imagine a server that should execute one of N JS functions with
> per-request parameters. I'd like to snapshot each of the loaded JS
> functions and (for each request selecting a JS func and specifying
> parameters) create a new, clean context from the appropriate snapshot and
> execute the func with the parameters.

I don't think that's going to work. The snapshot is a per-isolate
property, not per-context.

> How do snapshots relate to StartupData? (What is StartupData?) What's the
> difference between SetNativesDataBlob and SetSnapshotDataBlob? mksnapshot
> appears to enable compiling in a fixed, hard-coded snapshot, which is not
> what I want.

StartupData is a serialized snapshot, basically a VM core dump.

The natives blob contains the built-in objects and functions, like
Array, Date, Object, etc. I don't think you would normally touch
this.

The snapshot blob is your custom start state and you can set it with
V8::SetSnapshotDataBlob() or pass it in the Isolate::CreateParams
argument to Isolate::New().

You need to build V8 with v8_use_snapshot == 'true' and
v8_use_external_startup_data == 1 for that to work, or `make
snapshot=external` when using the Makefile.

> I understand that V8 is not an AOT JS compiler, but some posts seem to
> indicate that snapshots may be appropriate for my use case which is
> confusingly inconsistent.

It's similar to emacs' dump mechanism if you are familiar with that,
only with more restrictions. :-)

Yang Guo

unread,
Apr 8, 2015, 9:28:13 AM4/8/15
to v8-u...@googlegroups.com
Ben, thanks for answering that question, but there are some inaccuracies :)


On Wednesday, April 8, 2015 at 2:16:37 PM UTC+2, Ben Noordhuis wrote:
On Wed, Apr 8, 2015 at 12:18 PM, Augusto Roman <aro...@flux.io> wrote:
> I've dug through the post history, but I'm still having trouble
> understanding exactly what capability snapshots provide.
>
> My question is:
>   Within a single execution of a single binary, given a single isolate and
> context that has executed some JS, can I snapshot the state and later create
> a new context from that snapshot?  

Yes, with restrictions.  Native functions and objects currently cannot
be serialized.  Large typed arrays exist outside of the JS heap and
probably cannot be serialized.  No doubt there are more caveats.

Correct. You can play around this by for example building with
make x64.release embedscript=somescript.js

somescript.js will be run before creating the snapshot, so that it will be part of every context you create from that snapshot.
 

> For example, imagine a server that should execute one of N JS functions with
> per-request parameters.  I'd like to snapshot each of the loaded JS
> functions and (for each request selecting a JS func and specifying
> parameters) create a new, clean context from the appropriate snapshot and
> execute the func with the parameters.

I don't think that's going to work.  The snapshot is a per-isolate
property, not per-context.

This is not entirely correct. The snapshot consists of two parts. The first one is to create the isolate. The second part is to initialize each context. By providing a snapshot, both creating an isolate and creating a context can be sped up 
 

> How do snapshots relate to StartupData?  (What is StartupData?)  What's the
> difference between SetNativesDataBlob and SetSnapshotDataBlob?  mksnapshot
> appears to enable compiling in a fixed, hard-coded snapshot, which is not
> what I want.

The natives data blob is simply the natives js sources minified and in a binary format. V8 requires those sources for example when functions like Array.prototype.push is called, to compile it before the call.

The snapshot data blob is the dump of the isolate and a fresh context, like explained above.
 

StartupData is a serialized snapshot, basically a VM core dump.

The natives blob contains the built-in objects and functions, like
Array, Date, Object, etc.  I don't think you would normally touch
this.

The snapshot blob is your custom start state and you can set it with
V8::SetSnapshotDataBlob() or pass it in the Isolate::CreateParams
argument to Isolate::New().

You need to build V8 with v8_use_snapshot == 'true' and
v8_use_external_startup_data == 1 for that to work, or `make
snapshot=external` when using the Makefile.

> I understand that V8 is not an AOT JS compiler, but some posts seem to
> indicate that snapshots may be appropriate for my use case which is
> confusingly inconsistent.

The snapshot is mainly used so that initialization scripts (including V8's internal ones) don't have to be run when creating a new context. For example, if you require a pre-calculated lookup table for your program, you would usually calculate it upfront upon startup. With snapshot, you could calculate it before creating the snapshot, and have it loaded as part of the context directly.

Ben Noordhuis

unread,
Apr 8, 2015, 10:50:09 AM4/8/15
to v8-u...@googlegroups.com
On Wed, Apr 8, 2015 at 3:28 PM, Yang Guo <yan...@chromium.org> wrote:
> On Wednesday, April 8, 2015 at 2:16:37 PM UTC+2, Ben Noordhuis wrote:
>> On Wed, Apr 8, 2015 at 12:18 PM, Augusto Roman <aro...@flux.io> wrote:
>> > For example, imagine a server that should execute one of N JS functions
>> > with
>> > per-request parameters. I'd like to snapshot each of the loaded JS
>> > functions and (for each request selecting a JS func and specifying
>> > parameters) create a new, clean context from the appropriate snapshot
>> > and
>> > execute the func with the parameters.
>>
>> I don't think that's going to work. The snapshot is a per-isolate
>> property, not per-context.
>
> This is not entirely correct. The snapshot consists of two parts. The first
> one is to create the isolate. The second part is to initialize each context.
> By providing a snapshot, both creating an isolate and creating a context can
> be sped up

I interpreted Augusto's question as: Can I have an isolate with
contexts C1 and C2, where C1 and C2 are created from different
snapshots? That doesn't seem possible from reading the source code.
But if it is possible, that's very interesting and I would like to
know more. :-)

Augusto Roman

unread,
Apr 8, 2015, 4:37:34 PM4/8/15
to v8-u...@googlegroups.com
Thanks Ben and Yang:


On Wednesday, April 8, 2015 at 6:28:13 AM UTC-7, Yang Guo wrote:
Ben, thanks for answering that question, but there are some inaccuracies :)

On Wednesday, April 8, 2015 at 2:16:37 PM UTC+2, Ben Noordhuis wrote:
On Wed, Apr 8, 2015 at 12:18 PM, Augusto Roman <aro...@flux.io> wrote:
> I've dug through the post history, but I'm still having trouble
> understanding exactly what capability snapshots provide.
>
> My question is:
>   Within a single execution of a single binary, given a single isolate and
> context that has executed some JS, can I snapshot the state and later create
> a new context from that snapshot?  

Yes, with restrictions.  Native functions and objects currently cannot
be serialized.  Large typed arrays exist outside of the JS heap and
probably cannot be serialized.  No doubt there are more caveats.

Correct. You can play around this by for example building with
make x64.release embedscript=somescript.js 

somescript.js will be run before creating the snapshot, so that it will be part of every context you create from that snapshot.

Awesome, that's exactly what I'm interested in.  Can snapshots be created at runtime?  For example, can I have my running server create a new isolate, run somescript2.js, save the snapshot, and then rapidly process many new isolates+contexts with somescript2 preloaded?  That appears to be the case using CreateSnapshotDataBlob, and that should NOT be used for Natives but should be used for SetSnapshotDataBlob, right?

> For example, imagine a server that should execute one of N JS functions with
> per-request parameters.  I'd like to snapshot each of the loaded JS
> functions and (for each request selecting a JS func and specifying
> parameters) create a new, clean context from the appropriate snapshot and
> execute the func with the parameters.

I don't think that's going to work.  The snapshot is a per-isolate
property, not per-context.

This is not entirely correct. The snapshot consists of two parts. The first one is to create the isolate. The second part is to initialize each context. By providing a snapshot, both creating an isolate and creating a context can be sped up

So, given a particular snapshot, I can apply it to an isolate and all contexts created within that isolate will have that snapshot applied.  However, if I want a different snapshot, I need a new isolate... right?

Yang Guo

unread,
Apr 9, 2015, 2:16:43 AM4/9/15
to v8-u...@googlegroups.com


On Wednesday, April 8, 2015 at 10:37:34 PM UTC+2, Augusto Roman wrote:
Thanks Ben and Yang:

On Wednesday, April 8, 2015 at 6:28:13 AM UTC-7, Yang Guo wrote:
Ben, thanks for answering that question, but there are some inaccuracies :)

On Wednesday, April 8, 2015 at 2:16:37 PM UTC+2, Ben Noordhuis wrote:
On Wed, Apr 8, 2015 at 12:18 PM, Augusto Roman <aro...@flux.io> wrote:
> I've dug through the post history, but I'm still having trouble
> understanding exactly what capability snapshots provide.
>
> My question is:
>   Within a single execution of a single binary, given a single isolate and
> context that has executed some JS, can I snapshot the state and later create
> a new context from that snapshot?  

Yes, with restrictions.  Native functions and objects currently cannot
be serialized.  Large typed arrays exist outside of the JS heap and
probably cannot be serialized.  No doubt there are more caveats.

Correct. You can play around this by for example building with
make x64.release embedscript=somescript.js 

somescript.js will be run before creating the snapshot, so that it will be part of every context you create from that snapshot.

Awesome, that's exactly what I'm interested in.  Can snapshots be created at runtime?  For example, can I have my running server create a new isolate, run somescript2.js, save the snapshot, and then rapidly process many new isolates+contexts with somescript2 preloaded?  That appears to be the case using CreateSnapshotDataBlob, and that should NOT be used for Natives but should be used for SetSnapshotDataBlob, right?

Yes. CreateSnapshotDataBlob is precisely the API to use to create snapshots at runtime. Please take a look at test/cctest/test-serialize.cc for its usage. Natives data are just internal js sources as a binary and should not change. There is also no way at this point to create them at runtime (and no point in doing so).

 

> For example, imagine a server that should execute one of N JS functions with
> per-request parameters.  I'd like to snapshot each of the loaded JS
> functions and (for each request selecting a JS func and specifying
> parameters) create a new, clean context from the appropriate snapshot and
> execute the func with the parameters.

I don't think that's going to work.  The snapshot is a per-isolate
property, not per-context.

This is not entirely correct. The snapshot consists of two parts. The first one is to create the isolate. The second part is to initialize each context. By providing a snapshot, both creating an isolate and creating a context can be sped up

So, given a particular snapshot, I can apply it to an isolate and all contexts created within that isolate will have that snapshot applied.  However, if I want a different snapshot, I need a new isolate... right?

Yes. Like Ben mentioned, each snapshot is good for exactly one isolate and one context. I did consider offering the option for different contexts in the same snapshot, on the same isolate, but there were no use cases for it.
Reply all
Reply to author
Forward
0 new messages