Making V8 value serialization faster

550 views
Skip to first unread message

Kentaro Hara

unread,
Aug 5, 2016, 1:26:57 AM8/5/16
to platform-architecture-dev, Jeremy Roman, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn
Hi

Background: Currently postMessage() is 10x slower in Chrome than in Firefox/Safari because V8 value serialization is slow. Jeremy started working on optimizing the serialization (design doc).

On Fri, Aug 5, 2016 at 2:54 AM, Jeremy Roman <jbr...@google.com> wrote:
I've now gotten far enough in local prototyping this that I can serialize numbers, strings, objects, arrays, etc. (DOM objects, array buffers, and a few other things are still missing.)

Early perf results for the large JSON object from bug 148757: serialization time went from 1521 ms to 404 ms on my Z620 (I haven't done profiling yet to determine whether this can be further reduced, nor have I yet measured on other devices, though I expect similar results).

I've responded to some comments on the document linked earlier, but no conclusions have been reached. I'm reluctant to prototype too far ahead of landing (especially without a thumbs-up on the approach).

Are there additional concerns I should address, investigation I should do, or things I should explain/document?

As far as I understand the comments in the document, I think we need to figure out how to deal with the following points:

a) Whether the callback-style V8 API (SerializeHostObject() in the document) is good or not.

b) Whether we should invent a new wire format or keep the existing wire format. If we invent a new wire format, how (or whether) to deprecate the existing format from the web.

Regarding a), the key would be what the spec is saying. IIUC, the serialization order does matter to throw a DataCloneError per the spec when the serializer encounters something unexpected. If that is the case, we'll need the callback-style API.

Regarding b), Elliott seems to be fine with inventing a new wire format and worry about how to deprecate the existing format later. That sounds reasonable to me.



--
Kentaro Hara, Tokyo, Japan

Kentaro Hara

unread,
Aug 5, 2016, 1:27:43 AM8/5/16
to platform-architecture-dev, Jeremy Roman, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn, Joshua Bell
+jsbell

Jeremy Roman

unread,
Aug 5, 2016, 1:51:14 AM8/5/16
to Kentaro Hara, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn, Joshua Bell
On Fri, Aug 5, 2016 at 1:27 AM, Kentaro Hara <har...@chromium.org> wrote:
+jsbell


On Fri, Aug 5, 2016 at 2:26 PM, Kentaro Hara <har...@chromium.org> wrote:
Hi

Background: Currently postMessage() is 10x slower in Chrome than in Firefox/Safari because V8 value serialization is slow. Jeremy started working on optimizing the serialization (design doc).

On Fri, Aug 5, 2016 at 2:54 AM, Jeremy Roman <jbr...@google.com> wrote:
I've now gotten far enough in local prototyping this that I can serialize numbers, strings, objects, arrays, etc. (DOM objects, array buffers, and a few other things are still missing.)

Early perf results for the large JSON object from bug 148757: serialization time went from 1521 ms to 404 ms on my Z620 (I haven't done profiling yet to determine whether this can be further reduced, nor have I yet measured on other devices, though I expect similar results).

I've responded to some comments on the document linked earlier, but no conclusions have been reached. I'm reluctant to prototype too far ahead of landing (especially without a thumbs-up on the approach).

Are there additional concerns I should address, investigation I should do, or things I should explain/document?

As far as I understand the comments in the document, I think we need to figure out how to deal with the following points:

a) Whether the callback-style V8 API (SerializeHostObject() in the document) is good or not.

b) Whether we should invent a new wire format or keep the existing wire format. If we invent a new wire format, how (or whether) to deprecate the existing format from the web.

Regarding a), the key would be what the spec is saying. IIUC, the serialization order does matter to throw a DataCloneError per the spec when the serializer encounters something unexpected. If that is the case, we'll need the callback-style API.

Not only in the case of an exception, but also in the case where a getter modifies some object that has yet to be reached in the structured clone process.

e.g. Consider the (silly, but nonetheless…) object
{ imageData: new ImageData(1, 1), get x() { this.imageData.data[3] = 255; } }

If the data from DOM (Blink) objects isn't serialized (or at least copied) at the correct time, then the x getter will make the pixel opaque black, whereas per spec (and our current implementation) it should be transparent black.

Jeremy Roman

unread,
Aug 5, 2016, 11:51:46 AM8/5/16
to Kentaro Hara, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn, Joshua Bell
(Apologies; long email follows.)

On a related note, looking more closely at ImageData leads me to discover that it is a DOM object whose [[Clone]] operation recursively clones another JavaScript object (the Uint8ClampedArray it wraps).
Another example is FileList, whose [[Clone]] recursively clones its contained File objects.

Interestingly, neither Chrome nor Firefox correctly handles these per spec. In particular, the spec demands that they be put into the common "memory" (object map) of the cloning operation, so that multiple references to the same object correspond to the same on the other side. However, cases like these:

{ imageData: new ImageData(1, 1), get rawData() { return this.imageData.data; } }
{ fileList: someFileList, firstFile: someFileList[0] }

Produce objects which contain multiple copies of the wrapped objects (i.e. they don't compare equal with ===).

Since it apparently doesn't work today, it's clearly not a big issue, but from an author's perspective I think it's reasonable to expect such references to be maintained correctly, just as ones between ordinary JS objects are.

I haven't yet found a DOM object whose [[Clone]] operation requires cloning a non-DOM JS object, so the API proposed in the doc would work if we're happy giving up this property. On the other hand, we could:

1. Allow the SerializeHostObject to reinvoke the v8::ValueSerializer as part of its serialization work.

This means that care must be taken to make the serializer re-entrant, so it somewhat complicates the interface.

2. Don't have Blink serialize directly into the buffer, but instead convert the DOM object into some value V8 can serialize.

roughly: v8::MaybeLocal<v8::Value> SerializeHostObject(v8::Local<v8::Object> object);

This gives a lot of nice properties: Blink doesn't have to deal with raw byte buffers and have its own logic for writing/reading integers etc. in a portable way, it gets reference deduplication "for free", and dealing with structured objects makes it more natural to add additional properties, etc. in the future without having to do much explicit work for this. e.g. an ImageData might get mapped to {width: w, height: h, data: d}

However, it would almost certainly be slower than having the DOM object's data serialized directly (because of the need to copy it into the V8 heap, and then from there to the serialized buffer (and similar for deserialization) and larger (due to having to write string keys or equivalent).

The possible saving grace would be that I expect that #(DOM objects) << #(plain JS objects) in practice, so while there would be an overhead here it might be minor in practice. Clever but cautious use of an external ArrayBuffer could be used to avoid an extra copy of objects which contain a large block of data internally.

3. Some combination of the above (let Blink write its own data, _and_ optionally provide another V8 value to serialize).

This would give some advantages of each, at the cost of a larger and more awkward API.

Kentaro Hara

unread,
Aug 7, 2016, 10:06:45 PM8/7/16
to Jeremy Roman, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn, Joshua Bell
If our plan is to make a fundamental performance improvement by inventing a new wire format / APIs (instead of micro-optimizing the existing serializer), I don't think we should rush into a short-term solution; i.e., we should seriously consider how to make the existing broken implementation more correct when designing the new wire format & APIs.



I haven't yet found a DOM object whose [[Clone]] operation requires cloning a non-DOM JS object, so the API proposed in the doc would work if we're happy giving up this property. On the other hand, we could:

1. Allow the SerializeHostObject to reinvoke the v8::ValueSerializer as part of its serialization work.

This means that care must be taken to make the serializer re-entrant, so it somewhat complicates the interface.

The serializer must be re-entrant anyway, right? If some getter is overriden by user's script, any arbitrary code may run during the serializer (which may call the serializer again).
 
2. Don't have Blink serialize directly into the buffer, but instead convert the DOM object into some value V8 can serialize.

roughly: v8::MaybeLocal<v8::Value> SerializeHostObject(v8::Local<v8::Object> object);

This gives a lot of nice properties: Blink doesn't have to deal with raw byte buffers and have its own logic for writing/reading integers etc. in a portable way, it gets reference deduplication "for free", and dealing with structured objects makes it more natural to add additional properties, etc. in the future without having to do much explicit work for this. e.g. an ImageData might get mapped to {width: w, height: h, data: d}

I like the idea :) It will also prevent us from implementing a wire-format-dependent logic in both V8 and Blink.

 
However, it would almost certainly be slower than having the DOM object's data serialized directly (because of the need to copy it into the V8 heap, and then from there to the serialized buffer (and similar for deserialization) and larger (due to having to write string keys or equivalent).

The possible saving grace would be that I expect that #(DOM objects) << #(plain JS objects) in practice, so while there would be an overhead here it might be minor in practice. Clever but cautious use of an external ArrayBuffer could be used to avoid an extra copy of objects which contain a large block of data internally.

jsbell@: Do you have any idea on this?


3. Some combination of the above (let Blink write its own data, _and_ optionally provide another V8 value to serialize).

This would give some advantages of each, at the cost of a larger and more awkward API.

Regarding b), Elliott seems to be fine with inventing a new wire format and worry about how to deprecate the existing format later. That sounds reasonable to me.



--
Kentaro Hara, Tokyo, Japan



--
Kentaro Hara, Tokyo, Japan


Joshua Bell

unread,
Aug 8, 2016, 1:12:30 PM8/8/16
to Kentaro Hara, Jeremy Roman, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn
On Sun, Aug 7, 2016 at 7:06 PM, Kentaro Hara <har...@chromium.org> wrote:
On Sat, Aug 6, 2016 at 12:51 AM, Jeremy Roman <jbr...@chromium.org> wrote:
(Apologies; long email follows.)

On Fri, Aug 5, 2016 at 1:51 AM, Jeremy Roman <jbr...@chromium.org> wrote:


On Fri, Aug 5, 2016 at 1:27 AM, Kentaro Hara <har...@chromium.org> wrote:
+jsbell


On Fri, Aug 5, 2016 at 2:26 PM, Kentaro Hara <har...@chromium.org> wrote:
Hi

Background: Currently postMessage() is 10x slower in Chrome than in Firefox/Safari because V8 value serialization is slow. Jeremy started working on optimizing the serialization (design doc).

On Fri, Aug 5, 2016 at 2:54 AM, Jeremy Roman <jbr...@google.com> wrote:
I've now gotten far enough in local prototyping this that I can serialize numbers, strings, objects, arrays, etc. (DOM objects, array buffers, and a few other things are still missing.)

Early perf results for the large JSON object from bug 148757: serialization time went from 1521 ms to 404 ms on my Z620 (I haven't done profiling yet to determine whether this can be further reduced, nor have I yet measured on other devices, though I expect similar results).

I've responded to some comments on the document linked earlier, but no conclusions have been reached. I'm reluctant to prototype too far ahead of landing (especially without a thumbs-up on the approach).

Are there additional concerns I should address, investigation I should do, or things I should explain/document?

As far as I understand the comments in the document, I think we need to figure out how to deal with the following points:

a) Whether the callback-style V8 API (SerializeHostObject() in the document) is good or not.

b) Whether we should invent a new wire format or keep the existing wire format. If we invent a new wire format, how (or whether) to deprecate the existing format from the web.

Regarding a), the key would be what the spec is saying. IIUC, the serialization order does matter to throw a DataCloneError per the spec when the serializer encounters something unexpected. If that is the case, we'll need the callback-style API.

Not only in the case of an exception, but also in the case where a getter modifies some object that has yet to be reached in the structured clone process.

e.g. Consider the (silly, but nonetheless…) object
{ imageData: new ImageData(1, 1), get x() { this.imageData.data[3] = 255; } }

If the data from DOM (Blink) objects isn't serialized (or at least copied) at the correct time, then the x getter will make the pixel opaque black, whereas per spec (and our current implementation) it should be transparent black.

On a related note, looking more closely at ImageData leads me to discover that it is a DOM object whose [[Clone]] operation recursively clones another JavaScript object (the Uint8ClampedArray it wraps).
Another example is FileList, whose [[Clone]] recursively clones its contained File objects.


As an aside, we're hoping FileList can be removed from the platform and replaced by FrozenArray<File>. But I suspect that'll take longer than we'd want to wait on getting a new serialization mechanism, and since it's not an isolated case (ImageData) I don't think this wish helps in a practical way. But thanks for noting this pattern, which we may want to avoid in future additions to the platform!
 
Interestingly, neither Chrome nor Firefox correctly handles these per spec. In particular, the spec demands that they be put into the common "memory" (object map) of the cloning operation, so that multiple references to the same object correspond to the same on the other side. However, cases like these:

{ imageData: new ImageData(1, 1), get rawData() { return this.imageData.data; } }
{ fileList: someFileList, firstFile: someFileList[0] }

Produce objects which contain multiple copies of the wrapped objects (i.e. they don't compare equal with ===).

Since it apparently doesn't work today, it's clearly not a big issue, but from an author's perspective I think it's reasonable to expect such references to be maintained correctly, just as ones between ordinary JS objects are.

If our plan is to make a fundamental performance improvement by inventing a new wire format / APIs (instead of micro-optimizing the existing serializer), I don't think we should rush into a short-term solution; i.e., we should seriously consider how to make the existing broken implementation more correct when designing the new wire format & APIs.



I haven't yet found a DOM object whose [[Clone]] operation requires cloning a non-DOM JS object, so the API proposed in the doc would work if we're happy giving up this property. On the other hand, we could:

1. Allow the SerializeHostObject to reinvoke the v8::ValueSerializer as part of its serialization work.

This means that care must be taken to make the serializer re-entrant, so it somewhat complicates the interface.

The serializer must be re-entrant anyway, right? If some getter is overriden by user's script, any arbitrary code may run during the serializer (which may call the serializer again).
 
2. Don't have Blink serialize directly into the buffer, but instead convert the DOM object into some value V8 can serialize.

roughly: v8::MaybeLocal<v8::Value> SerializeHostObject(v8::Local<v8::Object> object);

This gives a lot of nice properties: Blink doesn't have to deal with raw byte buffers and have its own logic for writing/reading integers etc. in a portable way, it gets reference deduplication "for free", and dealing with structured objects makes it more natural to add additional properties, etc. in the future without having to do much explicit work for this. e.g. an ImageData might get mapped to {width: w, height: h, data: d}

I like the idea :) It will also prevent us from implementing a wire-format-dependent logic in both V8 and Blink.

 
However, it would almost certainly be slower than having the DOM object's data serialized directly (because of the need to copy it into the V8 heap, and then from there to the serialized buffer (and similar for deserialization) and larger (due to having to write string keys or equivalent).

The possible saving grace would be that I expect that #(DOM objects) << #(plain JS objects) in practice, so while there would be an overhead here it might be minor in practice. Clever but cautious use of an external ArrayBuffer could be used to avoid an extra copy of objects which contain a large block of data internally.

jsbell@: Do you have any idea on this?

As noted in a comment in the doc: I agree with this intuition but don't have actual data. (We don't appear to have histograms on # records written vs. # files/blobs written for Indexed DB, but could collect that easily since we have to stash files/blobs on the side.) 

Anecdotally, high-performance apps using IDB use JSON.serialize to "work around" Chrome's slow serialization, and one can therefore assume they're mostly working with plain JS data. Also, at least for files/blobs (which I suspect are the most common DOM objects stored in IDB), the serialization time will be << the actual file I/O. (So #2 doesn't seem problematic)

I don't have a sense of the use pattern for non-IDB usage, e.g. how common it is to postMessage ImageData (or other cloneable types) to/from a Worker for processing.

Jeremy Roman

unread,
Aug 8, 2016, 1:49:55 PM8/8/16
to Joshua Bell, Kentaro Hara, platform-architecture-dev, Camillo Bruni, Jochen Eisinger, Elliott Sprehn
I meant that the same instance of the serializer must be re-entrant, i.e. it must be possible to continue serializing and adding to the same object identity map. (This isn't hard, necessarily, but it does restrict the implementation.)

User script should only be able to start other serialization processes which don't have to share an object identity map and other state with the surrounding serialization process.
 
2. Don't have Blink serialize directly into the buffer, but instead convert the DOM object into some value V8 can serialize.

roughly: v8::MaybeLocal<v8::Value> SerializeHostObject(v8::Local<v8::Object> object);

This gives a lot of nice properties: Blink doesn't have to deal with raw byte buffers and have its own logic for writing/reading integers etc. in a portable way, it gets reference deduplication "for free", and dealing with structured objects makes it more natural to add additional properties, etc. in the future without having to do much explicit work for this. e.g. an ImageData might get mapped to {width: w, height: h, data: d}

I like the idea :) It will also prevent us from implementing a wire-format-dependent logic in both V8 and Blink.

 
However, it would almost certainly be slower than having the DOM object's data serialized directly (because of the need to copy it into the V8 heap, and then from there to the serialized buffer (and similar for deserialization) and larger (due to having to write string keys or equivalent).

The possible saving grace would be that I expect that #(DOM objects) << #(plain JS objects) in practice, so while there would be an overhead here it might be minor in practice. Clever but cautious use of an external ArrayBuffer could be used to avoid an extra copy of objects which contain a large block of data internally.

jsbell@: Do you have any idea on this?

As noted in a comment in the doc: I agree with this intuition but don't have actual data. (We don't appear to have histograms on # records written vs. # files/blobs written for Indexed DB, but could collect that easily since we have to stash files/blobs on the side.) 

Anecdotally, high-performance apps using IDB use JSON.serialize to "work around" Chrome's slow serialization, and one can therefore assume they're mostly working with plain JS data. Also, at least for files/blobs (which I suspect are the most common DOM objects stored in IDB), the serialization time will be << the actual file I/O. (So #2 doesn't seem problematic)

I don't have a sense of the use pattern for non-IDB usage, e.g. how common it is to postMessage ImageData (or other cloneable types) to/from a Worker for processing.



3. Some combination of the above (let Blink write its own data, _and_ optionally provide another V8 value to serialize).

This would give some advantages of each, at the cost of a larger and more awkward API.

Regarding b), Elliott seems to be fine with inventing a new wire format and worry about how to deprecate the existing format later. That sounds reasonable to me.



--
Kentaro Hara, Tokyo, Japan



--
Kentaro Hara, Tokyo, Japan





--
Kentaro Hara, Tokyo, Japan

--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsub...@chromium.org.
To post to this group, send email to platform-architecture-dev@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/platform-architecture-dev/CAD649j7DFg5RCAG7WuSG1J%2BxfHOvBQSMYoHYSc5U0cx6L6TFpg%40mail.gmail.com.

Kentaro Hara

unread,
Aug 8, 2016, 8:10:12 PM8/8/16
to Joshua Bell, Jeremy Roman, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn
Anecdotally, high-performance apps using IDB use JSON.serialize to "work around" Chrome's slow serialization, and one can therefore assume they're mostly working with plain JS data. Also, at least for files/blobs (which I suspect are the most common DOM objects stored in IDB), the serialization time will be << the actual file I/O. (So #2 doesn't seem problematic) 
 
I don't have a sense of the use pattern for non-IDB usage, e.g. how common it is to postMessage ImageData (or other cloneable types) to/from a Worker for processing.

It would be important to clarify this point. What benchmarks (use cases) are you trying to optimize?

As I mentioned in the other thread, postMessage() is 10x slower in Chrome than in Safari, but I don't know how problematic the slowness is in real-world scenarios.

Kentaro Hara

unread,
Aug 9, 2016, 1:28:22 AM8/9/16
to Joshua Bell, Jeremy Roman, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger, Elliott Sprehn
On Tue, Aug 9, 2016 at 9:09 AM, Kentaro Hara <har...@chromium.org> wrote:
Anecdotally, high-performance apps using IDB use JSON.serialize to "work around" Chrome's slow serialization, and one can therefore assume they're mostly working with plain JS data. Also, at least for files/blobs (which I suspect are the most common DOM objects stored in IDB), the serialization time will be << the actual file I/O. (So #2 doesn't seem problematic) 
 
I don't have a sense of the use pattern for non-IDB usage, e.g. how common it is to postMessage ImageData (or other cloneable types) to/from a Worker for processing.

It would be important to clarify this point. What benchmarks (use cases) are you trying to optimize?

As I mentioned in the other thread, postMessage() is 10x slower in Chrome than in Safari, but I don't know how problematic the slowness is in real-world scenarios.

I raised this topic in an arch-leads meeting and Elliott made a good point: Even if postMessage() is not a bottleneck of IndexedDB operations, it's worth optimizing because the serialization is causing a big jank on the main thread.

But I still think that it's important to clarify what benchmarks we should optimize.

Elliott Sprehn

unread,
Aug 9, 2016, 2:12:30 AM8/9/16
to Kentaro Hara, Joshua Bell, Jeremy Roman, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger
On Mon, Aug 8, 2016 at 10:27 PM, Kentaro Hara <har...@chromium.org> wrote:
On Tue, Aug 9, 2016 at 9:09 AM, Kentaro Hara <har...@chromium.org> wrote:
Anecdotally, high-performance apps using IDB use JSON.serialize to "work around" Chrome's slow serialization, and one can therefore assume they're mostly working with plain JS data. Also, at least for files/blobs (which I suspect are the most common DOM objects stored in IDB), the serialization time will be << the actual file I/O. (So #2 doesn't seem problematic) 
 
I don't have a sense of the use pattern for non-IDB usage, e.g. how common it is to postMessage ImageData (or other cloneable types) to/from a Worker for processing.

It would be important to clarify this point. What benchmarks (use cases) are you trying to optimize?

As I mentioned in the other thread, postMessage() is 10x slower in Chrome than in Safari, but I don't know how problematic the slowness is in real-world scenarios.

I raised this topic in an arch-leads meeting and Elliott made a good point: Even if postMessage() is not a bottleneck of IndexedDB operations, it's worth optimizing because the serialization is causing a big jank on the main thread.

But I still think that it's important to clarify what benchmarks we should optimize.

I'd say we should just build some simple benchmarks, for example:

If we assume that an app like Gmail was saving and loading the inbox through IndexedDB or inside a Worker and postMessaging the data to the mail thread that'd be N rows of emails which are (star, date, array of recipients, array of labels, subject, reply count, ...). My Gmail has 60 rows visible on page load, so N=60 here. How fast can we do that? What about on a phone?

Remember that the main thread realistically has about 6ms to do all of the work to avoid jank during an A period in RAIL, so if that took 2ms and this patch makes it take 1ms, that's giving you 16% more time on the main thread to do other things which is pretty huge. :)

fwiw there's a bunch of code health improvements here too, and the 3.5x is a very large improvement. I think the only question to really understand here is if we should make this backwards compatible with the old format or try to create the new one. Not having to reserve a bunch of enum values in the v8 API isn't very compelling. Is the new format faster? How much less space does it use? Using a new format is added risk so it'd be nice to understand what we get in return (though jsbell@ was happy about it, and his team is the primary one effected, so this is more about diligence than go/no-go :).

Joshua Bell

unread,
Aug 9, 2016, 1:04:36 PM8/9/16
to Elliott Sprehn, Kentaro Hara, Jeremy Roman, platform-architecture-dev, cbr...@chromium.org, Jochen Eisinger
We have the existing Indexed DB perf tests in src/chrome/test/data/indexeddb/ - some of the subtests showed an improvement on the bots with peria@'s recent changes.


The real-world scenario where this matters is database initialization; not gmail loading 60 records, but when it gets 10000 or 100000 records from the server and dumps them all into the DB within one turn of the event loop, stalling the DOM. There's a mitigation - breaking up the insertion work to span multiple turns - but you still get stutter. 



--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsub...@chromium.org.
To post to this group, send email to platform-architecture-dev@chromium.org.

Jeremy Roman

unread,
Aug 9, 2016, 7:54:34 PM8/9/16
to Joshua Bell, Elliott Sprehn, Kentaro Hara, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
On Tue, Aug 9, 2016 at 1:04 PM, Joshua Bell <jsb...@chromium.org> wrote:
We have the existing Indexed DB perf tests in src/chrome/test/data/indexeddb/ - some of the subtests showed an improvement on the bots with peria@'s recent changes.


The real-world scenario where this matters is database initialization; not gmail loading 60 records, but when it gets 10000 or 100000 records from the server and dumps them all into the DB within one turn of the event loop, stalling the DOM. There's a mitigation - breaking up the insertion work to span multiple turns - but you still get stutter. 



On Mon, Aug 8, 2016 at 11:11 PM, Elliott Sprehn <esp...@chromium.org> wrote:


On Mon, Aug 8, 2016 at 10:27 PM, Kentaro Hara <har...@chromium.org> wrote:
On Tue, Aug 9, 2016 at 9:09 AM, Kentaro Hara <har...@chromium.org> wrote:
Anecdotally, high-performance apps using IDB use JSON.serialize to "work around" Chrome's slow serialization, and one can therefore assume they're mostly working with plain JS data. Also, at least for files/blobs (which I suspect are the most common DOM objects stored in IDB), the serialization time will be << the actual file I/O. (So #2 doesn't seem problematic) 
 
I don't have a sense of the use pattern for non-IDB usage, e.g. how common it is to postMessage ImageData (or other cloneable types) to/from a Worker for processing.

It would be important to clarify this point. What benchmarks (use cases) are you trying to optimize?

As I mentioned in the other thread, postMessage() is 10x slower in Chrome than in Safari, but I don't know how problematic the slowness is in real-world scenarios.

I raised this topic in an arch-leads meeting and Elliott made a good point: Even if postMessage() is not a bottleneck of IndexedDB operations, it's worth optimizing because the serialization is causing a big jank on the main thread.

But I still think that it's important to clarify what benchmarks we should optimize.

I'd say we should just build some simple benchmarks, for example:

If we assume that an app like Gmail was saving and loading the inbox through IndexedDB or inside a Worker and postMessaging the data to the mail thread that'd be N rows of emails which are (star, date, array of recipients, array of labels, subject, reply count, ...). My Gmail has 60 rows visible on page load, so N=60 here. How fast can we do that? What about on a phone?

For what it's worth, my current prototype shows considerable gains on both postMessage and IndexedDB put, especially on large values. I can construct 
 
Remember that the main thread realistically has about 6ms to do all of the work to avoid jank during an A period in RAIL, so if that took 2ms and this patch makes it take 1ms, that's giving you 16% more time on the main thread to do other things which is pretty huge. :)

fwiw there's a bunch of code health improvements here too, and the 3.5x is a very large improvement. I think the only question to really understand here is if we should make this backwards compatible with the old format or try to create the new one. Not having to reserve a bunch of enum values in the v8 API isn't very compelling. Is the new format faster? How much less space does it use? Using a new format is added risk so it'd be nice to understand what we get in return (though jsbell@ was happy about it, and his team is the primary one effected, so this is more about diligence than go/no-go :).

I'd expect it to be faster (to be sure I'd really have to write and optimize a V8-based implementation of both formats), and it is somewhat smaller (10-30%-ish, depending on what you're writing -- though for large values I'd expect LevelDB's compression to wipe away the difference for IndexedDB). I'd argue that rationality is noteworthy, and in particular, the current format does some rather strange things with endianness that make it non-portable. For instance, a double which is represented with big-endian bytes:

aa bb cc dd ee ff gg hh

will be written out in native endianness, except with pairs of bytes swapped if the machine is little-endian, so:

little-endian machine: gg hh ee ff cc dd aa bb
big-endian machine: aa bb cc dd ee ff gg hh

If the existing output was two-byte-aligned, that is. Otherwise, on LE machines it's

hh XX ff gg dd ee bb cc XX aa

where the XX are from the stuff serialized next to it (or padding)

This is an artifact of the current serialization logic writing various kinds of bytes into a Vector<UChar>, and then trying to fix up the endianness of those characters. There are a few of these quirks that I think would make precisely replicating the existing format require great attention to detail, if it were even wanted in V8.

To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsubsc...@chromium.org.

To post to this group, send email to platform-architecture-dev@chromium.org.

Kentaro Hara

unread,
Aug 10, 2016, 12:55:54 AM8/10/16
to Jeremy Roman, Joshua Bell, Elliott Sprehn, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
I think that Elliott's point is just that if there is a way to make postMessage() XX% faster without changing the existing wire format, it would be better to try that first. Then we can consider making it even faster by inventing a new, nicer format.

Jeremy Roman

unread,
Aug 10, 2016, 3:43:21 PM8/10/16
to Kentaro Hara, Joshua Bell, Elliott Sprehn, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
I just hacked up approximately the current wire format (excluding some of the byte swapping for now, though I'd have to add that back to do so explicitly), and its speed is within measurement error of my experiment with a new wire format. The wins of a new wire format are, roughly:
- minor win on size
- moderate simplicity win
- improved portability (though a new version of the existing one with endianness fixes would be possible) -- the current format cannot be read on a CPU with different endianness, unlike most of the Chrome profile
- built-in separation between V8 and Blink types (else, we'll basically have to define some way to separate the tag namespace, reserving the ones already in use and separating out the ones available for other uses)

If we do want a new format, I'd prefer not to go through the work of implementing both of them in V8, and then saddling not only Blink but also V8 with maintaining the old format. But it is viable to maintain the existing wire format if we're willing to accept the aforementioned quirks. (The pairwise byte swapping, at least, is a postprocessing step that could thankfully be left in Blink.)

Kentaro Hara

unread,
Aug 10, 2016, 10:43:07 PM8/10/16
to Jeremy Roman, Joshua Bell, Elliott Sprehn, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
That's a good news, Jeremy! Then I'd suggest separating this optimization project into two phases:

1) We make postMessage() faster keeping the existing wire format. This wouldn't be controversial.
2) Once we get there, we can consider how to make the wire format better and make it even faster (if needed).

Elliott Sprehn

unread,
Aug 10, 2016, 10:49:03 PM8/10/16
to Kentaro Hara, Jeremy Roman, Joshua Bell, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
One suggestion would be to put the old format in v8 under some option like kWeirdAncientWebKitSerializationFormatProbablyNotWhatYouWant (literally something awful like that). That'll discourage anyone else using the v8 API from using it. Then I'd start implementing the new serialization format under something like kStandardFormat. Once it works we can try switching blink over to it and deleting the old mode.

That separates the risk of rewriting the serializer/deserializer from the risk of creating a new format.

I do still think we should try doing both, it doesn't seem like too much work here. I don't think we should be worried about some churn in the v8 API, we've done all kinds of things there before.

Kentaro Hara

unread,
Aug 10, 2016, 10:59:44 PM8/10/16
to Elliott Sprehn, Jeremy Roman, Joshua Bell, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
On Thu, Aug 11, 2016 at 11:48 AM, Elliott Sprehn <esp...@chromium.org> wrote:
One suggestion would be to put the old format in v8 under some option like kWeirdAncientWebKitSerializationFormatProbablyNotWhatYouWant (literally something awful like that). That'll discourage anyone else using the v8 API from using it. Then I'd start implementing the new serialization format under something like kStandardFormat. Once it works we can try switching blink over to it and deleting the old mode.

That separates the risk of rewriting the serializer/deserializer from the risk of creating a new format.

I do still think we should try doing both, it doesn't seem like too much work here. I don't think we should be worried about some churn in the v8 API, we've done all kinds of things there before.

I totally agree we should do both in the end, but I think we should have more careful discussion before inventing a new format, including a couple of points jsbell@ pointed out in the document (e.g., partial decode of a serialized value). It's not easy to change the format once we ship it to the wild.

Joshua Bell

unread,
Aug 11, 2016, 12:40:57 PM8/11/16
to Kentaro Hara, Elliott Sprehn, Jeremy Roman, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
I'm liking this plan!

One worry comes to mind: once the old serialization format appears in v8 (initially as the only format), are there likely to be non-Chrome users of v8 (e.g. some node.js binding to a database) that will make it more difficult to slim down/remove in the future as we wean Chrome off of it?

We might mitigate that somewhat by surrounding the API with big scary warnings and ugly names (like Elliott suggests) even while it is the only supported format, until we have a format that we really are committing to maintaining indefinitely.

Another idea to throw out: at the point where we have a new wire format, we can immediately switch non-persistent uses in Blink over exclusively. Only the persistent uses (i.e. IndexedDB) need access to the legacy decode. We could move all of the legacy decode logic back into Blink; it would be slower (i.e. same perf as today) for values persisted before the switch but still fast for fresh data and it keeps v8 cleaner. Blink would need to maintain knowledge of the version signature.

It probably doesn't matter where the legacy decode logic lives, just thinking of ways not to pollute v8 too badly.

Jeremy Roman

unread,
Aug 11, 2016, 2:28:21 PM8/11/16
to Joshua Bell, Kentaro Hara, Elliott Sprehn, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
On Thu, Aug 11, 2016 at 12:40 PM, Joshua Bell <jsb...@chromium.org> wrote:
I'm liking this plan!

One worry comes to mind: once the old serialization format appears in v8 (initially as the only format), are there likely to be non-Chrome users of v8 (e.g. some node.js binding to a database) that will make it more difficult to slim down/remove in the future as we wean Chrome off of it?

We might mitigate that somewhat by surrounding the API with big scary warnings and ugly names (like Elliott suggests) even while it is the only supported format, until we have a format that we really are committing to maintaining indefinitely.

That's my worry as well. It would be great to have feedback from the V8 team about how this looks from their end. (jochen/cbruni?)
 
Another idea to throw out: at the point where we have a new wire format, we can immediately switch non-persistent uses in Blink over exclusively. Only the persistent uses (i.e. IndexedDB) need access to the legacy decode. We could move all of the legacy decode logic back into Blink; it would be slower (i.e. same perf as today) for values persisted before the switch but still fast for fresh data and it keeps v8 cleaner. Blink would need to maintain knowledge of the version signature.

This was the plan I originally proposed, except that I didn't intend to put the new format in V8.

Kentaro Hara

unread,
Aug 11, 2016, 9:13:01 PM8/11/16
to Jeremy Roman, Joshua Bell, Elliott Sprehn, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
The plan sounds reasonable to me.

Camillo Bruni

unread,
Aug 12, 2016, 5:01:08 AM8/12/16
to Kentaro Hara, Jeremy Roman, Joshua Bell, Elliott Sprehn, platform-architecture-dev, Camillo Bruni, Jochen Eisinger
There has been an ongoing discussion on the node-side for faster object serialization other than JSON: https://github.com/nodejs/node/issues/6300

So there is the chance that if we do this in an unstructured way the old format will get picked up too soon.
I think we have to come up with a decent road map and communicate this to the node people.

To unsubscribe from this group and stop receiving emails from it, send an email to platform-architect...@chromium.org.
To post to this group, send email to platform-arc...@chromium.org.



--
Kentaro Hara, Tokyo, Japan



--
Kentaro Hara, Tokyo, Japan



--
Kentaro Hara, Tokyo, Japan



--
Kentaro Hara, Tokyo, Japan
--
Camillo Bruni | Software Engineer, V8 | Google Germany GmbH | Erika-Mann Str. 33, 80636 München 
Registergericht und -nummer: Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Reply all
Reply to author
Forward
0 new messages