Serialization Side Effects

43 views
Skip to first unread message

Adrienne Walker

unread,
Jul 30, 2020, 2:21:30 PM7/30/20
to v8-u...@googlegroups.com
Hi-

Is there any way to know from a v8::Value whether serializing it will have side effects (at all or on particular properties)?

Context: I was investigating indexeddb write performance in Chrome and discovered that when using inline keys, values are serialized and then immediately deserialized again to get the key value(s) out. This is because serialization might have side effects, so it's not safe to access the key from the original value. This takes quite a bit of time, especially on larger or more complicated values.

I was wondering if there was anything that exists (and I just can't see from reading the code) or that we could add to make the calling code smarter about this.

Ben Noordhuis

unread,
Jul 31, 2020, 12:57:08 AM7/31/20
to v8-users
On Thu, Jul 30, 2020 at 8:21 PM Adrienne Walker <en...@chromium.org> wrote:
> Is there any way to know from a v8::Value whether serializing it will have side effects (at all or on particular properties)?

Apart from checking whether it's primitive (v->IsNullOrUndefined() ||
v->IsBoolean() || ...), I believe the answer is 'no' . Non-primitive
values can have getters and getters execute arbitrary code.

Checking for only simple properties recursively is an option but
probably not faster and you'll need to handle cycles and a ton of edge
cases (what if the property is a pending promise? what if it's a
WeakMap? etc.)

The debugger has a "side-effect-free evaluate" mode but that operates
on functions, not values. You could use it to check getters for side
effects (and promises, and...) but the algorithm is conservative (can
report side effects when there are none) and runs in O(n) time
relative to the function's bytecode size.

(It's actually worse than that. It does bytecode analysis + runtime
evaluation in a throwaway context. I suspect it hangs on a busy loop.
That makes it... O(Infinity)?)

The relevant methods are DebugEvaluate::Global() and
Debug::PerformSideEffectCheck(). Neither are currently exposed by the
public API except indirectly, through the Debugger.evaluate and
Debugger.evaluateOnCallFrame inspector protocol commands.

Adrienne Walker

unread,
Aug 4, 2020, 4:32:36 PM8/4/20
to v8-u...@googlegroups.com
On Thu, Jul 30, 2020 at 9:57 PM Ben Noordhuis <in...@bnoordhuis.nl> wrote:
On Thu, Jul 30, 2020 at 8:21 PM Adrienne Walker <en...@chromium.org> wrote:
> Is there any way to know from a v8::Value whether serializing it will have side effects (at all or on particular properties)?

Apart from checking whether it's primitive (v->IsNullOrUndefined() ||
v->IsBoolean() || ...), I believe the answer is 'no' . Non-primitive
values can have getters and getters execute arbitrary code.

Checking for only simple properties recursively is an option but
probably not faster and you'll need to handle cycles and a ton of edge
cases (what if the property is a pending promise? what if it's a
WeakMap? etc.)

Can I use HasRealNamedProperty/GetRealNamedProperty to see if I can access those properties without side effects and then check if those values are primitive from there? I suspect that most indexeddb keys being provided here are primitive string values inside a single simple object, and so am I trying to figure out how to fast path this case.  Is there a way to tell if a property is one of these complicated edge cases that you mention?
 
The debugger has a "side-effect-free evaluate" mode but that operates
on functions, not values. You could use it to check getters for side
effects (and promises, and...) but the algorithm is conservative (can
report side effects when there are none) and runs in O(n) time
relative to the function's bytecode size.

Given the potential performance issues there, this doesn't sound like a plausible approach.

The only other thing we thought of was if there was some way to have some sort of observer as a part of serialization that could record the values without having to deserialize again to access.  I worry that this might be too invasive to v8's serialization though.

Ben Noordhuis

unread,
Aug 4, 2020, 5:04:10 PM8/4/20
to v8-users
On Tue, Aug 4, 2020 at 10:32 PM Adrienne Walker <en...@chromium.org> wrote:
>
> On Thu, Jul 30, 2020 at 9:57 PM Ben Noordhuis <in...@bnoordhuis.nl> wrote:
>>
>> On Thu, Jul 30, 2020 at 8:21 PM Adrienne Walker <en...@chromium.org> wrote:
>> > Is there any way to know from a v8::Value whether serializing it will have side effects (at all or on particular properties)?
>>
>> Apart from checking whether it's primitive (v->IsNullOrUndefined() ||
>> v->IsBoolean() || ...), I believe the answer is 'no' . Non-primitive
>> values can have getters and getters execute arbitrary code.
>>
>> Checking for only simple properties recursively is an option but
>> probably not faster and you'll need to handle cycles and a ton of edge
>> cases (what if the property is a pending promise? what if it's a
>> WeakMap? etc.)
>
>
> Can I use HasRealNamedProperty/GetRealNamedProperty to see if I can access those properties without side effects and then check if those values are primitive from there? I suspect that most indexeddb keys being provided here are primitive string values inside a single simple object, and so am I trying to figure out how to fast path this case.

Yes, that could work, but you'll have to recurse into non-primitive
property values. GetRealNamedProperty() and co are fairly slow
(although probably not much slower than Get() - the whole C++ API is
fairly slow compared to native JS property access) so you'll probably
have to benchmark whether it's an improvement over the naive approach.

> Is there a way to tell if a property is one of these complicated edge cases that you mention?

Exhaustive v->IsPromise() || v->IsWeakMap() || ... checks. :-)

Adam Klein

unread,
Aug 5, 2020, 8:20:12 PM8/5/20
to v8-users, jbr...@chromium.org
To the question of whether something might be "invasive to v8's serialization", I think the history here is important: serialization used to live in Blink, but was moved into V8 (by jbroman@, CCed) for performance reasons. Given that the problem you're investigating is another performance issue in serialization, making such an invasive change should at least be on the table, I'd argue.

- Adam
 

--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/CA%2B1_fV8rAhA8arck6C%2BXoQC%3DaMx1S_cARXGbymiG-%3DCHko3_5Q%40mail.gmail.com.

kosit la-orngsri

unread,
Aug 6, 2020, 12:42:56 AM8/6/20
to v8-users
สวัสดี

ในวันที่ วันพฤหัสบดีที่ 6 สิงหาคม ค.ศ. 2020 เวลา 7 นาฬิกา 20 นาที 12 วินาที UTC+7 ad...@chromium.org เขียนว่า:

Jeremy Roman

unread,
Aug 6, 2020, 10:19:02 AM8/6/20
to Adam Klein, v8-users
[resending because v8-users rejects email from non-subscribers]

Right, you cannot tell in general whether something can be serialized without side effects (though you can tell that certain things definitely won't have side effects). And if the spec says the value should be serialized (or content expects it), not doing so is a web-visible behavior change at present (albeit reasonably likely to be web-compatible).

I would worry about the performance of such an observer, and it seems like the observer would also need to be fairly complicated anyway in general. For example, you might want key path "b.c.d", but b.c turns out to just be a back-reference to a.x earlier (which the serializer basically knows as an object index at that point), so you might need to reconstruct a substantial part of the object graph anyway. I don't think it's off the table, but it seems quite non-trivial if you want this to be a general solution.

If the typical cases are fairly simple, then it seems like we just be pretty naive and only proceed if all we see are simple objects with no accessors properties/interceptors/etc and fall back to whatever we currently do if that isn't the case. That seems somewhat duplicative, sure, but seems like it would give you 90%+ coverage with much less complexity.
Reply all
Reply to author
Forward
0 new messages