Serializing objects; quickly!

61 views
Skip to first unread message

hapal...@googlemail.com

unread,
Jun 2, 2011, 4:44:25 PM6/2/11
to v8-users
I need to serialize moderately complex objects with 1-100's of mixed
type properties.

JSON was used originally, then I switched to BSON which is marginally
faster.

Encoding 10000 sample objects (V8 3.3.10)

JSON: 1807mS
BSON: 1687mS

I want an order of magnitude increase; it is having a ridiculously bad
impact on the rest of the system.

Part of the motivation to move to BSON is the requirement to encode
binary data (BinaryF), so JSON is (now) unsuitable.

Profiled BSON performance hot-spots

- (unavoidable?) conversion of UTF16 V8 JS strings to UTF8.
- malloc and string ops inside the BSON library

The BSON encoder is based on the Mongo BSON library.

A native V8 binary serializer might be wonderful, yet as JSON is
native and quick to serialize I fear even that might not provide the
answer. Perhaps my best bet is to optimize the heck out of the BSON
library or write my own plus figure out far more efficient way to pull
strings out of V8. One tactic might be to add UTF16 support to BSON to
skip that step.

Is there anyway to use the snapshot serializer for this purpose? Would
it even approach the performance I'm looking for.

Is it feasible to cache string conversion results using the object
hash?

Any other ideas?

Thanks,

Stuart.

bradley.meck

unread,
Jun 2, 2011, 5:18:27 PM6/2/11
to v8-u...@googlegroups.com
Rather than focusing on scaling serialization, changing architecture might be more appropriate (using a stream based serializer, removing redundant data by using references of some kind (even if in userland), etc.). Often times when we want to share data between things for v8 (I'm assuming out of process / Isolate) we are going to hit a wall not by the serialization, but the raw limitation of computing. IE: dictionary construction time. It may not help you serialize things faster, but a leaner architecture might be a solution for you, especially if you can isolate out data that can be processed parallel without relying on each other. Without knowing more about the problem, I cannot comment beyond that.

hapal...@googlemail.com

unread,
Jun 2, 2011, 9:36:22 PM6/2/11
to v8-users
Roughly speaking:

It's a Erlang-like message-parsing platform and the serialization is
required to monitor the messages as they flow through the system.There
are thousands of processes and messages (objects) sent per second, and
along with other meta-data, all are serialized. Its exceptionally
efficient until it hits serialization. The messages are schema-less,
and user-defined.

If I could speed up the v8:String conversion it would already help
tremendously. I'll perform some tests bypassing the UTF16->UTF8 as see
what impact that has.

Stuart.
Reply all
Reply to author
Forward
0 new messages