Javascript object as hash table with 7 million plus entries slows V8

433 views
Skip to first unread message

Joran Greef

unread,
May 11, 2012, 3:50:00 AM5/11/12
to v8-u...@googlegroups.com
I am using V8 as part of Node and have written a Javascript implementation of Bitcask, using a Javascript object as a hash to keep file offsets in memory.

This object has 7 million entries and I'm noticing that while the JS code is resting, doing nothing, V8 is hitting 100% CPU every few seconds and doing this continually.

Attached is the full result of running V8 with --prof.

And of particular interest:

[C++]:
   ticks  total  nonlib   name
  73615   43.1%   43.1%  v8::internal::StaticMarkingVisitor::VisitUnmarkedObjects
  68436   40.1%   40.1%  _accept$NOCANCEL
   4796    2.8%    2.8%  v8::internal::FlexibleBodyVisitor<v8::internal::StaticMarkingVisitor, v8::internal::JSObject::BodyDescriptor, void>::VisitSpecialized<40>

Should I be using many smaller hashes to keep this overhead down? i.e. some sort of sparse hash implementation? Or using key mod 1000 to determine the hash it should be in?

Does V8 have limits on hash table sizes?

Thanks.
profile.txt

Florian Schneider

unread,
May 20, 2012, 4:52:12 PM5/20/12
to v8-u...@googlegroups.com
You're probably almost hitting the maximum heap size. The profile indicates that the program spends almost 100% time in the garbage collector. You could try running with a larger heap, but in general JS objects in V8 are not optimized to be used as large hash tables. Most likely, it would be best to implement your own hash table data structure using an array as a backing store instead of using a JS object as a hash table.

2012/5/11 Joran Greef <jo...@ronomon.com>

Joran Greef

unread,
May 21, 2012, 2:12:20 AM5/21/12
to v8-u...@googlegroups.com
I was already using --max_old_space_size=32768, was it the GC iterating through every key in the hash? Am running the process now with --nouse_idle_notification and that has brought the CPU right down. I also implemented my own hash as described here: https://groups.google.com/d/msg/nodejs/yLbuS7YONTI/p_CXyj-XIwwJ

A benefit with a binary hash is that you can allocate enough buckets in advance to avoid resizes.
Reply all
Reply to author
Forward
0 new messages