I created a 16MB buffer for every client request, and it's a local
variable, which means will be GCed soon. When I run a benchmark, the qps
was much less than expected, only hundreds per second. By the cpu profiler,
I found that GC used 77% of total cpu time. With --trace_gc option, there
are many lines like this:
785292 ms: Mark-sweep 9.5 (46.0) -> 9.4 (46.0) MB, 14 ms [external memory
allocation limit reached] [GC in old space requested].
I've found in node src, and it's output by
Heap::AdjustAmountOfExternalAllocatedMemory(), which is called in
constructor and destructor of Class Buffer. And when
amount_since_last_global_gc is more than external_allocation_limit_(default
16MB), CollectAllGarbage() will be executed, which means with
(16MB/buf_size) times new creation of Buffer, CollectAllGarbage() will be
called once, which consume 14ms.
external_allocation_limit_ is relative with v8-option of
--max_new_space_size, but max_new_space_size is not available when v8
snapshot is used, which is default true. I rebuild node with "./configure
--without-snapshot", and run server with "--max_new_space_size=819200", and
the performance is improved about 3 times. Although, GC is still expensive,
with 30% of total cpu time.
So, you should be careful with large buffers.