Strange Scavenge allocation failure cause dramatical performance dropdown

862 views
Skip to first unread message

Huabin Ling

unread,
Dec 5, 2017, 5:24:46 AM12/5/17
to v8-users
Hi, 

I'm recently developing a rendering engine for WebGL, but met a serious performance issue caused by Scavenge allocation failure issue.

I have a test case which add more and more sprites on the screen, when it's arriving at about 7000 (and stop adding more), the cost of engine is about 6ms per frame, but it grows to 20 ms in seconds. When I add --trace_gc to Chrome, I see the behavior is totally related to a huge amount of Scavenge allocation failure log, like the following


[33525:0x7fa78280fc00]   955497 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955580 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955663 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955747 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955830 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955913 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955998 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956084 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956170 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956254 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956335 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956418 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956502 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956584 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956667 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956752 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

... (it keeps printing very fast)


However, when I do a allocation profiling, it shows that the JS heap size is quite low (36mb), and I generate nearly nothing in heap



Some additional information, we tried very hard to reduce memory allocation in the engine core, so we used a lot of object pools and pre-generated objects, including some large array buffers, largest one is 20000 * 6 floats, and we keep recycling / updating these objects.


I also tried to check jit status, no hot functions is being deoptimized, the scavenge log is the only thing related to the strange performance drop down. 


Does anyone have a clue on my problem ? Thanks a lot for helping


Huabin


Huabin Ling

unread,
Dec 5, 2017, 5:34:04 AM12/5/17
to v8-users
By the way, the allocation failure log doesn't occur before the performance drop down.

If a switch page and wait a moment, then return to the test case page, there is chance that chrome stops printing "allocation failure" log and the frame time returns to normal, but it will drop down again in a short time.

Michael Lippautz

unread,
Dec 5, 2017, 12:56:34 PM12/5/17
to v8-users
At first sight, the Scavenge pause times seem unrelated to your problem.

Can you take a trace using Chrome's tracing infrastructure? See [1]. This one will help dig down into what is causing the slowdown.

Rest inlined.


On Tue, Dec 5, 2017 at 11:24 AM Huabin Ling <huabi...@cocos2d-x.org> wrote:
Hi, 

I'm recently developing a rendering engine for WebGL, but met a serious performance issue caused by Scavenge allocation failure issue.


"allocation failure" means that V8 needs to do a garbage collection (Scavenge in this case) as the young generation is out of allocatable memory. This usually happens when JavaScript allocates objects.
 
I have a test case which add more and more sprites on the screen, when it's arriving at about 7000 (and stop adding more), the cost of engine is about 6ms per frame, but it grows to 20 ms in seconds. When I add --trace_gc to Chrome, I see the behavior is totally related to a huge amount of Scavenge allocation failure log, like the following


[33525:0x7fa78280fc00]   955497 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955580 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955663 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955747 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 


Taking those samples I see that a Scavenge happens roughly every ~83ms. The time the Scavenge takes is 0.3ms. From a garbage collection perspective the utilization of anything else than GC is 99.6% which is pretty good.
 
All Scavenges happen regularly indicating that JavaScript is allocating at a constant rate. Also, they all take up 0.3ms which is probably as fast as it can get.

[33525:0x7fa78280fc00]   955830 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955913 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   955998 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956084 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956170 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956254 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956335 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956418 ms: Scavenge 37.0 (46.5) -> 35.1 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956502 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956584 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956667 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

[33525:0x7fa78280fc00]   956752 ms: Scavenge 37.0 (46.5) -> 35.2 (46.5) MB, 0.3 / 0.0 ms  allocation failure 

... (it keeps printing very fast)


However, when I do a allocation profiling, it shows that the JS heap size is quite low (36mb), and I generate nearly nothing in heap



Some additional information, we tried very hard to reduce memory allocation in the engine core, so we used a lot of object pools and pre-generated objects, including some large array buffers, largest one is 20000 * 6 floats, and we keep recycling / updating these objects.


I also tried to check jit status, no hot functions is being deoptimized, the scavenge log is the only thing related to the strange performance drop down. 


Does anyone have a clue on my problem ? Thanks a lot for helping


Huabin


--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Huabin Ling

unread,
Dec 5, 2017, 9:55:57 PM12/5/17
to v8-users
Thanks for the informations Michael

I attached a saved trace record here, and I notice that in each frame there is a relatively long "CommandBufferProxyImpl::WaitForToken", which doesn't happen when the sprite number is low. You are probably right that scavenge process may be not related.


But I'm a little confused, trace tool doesn't seem to capture any CPU level usage by my program, so I'm not sure how the renderer waitForToken is related to my logic code.
trace_cocoscreator_new_renderer.json.gz

Ben Noordhuis

unread,
Dec 5, 2017, 10:05:39 PM12/5/17
to v8-users
On Wed, Dec 6, 2017 at 3:55 AM, Huabin Ling <huabi...@cocos2d-x.org> wrote:
> Thanks for the informations Michael
>
> I attached a saved trace record here, and I notice that in each frame there is a relatively long "CommandBufferProxyImpl::WaitForToken", which doesn't happen when the sprite number is low. You are probably right that scavenge process may be not related.
>
>
> But I'm a little confused, trace tool doesn't seem to capture any CPU level usage by my program, so I'm not sure how the renderer waitForToken is related to my logic code.

https://cs.chromium.org/chromium/src/gpu/ipc/client/command_buffer_proxy_impl.cc?type=cs&q=CommandBufferProxyImpl::WaitForToken&l=319

Basically, it looks like your code is waiting for the GPU.

Huabin Ling

unread,
Dec 8, 2017, 1:19:55 AM12/8/17
to v8-users
Thanks guys, appreciate your help.

So to conclude the problem, I finally found the stupid reason, actually it was because somebody changed the webgl context configuration by adding "antialias" as true, which caused the pixel fill rate too high and hang up the GPU rendering. As we are also enabling retina screen adaptation on Mac, the original canvas is kind of scaled up to have 4 * 4 times pixels, which is huge.

Still an odd behavior during the investigation, I upgraded Chrome from 62.0.3202.94 to 63.0.3239.84, and the performance is even worse, seems like Chrome changed the antialias behavior.

Now the problem is solved, just want to conclude that I was miss leading by two informations,

1. Scavenge allocation failure log
2. GPU ClientWait in Chrome tracing, this makes me thought the GPU was actually free and waiting for the previous frame

Again, thanks for all the help
Huabin
Reply all
Reply to author
Forward
0 new messages