--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/2944463a-a964-40a9-91f4-fba7a74debe8n%40googlegroups.com.
Thanks for the explanation. The second screenshot is 1520ms~1560ms in fact. The snapshots were taken from the first major gc running webtooling. Because I believe the first major gcs are relatively predictable and consistent. (Maybe I am wrong).I will check with your suggestion later. Does the gc-tracer already have something to record the mark-bytes/ms ? Or I need to try to implement one?
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/3e8a9fd6-0a7f-424a-843c-9d3be714ecebn%40googlegroups.com.
Test on an AWS server:Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz (16 cores)Running webtooling with d8Here is the v8.gc tracing data:Raw tracing log files: https://drive.google.com/drive/folders/1NRQCjNQ5x9RDHA0AVvTOr8neuARyheq-?usp=sharingFor the concurrent marking speed, I simply sum up the KB and ms in the trace log and divide them (KB/ms),For each worker:baseline: 640.5, 669.86, 684.72
15 workers:508.91, 515.51, 503.51
Maybe it's better to replace the fixed kMaxTasks with core number? Just like what parallel compaction do.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/50b7b577-3304-4227-ae7c-1b59f8d4690bn%40googlegroups.com.
> Can you relate the time spent more in concurrent marking vs the time saved on the main thread (incremental steps above)?I am not 100% sure about that but here is my understanding:When more workers are activated, the speed of each worker may decrease but the total speed will increase.Rough calculation:508.91 KB/ms/worker *15 worker=7633.65 KB/ms669.86 KB/ms/worker * 7 worker = 4689.02 KB/ms
The concurrent marking task is scheduled when the incremental marking starts . For each steps of incremental marking, it need to mark "bytes_to_process" heapobjects. When the local worklists is empty, the incremental marking will complete and invoke major gc . That means, only when concurrent marking complete most marking works, can the main thread local worklist be empty, otherwise it can always steal a segment from global worklist. If the concurrent marking have more worker to mark heapobjects faster, the incremental marking can also complete earlier. This may explain why the occurrence and duration of V8.GC_MC_INCREMENTAL are decreased. And the execution will also benefit from that without write barrier introduced by incremental marking.> What we can always do is exposing a flag though for users that don't care about this tradeoff.I suggest to expose that a flag.
I am not urgent with that, just share some findings and confusions with community. Thanks!
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/814f0eb4-a805-4d90-9123-165889eac1ddn%40googlegroups.com.