Hello v8 folks,I filed this Chromium issue (https://code.google.com/p/chromium/issues/detail?id=557466) a couple days ago, but I attended the Chrome Dev Summit in Mountain View yesterday and was directed to this list for V8-related internal issues. So here I am!I've been poking around at the background script streaming thread logic for the past week or so to see if I could increase the amount of async script parsing that we do off the main thread when loading facebook.com (disclaimer: I work for Facebook). Facebook sends a pretty sizable pile of JS down to the browser on page load, most of which is not very hot (e.g. in Safari, ~87% of the ~6500 executed functions never make it past the bytecode interpreter). When running Instruments on a trunk release build of Chromium I see ~20% of overall execution time spent parsing JS and generating code on the main thread while I see < 1% CPU time spent on the script streaming thread. According to some console-fu, it seems that ~70% of the script tags in the document on home.php are marked with the async attribute, so given how much time we spend on the main thread parsing JS I figured that there should also be plenty of work for the script streaming thread.I think I've improved utilization of the script streaming thread slightly with a few tweaks including:(1) reducing the small script threshold from 30 KB, which was preventing the background thread from even attempting to parse many scripts.
(2) forcing the script streaming thread to perform eager rather than lazy parsing in hopes of reducing the total amount of parsing happening on the main thread.
(3) allowing the main thread to retry posting parsing tasks to the background thread at a later time if the background thread is currently busy.
These tweaks may have moved the needly slightly (I'm not 100% convinced since I haven't done any rigorous measurements), but I'm still seeing a suspiciously high amount of CPU time spent on the main thread parsing JS along with a suspiciously low amount of parsing activity on the background thread when looking at both Instruments profiles and Chrome traces recorded with chrome://tracing. In a simplified test page that tries loading multiple async scripts concurrently, I noticed that even when the script streaming thread parses a script, it then sends a task back to the main thread which then proceeds to do a bunch more lazy parsing! I realize that the background thread isn't allowed to generate code so it must give it back to the main thread, but this additional parsing time surprised me.
I tried adding some more tracing events to get a better picture of what's going on at various times. For example, it seems like we receive JS scripts on the main thread in clusters rather than in an evenly spaced manner which, due to the fact that the background thread synchronously parses a single JS file at a time, causes the other scripts to have to wait to be parsed. A pool of worker threads might help some, but probably won't be sufficient on devices with low core counts. It might be better if we could incrementally parse JS and multiplex multiple JS files on one or two parsing threads. Incremental parsing sounds like a big, potentially invasive effort though.
I realize that forcing more parsing onto the background thread could cause adverse side effects related to page interactivity, but I'd like to be able to experiment with different amounts of parsing on either the main thread or the background thread. I was wondering, given the steps I've taken, if I've missed or misunderstood anything. I think off-main-thread parsing is a great idea, so any help debugging/understanding/improving the script streaming thread would be very much appreciated :-) And while the script streaming stuff was just the first weird thing I stumbled across, given the importance of parsing in general on complex, JS-heavy sites like Facebook, any additional info related to how V8 does parsing and how sites might be able to take advantage of that would be very helpful.
(1) reducing the small script threshold from 30 KB, which was preventing the background thread from even attempting to parse many scripts.We should try this. Which threshold did you choose?
(2) forcing the script streaming thread to perform eager rather than lazy parsing in hopes of reducing the total amount of parsing happening on the main thread.I expect this to be _very_ bad for memory consumption. The V8 AST - the result of a parse - is unfortunately rather large, so that not parsing everything upfront conserves a lot of memory. I'm skeptical this change would be beneficial in general and think this would require some rather careful benchmarking across different websites and device types.That said, I've recently run across several situations - usually large, framework-y web apps - where excessive lazy- and then re-parsing was a problem. I'm quite certain our lazy parse heuristic could be improved; I don't really have a plan how. If you have any suggestions, I'd be eager to listen.Generally speaking, the heuristic tries to "guess" whether a given function will be called during the initial script evaluation. This gets called a lot and hence needs to be fast, and since it needs to decide before the function is even parsed, it only has the function header to go on.
(3) allowing the main thread to retry posting parsing tasks to the background thread at a later time if the background thread is currently busy.This might make sense, but I guess it's difficult to do, since there isn't really a good point when that should get done. See below on task scheduling, though.These tweaks may have moved the needly slightly (I'm not 100% convinced since I haven't done any rigorous measurements), but I'm still seeing a suspiciously high amount of CPU time spent on the main thread parsing JS along with a suspiciously low amount of parsing activity on the background thread when looking at both Instruments profiles and Chrome traces recorded with chrome://tracing. In a simplified test page that tries loading multiple async scripts concurrently, I noticed that even when the script streaming thread parses a script, it then sends a task back to the main thread which then proceeds to do a bunch more lazy parsing! I realize that the background thread isn't allowed to generate code so it must give it back to the main thread, but this additional parsing time surprised me.Not sure what you're describing here. There's two things:- There's a 'finishing' step to background parsing that needs to happen on the main thread. Basically, background parsing can't modify any global state while the main thread might mutate it, so the background parser builds up a separate data structure, which the main thread will then need to patch into the heap. This happens for every background parse, and I don't see a chance to avoid this without a major rewrite.
- The background parse thread uses the same lazy/eager parse heuristic as the main thread. If it guesses wrongly, then the main thread will have to do a lot of synchronous parsing while it executes.You should be able to distinguish these in chrome://tracing, as the first happens before any execute (for that script), while the second happens during an execute.
I tried adding some more tracing events to get a better picture of what's going on at various times. For example, it seems like we receive JS scripts on the main thread in clusters rather than in an evenly spaced manner which, due to the fact that the background thread synchronously parses a single JS file at a time, causes the other scripts to have to wait to be parsed. A pool of worker threads might help some, but probably won't be sufficient on devices with low core counts. It might be better if we could incrementally parse JS and multiplex multiple JS files on one or two parsing threads. Incremental parsing sounds like a big, potentially invasive effort though.Hmm. There are some changes to Chromium's task scheduling in progress, that might help us here. And with (3) above. That work is still ongoing, though, and I'm not terribly familiar with it, so I'd need to find the right people to talk to.
I realize that forcing more parsing onto the background thread could cause adverse side effects related to page interactivity, but I'd like to be able to experiment with different amounts of parsing on either the main thread or the background thread. I was wondering, given the steps I've taken, if I've missed or misunderstood anything. I think off-main-thread parsing is a great idea, so any help debugging/understanding/improving the script streaming thread would be very much appreciated :-) And while the script streaming stuff was just the first weird thing I stumbled across, given the importance of parsing in general on complex, JS-heavy sites like Facebook, any additional info related to how V8 does parsing and how sites might be able to take advantage of that would be very helpful.Generally, we might look at (code) caching. The code cache - if successful - skips the parse entirely. How effective that is of course depends on how often you update how much of your code; and possibly on how exactly you do that.
Another, more complex thing (where I'm not sure about the details) might be Service Workers, since they allow a site to be explicit about what it wants cached, etc. This is not merely an implementation detail, though, and may require some greater rework on your side. Honestly, I'm not sure what the current status of ServiceWorkers is, though.
A silly question: How do we reproduce this? https://facebook.com, and I suspect I need to be logged in? Anything else?