Breakpoints trigger repeated full-parse of the debugged script

Gunes Acar

unread,

Oct 5, 2020, 9:15:53 PM10/5/20

to v8-dev

Hi all,

For an academic research project, we use Chrome Devtools Protocol to set breakpoints at function returns, and collect function call metadata (stack frames, arguments, timestamps) using a condition script(*).

Although the condition script always returns `false` to avoid pausing the debugger, we still observe a significant overhead per "hit" breakpoint.

To identify the root cause, we set up tracing and found that V8.ParseProgram is called for each "hit" breakpoint. We then logged function events using "--js-flags=--log-function-events", which showed that all functions in the debugged script is fully parsed (i.e., `function,full-parse,...` event) for each "hit" breakpoint.

For example, assume the debugged script contains 10K function definitions, of which only a 100 is called. We observe ~10K*100=~1M function (full-)parse events in the logs, which slows down the process and makes it unfeasible on heavy websites.

1) Is it possible to prevent repeated parsing of the debugged script for each evaluated breakpoint?

2) Could there be an easier and faster way of logging all function calls with the associated stack frames and arguments? Pointers to the relevant code locations would be much appreciated if modifying Chrome/V8 source code is the only way to do that.

The trace and test files can be found in this Gist.

Thanks so much and stay safe,
Gunes

PS: None of our breakpoints were actually hit, since the condition script always returns false. Also, the debugged scripts were not modified during the execution through `Debugger.setScriptSource` or by any other means

*: Idea from the DuckDuckGo’s Tracker Radar Collector

Gunes Acar

unread,

Oct 5, 2020, 9:23:08 PM10/5/20

to v8-dev

The missing screenshot of the trace can be seen at: Github: https://gist.github.com/gunesacar/b8f82266e93fbfda2bf06c573af3caea#gistcomment-3478635

Leszek Swirski

unread,

Oct 6, 2020, 4:08:15 AM10/6/20

to v8-dev

This might be lazy source positions forcing a reparse, have you tried running with --no-enable-lazy-source-positions?

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/f3242ed8-a082-4b68-868b-1a18f9d98323n%40googlegroups.com.

Dan Elphick

unread,

Oct 6, 2020, 4:23:00 AM10/6/20

to v8-...@googlegroups.com

If it is down to lazy source positions, then what should happen is that it would reparse each function at the point that you set the breakpoint as otherwise it wouldn't know where to set it. In that case you would get 1000 function reparses. It shouldn't need to reparse every function each time, because a) it only needs the source positions for the function it's setting the break point on and b) it doesn't need to parse the entire program to do that.

As Leszek says, disabling the feature via the flag would tell you if it's responsible.

The code in your gist is just the code to be debugged, but not the code that sets the break points. A repro with just 10 functions should be sufficient to show that it's doing 100 reparses and we can go from there.

Thanks,

Dan

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CAGRskv8nby_co6K6%2Bxcyq-WKVvq4_SodGCczQAH_Wg_oTJ%2BBEg%40mail.gmail.com.

Gunes Acar

unread,

Oct 6, 2020, 7:27:25 AM10/6/20

to v8-...@googlegroups.com

Thank you, Leszek and Dan for the suggestions.

I tried passing --no-enable-lazy-source-positions, but I still get full script parses for each hit breakpoint.

I put together the small repro that Dan suggested. It uses Puppeteer, although passing `headless=false` didn't change things either.

The script will print instructions to count the number of full parses in the logs. Let me know if I can make it easier for you to reproduce the issue.

Thanks again,

Gunes

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CALH_77sN_wbzqsJeDq82eyactH%3D0Xq%3DYdbfHD4T-Fr0YD%2Byiew%40mail.gmail.com.

Seth Brenith

unread,

Oct 6, 2020, 8:13:36 AM10/6/20

to Gunes Acar, v8-...@googlegroups.com

I haven't looked into the details, but it sort of makes sense that evaluating a breakpoint condition would require reparsing. Conditional breakpoints are basically a local eval(), which can bind to variables from any enclosing scope by name. But if those scopes didn't contain the word "eval" during the original parse, then they had no reason to keep full variable data around. There's no way to reparse every enclosing scope without at least lazy-parsing through the rest of the code in the file. And lazy-parsing the rest of the file could cause this exact problem, because presumably we don't save full local-eval data for the other functions which might also have conditional breakpoints.

From: v8-...@googlegroups.com <v8-...@googlegroups.com> on behalf of Gunes Acar <acar...@gmail.com>
Sent: Tuesday, October 6, 2020 4:27 AM
To: v8-...@googlegroups.com <v8-...@googlegroups.com>
Subject: [EXTERNAL] Re: [v8-dev] Re: Breakpoints trigger repeated full-parse of the debugged script

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CABb%3D4dUtagUNmWHuSTG1d1zTh92mbRs8P6MF4JLFNYWB8YGKnQ%40mail.gmail.com.

Yang Guo

unread,

Oct 6, 2020, 8:20:49 AM10/6/20

to v8-...@googlegroups.com, Simon Zünd, Gunes Acar

Yes, indeed. When evaluating a conditional breakpoint, we reconstruct the entire scope chain for the break position in order to evaluate the condition expression as if it was inserted at the break location. Some of this scope information can no longer be reconstructed without parsing, so we indeed need to parse. We enter the parser from here.

It is possible to avoid repeated parsing if we cache the scope information for debugging. We have not implemented that because usually, hitting breakpoints and resuming is an interactive process so that reasonably slow performance is not a deal breaker.

+Simon Zünd probably can point to a tracking issue for this.

Cheers,

Yang

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/MN2PR00MB0735AC85AB14CB4618055B9B8B0D1%40MN2PR00MB0735.namprd00.prod.outlook.com.

Jakob Kummerow

unread,

Oct 6, 2020, 9:32:41 AM10/6/20

to v8-dev, Simon Zünd, Gunes Acar

You can get some function call tracing data by using the --trace flag. No breakpoints or parsing required -- but you also don't get to choose/configure the data that's produced.

Simon Zünd

unread,

Oct 7, 2020, 3:04:16 AM10/7/20

to Jakob Kummerow, v8-dev, Gunes Acar

We discussed this actually on this bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1072939

In essence, for local debug-evaluate there is not really an "anchor" where we can put a scope information cache. For conditional breakpoints it's easier, we use the breakpoint itself. So ideally we only re-parse the first time a conditional breakpoint is hit, cache the parsed scope information on the breakpoint and then re-use it on future breakpoint hits to evaluate the break condition. Given that this is not an issue for most users of conditional breakpoints, implementing the fix has (unfortunately) lower priority.

Gunes Acar

unread,

Oct 7, 2020, 1:27:21 PM10/7/20

to Simon Zünd, Jakob Kummerow, v8-dev

Thanks, Simon and others for the helpful comments.

Even caching the result of the scope analysis (#1072939) will likely be slow for our use case, if the first hit of each breakpoint still requires re-parsing the whole script. I think we'll be looking for alternative methods at this point (e.g., adding our TRACE calls).

In the meantime, I measured the time reduction if Chrome were to re-parse functions (not whole scripts) in TryParseAndRetrieveScopes: it reduced the time for each DebugBreakOnBytecode from 12-13 ms to 0.25 ms for our test case. (Admittedly these figures depend on script and function length, and not reparsing the whole script may possibly yield incorrect evaluations).