Yao, PTAL for a high-level pass. I will be out for a bit and even though this CL isn't perfect yet it'd be good to have you take a pass. Thanks!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
probe::AsyncTaskContext* async_task_context() { return &async_task_context_; }Blink Style Guide: Naming - Use 'CamelCase' for all function names. Please rename `async_task_context` to `AsyncTaskContext`.
To keep this interaction as brief and non-intrusive as possible, please consider responding with one of following options:
**Done** | **OK But Won't Fix**: reason | **Later**: b/<bug_id> | **Invalid:** reason
_This comment was generated by [Experimental Blink C++ Code Review Agent](http://go/blink-c++-code-review-agent)._
_AI reviews can sometimes be inaccurate; We appreciate your 🙏 feedback 🙏 to help us improve._
_[File a bug](http://go/blink-c++-code-review-agent-feedback) | [Provide feedback on chat](https://chat.google.com/room/AAQA0zhQHe0?cls=4) | [Opt-out](https://ganpati2.corp.google.com/group/peep-genai-blink-agent-optout.prod)_
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
The overall approach looks good. Added some comments.
class JSEventHandlerForContentAttribute final : public JSEventHandler {`JSEventHandlerForContentAttribute` is a GarbageCollected object, but `probe::AsyncTaskContext` has a non-trivial destructor. Should you use a pre-finalizer to explicitly cancel the task before the object is swept? e.g.,
```
class JSEventHandlerForContentAttribute : public JSEventHandler {
USING_PRE_FINALIZER(JSEventHandlerForContentAttribute, Dispose);
public:
void Dispose() { async_task_context_.reset(); }
private:
std::optional<probe::AsyncTaskContext> async_task_context_;
}
```
if (LocalDOMWindow* dom_window =
DynamicTo<LocalDOMWindow>(execution_context_of_event_target)) {
if (LocalFrame* frame = dom_window->GetFrame()) {
if (ScriptInitiationMonitor* monitor =
frame->GetScriptInitiationMonitor()) {
monitor->DidRegisterDynamicScript(
v8_context_of_event_target,
V8ScriptId(compiled_function->ScriptId()));
}
}
}You can use the new static helper:
```
if (auto* monitor = ScriptInitiationMonitor::FromExecutionContext(execution_context_of_event_target)) {
monitor->DidRegisterDynamicScript(v8_context_of_event_target, V8ScriptId(compiled_function->ScriptId()));
}
```
void AdTracker::DidRegisterDynamicScript(v8::Local<v8::Context> v8_context,Because `JSEventHandlerForContentAttribute` now checks ad-status via `DidRegisterDynamicScript` at execution time rather than creation time, this introduces a subtle regression: If a vanilla script synchronously triggers the ad-created event handler (e.g., document.getElementById('btn').click()), `IsAdScriptInStack` evaluates the synchronous V8 stack, sees the vanilla script, ignores the `async_script_stack_`, and the original ad-provenance is lost.
I've attached an example regression test below that demonstrates this issue. It passes previously but fails with your patch.
To restore the previous behavior, I think you can check `async_script_stack_` before checking `IsAdScriptInStack`. Or, if you decide this new behavior is more desirable (I'm not entirely sure yet), I suggest splitting it into a separate CL so this current CL remains a strict refactoring.
Either way, adding the regression test seems valuable.
Example regression test:
```
TEST_F(AdTrackerSimTest, AdEventHandlerTriggeredByNonAdScript) {
SimRequest ad_script_resource("https://example.com/ad_script.js", "application/javascript");
SimRequest vanilla_script_resource("https://example.com/script.js", "application/javascript");
SimRequest image_resource("https://example.com/foo.png", "image/png");
main_resource_->Complete(R"HTML(
<body>
<script src="https://example.com/ad_script.js"></script>
<script src="https://example.com/script.js"></script>
</body>
)HTML");
// Ad script sets an inline event handler on a dynamically created element.
ad_script_resource.Complete(R"JS(
let btn = document.createElement("button");
btn.id = "target";
btn.setAttribute("onclick", "let img = document.createElement('img'); img.src = 'foo.png';");
document.body.appendChild(btn);
)JS");
// Non-ad script triggers the event handler synchronously.
vanilla_script_resource.Complete(R"JS(
document.getElementById('target').click();
)JS");
base::RunLoop().RunUntilIdle();
image_resource.Complete("data");// Verify that the subsequent image request is correctly tagged as an ad.
EXPECT_TRUE(ad_tracker_->RequestWithUrlTaggedAsAd("https://example.com/foo.png"));
}
```
Vector<v8::StackTrace::ScriptData> cached_stack_;Add inline capacity to avoid heap allocations on V8 hot paths? e.g.,
```
Vector<v8::StackTrace::ScriptData, 5> cached_stack_;
```
if (cached_stack_.size() >= limit) {
return base::span(cached_stack_).first(limit);
}If the real stack is shallower than the requested limit, then you may still do unnecessary V8 stack walks. Suggest adding a `bool fully_captured_ = false;` member (and move it in the move constructor). The logic here can then be:
```
if (cached_stack_.size() >= limit || fully_captured_) {
return base::span(cached_stack_).first(std::min(static_cast<wtf_size_t>(limit), cached_stack_.size()));
}
// ... walk stack ...
if (stack.size() < limit) {
cached_stack_.Shrink(static_cast<wtf_size_t>(stack.size()));
fully_captured_ = true; // Prevents future redundant walks
}
```lazy_stack_trace_test.cc should also be updated to expect `EXPECT_EQ(stack3.data(), stack1.data());`.
virtual void DidRegisterDynamicScript(v8::Local<v8::Context> v8_context,Should this take a `LazyStackTrace& stack_trace` parameter as well?
KURL url(probe.script_url);Here you construct a `KURL` from a `WTF::String`, but inside `AdTracker::WillExecuteScript`, you turn it back into a `WTF::String`.
Can we change the `Observer::WillExecuteScript` signature to take just `const String& script_url`?
v8::HandleScope handle_scope(isolate);This shouldn't be needed. Since `probe.function` is a `v8::Local<v8::Function>`, a handle scope must already be active on the call stack.
v8::HandleScope handle_scope(isolate);Same as above: this shouldn't be needed.
std::optional<AdProvenance> FrameFetchContext::CalculateIfAdSubresource(Should we consider using more generic names here, such as `CalculateResourceAnnotations`, `scan_javascript_stack`, and a struct for the return type:
```
struct ResourceAnnotations {
std::optional<AdProvenance> ad_provenance;
// Future mutators can add their own return states here.
};
```
?
(Though keeping the return type as `std::optional<AdProvenance>` also seems fine for now).
This can be done in a future CL to keep the current scope small.
WDYT?
ScriptInitiationMonitor* monitor = GetFrame()->GetScriptInitiationMonitor();
if (monitor) {nit: one-liner `if` initialization
```
if (ScriptInitiationMonitor* monitor = GetFrame()->GetScriptInitiationMonitor())
```
return GetFrame()->GetAdTracker()->CalculateIfAdSubresource(Relying on `PrepareRequest` and `CalculateIfAdSubresource` being called back-to-back seems fragile.
How about making this stateless by instantiating the `LazyStackTrace` directly in `FrameFetchContext`, and pass it to both `monitor->PrepareRequest` and `ad_tracker->CalculateIfAdSubresource`? With this approach, `AdTracker::WillPrepareRequest` can just be a no-op.
Even though this exposes `LazyStackTrace` slightly higher up the stack, it seems more robust.
ScanForAds scan_for_ads) {Should we generalize this struct name as well (in a future CL)?
Persistent<ScriptInitiationMonitor> script_initiation_tracker_;nit: `script_initiation_monitor_` to match the class name.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |