Re: Calling Stop() from WebContentsObserver

8 views
Skip to first unread message

Alexander Timin

unread,
Apr 8, 2020, 12:55:20 PM4/8/20
to Scott Violet, navigat...@chromium.org, John Abd-El-Malek, content-team, Alex Moshchuk, fer...@chromium.org
+naviga...@chromium.org 

If I had to guess, I would say that WebView might have the same problem as well.
For example, we have been chasing some mysterious WebView-specific navigation bugs and one of the theories is that something strange happens in one of the "process has crashed" callbacks. 
It might be a confirmation bias, but given that we synchronously dispatch callback to WebView, I guess that reentrancy might be a problem worth solving in content/ layer.


I like the idea behind this patch, but I think we would be able to make this cleaner by having an explicit scope:

void WebContentsImpl::DidStartNavigation() {
  DispatchingCallbacksScope scope; // runs stored callbacks when destroyed.
  observers->DidFinishNavigation(); 
}

void WebContentsImpl::Stop() {
  if (DispatchingCallbacksScope::Get()) {
     DispatchingCallbacksScope::Defer(Bind(&Stop, GetWeakPtr()); // or PostTask
     return;
  }
  // ...
}
 
WDYT?


On Wed, 8 Apr 2020 at 05:11, Scott Violet <s...@google.com> wrote:


On Tue, Apr 7, 2020 at 4:58 PM John Abd-El-Malek <j...@google.com> wrote:


On Tue, Apr 7, 2020 at 4:43 PM Scott Violet <s...@google.com> wrote:
Hello,

I'm looking for some advice as to whether we want to fix a bug in the content side, or work around it in the embedder. The specific case is calling WebContents::Stop() from WebContentsObserver::DidStartNavigation/DidRedirectNavigation. Doing that triggers crashes. Do we think this is a bug worth a fix, or should it be explicitly disallowed via DCHECKs and what not?

Getting a clean fix for this is mildly challenging and/or comes with a cost. My first attempt was adding a number of early outs. In particular, WebContents would early out during calling to observers if the navigation was deleted. Alex was concerned about the state of the stack when this happens,

To make it clear to everyone: the issue is that DidFinishNavigation would be fired while DidStartNavigation is below in the stack. And it's probably against the current contract that some observers would then only see DidFinishNavigation without a DidStartNavigation.
 
and suggested we defer processing the Stop() until start/redirect has completed. Micro-tasks would be ideal in this scenario, but we lack that in the browser side. Patchset 9 allows processing Stop() after notifying observers. This is quite invasive.

My plan is to work around this in weblayer, but I'm happy to fix this in the content side if we can decide on the best approach.

We could use a NavigationThrottle for that in WebLayer, e.g. in the WCO::DiddStartNavigation if Stop() is called we can record this and then when the NavigationThrottle's start method is called later it can cancel the navigation. However for redirects Scott fond it's a different order: the throttle is called before the corresponding WCO method. So we'd have to either defer all redirects until we see what the embedder says when we call them in the WCO method, or we'd have to call the embedder in the throttle's redirect callback.

Here's the patch doing what John outlines: https://chromium-review.googlesource.com/c/chromium/src/+/2141232 . It's mildly tricky, but ensures Stop() is called during notifying observers.

  -Scott
 

Thanks,

  -Scott

dan...@chromium.org

unread,
Apr 8, 2020, 1:41:38 PM4/8/20
to Alexander Timin, Scott Violet, navigat...@chromium.org, John Abd-El-Malek, content-team, Alex Moshchuk, fer...@chromium.org
Do we only need to defer Stop()? In that case setting a stop_after_done_starting_ when inside_start_navigation_ could be sufficient?

DidStartNavigation() {
  {
    AutoReset<bool> r(&inside_start_navigation_, true)
    // Do start stuff.
  }
  if (stop_after_done_starting_)
    Stop();
}

Stop() {
  if (inside_start_navigation_) {
    stop_after_done_starting_ = true;
    return;
  }
  // Do stop stuff.
}


--
You received this message because you are subscribed to the Google Groups "navigation-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to navigation-de...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/navigation-dev/CALHg4nnDsU14jr0OwJdyPfj-Pab6fBkYKkt%2BVtO%2BmXbmB0i9Nw%40mail.gmail.com.

Nasko Oskov

unread,
Apr 8, 2020, 3:19:58 PM4/8/20
to Dana Jansens, Alexander Timin, Scott Violet, navigation-dev, John Abd-El-Malek, content-team, Alex Moshchuk, Fergal Daly
In general, WebContentsObserver is an interface for observing events and observers should not be making changes directly as part of the observation. This is especially true for navigations, since the navigation code is not reentrancy safe. While the proposed patches seem fine in pure code perspective, I don't think that Stop() is the only method that will need to be accounted for and making the navigation system reentrant safe will be non-trivial change.

Could we design the API to make these sorts of calling patterns harder to do or discouraged?

John Abd-El-Malek

unread,
Apr 8, 2020, 5:01:42 PM4/8/20
to Nasko Oskov, Dana Jansens, Alexander Timin, Scott Violet, navigation-dev, content-team, Alex Moshchuk, Fergal Daly
On Wed, Apr 8, 2020 at 12:19 PM Nasko Oskov <na...@chromium.org> wrote:
In general, WebContentsObserver is an interface for observing events and observers should not be making changes directly as part of the observation. This is especially true for navigations, since the navigation code is not reentrancy safe. While the proposed patches seem fine in pure code perspective, I don't think that Stop() is the only method that will need to be accounted for and making the navigation system reentrant safe will be non-trivial change.

Could we design the API to make these sorts of calling patterns harder to do or discouraged?

It's kind of impossible to do with the current API structure, short of rewriting all of the public API no? e.g. given that there are a large amount of methods on WebContents/RenderFrameHost that modify WebContent's state, and there are a lot of WCO callbacks.

The code snippet below is also what sky@ and I had discussed; agreed at this point we should just look at Stop. I think trying to expand this to all possibilities of bad calls is quite outside the scope, and is a large project on its own.

Scott Violet

unread,
Apr 8, 2020, 5:39:56 PM4/8/20
to Alexander Timin, navigat...@chromium.org, John Abd-El-Malek, content-team, Alex Moshchuk, fer...@chromium.org
On Wed, Apr 8, 2020 at 9:55 AM Alexander Timin <alt...@chromium.org> wrote:
+naviga...@chromium.org 

If I had to guess, I would say that WebView might have the same problem as well.
For example, we have been chasing some mysterious WebView-specific navigation bugs and one of the theories is that something strange happens in one of the "process has crashed" callbacks. 
It might be a confirmation bias, but given that we synchronously dispatch callback to WebView, I guess that reentrancy might be a problem worth solving in content/ layer.


I like the idea behind this patch, but I think we would be able to make this cleaner by having an explicit scope:

void WebContentsImpl::DidStartNavigation() {
  DispatchingCallbacksScope scope; // runs stored callbacks when destroyed.
  observers->DidFinishNavigation(); 
}

void WebContentsImpl::Stop() {
  if (DispatchingCallbacksScope::Get()) {
     DispatchingCallbacksScope::Defer(Bind(&Stop, GetWeakPtr()); // or PostTask
     return;
  }
  // ...
}
 
WDYT?

Good suggestion! I agree this is likely to make the patch less invasive.

  -Scott

Scott Violet

unread,
Apr 8, 2020, 5:45:33 PM4/8/20
to Nasko Oskov, Dana Jansens, Alexander Timin, navigation-dev, John Abd-El-Malek, content-team, Alex Moshchuk, Fergal Daly
On Wed, Apr 8, 2020 at 12:19 PM Nasko Oskov <na...@chromium.org> wrote:
In general, WebContentsObserver is an interface for observing events and observers should not be making changes directly as part of the observation.

I believe you're arguing that we should explicitly disallow this sort of thing with DCHECKs. Do I have that right?
 
This is especially true for navigations, since the navigation code is not reentrancy safe. While the proposed patches seem fine in pure code perspective, I don't think that Stop() is the only method that will need to be accounted for and making the navigation system reentrant safe will be non-trivial change.

You are likely right there are other cases that have similar problems.
 

Could we design the API to make these sorts of calling patterns harder to do or discouraged?

It's funny you should mention this. I've been discussing with John that we should strive to provide APIs that can't be used wrongly, or at least you have to go out of your way to use wrongly. I think you're arguing that here. That said, such an API is likely to be very difficult to retrofit at this time, and I'm honestly not even sure how that would be done here.

  -Scott

Scott Violet

unread,
Apr 9, 2020, 1:04:49 PM4/9/20
to Nasko Oskov, Dana Jansens, Alexander Timin, navigation-dev, John Abd-El-Malek, content-team, Alex Moshchuk, Fergal Daly
I could be wrong, but it seems Nasko (and maybe John) are of the opinion that embedders should prevent this scenario from happening. Where as Alex is inclined to fix it. Not sure if there are other opinions. Can we reach consensus on what we want to have happen here? I'm happy to work out a cleaner patch (Alex's suggestion is likely to improve things) if we decide we want to allow this in content.

  -Scott

Alexander Timin

unread,
Apr 9, 2020, 1:35:43 PM4/9/20
to Scott Violet, Nasko Oskov, Dana Jansens, navigation-dev, John Abd-El-Malek, content-team, Alex Moshchuk, Fergal Daly
I would be very happy if we can prohibit embedders from doing complex stuff synchronously inside content/ callbacks (if we do this, I think some checks inside content/ would come handy).
However, it does feel like quite a bit of work (especially given that it might impact Android WebView) and fixing it might be more realistic.

I don't have any additional insights, so I'm happy to agree with any decision more opinionated folks reach.  

John Abd-El-Malek

unread,
Apr 9, 2020, 1:42:05 PM4/9/20
to Alexander Timin, Scott Violet, Nasko Oskov, Dana Jansens, navigation-dev, content-team, Alex Moshchuk, Fergal Daly
Yeah I mostly agree. Trying to solve this one case in content might not be so worthwhile if there are dozens of other ones: e.g. if we really want to fix this we'd have to do it properly. Fixing it in WebLayer for now sounds good. This really would mirror Chrome not calling WC to cause nested notifications.

Nasko Oskov

unread,
Apr 9, 2020, 2:02:02 PM4/9/20
to John Abd-El-Malek, Alexander Timin, Scott Violet, Dana Jansens, navigation-dev, content-team, Alex Moshchuk, Fergal Daly
If we can make the WebLayer API more robust to this, hiding it in its implementation would be great. I'm not opposed to the proposed solution inside content, but I'm afraid of folks looking at the code and that reentrancy works for Stop and then expecting all other APIs to work that way.

Scott Violet

unread,
Apr 9, 2020, 4:02:10 PM4/9/20
to Nasko Oskov, John Abd-El-Malek, Alexander Timin, Dana Jansens, navigation-dev, content-team, Alex Moshchuk, Fergal Daly
I'll go with the fix in weblayer then. If no one is opposed to it, I can also add a DCHECK for the Stop() case. We can extend the DCHECK to other places if necessary.

  -Scott

John Abd-El-Malek

unread,
Apr 9, 2020, 4:38:50 PM4/9/20
to Scott Violet, Nasko Oskov, Alexander Timin, Dana Jansens, navigation-dev, content-team, Alex Moshchuk, Fergal Daly
Sounds good, thanks!

On Thu, Apr 9, 2020 at 1:02 PM Scott Violet <s...@google.com> wrote:
I'll go with the fix in weblayer then. If no one is opposed to it, I can also add a DCHECK for the Stop() case. We can extend the DCHECK to other places if necessary.

  -Scott

On Thu, Apr 9, 2020 at 11:02 AM Nasko Oskov <na...@chromium.org> wrote:
If we can make the WebLayer API more robust to this, hiding it in its implementation would be great. I'm not opposed to the proposed solution inside content, but I'm afraid of folks looking at the code and that reentrancy works for Stop and then expecting all other APIs to work that way.
+1 

Scott Violet

unread,
Apr 10, 2020, 5:05:43 PM4/10/20
to Nasko Oskov, John Abd-El-Malek, Alexander Timin, Dana Jansens, navigation-dev, content-team, Alex Moshchuk, Fergal Daly
Reply all
Reply to author
Forward
0 new messages