Cancelling a pending navigation?

559 views
Skip to first unread message

Chris Thompson

unread,
Sep 14, 2021, 6:03:20 PM9/14/21
to navigat...@chromium.org, trusty-transport
As an improvement to Chrome's HTTPS-First Mode, we're wanting to add a "fast timeout" timer to cancel the upgraded navigation if it doesn't complete quickly (faster than the net stack timeout). Currently, there are APIs for stopping a navigation (i.e., WebContents::Stop()) but they fully stop the navigation and reset the NavigationRequest. This means that the navigation basically ceases to exist --- most notably in our case this means that none of the "navigation failed" machinery triggers, specifically any relevant NavigationThrottles.

To that end, I've been thinking about whether we could add two APIs:
  1. A (content-internal) NavigationRequest::CancelNavigation() which triggers the OnRequestFailed() logic with a caller-specified error code.
  2. A new public API WebContents::Cancel(), mirroring the Stop() method, which in the WebContentsImpl implementation forwards through to the new NavigationRequest::CancelNavigation() method.
I've made a proof-of-concept implementation that works as expected (we can trigger a fast timeout and the NavigationThrottle logic will correctly trigger and let us load the interstitial page), but I wanted to raise this proposed API here to see if there any concerns or issues I'm missing from my minimal experience with navigation code.

Notably (to me), NavigationRequest::OnRequestFailed() is not publicly exposed currently, but my understanding of how timeouts from the net stack are handled is that triggering this directly should be mostly equivalent. However, I imagine that there might be some potential edge case concerns if a consumer of this new API calls it at a point where the navigation state machine doesn't expect to be canceled -- would there need to be specific checks (i.e., in the new NavigationRequest::CancelNavigation() code) to prevent issues?

Any advice or opinions are appreciated!
- Chris

Kentaro Hara

unread,
Sep 14, 2021, 9:47:33 PM9/14/21
to Chris Thompson, navigat...@chromium.org, trusty-transport
WebContents::Cancel() will cancel *all* pending navigation requests in the WebContents. Is it your intention? For the fast timeout use case, I thought you are interested in cancelling some specific NavigationRequests.


--
You received this message because you are subscribed to the Google Groups "navigation-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to navigation-de...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/navigation-dev/CALMy46TsDeiun_2H7dXLSdj589p3MEmA-WzH5n0fEEcOkfqOcQ%40mail.gmail.com.


--
Kentaro Hara, Tokyo

Chris Thompson

unread,
Sep 14, 2021, 10:10:39 PM9/14/21
to Kentaro Hara, navigat...@chromium.org, trusty-transport
Good question. We’re only concerned about main frame navigations in this case, so they would likely be the same for us. The proposed API could potentially take a NavigationHandle, but there aren’t any existing WebContents APIs that use those as parameters. A possible alternative might be to expose this via NavigationHandle to trigger the cancellation logic for the specific navigation being tracked by the throttle (although I’m not sure whether it is safe to save the handle pointer from the WillStartNavigation() handler for use in a delayed callback).

(We originally investigated using LoadPostCommitErrorPage() for this particular case, which worked for loading the interstitial page but causes navigation issues as there isn’t actually a committed navigation entry to swap out. Unifying our throttle to directly cancel the navigation and trigger the WillFailRequest() logic seemed a lot cleaner than manipulating “fake” navigation state for the timed out navigation.)

dan...@chromium.org

unread,
Sep 15, 2021, 10:39:49 AM9/15/21
to Chris Thompson, Kentaro Hara, navigation-dev, trusty-transport
On Tue, Sep 14, 2021 at 10:10 PM Chris Thompson <cth...@chromium.org> wrote:
Good question. We’re only concerned about main frame navigations in this case, so they would likely be the same for us.

There can be multiple navigations in flight for the main frame. We keep a set of navigation requests in the RenderFrameHostImpl: https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_host_impl.h;drc=1336ac64ee83c87ec97b293167f86809d591cd53;l=3465

So keeping your code specific to a NavigationRequest seems preferable if I am understanding.

A high level nit, perhaps naming things "FailNavigation" since you want it to go down the failure path, rather than "Cancel" which is not distinguishable from "Stop" for me.
 

Alex Moshchuk

unread,
Sep 15, 2021, 2:00:18 PM9/15/21
to dan...@chromium.org, Daniel Cheng, Chris Thompson, Kentaro Hara, navigation-dev, trusty-transport
One issue to consider here is whether the navigation is in a state where it can be (safely) canceled.  After we've picked the final RenderFrameHost for a navigation and told the renderer to commit (basically, after ReadyToCommitNavigation), I think canceling a navigation is problematic and has been a longstanding source of crashes and issues (+Daniel Cheng did a bunch of work here on the context of issue 838348).  So, ideally we shouldn't support navigation cancellation between ReadyToCommitNavigation and DidFinishNavigation (after which the NavigationHandle gets destroyed), as that's too late, but the time window between DidStartNavigation and ReadyToCommitNavigation seems ok.

Also, to clarify, I think we can only have multiple requests in the same frame after reaching ready-to-commit (for the above reason - we basically have to maintain a pipeline of navigations that have told the renderer to commit but haven't received DidCommit yet).  Before ready-to-commit, we'll have at most one NavigationRequest stored in FrameTreeNode, and after ready-to-commit that request moves to the set of requests on RFHI that Dana mentioned.  Still, I agree we should make this specific to a NavigationHandle/NavigationRequest, since with MPArch WebContents may now have multiple main frames (e.g., with prerender), and ideally the API would let us cancel any navigation (in any frame), rather than just the current one in the primary main frame (which would be prone to misuse).  Keeping a reference to NavigationHandle is considered ok until you hear DidFinishNavigation for it.

Chris Thompson

unread,
Sep 15, 2021, 2:29:51 PM9/15/21
to Alex Moshchuk, dan...@chromium.org, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
Thanks danakj@ and alexmos@! Nits and safety concerns are very welcome since I always feel a bit out of my depth in Navigation internals :-). I've updated my proof-of-concept to move this onto NavigationHandle/NavigationRequest, changed the method name to FailNavigation(), and added a check that the NavigationRequest state isn't READY_TO_COMMIT or later (if it is, then FailNavigation() would just be a no-op). This does seem a lot cleaner than doing this via WebContents.

And thanks for the confirmation about keeping a reference to a NavigationHandle -- I think that the lifetime of NavigationThrottles (which keep a pointer to the NavigationHandle) should be within the lifetime of the NavigationHandle (i.e., the throttle will be destroyed by the time DidFinishNavigation() is called), and our callback is triggered from a OneShotTimer which will get canceled when our NavigationThrottle is destroyed.

Alexander Timin

unread,
Sep 15, 2021, 2:40:49 PM9/15/21
to Chris Thompson, Alex Moshchuk, dan...@chromium.org, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
Taking a step back — I would generally recommend against trying to add new ways to cancel the navigation and/or add more branches to the navigation flow as it's already very complicated and hard to reason about.

It seems to me that we just want a faster timeout for the network request here – something that seems to belong to the network layer than the navigation layer.
I wonder if we could use URLLoaderRequestInterceptor to wrap the default URLLoader and implement timeout on that level?

Łukasz Anforowicz

unread,
Sep 15, 2021, 3:55:00 PM9/15/21
to Alexander Timin, Chris Thompson, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
RE: The URLLoaderRequestInterceptor idea

FWIW, the //chrome layer is given a chance to inject intermediate/proxying URLLoaderFactories (e.g. for DevTools or for Chrome Extension WebRequest API) via ContentBrowserClient::WillCreateURLLoaderFactory.

OTOH, after a cursory look at network::ResourceRequest I didn't see any properties that might be used to control //net-level timeouts

Nasko Oskov

unread,
Sep 15, 2021, 5:24:52 PM9/15/21
to Łukasz Anforowicz, Alexander Timin, Chris Thompson, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
Just a quick question. If there is a navigation that was started in the main frame and we don't want it to complete, can't we just start a new navigation in the same main frame? If it is browser-initiated, it should cancel the existing navigation, if possible, and that is already a codepath that exists and is well exercised.

Chris Thompson

unread,
Sep 15, 2021, 5:27:48 PM9/15/21
to Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
Specifically in our case we want to trigger our interstitial page for the failed navigation and wait for the user to make a decision before starting a new navigation. It's possible though that we could start some kind of "placeholder" navigation to trigger the first navigation to fail and cause the interstitial to load.

John Abd-El-Malek

unread,
Sep 20, 2021, 12:00:43 AM9/20/21
to Chris Thompson, Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
+1 to avoiding new paths.

It seems like what you want is to change the default timeout for some requests (e.g. main frame requests for HTTPS first). Perhaps it can be a new parameter to network::mojom::URLRequest which then plumbs it to net for that one request? Then the rest of the code which observes failed/timed out navigation would work the same.

Chris Thompson

unread,
Sep 20, 2021, 2:42:22 PM9/20/21
to John Abd-El-Malek, net-dev, Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
+net-dev as I only have superficial knowledge of how timeouts work in the //net stack. If there would be a reasonable way to have a single centralized timeout for network requests I agree that it seems like a good way to solve this.

My understanding is that there are many different layers of a connection that get different timeouts applied -- for example, ConnectJob() has a timeout timer but it's only used for some job types while others apply separate logic (e.g., SSLConnectJob instead has a 30s timeout for the handshake; TransportConnectJob has a 4 minute fallback timer; TCP client sockets have their own TCP-layer timeout logic, etc.). Maybe we could centralize this into ConnectJob as the "overall timeout" that all callers pass through from a parameter on URLRequest (if set) but is separate from any of the other timeout logic? But that is feeling equally complex as adding a FailNavigation() API to navigation code, especially as it adds complexity for all network connections rather than just being constrained to navigation code. Or maybe some net folks will point out how I'm missing an obvious point in the stack to add this :-)

From the other possible options, wrapping the navigation request in a URLLoader that just implements a timeout timer seems doable, but it definitely feels like a lot of complexity just to do this (although it localizes that complexity in our leaf node at least). There is a good precedent example of SimpleURLLoader wrapping URLLoader and providing timeout functionality. I can dig into this more if the other options seem like they're non-starters.

Matt Menke

unread,
Sep 20, 2021, 3:29:33 PM9/20/21
to Chris Thompson, John Abd-El-Malek, net-dev, Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
ConnectJobs are not associated with particular URLRequests, so they can't be passed a timeout from them.  It's a hard requirement that all ConnectJobs in a group (same origin/NIK/privacy mode/proxy) must be indistinguishable from each other, which includes not giving some subset of them different timeouts.  These jobs must be indistinguishable because we use late binding - when we've established a connection, it goes to the oldest, highest priority pending socket request.  Also worth noting due to preconnect, we often create the ConnectJob before a URLRequest even makes it to the socket layer.

Also, there's no general timeout for URLRequests in net - there are only the connection timeout timers (and possibly TCP/QUIC/H2 pings), since we don't know if a request is supposed to be a hanging GET, or the server has to do a lot of processing, etc.

All that aside, I'm not seeing why this should go deep inside net/, instead of in the NavigationURLLoader layer, which might not know when a connection is successfully established, but does knot when we receive response headers/redirects/etc.  It sounds to be like the main concern around when we stop the timeout timer here is related to commit time, which the NavigationURLLoader knows (And, conveniently, at commit time, I believe we throw away the NavigationURLLoader and pass ownership of its data pipe and URLLoader to the renderer as well).

You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/CALMy46TzFeUhi8st0fXURM6JsTG_VdRwGO%3DNVWZJjuybeN3vbw%40mail.gmail.com.

John Abd-El-Malek

unread,
Sep 22, 2021, 6:04:52 PM9/22/21
to Matt Menke, Chris Thompson, net-dev, Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
(oops, sorry missed this thread since it got filtered into a label)

Thanks for the net/ details explanation Matt. So if that doesn't work, putting it in NavigationURLLoader as an argument to the navigation sounds better than trying to add another cancellation path.

Chris Thompson

unread,
Sep 22, 2021, 9:07:32 PM9/22/21
to John Abd-El-Malek, Matt Menke, net-dev, Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
Thanks Matt! And agreed that NavigationURLLoader seems like a reasonable layer to add navigation-level timeout functionality. However, for our purposes I don't think we can make this timeout an argument passed to the Loader when it's being created, as the navigation (and loader) already exist when the navigation gets upgraded -- maybe there's a reasonable way to "swap out" an existing navigation (loader) I'm not thinking of?

I've updated my proof-of-concept CL to use a quick implementation of this (see PS#5 https://crrev.com/c/3149506/5) where instead we expose the timeout via a method on NavigationHandle/NavigationURLLoader:
  1. Adds NavigationURLLoader::SetNavigationTimeout() which starts a OneShotTimer for the specified duration, which calls OnComplete(ERR_TIMED_OUT) when triggered.
  2. Adds NavigationHandle::SetNavigationTimeout() to give an API which Throttles can call to set the timeout. This gets forwarded through to the Loader for the NavigationRequest.
Currently these methods return a boolean for whether the timeout was started or not (and they don't reset the timer if called multiple times) to make it easier to integrate them into the existing logic in HttpsOnlyModeNavigationThrottle -- it might be possible to avoid this with some more changes to our Throttle code.

Let me know if this seems like a reasonable approach to handling this at this layer.

- Chris

John Abd-El-Malek

unread,
Sep 24, 2021, 2:45:38 PM9/24/21
to Chris Thompson, Matt Menke, net-dev, Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
On Wed, Sep 22, 2021 at 6:07 PM Chris Thompson <cth...@chromium.org> wrote:
Thanks Matt! And agreed that NavigationURLLoader seems like a reasonable layer to add navigation-level timeout functionality. However, for our purposes I don't think we can make this timeout an argument passed to the Loader when it's being created, as the navigation (and loader) already exist when the navigation gets upgraded -- maybe there's a reasonable way to "swap out" an existing navigation (loader) I'm not thinking of?

I've updated my proof-of-concept CL to use a quick implementation of this (see PS#5 https://crrev.com/c/3149506/5) where instead we expose the timeout via a method on NavigationHandle/NavigationURLLoader:
  1. Adds NavigationURLLoader::SetNavigationTimeout() which starts a OneShotTimer for the specified duration, which calls OnComplete(ERR_TIMED_OUT) when triggered.
  2. Adds NavigationHandle::SetNavigationTimeout() to give an API which Throttles can call to set the timeout. This gets forwarded through to the Loader for the NavigationRequest.
Currently these methods return a boolean for whether the timeout was started or not (and they don't reset the timer if called multiple times) to make it easier to integrate them into the existing logic in HttpsOnlyModeNavigationThrottle -- it might be possible to avoid this with some more changes to our Throttle code.

Let me know if this seems like a reasonable approach to handling this at this layer.

Seems very reasonable and minimal impact on content/. Only nit is to make the documentation clear that this isn't a timeout on the whole request, just until it gets committed.

Chris Thompson

unread,
Sep 24, 2021, 5:33:20 PM9/24/21
to John Abd-El-Malek, Matt Menke, net-dev, Nasko Oskov, Łukasz Anforowicz, Alexander Timin, Alex Moshchuk, Dana Jansens, Daniel Cheng, Kentaro Hara, navigation-dev, trusty-transport
Sounds good. Thanks for the feedback and discussion everyone!

I'll plan to clean up the CL and then send it for review next week, so do let me know if there are any remaining concerns with this last approach.

- Chris
Reply all
Reply to author
Forward
0 new messages