long durations between connectEnd and requestStart

461 views
Skip to first unread message

Ben Maurer

unread,
Jan 11, 2016, 1:42:12 PM1/11/16
to net-dev
Hey,

At Facebook we've noticed that there are often long gaps between connectEnd and requestStart in resource timing entries for static resource downloads. I've attached a screenshot of an internal tool that visualizes this. in the highlighted request:

- fetchStart = connectStart = connectEnd = left hand side of blue box
- connectEnd -> requestStart  = 164 ms
- requestStart -> responseStart = the green box to the right of the blue box 
- responseStart -> responseEnd = the tiny blue box to the right of the green box
- responseEnd -> performance.now() on the first line of the JS file = the yellow box

It seems really weird to us that there would ever be a gap between connectEnd and requestStart. There are no redirects for these resources and there appears to be a HTTP2 connection that is open and ready for reuse.

One hypothesis I have is that these requests might actually not be network requests but might be served from the user's cache. As we saw in the previous discussion about spacer images even on a fast laptop with flash a cached request can easily take 100 ms. Some UMA data was shared with us suggesting that the 90th percentile request time for cached items is often over 100ms.

1) Is there any way that we can distinguish cache hit vs miss with resource timing. Ilya suggested that we could once transferSize is included.
2) Would cache hits ever see a long duration between connectEnd and requestStart
3) The UMA for the 90th percentile of cached requests seems really high. Is this to be expected?

-b
12469488_10100951949617179_3430289650558326638_o.jpg

Matt Menke

unread,
Jan 11, 2016, 2:03:28 PM1/11/16
to Ben Maurer, net-dev
If you can repro locally, an about:net-internals provides a much more complete view of what things look like from the network stack's perspective, and about:tracing provides a lot more details about Chrome's perspective.


I'm not sure how much knowledge there is on this list of the Javascript APIs.  I guess connectEnd is what we call connect_end (Shockingly), and requestStart is what we call send_start (Our request_start is much earlier).  I don't know the cache well enough to know if these timings could reflect cache revalidation or not, but it does sound like a possibility.  The normal (non-cached) order is:  Create stream, generate auth token, init body (Which, for uploads, may involve opening one or more files, and checking their sizes and modification dates), and then we record send start.  Doesn't seem much in there that could account for the delay.

--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/05956a81-db3c-4b21-b825-9b5a37b55684%40chromium.org.

Ben Maurer

unread,
Jan 11, 2016, 2:33:10 PM1/11/16
to Matt Menke, net-dev
The code seems to agree with your mapping here:


One other insight here -- I found a few samples where the requestStart -> responseEnd time is < 5 milliseconds and the user is in a country in which it would be basically impossible for them to have a 5 ms rtt to our CDN. 

Ryan Sleevi

unread,
Jan 11, 2016, 2:44:19 PM1/11/16
to Matt Menke, Ben Maurer, net-dev, Chromium Loading Performance, Ilya Grigorik
+loading-dev & ilya

Doesn't requestStart also factor in any necessary process instantiations? That is, we may have preconnected or prewarmed a socket (thus covering connectStart -> connectEnd), and then either are waiting for a request (in the case of preconnect) or are waiting for the renderer to spawn to handle the request.

On Mon, Jan 11, 2016 at 11:03 AM, Matt Menke <mme...@chromium.org> wrote:

Ben Maurer

unread,
Jan 11, 2016, 2:48:24 PM1/11/16
to Ryan Sleevi, Matt Menke, net-dev, Chromium Loading Performance, Ilya Grigorik
Yeah, we've noticed that if a request waits for a socket that it will have connectEnd = fetchStart. Since we use HTTP2 this generally seems to happen at the beginning of the page when we are waiting to first open a connection to the server. In the example here, we've already made a number of successful requests to this domain so I don't see why this would happen.

Also, we saw a few cases where there was a gap between requestStart and domainLookupStart -- that would seem to only be explicable by a cache miss causing latency before looking up the DNS to fill the miss

Matt Menke

unread,
Jan 11, 2016, 2:51:04 PM1/11/16
to Ryan Sleevi, Ben Maurer, net-dev, Chromium Loading Performance, Ilya Grigorik
So the events are:

navigationStart (And whatever resource timing equivalent there is)
fetchStart  (our request_start)
connectStart
connectEnd
requestStart (our send_start)

We force those events to be in order, so in the case of socket reuse or preconnect, fetchStart=connectStart=connectEnd.  As long as those times are different, the socket wasn't connected before the request started.  I'm not sure if it's possible for the socket to have been connected before the request made it to the socket layer, in that case, though.

Ilya Grigorik

unread,
Jan 11, 2016, 2:58:17 PM1/11/16
to Matt Menke, Ryan Sleevi, Ben Maurer, net-dev, Chromium Loading Performance
Shooting in the dark.. since the last batch of requests all appear to have the same behavior (long connectEnd -> requestStart), are they fetching resources with different crossorigin policy or some such? Those would require a different socket. Alternatively, socket could have been closed by the server, ...?

Is it possible to reproduce this locally? A chrome trace would go a long way here.

Ben Maurer

unread,
Jan 11, 2016, 3:02:09 PM1/11/16
to Ilya Grigorik, Matt Menke, Ryan Sleevi, net-dev, Chromium Loading Performance
All of our JS and CSS resources have the same (anonymous) policy. Also, that case would show somebody with a connectStart -> connectEnd which I didn't find in the trace.

I haven't had any luck repro-ing samples like this locally despite the high number of them I see in the wild. I might have to try getting a machine without an SSD

Eric Roman

unread,
Jan 11, 2016, 3:26:17 PM1/11/16
to Ben Maurer, Ilya Grigorik, Matt Menke, Ryan Sleevi, net-dev, Chromium Loading Performance
I wonder if the 150ms between connectEnd and requestStart could be Chrome doing a network request to validate a cache entry (as in an if-modified-since GET).

I don't believe that is what the timing api expects, but I wouldn't be surprised if that is what our implementation was measuring (I glanced quickly at that code but couldn't follow the caching case).

Another idea is whether a Chrome extension could be blocking it. The chrome.webRequest API has some hook points at onBeforeRequest and onBeforeSendHeaders. Worth checking if this measurement happens prior to the dispatch for those hooks.

Nat Duca

unread,
Jan 12, 2016, 3:16:29 PM1/12/16
to Eric Roman, Oystein Eftevaag, Ben Maurer, Ilya Grigorik, Matt Menke, Ryan Sleevi, net-dev, Chromium Loading Performance
We have a nascent system that Oystein built called bulk reports that lets us get pii-anonymized traces from real world computers who are inserting into a  speicific histogram bin . E.g., with a bit of poking, we might be able to use bulk reporting to get a trace from real world page loads that have this long delay.

Maybe someone from net/loading side that is poking at this can quickly vc or chat off thread with Oysteine to see if his [awesome] tech can help.

--
You received this message because you are subscribed to the Google Groups "Chromium Loading Performance" group.
To unsubscribe from this group and stop receiving emails from it, send an email to loading-dev...@chromium.org.
To post to this group, send email to loadi...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/loading-dev/CAFswn4mSf8Yan27J5n-SB35_t%3DBE%3DJ2JAV-BZGDNWfpLuQkA%3Dw%40mail.gmail.com.

mattc...@gmail.com

unread,
Jun 15, 2016, 8:00:37 PM6/15/16
to net-dev
Any updates on this? I'm seeing similar issues at NewVoiceMedia. A very small proportion of chrome users show this unexplained gap.

Is it's real remote users with the problem we can't simply debug it locally.

TIA Matt Churcher

Reply all
Reply to author
Forward
0 new messages