[Layout Shift Metrics] feedback on different strategies

71 views
Skip to first unread message

Ilya Grigorik

unread,
Feb 19, 2021, 11:45:52 AM2/19/21
to web-vitals-feedback

Hey folks. Excited to see progress on improving CLS and thank you for the thorough write-up on the methodology and evaluation process — really helpful!

Do you find sliding or session windows easier to understand?

Polling folks at Shopify, we found session windows easier to grok, explain, and reason about. For example, in a context where the site can demarcate certain usage patterns (user adding to cart, etc), it’s easier to map the presence (or absence ;)) of session windows within that span.

One of the top strategies summarizes the layout shift windows as an average, and the rest report the maximum window. For pages which are open a very long time, the average will likely report a more representative value, but in general it will likely be easier for developers to act on a single window—they can log when it occurred, the elements that shifted, and so on. We'd love feedback on which is more important to developers.

With a developer hat on, I agree that max is where we’d turn our attention first and then work down the list from there. That said, given the stated goal of CWV as being user-first, it seems that we should be prioritizing the decision based on what we think better captures the user pain: is it the presence of the large and jagged peak (singular) that causes the most frustration, or is it the cumulative damage? 

My personal intuition is that average is a better user proxy (especially for long-lived pages), but developers should and will also gather peaks as a proxy for where to dig first.

Separately, I assume all of this analysis is done in context of a single “user session”. What is the current thinking and guidance on sessions where the user switches tabs back-n-forth? E.g. I have Gmail open and I come back to it 15 times without reloading throughout the day. I assume we should treat these as 15 distinct sessions and report on visibility switch?

Finally, one idea that came up in internal brainstorm (credit to Helen Lin) is guidance+tips on how the developer could group/label certain activities. For example, in the context of a shopping experience, it would be nice to have a simple mechanism to mark the beginning of checkout flow and the end, and then group+capture all the CLS sessions (hopefully, none :)), input delay events, etc. This is all ~possible via UT and manual gymnastics, but it would be nice to at least have an agreed convention on how to do this, such that analytics vendors could make use of it automagically, and perhaps that could open optimization opportunities for browser vendors too.

ig

Michal Mocny

unread,
Feb 19, 2021, 12:39:11 PM2/19/21
to Ilya Grigorik, web-vitals-feedback
Thanks for the feedback and for polling folks locally!

On Fri, Feb 19, 2021 at 11:45 AM 'Ilya Grigorik' via web-vitals-feedback <web-vital...@googlegroups.com> wrote:

Hey folks. Excited to see progress on improving CLS and thank you for the thorough write-up on the methodology and evaluation process — really helpful!

Do you find sliding or session windows easier to understand?

Polling folks at Shopify, we found session windows easier to grok, explain, and reason about. For example, in a context where the site can demarcate certain usage patterns (user adding to cart, etc), it’s easier to map the presence (or absence ;)) of session windows within that span.

One of the top strategies summarizes the layout shift windows as an average, and the rest report the maximum window. For pages which are open a very long time, the average will likely report a more representative value, but in general it will likely be easier for developers to act on a single window—they can log when it occurred, the elements that shifted, and so on. We'd love feedback on which is more important to developers.

With a developer hat on, I agree that max is where we’d turn our attention first and then work down the list from there. That said, given the stated goal of CWV as being user-first, it seems that we should be prioritizing the decision based on what we think better captures the user pain: is it the presence of the large and jagged peak (singular) that causes the most frustration, or is it the cumulative damage? 

My personal intuition is that average is a better user proxy (especially for long-lived pages), but developers should and will also gather peaks as a proxy for where to dig first.

Separately, I assume all of this analysis is done in context of a single “user session”. What is the current thinking and guidance on sessions where the user switches tabs back-n-forth? E.g. I have Gmail open and I come back to it 15 times without reloading throughout the day. I assume we should treat these as 15 distinct sessions and report on visibility switch?

This is certainly a reasonable perspective.  Slicing on "sessions" within a single Page Load hasn't been done in practice... yet.  Occasionally sessions are cut short (e.g. on Android when the whole browser is backgrounded), but have not been "reset and continued".

However, this is changing.  First and foremost, for BFCache navigations.  Yoav had several presentations in the last few Web Perf WG meetings about proposals for updating the performance timeline to support multiple same document navigation entries.

That approach (perhaps alongside new history api changes) opens the door to potentially doing something similar for SPA navigations in general.  And finally, we did at least discuss if that wouldn't further support slicing on visibility (or other reasons for idle time).  It's pretty nascent, but things are happening!

Finally, one idea that came up in internal brainstorm (credit to Helen Lin) is guidance+tips on how the developer could group/label certain activities. For example, in the context of a shopping experience, it would be nice to have a simple mechanism to mark the beginning of checkout flow and the end, and then group+capture all the CLS sessions (hopefully, none :)), input delay events, etc. This is all ~possible via UT and manual gymnastics, but it would be nice to at least have an agreed convention on how to do this, such that analytics vendors could make use of it automagically, and perhaps that could open optimization opportunities for browser vendors too.

Hmm.  Indeed, I would probably suggest slicing by startTime, which you could choose to mark with UT or just slice locally before uploading.

As an alternative way to simplify "gather" operation, perhaps registering a new PerformanceObserver(s) w/o buffering at the start of the flow and unregistering at the end?

ig

--
You received this message because you are subscribed to the Google Groups "web-vitals-feedback" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web-vitals-feed...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/web-vitals-feedback/c2d55d8d-28c2-4557-b7c7-d636874a7686n%40googlegroups.com.

Ilya Grigorik

unread,
Mar 19, 2021, 12:41:41 PM3/19/21
to Michal Mocny, web-vitals-feedback
Hey Michal. Apologies, slow reply: filter fail and slipped through the cracks.

On Fri, Feb 19, 2021 at 9:39 AM Michal Mocny <mmo...@google.com> wrote:
Thanks for the feedback and for polling folks locally!

On Fri, Feb 19, 2021 at 11:45 AM 'Ilya Grigorik' via web-vitals-feedback <web-vital...@googlegroups.com> wrote:

Separately, I assume all of this analysis is done in context of a single “user session”. What is the current thinking and guidance on sessions where the user switches tabs back-n-forth? E.g. I have Gmail open and I come back to it 15 times without reloading throughout the day. I assume we should treat these as 15 distinct sessions and report on visibility switch?

This is certainly a reasonable perspective.  Slicing on "sessions" within a single Page Load hasn't been done in practice... yet.  Occasionally sessions are cut short (e.g. on Android when the whole browser is backgrounded), but have not been "reset and continued".

However, this is changing.  First and foremost, for BFCache navigations.  Yoav had several presentations in the last few Web Perf WG meetings about proposals for updating the performance timeline to support multiple same document navigation entries.

That approach (perhaps alongside new history api changes) opens the door to potentially doing something similar for SPA navigations in general.  And finally, we did at least discuss if that wouldn't further support slicing on visibility (or other reasons for idle time).  It's pretty nascent, but things are happening!

Agree on all of the above, but let me more direct: how will Chrome measure this with CrUX? We need to have the same ground truth in our telemetry. I assume it will be session-based and reset on visibility change or bf-nav — is that correct?

Finally, one idea that came up in internal brainstorm (credit to Helen Lin) is guidance+tips on how the developer could group/label certain activities. For example, in the context of a shopping experience, it would be nice to have a simple mechanism to mark the beginning of checkout flow and the end, and then group+capture all the CLS sessions (hopefully, none :)), input delay events, etc. This is all ~possible via UT and manual gymnastics, but it would be nice to at least have an agreed convention on how to do this, such that analytics vendors could make use of it automagically, and perhaps that could open optimization opportunities for browser vendors too.

Hmm.  Indeed, I would probably suggest slicing by startTime, which you could choose to mark with UT or just slice locally before uploading.
As an alternative way to simplify "gather" operation, perhaps registering a new PerformanceObserver(s) w/o buffering at the start of the flow and unregistering at the end?

True, the latter strategy makes it clean and simple! I guess if we also emit a UT measure at the end, that'll show up in DevTools and can be harvested by RUM tools.

ig

Philip Walton

unread,
Mar 21, 2021, 8:01:52 PM3/21/21
to Ilya Grigorik, Michal Mocny, web-vitals-feedback
On Fri, Mar 19, 2021 at 9:41 AM 'Ilya Grigorik' via web-vitals-feedback <web-vital...@googlegroups.com> wrote:
Hey Michal. Apologies, slow reply: filter fail and slipped through the cracks.

On Fri, Feb 19, 2021 at 9:39 AM Michal Mocny <mmo...@google.com> wrote:
Thanks for the feedback and for polling folks locally!

On Fri, Feb 19, 2021 at 11:45 AM 'Ilya Grigorik' via web-vitals-feedback <web-vital...@googlegroups.com> wrote:

Separately, I assume all of this analysis is done in context of a single “user session”. What is the current thinking and guidance on sessions where the user switches tabs back-n-forth? E.g. I have Gmail open and I come back to it 15 times without reloading throughout the day. I assume we should treat these as 15 distinct sessions and report on visibility switch?

This is certainly a reasonable perspective.  Slicing on "sessions" within a single Page Load hasn't been done in practice... yet.  Occasionally sessions are cut short (e.g. on Android when the whole browser is backgrounded), but have not been "reset and continued".

However, this is changing.  First and foremost, for BFCache navigations.  Yoav had several presentations in the last few Web Perf WG meetings about proposals for updating the performance timeline to support multiple same document navigation entries.

That approach (perhaps alongside new history api changes) opens the door to potentially doing something similar for SPA navigations in general.  And finally, we did at least discuss if that wouldn't further support slicing on visibility (or other reasons for idle time).  It's pretty nascent, but things are happening!

Agree on all of the above, but let me more direct: how will Chrome measure this with CrUX? We need to have the same ground truth in our telemetry. I assume it will be session-based and reset on visibility change or bf-nav — is that correct?

Current thinking is that changes to CLS will continue to be attributed to the initial page load. E.g. if a user backgrounds a tab, revisits it a few hours/days later, and then that later "session" contains more severe layout shifts than before, the new CLS value will be used and the previously reported value will be ignored.

The recommendation for RUM vendors will be the same as it is currently for CLS. Keep track of the current value and always report at visibilitychange:hidden if the value has changed from the previous reported value. Then, when beacons are processed on the back end, these values will have to be grouped on a common page ID (where the page ID is unique to the current load or bfcache restore) and the max (i.e. last one) is the one that should be used.

I'd love (as you know) to see us explore the possibility of considering some of these cases separate sessions (that would certainly help with SPA measurement), but I don't think we have concrete thoughts on how best to do that yet.

Finally, one idea that came up in internal brainstorm (credit to Helen Lin) is guidance+tips on how the developer could group/label certain activities. For example, in the context of a shopping experience, it would be nice to have a simple mechanism to mark the beginning of checkout flow and the end, and then group+capture all the CLS sessions (hopefully, none :)), input delay events, etc. This is all ~possible via UT and manual gymnastics, but it would be nice to at least have an agreed convention on how to do this, such that analytics vendors could make use of it automagically, and perhaps that could open optimization opportunities for browser vendors too.

Hmm.  Indeed, I would probably suggest slicing by startTime, which you could choose to mark with UT or just slice locally before uploading.
As an alternative way to simplify "gather" operation, perhaps registering a new PerformanceObserver(s) w/o buffering at the start of the flow and unregistering at the end?

True, the latter strategy makes it clean and simple! I guess if we also emit a UT measure at the end, that'll show up in DevTools and can be harvested by RUM tools.

ig

--
You received this message because you are subscribed to the Google Groups "web-vitals-feedback" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web-vitals-feed...@googlegroups.com.

Ilya Grigorik

unread,
Mar 23, 2021, 11:40:20 AM3/23/21
to Philip Walton, Michal Mocny, web-vitals-feedback
Hey Philip.

On Sun, Mar 21, 2021 at 5:01 PM Philip Walton <philip...@google.com> wrote:
On Fri, Mar 19, 2021 at 9:41 AM 'Ilya Grigorik' via web-vitals-feedback <web-vital...@googlegroups.com> wrote:
Agree on all of the above, but let me more direct: how will Chrome measure this with CrUX? We need to have the same ground truth in our telemetry. I assume it will be session-based and reset on visibility change or bf-nav — is that correct?

Current thinking is that changes to CLS will continue to be attributed to the initial page load. E.g. if a user backgrounds a tab, revisits it a few hours/days later, and then that later "session" contains more severe layout shifts than before, the new CLS value will be used and the previously reported value will be ignored.

The recommendation for RUM vendors will be the same as it is currently for CLS. Keep track of the current value and always report at visibilitychange:hidden if the value has changed from the previous reported value. Then, when beacons are processed on the back end, these values will have to be grouped on a common page ID (where the page ID is unique to the current load or bfcache restore) and the max (i.e. last one) is the one that should be used.

Isn't this counter to what we were saying earlier about BF-cache navs? Conceptually, a user hitting back-forward is not much different from switching tabs, and it's confusing — to me, at least — to treat these with different processing logic.  

Building on the above, session stitching is hard and my intuition is that the proposed route wouldn't work for most analytics vendors, at least as implemented today. If I have a tab open for multiple days, and I revisit it once each day, it will most likely be recorded as a distinct session by most analytics vendors... Case in point, Google Analytics has, afaik, default session length of 30m. 
 
I'd love (as you know) to see us explore the possibility of considering some of these cases separate sessions (that would certainly help with SPA measurement), but I don't think we have concrete thoughts on how best to do that yet.

Let's keep SPA out of this for now as that's a separate can of worms. What I'm trying to clarify is the background-foreground transition and the case I'm making is: 
  • we tell everyone to (always) beacon when you make visibility transition -- this we agree on.
  • we treat bf-navs and tab switches as the same -- they trigger same visibility transitions.
  • we need clear guidance on what constitutes a session.. 
    • simplest: every foregrounded span is a session — simple to understand, to stitching required.
    • medium complexity: foreground sessions that occur within X period of time should be stitched
My claim is that we should either default to simplest, or do an audit of how existing analytics/RUM tools stitch and default to that. If GA defaults to 30m, that's one strong signal that we can't and shouldn't expect folks to stitch beyond that.

ig

Ishan Anand

unread,
Mar 26, 2021, 1:41:53 AM3/26/21
to Ilya Grigorik, Philip Walton, Michal Mocny, web-vitals-feedback
Building on the above, session stitching is hard and my intuition is that the proposed route wouldn't work for most analytics vendors, at least as implemented today. If I have a tab open for multiple days, and I revisit it once each day, it will most likely be recorded as a distinct session by most analytics vendors... Case in point, Google Analytics has, afaik, default session length of 30m. 

Plus one on the above.










--
You received this message because you are subscribed to the Google Groups "web-vitals-feedback" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web-vitals-feed...@googlegroups.com.


--
Ishan Anand
Moovweb
340 Pine Street, Suite 400
San Francisco, CA 94104
Cell: +1-415-335-6094 (sms/texts are welcome)
Reply all
Reply to author
Forward
0 new messages