Measuring scroll latency with our high-res timestamp API

10 views
Skip to first unread message

Rick Byers

unread,
Nov 13, 2015, 9:06:13 PM11/13/15
to input-dev, igri...@chromium.org, Tim Dresser, Dave Tapuska, Majid Valipour
With Majid switching Event.timeStamp over to be the system time, and Dave working on passive event listeners I figured it was time to build a demo of measuring scroll latency given input event timestamps.  My hope is that eventually APM folks like newRelic will pick this up (but we don't want to push that without also pushing passive event listeners, since we don't want them to ever make the problem worse).

It seems to be working pretty well with Majid's feature enabled (modulo a couple outstanding timestamp bugs on some platforms), and already works (via a slightly different code path) on Safari 9+.

Of course this isn't end-to-end scroll latency, just the input side.  But I think it's pretty close to what developers should primarily care about (the blocking they can cause with their listeners).  I _think_ we still want to do the PerformanceTimeline API for the following reasons:
  • The values it reports will be very precisely defined - not including the system/hardware-defined input-latency, and so comparable between browsers and platforms.
  • It's simpler - takes the guesswork out of what events matter (blocking scrolling etc)
Feedback?

Rick

Tim Dresser

unread,
Nov 13, 2015, 9:54:09 PM11/13/15
to Rick Byers, input-dev, igri...@chromium.org, Dave Tapuska, Majid Valipour
I agree with your justification of why we still want the PerformanceTimeline API.
In addition, I think there is some benefit to having RUM collected in a consistent place.

Tim

Ilya Grigorik

unread,
Nov 16, 2015, 1:51:38 PM11/16/15
to Tim Dresser, Rick Byers, input-dev, Dave Tapuska, Majid Valipour, Nat Duca
I'm wondering if, as a first step, we should consider doing a perf-violation-style report for this? The super hand-wavy description is:

- Developer registers a reporting endpoint via mikewest.github.io/error-reporting (ignore the name, WIP :))
- UA records perf violation reports (e.g. scroll jank) and emits a report: 
-- Triggering criteria are under our control
-- The format is flexible and something we can iterate on 
-- The schema is ~semi-structured and can be processed by existing RUM vendors

The advantage of above model is that we don't have to go through the same standardization gauntlet upfront, before we know what we want to surface exactly and under what conditions. With above we have a lot of flexibility to surface platform-specific insights, modify trigger criteria, etc... And, of course, in the longer term we can also expose a first-class timeline API once we have some hands-on experience with this stuff.

Thoughts?
ig

Tim Dresser

unread,
Nov 16, 2015, 1:59:03 PM11/16/15
to Ilya Grigorik, Rick Byers, input-dev, Dave Tapuska, Majid Valipour, Nat Duca
How far are we from shipping perf-violation-style reports?

What other violations are we planning to report? Is it just csp and hpkp for now?

In general, I like the idea of launching through a format that we can iterate quickly on.

Ilya Grigorik

unread,
Nov 16, 2015, 2:19:55 PM11/16/15
to Tim Dresser, Rick Byers, input-dev, Dave Tapuska, Majid Valipour, Nat Duca
On Mon, Nov 16, 2015 at 10:58 AM, Tim Dresser <tdre...@google.com> wrote:
How far are we from shipping perf-violation-style reports?

We need to flush out remaining bits in the reporting API spec, and then implement it. Optimistically, the former is something we could nail down this quarter, and implementation (need to find a dev owner) could be a plausible Q1 target. With that in place it's mostly a question to you guys on (a) trigger criteria, (b) type of reports you'd like to send.. In fact, having some sketches of (a) and (b) would help with defining the reporting spec as well.
 
What other violations are we planning to report? Is it just csp and hpkp for now?

Yes, plus NEL, and likely.. some memory reports (e.g. OOM's, if we can get our hands on those).

Nat Duca

unread,
Nov 16, 2015, 2:36:43 PM11/16/15
to Ilya Grigorik, Tim Dresser, Rick Byers, input-dev, Dave Tapuska, Majid Valipour
Do we restrict these timings to be available only on passive listeners? Could we?

My main concern with us exposing this on non-passive regular input events is an increase in naieve RUM attempts that just add more listeners to the body ... because I think that's typically what peeps would do...

Rick Byers

unread,
Nov 16, 2015, 3:57:13 PM11/16/15
to Nat Duca, Ilya Grigorik, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
On Mon, Nov 16, 2015 at 11:36 AM, Nat Duca <nd...@google.com> wrote:
Do we restrict these timings to be available only on passive listeners? Could we?

We don't (we don't have passive listeners fully implemented yet).  I'm not sure we really could - that would be bizarre if timeStamp was 0 or something for active listeners.

My main concern with us exposing this on non-passive regular input events is an increase in naieve RUM attempts that just add more listeners to the body ... because I think that's typically what peeps would do...

Yes that was my concern about promoting this before passive listeners are ready.  I left a comment in my sample code warning about this, but it's probably not enough.  If we're going to publish any samples / demos, perhaps we should do so only using our passive event listeners polyfill?  That way (assuming the API doesn't change before shipping) any copy/pasted code will just get passive listeners when available?

If this seems valuable I can update my sample for this now.

Nat Duca

unread,
Nov 17, 2015, 1:03:53 PM11/17/15
to Rick Byers, Ilya Grigorik, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
I guess I'm more concerned now than I was before. I think we need to be cautious here. I think the failure mode here is quite similar to the whole TheVerge thing around syncscroll --- we get one site out there that adds a listener for monitoring and then kills threading in various levels of bad ways.

On Mon, Nov 16, 2015 at 12:56 PM, Rick Byers <rby...@chromium.org> wrote:
On Mon, Nov 16, 2015 at 11:36 AM, Nat Duca <nd...@google.com> wrote:
Do we restrict these timings to be available only on passive listeners? Could we?

We don't (we don't have passive listeners fully implemented yet).  I'm not sure we really could - that would be bizarre if timeStamp was 0 or something for active listeners.

This argues more for not piggybacking on the event timestamp right? Long ago, we discussed having a separate field for latency calculation. And even in this thread, we're talking about a perf observer approach that'd give us the same things, without the footgun.

That is, even with your reply, I'm uneasy about even allowing latency calculation on non-passive events. They're coming, so why the rush? If we're rushed, go the perf observer route...

Ilya Grigorik

unread,
Nov 17, 2015, 5:44:11 PM11/17/15
to Nat Duca, Rick Byers, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
On Tue, Nov 17, 2015 at 10:03 AM, Nat Duca <nd...@google.com> wrote:
That is, even with your reply, I'm uneasy about even allowing latency calculation on non-passive events. They're coming, so why the rush? If we're rushed, go the perf observer route...

Just to make sure I understand the concern.. 

- We go out and tell developers that input latency is a problem they should care about
- To figure out _if_ it's a problem developers have to install new handlers to capture the event and timestamp it at the end
-- they could define these handlers with the {passive: true} flag, but...
--- if UA doesn't support this (yet) then the listener itself creates a perf problem
--- once UA supports it, all is well.

Does that capture it? 

Curious, do we have data on how many pages register for such events already? My hunch is that most pages will have at least one - courtesy of own code, analytics code, ads code, etc - and the impact of an additional handler will be near zero with respect to where they're at currently.

On the other hand, this is also another good argument for starting with UA-initiated violation reports.. We can instrument things on our end without asking developers to add any additional handlers. 

Nat Duca

unread,
Nov 17, 2015, 6:34:15 PM11/17/15
to Ilya Grigorik, Rick Byers, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
On Tue, Nov 17, 2015 at 2:43 PM, Ilya Grigorik <igri...@google.com> wrote:
On Tue, Nov 17, 2015 at 10:03 AM, Nat Duca <nd...@google.com> wrote:
That is, even with your reply, I'm uneasy about even allowing latency calculation on non-passive events. They're coming, so why the rush? If we're rushed, go the perf observer route...

Just to make sure I understand the concern.. 

- We go out and tell developers that input latency is a problem they should care about
- To figure out _if_ it's a problem developers have to install new handlers to capture the event and timestamp it at the end
-- they could define these handlers with the {passive: true} flag, but...
--- if UA doesn't support this (yet) then the listener itself creates a perf problem
--- once UA supports it, all is well.
Assuming that they don't mess it up. If they mess it up, then they've made a perf problem.

Does that capture it? 
Partly. The other thing is that we've been on a crusade for *years* to get people to *not* register these listeners. So now we're saying "now you should care about input latency, add a body touch/wheel AND mouse listener"


Curious, do we have data on how many pages register for such events already? My hunch is that most pages will have at least one - courtesy of own code, analytics code, ads code, etc - and the impact of an additional handler will be near zero with respect to where they're at currently
 
I tend to think thats not quite true. As I noted before, we've actively been discouraging people from adding mousewheel listeners, for example. We were literally checking yesterday whether facebook needed a mousewheel listener because of these issues.


On the other hand, this is also another good argument for starting with UA-initiated violation reports.. We can instrument things on our end without asking developers to add any additional handlers. 
I don't know if we need to go so far as to UA violations.

Why not for instance allow people to subscribe to input timing events via performance observer? Only. Off by default, on when you ask for it. Simple spec, easy spec. No footguns.

Ilya Grigorik

unread,
Nov 17, 2015, 7:27:05 PM11/17/15
to Nat Duca, Rick Byers, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
On Tue, Nov 17, 2015 at 3:33 PM, Nat Duca <nd...@google.com> wrote:
On Tue, Nov 17, 2015 at 2:43 PM, Ilya Grigorik <igri...@google.com> wrote:
On Tue, Nov 17, 2015 at 10:03 AM, Nat Duca <nd...@google.com> wrote:
That is, even with your reply, I'm uneasy about even allowing latency calculation on non-passive events. They're coming, so why the rush? If we're rushed, go the perf observer route...

Just to make sure I understand the concern.. 

- We go out and tell developers that input latency is a problem they should care about
- To figure out _if_ it's a problem developers have to install new handlers to capture the event and timestamp it at the end
-- they could define these handlers with the {passive: true} flag, but...
--- if UA doesn't support this (yet) then the listener itself creates a perf problem
--- once UA supports it, all is well.
Assuming that they don't mess it up. If they mess it up, then they've made a perf problem.

Does that capture it? 
Partly. The other thing is that we've been on a crusade for *years* to get people to *not* register these listeners. So now we're saying "now you should care about input latency, add a body touch/wheel AND mouse listener"

Well, at the limit I don't think our aim is to eliminate them, right? I'm still getting up to speed here, so I may be talking nonsense (wouldn't be the first time ;)), but my understanding is that the major step function happens when you add the first handler -- with that present we now have to synchronize between threads, etc. As such, any additional handlers are small incremental cost? That's not to say that we should encourage more of them, but at the same time if we accept that we'll have at least one then perhaps we're optimizing for the wrong thing?
 
The other thing to discuss here is what exactly we're telling developers to measure... Scroll latency is a tidy case where all the instrumentation lives in the browser -- we can surface that via Perf Observer with nice properties. However, I don't think that's sufficient. We're telling developers to measure "response" at a much higher level: time from input event to some _meaningful response_ within the application. The only way to do this is for the application to capture the event it cares about (click, scroll, etc), execute own logic, and then emit a custom metric at the end (e.g. as it renders the response, triggers an animation, or some such). So, if I want to do "response RUM" I'd still want to register for these events..?
 
On the other hand, this is also another good argument for starting with UA-initiated violation reports.. We can instrument things on our end without asking developers to add any additional handlers. 
I don't know if we need to go so far as to UA violations.

Why not for instance allow people to subscribe to input timing events via performance observer? Only. Off by default, on when you ask for it. Simple spec, easy spec. No footguns.

I'm not against it! I'm just trying to wrap my head around the full "response RUM" story.. 

ig

Ojan Vafai

unread,
Nov 18, 2015, 12:16:52 PM11/18/15
to Ilya Grigorik, Nat Duca, Rick Byers, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
On Tue, Nov 17, 2015 at 4:27 PM 'Ilya Grigorik' via input-dev <inpu...@chromium.org> wrote:
On Tue, Nov 17, 2015 at 3:33 PM, Nat Duca <nd...@google.com> wrote:
On Tue, Nov 17, 2015 at 2:43 PM, Ilya Grigorik <igri...@google.com> wrote:
On Tue, Nov 17, 2015 at 10:03 AM, Nat Duca <nd...@google.com> wrote:
That is, even with your reply, I'm uneasy about even allowing latency calculation on non-passive events. They're coming, so why the rush? If we're rushed, go the perf observer route...

Just to make sure I understand the concern.. 

- We go out and tell developers that input latency is a problem they should care about
- To figure out _if_ it's a problem developers have to install new handlers to capture the event and timestamp it at the end
-- they could define these handlers with the {passive: true} flag, but...
--- if UA doesn't support this (yet) then the listener itself creates a perf problem
--- once UA supports it, all is well.
Assuming that they don't mess it up. If they mess it up, then they've made a perf problem.

Does that capture it? 
Partly. The other thing is that we've been on a crusade for *years* to get people to *not* register these listeners. So now we're saying "now you should care about input latency, add a body touch/wheel AND mouse listener"

Well, at the limit I don't think our aim is to eliminate them, right? I'm still getting up to speed here, so I may be talking nonsense (wouldn't be the first time ;)), but my understanding is that the major step function happens when you add the first handler -- with that present we now have to synchronize between threads, etc. As such, any additional handlers are small incremental cost? That's not to say that we should encourage more of them, but at the same time if we accept that we'll have at least one then perhaps we're optimizing for the wrong thing?
 
The other thing to discuss here is what exactly we're telling developers to measure... Scroll latency is a tidy case where all the instrumentation lives in the browser -- we can surface that via Perf Observer with nice properties. However, I don't think that's sufficient. We're telling developers to measure "response" at a much higher level: time from input event to some _meaningful response_ within the application. The only way to do this is for the application to capture the event it cares about (click, scroll, etc), execute own logic, and then emit a custom metric at the end (e.g. as it renders the response, triggers an animation, or some such). So, if I want to do "response RUM" I'd still want to register for these events..?

There is an alternative between us providing a fixed list of tidy things and the author having to listen to specific events. We could provide an event stream API that delivered callbacks asynchronously.

Roughly something like this:
window.addEventObserver(['scroll', 'touchmove'], function(stream) {
    for (event in stream) {
        // Process each event. These would be in the order in which they fired.
    }
})

This is not that different from passive event listeners except that it's *entirely* out of the critical path and you get a single callback for a stream of events instead of one per event. Those two things make for a much more performant, foot-gun-free API without losing any richness.
  • No relationship with non-passive listeners. They stay async and fast no matter what the rest of the page is doing.
  • Fewer callbacks. This is less raw overhead and probably faster in practice because the base overhead of whatever the listening code can be better amortized.
  • The callbacks can be delivered during idle periods.
  • Noone will try to use such an API for making changes to their UI (due to the idle period thing)
The only downside to this approach that I see is that it requires more changes in existing code than passive event listeners. But I don't see the two as mutually exclusive. 

About the original question regarding high res timestamps. Is performance monitoring the only thing we want high res timestamps for? I thought you needed them in order to do good touchmove-based gestures (e.g. where you want to measure the velocity/acceleration of a fling). Am I wrong?

Rick Byers

unread,
Nov 18, 2015, 12:44:01 PM11/18/15
to Ilya Grigorik, Nat Duca, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
On Tue, Nov 17, 2015 at 4:26 PM, 'Ilya Grigorik' via input-dev <inpu...@chromium.org> wrote:

On Tue, Nov 17, 2015 at 3:33 PM, Nat Duca <nd...@google.com> wrote:
On Tue, Nov 17, 2015 at 2:43 PM, Ilya Grigorik <igri...@google.com> wrote:
On Tue, Nov 17, 2015 at 10:03 AM, Nat Duca <nd...@google.com> wrote:
That is, even with your reply, I'm uneasy about even allowing latency calculation on non-passive events. They're coming, so why the rush? If we're rushed, go the perf observer route...

One additional data point here: this technique of measuring the timestamp already works today in Safari.  That's the one advantage it has over the perf observer route.

Just to make sure I understand the concern.. 

- We go out and tell developers that input latency is a problem they should care about
- To figure out _if_ it's a problem developers have to install new handlers to capture the event and timestamp it at the end
-- they could define these handlers with the {passive: true} flag, but...
--- if UA doesn't support this (yet) then the listener itself creates a perf problem
--- once UA supports it, all is well.
Assuming that they don't mess it up. If they mess it up, then they've made a perf problem.

Does that capture it? 
Partly. The other thing is that we've been on a crusade for *years* to get people to *not* register these listeners. So now we're saying "now you should care about input latency, add a body touch/wheel AND mouse listener"

Well, at the limit I don't think our aim is to eliminate them, right? I'm still getting up to speed here, so I may be talking nonsense (wouldn't be the first time ;)), but my understanding is that the major step function happens when you add the first handler -- with that present we now have to synchronize between threads, etc. As such, any additional handlers are small incremental cost? That's not to say that we should encourage more of them, but at the same time if we accept that we'll have at least one then perhaps we're optimizing for the wrong thing?

Yeah we've been trying for years and only partially succeeding (maybe largely failing judging anecdotaly by the sites we measure).

I suggest that the way we couch this is "IF you have such listeners, then you should measure the performance impact they're having via the timeStamp".  I believe NewRelic is already hooking event listener addition, so this is a logical extension to what they're doing.

Rick Byers

unread,
Nov 18, 2015, 12:49:35 PM11/18/15
to Ojan Vafai, Ilya Grigorik, Nat Duca, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
Yep I've debated this design at length, eg. with Elliott.  There are a lot of passive event listener scenarios where this is insufficient.  Eg. a common use of touchstart/touchmove listeners we see is hiding UI like tooltips.  You still want that to happen as soon as possible on the start of a scroll (not some point in the future).  We 

About the original question regarding high res timestamps. Is performance monitoring the only thing we want high res timestamps for? I thought you needed them in order to do good touchmove-based gestures (e.g. where you want to measure the velocity/acceleration of a fling). Am I wrong?

Yes, good point - we do indeed have scenarios where even non-passive listeners want good timestamps.  It's just not clear how important those scenarios are (been pri-2 for years). 

Ilya Grigorik

unread,
Nov 19, 2015, 2:14:43 PM11/19/15
to Rick Byers, Ojan Vafai, Nat Duca, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
Should we schedule a VC to discuss this further? Say, early next week?

Lots of good ideas here, and I'm thinking a higher-bandwidth channel will help us converge faster... 

Nat Duca

unread,
Nov 19, 2015, 3:02:40 PM11/19/15
to Ilya Grigorik, Rick Byers, Ojan Vafai, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
Mmm that sounds great.

Ojan Vafai

unread,
Nov 19, 2015, 5:28:16 PM11/19/15
to Nat Duca, Ilya Grigorik, Rick Byers, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
SGTM. I'm happy to attend if you want my input, but I can also step out if there are too many cooks in the kitchen.

Rick Byers

unread,
Nov 19, 2015, 5:29:53 PM11/19/15
to Ojan Vafai, Nat Duca, Ilya Grigorik, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
+1.

Nat/Ilya you guys also shouldn't hesitate to also brainstorm f2f on this for even higher bandwidth.  As far as I'm concerned, you're the owners here - we'll take whatever you guys decide makes sense for RUM performance monitoring generally and implement our input scenarios within it.  What matters to me is that developers have SOME way to monitor their scroll latency, and that mechanism is consistent with our larger perf monitoring efforts.

Rick 

Ilya Grigorik

unread,
Nov 19, 2015, 6:40:59 PM11/19/15
to Rick Byers, Ojan Vafai, Nat Duca, Tim Dresser, input-dev, Dave Tapuska, Majid Valipour
Great. Sent out an invite for Monday.. we can move it around if that time doesn't work.
Reply all
Reply to author
Forward
0 new messages