INP: Probably a bad measurement metric for web vitals - even with simple HTML structure pages with no CSS will cause bad value.

王康文

unread,

Apr 3, 2024, 4:49:39 AMApr 3

to web-vitals-feedback

Hi there, Im redirecting posts from web-vitals issues #455

I'm currently working with a news publisher company with approx 5 million page view per day. Due to new metric replacing FID by last month, we encountering a failing INP score for approx 650k URL, So we carefully studying each performance score by gathering a real time field data by sending each INP score (using web-vitals.js) to bigQuery for analysis.

At first glance, we focus on 3rd party library like GTM and GPT, which should be the main cause of the problem due to the nature of these library will consuming most of the main thread works, and it turns out an increasing of 3-5% good score if we remove those library in our initial test, which is a great start of the optimization journey.

But as we try to our effort to move forward furthering decrease our score by changing our code, particulary those using setTimeout and event callback. it turn out no significance increase of good value.

While we take a deep look on interaction and user_agent based on our field data which we collected from our visitor, we found that most of the bad scores comes from a mid-end devices (Like OPPO Reno 8 or Samsung A53).

From the interaction logs, these bad value actually come from an interaction target like a paragraph or a anchor link click, for paragraph, we can determine that visitor actually will scroll and stop or selecting text on those articles, and for the anchor link, it might be cause by GTM click event that trigger INP which common in page analytic standpoint.

So to move on our research, we find a devices similar to the field data and do some lab environment testing, we consistantly do an interaction test for nearly 100 time per day but get inconsistance value of the INP, which might greater 1000ms in some case and as low as < 50ms in some case too. This make us so frustration because we can't find a single target or causes that might make INP value goes so wildly.

After a ton of tries and discussion, we decide to simplify the process of testing , The idea is removing all known issues like 3rd party library, with a simple HTML page, with no CSS (or to be exactly just a simple inline styles); a few paragrah of lorem text, which NO interactive JavaScript beside web-vitals.js for PerformanceEventTiming API to work; and what we found was for some devices mid-end device I've mention above, it turn out on some conditions (or ratherly in most of those test), with a simple scrolling and selecting a text, WILL TRIGGER INP score > 200ms, this is definitely not a great metric in our opinion.

So to make it more clearly, here what we do in our test:

Using an OPPO Reno8 Device. (targeted devices in our field data)
Navigate to test pages using Chrome v123 (https://udn.com/upf/static/common/inp.html).
Simulate Visitor scrolling and stop by selecting a few text and move down abit.
Repeat step 3 for 2 times or more to simulate visitor is reading an article.
Return to home page of the device. (which trigger a page visibility event that cause INP to calculate)
Navigate back to Google Chrome, INP scroll will show on the page.
Most of our test shows a bad INP score, and we do test it on many devices rather then OPPO or Samsung, it turns out it was a same result. So to make it a simple conclusion in our test:

Is mid-end devices contributing the most of our INP bad score?
Is a good idea to rely on user/visitor device capability for this new metric?
Even a simple interaction like selecting text will cause bad INP score.
Please give us a feedback if there is a flaws in our test condition, we do appreciate it as we try to our bast to met the criteria of web vitals, as due to our site is deeply rely on a good inp score. Or if there is some differ insight on this case will be great.

Sorry for a long thread here and we think many of our competitor do struggle on these metric recently, and hope we do raise an issuses about the flaw of these metric.

Thanks.

Michal Mocny

unread,

Apr 5, 2024, 7:18:16 AMApr 5

to 王康文, web-vitals-feedback

We checked the field data we have for that device (OPPO Reno8) and wanted to share this high level summary with you: we see that over 90% of user interactions measured (across all sites) are faster than the 200ms "good" threshold for INP.

(This isn't dismissing the issue you raised with inconsistently reported durations for the flow you described.)

On Wed, Apr 3, 2024 at 3:08 PM Michal Mocny <mmo...@google.com> wrote:

...in the meantime, if you would be able to share traces with us, then I can also take a look to better diagnose the specific reason for the long reported durations. DevTools performance profile will be sufficient, but even better would be to enable the DevTools experiment "Timeline: Show all events". You can upload the trace to https://trace.cafe/ if if helps with sharing.

Thanks.

On Wed, Apr 3, 2024 at 1:33 PM Michal Mocny <mmo...@google.com> wrote:
Thank you for this report! I appreciate the long and detailed thread here.

I will try to reproduce this example on a similar device. We are aware of a couple of issues where text highlighting on lower-end android devices might be measuring excessive durations, especially on mostly static pages, and I would like to see if these examples are affected by this.

I will see if I can reproduce the use case you mention with the test page you shared, and will report back.

--
You received this message because you are subscribed to the Google Groups "web-vitals-feedback" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web-vitals-feed...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/web-vitals-feedback/a690ab56-4b12-4e2e-864b-75fbcbf820a5n%40googlegroups.com.

Michal Mocny

unread,

Apr 5, 2024, 7:18:20 AMApr 5

to 王康文, web-vitals-feedback

...in the meantime, if you would be able to share traces with us, then I can also take a look to better diagnose the specific reason for the long reported durations. DevTools performance profile will be sufficient, but even better would be to enable the DevTools experiment "Timeline: Show all events". You can upload the trace to https://trace.cafe/ if if helps with sharing.

Thanks.

On Wed, Apr 3, 2024 at 1:33 PM Michal Mocny <mmo...@google.com> wrote:

Thank you for this report! I appreciate the long and detailed thread here.

I will try to reproduce this example on a similar device. We are aware of a couple of issues where text highlighting on lower-end android devices might be measuring excessive durations, especially on mostly static pages, and I would like to see if these examples are affected by this.

I will see if I can reproduce the use case you mention with the test page you shared, and will report back.

On Wed, Apr 3, 2024 at 4:49 AM 王康文 <nccucommwe...@gmail.com> wrote:

--

Michal Mocny

unread,

Apr 5, 2024, 7:18:26 AMApr 5

to 王康文, web-vitals-feedback

Thank you for this report! I appreciate the long and detailed thread here.

I will try to reproduce this example on a similar device. We are aware of a couple of issues where text highlighting on lower-end android devices might be measuring excessive durations, especially on mostly static pages, and I would like to see if these examples are affected by this.

I will see if I can reproduce the use case you mention with the test page you shared, and will report back.

On Wed, Apr 3, 2024 at 4:49 AM 王康文 <nccucommwe...@gmail.com> wrote:

--

Message has been deleted

王康文

unread,

Apr 8, 2024, 3:10:01 AMApr 8

to Michal Mocny, web-vitals-feedback

Hi Michal

Here is a trace with and without experimental feature on:

1. Off: https://trace.cafe/t/mrQoE3egbH
2. On: https://trace.cafe/t/MUUO3XrUg3

Tested using OPPO Reno8 (CPH2359)

in the above test, seem like turning show all event feature on will cause higher INP, not sure about it.

Thanks

王康文

unread,

Apr 17, 2024, 6:43:07 AMApr 17

to web-vitals-feedback

Hi Michal

Any update on this?

From our test, INP seems to be more higher on mid-range and low-end devices, are these devices the main contributor to bad inp scores in overrall?

Thanks

ethan silverman

unread,

Jul 1, 2024, 8:19:34 AM (2 days ago) Jul 1

to web-vitals-feedback

Its almost two months without any reply. @王康文 your theory seems to be true! I also notice that with GTM there is odd values returned for click events etc. The values will show low on extension and then high in real metric. Lab scores 100 in Mobile and simple interactions on page like a button click 300ms+? Strange indeed.

Reply all

Reply to author

Forward