INP: Probably a bad measurement metric for web vitals - even with simple HTML structure pages with no CSS will cause bad value.

706 views
Skip to first unread message

王康文

unread,
Apr 3, 2024, 4:49:39 AMApr 3
to web-vitals-feedback
Hi there, Im redirecting posts from web-vitals issues #455

I'm currently working with a news publisher company with approx 5 million page view per day. Due to new metric replacing FID by last month, we encountering a failing INP score for approx 650k URL, So we carefully studying each performance score by gathering a real time field data by sending each INP score (using web-vitals.js) to bigQuery for analysis.

At first glance, we focus on 3rd party library like GTM and GPT, which should be the main cause of the problem due to the nature of these library will consuming most of the main thread works, and it turns out an increasing of 3-5% good score if we remove those library in our initial test, which is a great start of the optimization journey.

But as we try to our effort to move forward furthering decrease our score by changing our code, particulary those using setTimeout and event callback. it turn out no significance increase of good value.

While we take a deep look on interaction and user_agent based on our field data which we collected from our visitor, we found that most of the bad scores comes from a mid-end devices (Like OPPO Reno 8 or Samsung A53).

From the interaction logs, these bad value actually come from an interaction target like a paragraph or a anchor link click, for paragraph, we can determine that visitor actually will scroll and stop or selecting text on those articles, and for the anchor link, it might be cause by GTM click event that trigger INP which common in page analytic standpoint.

So to move on our research, we find a devices similar to the field data and do some lab environment testing, we consistantly do an interaction test for nearly 100 time per day but get inconsistance value of the INP, which might greater 1000ms in some case and as low as < 50ms in some case too. This make us so frustration because we can't find a single target or causes that might make INP value goes so wildly.

After a ton of tries and discussion, we decide to simplify the process of testing , The idea is removing all known issues like 3rd party library, with a simple HTML page, with no CSS (or to be exactly just a simple inline styles); a few paragrah of lorem text, which NO interactive JavaScript beside web-vitals.js for PerformanceEventTiming API to work; and what we found was for some devices mid-end device I've mention above, it turn out on some conditions (or ratherly in most of those test), with a simple scrolling and selecting a text, WILL TRIGGER INP score > 200ms, this is definitely not a great metric in our opinion.

So to make it more clearly, here what we do in our test:

Using an OPPO Reno8 Device. (targeted devices in our field data)
Navigate to test pages using Chrome v123 (https://udn.com/upf/static/common/inp.html).
Simulate Visitor scrolling and stop by selecting a few text and move down abit.
Repeat step 3 for 2 times or more to simulate visitor is reading an article.
Return to home page of the device. (which trigger a page visibility event that cause INP to calculate)
Navigate back to Google Chrome, INP scroll will show on the page.
Most of our test shows a bad INP score, and we do test it on many devices rather then OPPO or Samsung, it turns out it was a same result. So to make it a simple conclusion in our test:

Is mid-end devices contributing the most of our INP bad score?
Is a good idea to rely on user/visitor device capability for this new metric?
Even a simple interaction like selecting text will cause bad INP score.
Please give us a feedback if there is a flaws in our test condition, we do appreciate it as we try to our bast to met the criteria of web vitals, as due to our site is deeply rely on a good inp score. Or if there is some differ insight on this case will be great.

Sorry for a long thread here and we think many of our competitor do struggle on these metric recently, and hope we do raise an issuses about the flaw of these metric.

Thanks.

Michal Mocny

unread,
Apr 5, 2024, 7:18:16 AMApr 5
to 王康文, web-vitals-feedback
We checked the field data we have for that device (OPPO Reno8) and wanted to share this high level summary with you: we see that over 90% of user interactions measured (across all sites) are faster than the 200ms "good" threshold for INP.

(This isn't dismissing the issue you raised with inconsistently reported durations for the flow you described.)

On Wed, Apr 3, 2024 at 3:08 PM Michal Mocny <mmo...@google.com> wrote:
...in the meantime, if you would be able to share traces with us, then I can also take a look to better diagnose the specific reason for the long reported durations.  DevTools performance profile will be sufficient, but even better would be to enable the DevTools experiment "Timeline: Show all events".  You can upload the trace to https://trace.cafe/ if if helps with sharing.

Thanks.

On Wed, Apr 3, 2024 at 1:33 PM Michal Mocny <mmo...@google.com> wrote:
Thank you for this report!  I appreciate the long and detailed thread here.

I will try to reproduce this example on a similar device.  We are aware of a couple of issues where text highlighting on lower-end android devices might be measuring excessive durations, especially on mostly static pages, and I would like to see if these examples are affected by this.

I will see if I can reproduce the use case you mention with the test page you shared, and will report back.


--
You received this message because you are subscribed to the Google Groups "web-vitals-feedback" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web-vitals-feed...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/web-vitals-feedback/a690ab56-4b12-4e2e-864b-75fbcbf820a5n%40googlegroups.com.

Michal Mocny

unread,
Apr 5, 2024, 7:18:20 AMApr 5
to 王康文, web-vitals-feedback
...in the meantime, if you would be able to share traces with us, then I can also take a look to better diagnose the specific reason for the long reported durations.  DevTools performance profile will be sufficient, but even better would be to enable the DevTools experiment "Timeline: Show all events".  You can upload the trace to https://trace.cafe/ if if helps with sharing.

Thanks.

On Wed, Apr 3, 2024 at 1:33 PM Michal Mocny <mmo...@google.com> wrote:
Thank you for this report!  I appreciate the long and detailed thread here.

I will try to reproduce this example on a similar device.  We are aware of a couple of issues where text highlighting on lower-end android devices might be measuring excessive durations, especially on mostly static pages, and I would like to see if these examples are affected by this.

I will see if I can reproduce the use case you mention with the test page you shared, and will report back.


On Wed, Apr 3, 2024 at 4:49 AM 王康文 <nccucommwe...@gmail.com> wrote:
--

Michal Mocny

unread,
Apr 5, 2024, 7:18:26 AMApr 5
to 王康文, web-vitals-feedback
Thank you for this report!  I appreciate the long and detailed thread here.

I will try to reproduce this example on a similar device.  We are aware of a couple of issues where text highlighting on lower-end android devices might be measuring excessive durations, especially on mostly static pages, and I would like to see if these examples are affected by this.

I will see if I can reproduce the use case you mention with the test page you shared, and will report back.


On Wed, Apr 3, 2024 at 4:49 AM 王康文 <nccucommwe...@gmail.com> wrote:
--
Message has been deleted

王康文

unread,
Apr 8, 2024, 3:10:01 AMApr 8
to Michal Mocny, web-vitals-feedback
Hi Michal

Here is a trace with and without experimental feature on:

1. Off: https://trace.cafe/t/mrQoE3egbH
2. On: https://trace.cafe/t/MUUO3XrUg3

Tested using OPPO Reno8 (CPH2359)

in the above test, seem like turning show all event feature on will cause higher INP, not sure about it.


Thanks

王康文

unread,
Apr 17, 2024, 6:43:07 AMApr 17
to web-vitals-feedback
Hi Michal

Any update on this?

From our test, INP seems to be more higher on mid-range and low-end devices, are these devices the main contributor to bad inp scores in overrall?

Thanks 

ethan silverman

unread,
Jul 1, 2024, 8:19:34 AMJul 1
to web-vitals-feedback
Its almost two months without any reply. @王康文 your theory seems to be true! I also notice that with GTM there is odd values returned for click events etc. The values will show low on extension and then high in real metric. Lab scores 100 in Mobile and simple interactions on page like a button click 300ms+? Strange indeed. 

Michal Mocny

unread,
Jul 3, 2024, 8:16:49 AMJul 3
to ethan silverman, web-vitals-feedback
By "low on extension" I think you mean that personally testing on your own device the page appears fast?  Are you using a desktop device which is significantly faster than the target field device (perhaps low-mid tier mobile)?

Also, when you test lab scores (something like lighthouse?) this does not typically simulate interactions at all, so there is typically no INP being tested.  Even if you do set up interaction automation, some libraries will selectively load content only for real user devices.

In other words-- it is very much expected that field data differs significantly (and you can try to reproduce it by carefully reproducing user journeys in the same environments and on similar devices).

王康文

unread,
Jul 5, 2024, 5:10:58 AMJul 5
to web-vitals-feedback
Hi all,

Here's my update on this: Our field data seems to be turning green by the end of May or early June. There is some mismatch in timing on some of our sites, but they're roughly turning green around the same time. We haven't done anything special because we've just been following the INP metric best practices all the time. At most, we've just been turning the content-visibility CSS property on and off respectively to test if the property has any impact on the score. Theoretically, between March and June, our site's base code has remained the same. Our guesses on these issues are:

1. Google may have changed their metric standards around this time, which matches our impacted users.
2. Our targeted field devices' users may have updated their devices or Chrome versions, which could have contributed to the bad scores with older versions of OS or app versions.


Michal Mocny

unread,
Jul 5, 2024, 8:48:46 AMJul 5
to 王康文, web-vitals-feedback
Chrome team has been optimising chrome continually to help improve overall web performance on this metric. 

The speed metrics team is also making some changes to the metric itself to reduce some measurements we agree are not essential four representing UX responsiveness yet where the inp value might have been reported as slow.

We update change logs in chromium and crux release announcements where you can read about the largest changes.  Many other small optimisations are not summarised. 

Generally, in the last 6 months we've seen massive shifts in improved performance of the web thanks to a combination of these factors, together with developer improvements.

Aditya Arora

unread,
Jul 10, 2024, 10:33:50 AM (9 days ago) Jul 10
to web-vitals-feedback
Guys, we are facing the same problem. Clicking on text boxes where there shouldn't be any feedback to customer is causing high INP values for midrange devices. Field data shows better results but Google Search Console show worse values. Has anyone solved it ? Is Chrome optimizing this metric for better results ?
Reply all
Reply to author
Forward
0 new messages