CrUX API Response Stripping Parameters from URI

Ryan Siddle

unread,

Dec 10, 2020, 5:07:07 AM12/10/20

to Chrome UX Report (Discussions)

Hi all,

We're finding that the CrUX API seems to be treating URI parameters as duplicate content, providing a "normalized" URI in the response. The two examples below show prominent sites whereby these pages are still using URI parameters and rank in Google search results too. These pages may have different underlying templates based on a parameter value, which would give a false reading if the normalized URI is only one of many templates. I could be mistaken in the logic though. Any feedback or comments would be appreciated.

curl "https://chromeuxreport.googleapis.com/v1/records:queryRecord?key=$GOOGLE_CRUX_API_SECRET_KEY" --header 'Content-Type: application/json' --data '{"url": "https://store.hp.com/UKStore/Merch/List.aspx\?sel\=PRN\&ctrl\=f\&fc_seg_bus\=1"}'

curl "https://chromeuxreport.googleapis.com/v1/records:queryRecord?key=$GOOGLE_CRUX_API_SECRET_KEY" --header 'Content-Type: application/json' --data '{"url": "https://www.booking.com/searchresults.html\?city\=436970\;dest_id\=436970\;dest_type\=city\;offset\=330"}'

Best regards,

Ryan

Rick Viscomi

unread,

Dec 11, 2020, 1:46:49 PM12/11/20

to Chrome UX Report (Discussions), ryan....@merj.com

Hi Ryan,

The normalized URL data returned by the CrUX API includes all user experiences under that base URL, for all parameter values. In other words, URLs only having different parameter values are treated as a single page by CrUX and all UX data are aggregated together. Not to say that results for only one parameterized URL are being returned.

Hope that helps,

Rick

Ryan Siddle

unread,

Dec 14, 2020, 11:11:10 AM12/14/20

to Chrome UX Report (Discussions), Rick Viscomi, Ryan Siddle

Hi Rick,

Thanks for confirming. Is the reason behind stripping parameters for security? There are still some sites that use dynamic URLs instead of static URLs.

On a second note, how is the CrUX data handled for areas around state machines such as checkout flows? For example, I imagine Apple's checkout flow is quite busy most days of the year, yet there is no CrUX data for it?

curl -X POST "https://chromeuxreport.googleapis.com/v1/records:queryRecord?key=$GOOGLE_CRUX_API_SECRET_KEY" --header 'Content-Type: application/json' --data '{"url": "https://www.apple.com/shop/bag"}'

{
"error": {
    "code": 404,
    "message": "chrome ux report data not found",
    "status": "NOT_FOUND"
}
}

Best regards,

Ryan

Rick Viscomi

unread,

Dec 15, 2020, 1:30:28 PM12/15/20

to Chrome UX Report (Discussions), ryan....@merj.com, Rick Viscomi

Hi Ryan,

It's a methodological choice for consistency. For example, URLs can contain lots of inconsequential parameters for marketing or page state. If we keyed on those, there would be multiple fragmented "versions" of the page, each one needing to meet the popularity threshold to be included in the dataset. Normalizing to URLs without parameters solves this general problem at the expense of losing granularity for the edge case of pages that use parameters to distinguish themselves.

Note that this particular checkout page explicitly declares itself to be `noindex`.

Rick

Ryan Siddle

unread,

Dec 17, 2020, 6:58:43 AM12/17/20

to Chrome UX Report (Discussions), Rick Viscomi, Ryan Siddle

Hi Rick,

Thanks for the feedback on the parameter normalization.

I noticed that it was a noindex, but was unsure whether that would be the case alone. Is it safe to assume that noindex pages are excluded from the monthly origin data in BigQuery?

An announcement from the Google Search team mentioned a Google algorithm update that is expected in May 2021 with CWV being introduced as a weighting factor. I'm not particularly clear whether CWV for noindex pages would be an attributing factor with that data not being available through the CrUX API. It would make it more tricky for developers not being able to collect the CWV metrics for noindex pages if the ranking algorithm does factor in origin and URL level CWV metrics.

Best regards,

Ryan

Josh Leigh

unread,

Mar 2, 2021, 1:11:43 PM3/2/21

to Chrome UX Report (Discussions), ryan....@merj.com, Rick Viscomi

Hi Rick,

Do you have clarification on how noIndex pages impact CWV? John Mueller on the Search advocate team states "Core Web Vitals will count for noindexed pages and things blocked in robots.txt. It’s quite interesting because obviously Search Console is aggregated, you get a group of pages… How do you understand that this is a group if these are noindexed and you’ve not got the context? Or is it just based on URL path" but a lot of your prior comments in early 2019 and late 2018 state noIndex pages won't be included in CWV data.

Thanks,

Josh

Rick Viscomi

unread,

Apr 20, 2021, 3:46:45 PM4/20/21

to Chrome UX Report (Discussions), Josh Leigh, ryan....@merj.com, Rick Viscomi, John Mueller

Hi Josh,

I can elaborate on how noindex pages are represented in the CrUX dataset, without getting into Search specifics.

A page will not be available at the URL level in tools like PSI and CrUX API if it's set to noindex.

An origin will not be available in tools like PSI, CrUX API, and CrUX BigQuery/Dashboard if its root page is set to noindex. However when an origin's root is indexable, the origin-level aggregation can include user experiences from any page on the origin, regardless their individual noindex states.

Rick

Reply all

Reply to author

Forward