CrUX eligibility - publicly discoverable

82 views
Skip to first unread message

psabha...@gmail.com

unread,
Apr 25, 2024, 9:37:17 PMApr 25
to Chrome UX Report (Discussions)
Hi
I read https://developer.chrome.com/docs/crux/methodology/ and have a question about how publicly discoverable eligibility is checked. Is it checked by googlebot crawler? In other words if i remove <meta name="robots" content="noindex"> from one of my pages, it would not meet the eligibility criteria until googleBot crawls the page again and sees that it can be indexed? (which may happen a long time after the change was made) 

Barry Pollard

unread,
Apr 29, 2024, 3:15:28 AMApr 29
to Chrome UX Report (Discussions), psabha...@gmail.com
Yes that's correct. We based the public discoverability eligibility off of the same system that Search uses. I'm not sure how long it takes to become discoverable again after removing a noindex meta tag.

Dave Smart

unread,
Apr 29, 2024, 4:32:27 AMApr 29
to Chrome UX Report (Discussions), barryp...@google.com, psabha...@gmail.com
This raises an interesting point or two I hadn't considered, I always assumed you just looked for the same indexability criteria, but independently, But does a page have to actually indexed by Google to appear in CrUX?

Also one point I am perhaps not too clear on, are pages blocked by robots.txt included by CrUX, a page blocked by robots.txt can be "indexed" (even if it has a noindex or returns a non-200 status because Googlebot can't crawl to see that), albeit for search based only on the references to the URL, it is technically indexable.

Dave Smart

unread,
Apr 29, 2024, 4:34:02 AMApr 29
to Chrome UX Report (Discussions), Dave Smart, barryp...@google.com, psabha...@gmail.com
But does a page have to actually indexed by Google to appear in CrUX?
Sorry, meant to add: as opposed to just being crawled and seen that the URL would be indexable, even if Google then choose to not index that particular URL.

Barry Pollard

unread,
Apr 29, 2024, 4:51:38 AMApr 29
to Dave Smart, Chrome UX Report (Discussions), psabha...@gmail.com
Well there’s a few things to keep in mind:

For a start we only surface URL level data in CrUX if the URL is “publicly discoverable” (as defined by the common infrastructure also used by Search). I’m not sure what happens if the crawler is blocked with robots.txt but indexable. That’s definitely getting outside of CrUX’s area of expertise (and hence my knowledge). I would advise to avoid ambiguity like that personally.

Origin-level data is eligible if the root page (i.e. the / page) is publicly discoverable. In that case, as per our docs (
https://developer.chrome.com/docs/crux/methodology#origin-eligibility), ALL pages under that origin are included in that origin-level data. Including noindexed pages. This is something that often surprises people.

In both cases the page(s) have to be sufficiently popular (
https://developer.chrome.com/docs/crux/methodology#popularity-eligibility) to appear in CrUX as defined by measurable users (

So, other than the “public discoverability” criteria there is no other direct link with whether a page appears in Search. So yes a page that is indexable but not displayed in Search could have CrUX data if sufficiently possible. We see this with a page that appears with multiple versions (e.g. with and without a trailing slash and no redirect) then CrUX can show data for both but Search—I presume—picks one based on various signals.

However, I have no idea how that applies to page grouping as used in the Core Web Vitals report in Search Console and whether indexable pages not chosen for SERPS will appear in that page. You’d need to ask Search about that as they own that report.

Thanks,
Barry

Dave Smart

unread,
Apr 29, 2024, 5:18:14 AMApr 29
to Barry Pollard, Chrome UX Report (Discussions), psabha...@gmail.com
Thanks Barry for the clarifications! The multi-variants example answers my other question about what happens when Google de-dups, i.e. one url is considered as canonical to another/

I will surely follow up with the search team too.

Barry Pollard

unread,
Apr 29, 2024, 5:28:57 AMApr 29
to Dave Smart, Chrome UX Report (Discussions), psabha...@gmail.com
One more difference worth pointing out is that CrUX strips URL params (
https://developer.chrome.com/docs/crux/methodology#page-eligibility). This helps pages meet the sufficiently popular eligibility criteria for CrUX more often. In many cases URL params are not relevant so the pages can be considered the same, but in some cases (e.g. a product id in the URL) this can result in different content being delivered. CrUX does not try to differentiate these cases and just logs then against the base URL without URL params.

We have talked about whether final slash should be treated the same and grouped together, but currently we don’t. This means you can get different CrUX data for https://example.com/page1 and https://example.com/page1/ if they don’t redirect despite seeming to be the same page.

Dave Smart

unread,
Apr 29, 2024, 5:45:07 AMApr 29
to Chrome UX Report (Discussions), barryp...@google.com, Chrome UX Report (Discussions), psabha...@gmail.com, Dave Smart
Thanks too for that follow up!

Yeah, definitely understood re the param stripping.

The blocked by robots.txt one is interesting to me, being the only real method to control crawl, it's far from uncommon, or bad practice, for folks to block things like filtered / faceted navigation on ecommerce sites to prevent wasted crawling, (noindex etc still require crawl to be seen).

These may or may not, but very often are params, so:  example.com/shoes?colour=red&size=10  example.com/shoes?colour=blue&size=12  and so on. So if robots.txt  isn't stopping CrUX eligibility, site owners might not know that example.com/shoes is including visits to those filtered variants, and often these aren't as well cached and primed, so may actually be performing worse, or even the other way around performing better as there's less product to return.

Definitely seems like a good question for the Search team to see if there's some guidance on how they report eligibility to CrUX in those circumstances!

Barry Pollard

unread,
Apr 29, 2024, 6:01:16 AMApr 29
to Dave Smart, Chrome UX Report (Discussions), psabha...@gmail.com
While I agree that’s useful to understand what the CrUX data is showing you, CrUX is looking to report performance data as experienced by actual users. So if they are slower, but are visited by enough users to influence the over all page performance metrics, then it’s right that they should be included in that IMHO.

Dave Smart

unread,
Apr 29, 2024, 6:29:06 AMApr 29
to Chrome UX Report (Discussions), barryp...@google.com, Chrome UX Report (Discussions), psabha...@gmail.com, Dave Smart
Wholeheartedly agree with that.
Reply all
Reply to author
Forward
0 new messages