Pages are 'noindex'ed, but CrUX is still reporting on them

193 views
Skip to first unread message

Ari F

unread,
Jun 21, 2023, 2:06:06 PM6/21/23
to Chrome UX Report (Discussions)
Hi all, we have the following two pages that are meta-tagged with 'noindex' however CrUX is still reporting on them:
  • https://hidrb.com/start
    • 'noindex' seemed to have worked here for months until 6/10 collection period started picking them up for whatever reason.
  • https://hidrb.com/dashboard
    • 'noindex' never seemed to work here. it has CrUX historical data points as far as I know.
Those two pages have particularly bad performance but because they're not landing pages, we try to avoid indexing. Google Search Console acknowledges that they're meta-tagged with 'noindex'. I also double checked to make sure we are't blocking these pages in robots.txt- as that was an issue we've ran into before. 

So I'm wondering:
  1. Why is CrUX data reporting on these 'noindex'ed pages? According to eligibility docs, they shouldn't be eligible with a `noindex` tag.
  2. Is the data from these pages contributing to our origin score?

Any advice would be appreciated.
Thanks

❄ Johannes Henkel

unread,
Jun 21, 2023, 3:04:51 PM6/21/23
to Ari F, Chrome UX Report (Discussions)
Hmm! I loaded https://hidrb.com/start with Chrome on a Linux desktop and used view source; I cannot find the meta tag with noindex.
There see another one, <meta name="robots" content="max-image-preview:large"/>
I also tried wget -q -S -O - https://hidrb.com/start to see the headers, doesn't look like noindex is on the headers either.
Can you please make sure on your side that it's working? The other robots tag may have gotten into the way on the server side, maybe?

https://developers.google.com/search/docs/crawling-indexing/block-indexing - This has the documentation for the choices and syntax, which you probably already know.

--
You received this message because you are subscribed to the Google Groups "Chrome UX Report (Discussions)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chrome-ux-repo...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chrome-ux-report/b94e7862-a1e0-4d4e-8b64-5c6619d6a084n%40chromium.org.


--
johannes :-)

Ari F

unread,
Jun 21, 2023, 3:33:05 PM6/21/23
to Chrome UX Report (Discussions), joha...@google.com, Chrome UX Report (Discussions), Ari F
Thanks for the hint! Looks like the 'noindex' only loads upon JS execution- so I assume that's why you're not seeing it on your end. Google Search Console is able to figure this out however perhaps the CrUX data can't for whatever reason. We're making a change so that 'noindex' is included in the first web server response document. As for /dashboard, I'll have to look into that separately but the issue may be similar.

Thanks again

❄ Johannes Henkel

unread,
Jun 21, 2023, 3:49:06 PM6/21/23
to Ari F, Chrome UX Report (Discussions)
Including it in the initial HTML does seem like a good idea, it may be more robust that way. My guess is that this should be true in general, also for Search Console.

On my desktop here, I actually had JavaScript enabled; it's pretty much a normal Chrome installation with some corporate Chrome extensions. Still, I could not see the tag.
In the search indexing stack, Javascript usually does get executed, if curious about this I like this talk by Erik Hendriks (I guess it's off-topic for CrUX) - https://www.youtube.com/watch?v=FRJ00L3i7VI

--
johannes :-)

Ari F

unread,
Jun 26, 2023, 10:52:51 AM6/26/23
to Chrome UX Report (Discussions), joha...@google.com, Chrome UX Report (Discussions), Ari F
Johannes,
Do you mind checking again to see if you're seeing the 'noindex' on that page load? We deployed a change last week and we see it on initial page load now but want to confirm with you since you were seeing different outcomes.

If you run the following you can see it's included on the original response: 
curl https://hidrb.com/start | grep noindex

However, it looks like the latest collection period still has the page being reported on so maybe there's still something else causing the meta tag to not load for folks.
Thanks

Ari F

unread,
Jun 27, 2023, 10:55:03 AM6/27/23
to Chrome UX Report (Discussions), Ari F, joha...@google.com, Chrome UX Report (Discussions)
I know this group is about CrUX and not the other various Google tools, but for what it's worth, Page Speed Insights confirms that the page is blocked from indexing: 
https://pagespeed.web.dev/analysis/https-hidrb-com-start/sjgyvghddm?form_factor=mobile

❄ Johannes Henkel

unread,
Jun 27, 2023, 1:53:04 PM6/27/23
to Ari F, Chrome UX Report (Discussions)
Yes. I think what you did should work great - thank you!

From what I can tell, it hasn't been crawled yet since the change, so the system that CrUX uses doesn't know about the noindex thus far. I don't have a button to push on this (sorry!), so maybe best to check back in a while (1-2 months so it had time to percolate through our 28 day aggregation).

The lab portion of PSI ("Diagnose performance issues" and below) is different because it fetches this every time it runs an analysis, it needs to do that since it runs lighthouse (the javascript lab tool for performance which is also in Chrome DevTools), but this is not the same as the crawling/indexing system that CrUX gets its info from for its batch processing.
--
johannes :-)

Ari F

unread,
Jun 27, 2023, 2:08:13 PM6/27/23
to Chrome UX Report (Discussions), joha...@google.com, Chrome UX Report (Discussions), Ari F
Got it- thanks for the reply! That makes sense, about PSI. We'll wait for the 28-day rolling average to catch up. Thanks!

Ari F

unread,
Aug 1, 2023, 1:06:13 AM8/1/23
to Chrome UX Report (Discussions), Ari F, joha...@google.com, Chrome UX Report (Discussions)
Hi Johannes, 
Circling back here a month later to see if the 'noindex' has perculated through the 28-day rolling average and unfortunately it seems like CrUX is still reporting page-level metrics for "https://hidrb.com/start". What's weird is that our root / aggregate CLS is 0.0- so I'm getting the feeling that even though the page-level score for "https://hidrb.com/start" is still getting calculated, it doesn't seem to be impacting our root / aggregate score at all. Do you know what's going on here?
Thanks!

❄ Johannes Henkel

unread,
Aug 1, 2023, 1:41:28 AM8/1/23
to Ari F, Chrome UX Report (Discussions)
Thanks for saying hi again. Looked into it; it appears that at this point https://hidrb.com/start has been crawled (on July 8th) and is considered noindex.
So, some measurements remain in our system from before, that's why there's still URL level data.
These remaining measurements also do count toward the origin, but there are enough other measurements for the origin so that the p75 you mention is no longer affected.
My guess is on August 10th or so, the last of these measurements will fade out and this should manifest in there no longer being url data available in PSI, for https://hidrb.com/start.
--
johannes :-)

Ari F

unread,
Aug 1, 2023, 9:30:20 AM8/1/23
to Chrome UX Report (Discussions), joha...@google.com, Chrome UX Report (Discussions), Ari F
Thanks for confirming! Turns out it's still perculating then :)
We'll keep waiting but the origin numbers are definitely looking better already. Thanks again!

Ari F

unread,
Aug 31, 2023, 1:54:34 PM8/31/23
to Chrome UX Report (Discussions), Ari F, joha...@google.com, Chrome UX Report (Discussions)
Hi Johannes,
Circling back here because I'm still seeing the /start page being reported on in CrUX. I think we've given it enough time for the rolling average to catch up to the "noindex" so I'm thinking it may be something else.
What do you think? Our "root" / aggregate score is still at 0 so maybe it's not actually counting against us?

Reply all
Reply to author
Forward
0 new messages