Looks good, nice video and production work. You may want to have the
link in "We'll see you at the related thread in our Webmaster Help
Group." actually point here rather than the whole group, as it will
get lost and bumped down after a few hours.
On our site we serve different CSS files depending on browser
detection, so we can work around CSS differences between different
browsers. We serve robots a "generic" css. It does cause a minor
difference in the HTML (the file name of the CSS includes "generic"
rather than "firefox-2" or "ie-7"). The rest of the HTML is the same.
Hopefully this is legit, but if not, how in general should browser
specific HTML generation be handled? Should we pretend that robots
are IE7 or Firefox, for example?
Well content is really just text and links. Styling doesn't count -
except in order to determine if you have text or links which are
hidden from the human visitor.
That said you can certainly engage in some serious black hat things
that way. If the css is handled server-side rather than client side
(in js) then robtos won't know. Unless of course they use tricks like
spoofing their user agent string to check up on such things.
I'd suggest you don't do this kind of thing. With a bit of effort you
can fix your css so you don't have to d this at all.
Or at worst you can keep one basic css file and supply separate style
sheet directives as needed with elements to override for some special
browser based considerations.
There too I'm quite convinced that with a bit of effort you'd likely
eliminate this need altogether.
> On our site we serve different CSS files depending on browser
> detection, so we can work around CSS differences between different
> browsers. We serve robots a "generic" css. It does cause a minor
> difference in the HTML (the file name of the CSS includes "generic"
> rather than "firefox-2" or "ie-7"). The rest of the HTML is the same.
> Hopefully this is legit, but if not, how in general should browser
> specific HTML generation be handled? Should we pretend that robots
> are IE7 or Firefox, for example?
> Unless of course they use tricks like
> spoofing their user agent string to check up on such things.
Isn't that the implication of this article?
> I'd suggest you don't do this kind of thing. With a bit of effort you
> can fix your css so you don't have to d this at all.
I think the cleanest way to fix browser-bugginess is with server-side
browser detection and a bit of conditional logic to emit special CSS
for horrible old browsers like IE6. It's still code that changes
based on the user-agent. Hopefully this is legit. I'd hate to have
to add CSS hacks in order to be Google-compliant. Although I wish it
wasn't so, there just simply are differences between today's common
browsers in their CSS compliance that need to be worked around.
> Or at worst you can keep one basic css file and supply separate style
> sheet directives as needed with elements to override for some special
> browser based considerations.
That is exactly what I'm doing, with server-side logic to drive the
emitted CSS.
I've been exploring this subject recently - it's popped up on Matt
Cutt's blog.
I have two questions:
1. How can we develop client-side application's like Google have done
with Reader or Gmail (except the kind whose content should be indexed
by Google), and present our content to Google at the same time?
2. Re: cloaking: What about REST and the idea of different resource
representations that is so prevalent throughout the web? Saying that
one could run a MD5 hash on a resource url to determine a change in
content violates the principles of REST, which states that resources
can exist at the same url in different representations, the nature of
the representation returned or received being determined by content
negotiation?
The blog post seems to be pretty strict when it comes to Cloaking. On
my site I have a large forum where I use cloaking (by your
definition). Google and other crawlers get almost the same page as
random other users but without links that I don't want to have
crawled.
For instance a topic has a URL like:
viewtopic.php?t=1234
which consists of 25 different posts. To enable our users to link to a
specific post this page contains 25 links like:
viewtopic.php?t=1234&p=234567#p234567
as far as I know I can't exclude those links in robots.txt therefore I
just don't include them in pages that are sent to the GoogleBot. Those
post links are already redirected to the 'normal' topic page at the
moment but it's a waste or resources to constantly send 301s to
Google. Google also seems to be limited to approx 2 requests/second
and I'd rather have it retrieve actual content than confirm that those
post pages are redirected to topic pages.
Is what I'm doing really cloaking? I'm only trying to make it easier
for Google to (efficiently) crawl the site which contains over 30
million posts. I'm not in any way trying to manipulate my rankings.
ERm.... may be it's jsut me being a little thick (okay... 'very thick'
and it's a common occurence ;) )
"... The key is to treat Googlebot as you would a typical user from a
similar location, IP range, etc. (i.e. don't treat Googlebot as if it
came from its own separate countrythat's cloaking). ..."
"... Googlebot should see the same content a typical user from the
same IP address would see. ..."
.
So, does that mean content should be provided to match the IP location
that the Google bot is using.... or should it be a 'default' version
(incase IP/Geo targeting fail) or should it be the same location as I
am in ????
Basically, I don't think it's 100% clear, so maybe adding something as
an example (or 2/3/4 examples just to make sure).
If I can see it as unclear, gaurenteed others will as well.
> So, does that mean content should be provided to match the IP location
> that the Google bot is using.... or should it be a 'default' version
> (incase IP/Geo targeting fail) or should it be the same location as I
> am in ????
Faskinaytin'
(I have a NASTY habit of reading between the lines with these things.)
Is Google perhaps preparing for a future in which it will deliberately
crawl sites from several different geographies and note where it gets
a location-dependent page - then indexes each location-dependent page
to as to serve the appropriate version to someone searching from that
location?
I read your blog post, and it didn't talk about creating landing pages
for visitors coming from a search engine after typing a specific
query. For example, I run a site called "Vanilla House Designs" which
sells sewing patterns. However, we get visitors coming to our site
looking for "House Designs". Our server config files detect the
"house designs" search phrase and executes a "Temporary Redirect" to a
landing page (of a different URL) which we present helpful links (and
ads) about "House Designs".
Is this ok? or is googlebot going to decide that we are
discriminating against a particular referrer and get googley mad at us?
Hi Maile.
Nice video. Could you allay my fears and advise whether creating a
different url for say German users would have any impact on the serps/
PR of the existing orginal URL? I worry that creating redirects to
duplicates of a URl with for example slightly different shipping
charges will dilute the PR and ranking of that original URL.
> On our site we serve different CSS files depending on browser
> detection, so we can work around CSS differences between different
> browsers. We serve robots a "generic" css.
...
> Hopefully this is legit, but if not, how in general should browser
> specific HTML generation be handled? Should we pretend that robots
> are IE7 or Firefox, for example?
Good question. :) If your site serves different CSS based on browser
detection, we recommend serving Googlebot the same CSS as a typical
user (not a crawler-specific CSS, even if it's more generic). It's
best to serve Googlebot the same stylesheet that you serve to a large
chunk of your users -- whether that's IE, Firefox, Safari, etc.
> I read your blog post, and it didn't talk about creating landing pages
> for visitors coming from a search engine after typing a specific
> query.
...
> Our server config files detect the
> "house designs" search phrase and executes a "Temporary Redirect" to a
> landing page (of a different URL) which we present helpful links (and
> ads) about "House Designs".
...
> Is this ok? or is googlebot going to decide that we are
> discriminating against a particular referrer and get googley mad at us?
I believe you're describing post-cloaking, which is serving users who
have selected a search result different content than Googlebot
requesting the same URL. It's often implemented when the webmaster
varies their content based on the "Referer" header showing a search
engine URL.
Post-cloaking, a more specific form of cloaking, is frowned upon at
Google. If you believe your site implements post-cloaking techniques
but you want to remain in compliance with our guidelines, it's best to
change that behavior at your earliest convenience.
> > On our site we serve different CSS files depending on browser
> > detection, so we can work around CSS differences between different
> > browsers. We serve robots a "generic" css.
> ...
> > Hopefully this is legit, but if not, how in general should browser
> > specific HTML generation be handled? Should we pretend that robots
> > are IE7 or Firefox, for example?
> Good question. :) If your site serves different CSS based on browser
> detection, we recommend serving Googlebot the same CSS as a typical
> user (not a crawler-specific CSS, even if it's more generic). It's
> best to serve Googlebot the same stylesheet that you serve to a large
> chunk of your users -- whether that's IE, Firefox, Safari, etc.
"Post-cloaking, a more specific form of cloaking, is frowned upon at
Google. If you believe your site implements post-cloaking techniques
but you want to remain in compliance with our guidelines, it's best to
change that behavior at your earliest convenience. "
I'm honestly not trying to be combative here but considering the
INTENT of a webmaster, how is what kingbyu is doing considered not "in
compliance with our guidelines" when his intent is the same as those
of your AdWords advertisers setting different URLs for the Display and
Destination URL? I know that cloaking is wrong. OK? I know that. But
there has to be a middle ground.
> > I read your blog post, and it didn't talk about creating landing pages
> > for visitors coming from a search engine after typing a specific
> > query.
> ...
> > Our server config files detect the
> > "house designs" search phrase and executes a "Temporary Redirect" to a
> > landing page (of a different URL) which we present helpful links (and
> > ads) about "House Designs".
> ...
> > Is this ok? or is googlebot going to decide that we are
> > discriminating against a particular referrer and get googley mad at us?
> I believe you're describing post-cloaking, which is serving users who
> have selected a search result different content than Googlebot
> requesting the same URL. It's often implemented when the webmaster
> varies their content based on the "Referer" header showing a search
> engine URL.
> Post-cloaking, a more specific form of cloaking, is frowned upon at
> Google. If you believe your site implements post-cloaking techniques
> but you want to remain in compliance with our guidelines, it's best to
> change that behavior at your earliest convenience.
> "Googley mad" is a very cute phrase, btw.
> Thanks for the question and best of luck,
> Maile
> "Post-cloaking, a more specific form of cloaking, is frowned upon at
> Google. If you believe your site implements post-cloaking techniques
> but you want to remain in compliance with our guidelines, it's best to
> change that behavior at your earliest convenience. "
> I'm honestly not trying to be combative here but considering the
> INTENT of a webmaster, how is what kingbyu is doing considered not "in
> compliance with our guidelines" when his intent is the same as those
> of your AdWords advertisers setting different URLs for the Display and
> Destination URL? I know that cloaking is wrong. OK? I know that. But
> there has to be a middle ground.
> On Jun 10, 2:45 pm, Maile Ohye wrote:
> > Hi kingbyu,
> > > I read your blog post, and it didn't talk about creating landing pages
> > > for visitors coming from a search engine after typing a specific
> > > query.
> > ...
> > > Our server config files detect the
> > > "house designs" search phrase and executes a "Temporary Redirect" to a
> > > landing page (of a different URL) which we present helpful links (and
> > > ads) about "House Designs".
> > ...
> > > Is this ok? or is googlebot going to decide that we are
> > > discriminating against a particular referrer and get googley mad at us?
> > I believe you're describing post-cloaking, which is serving users who
> > have selected a search result different content than Googlebot
> > requesting the same URL. It's often implemented when the webmaster
> > varies their content based on the "Referer" header showing a search
> > engine URL.
> > Post-cloaking, a more specific form of cloaking, is frowned upon at
> > Google. If you believe your site implements post-cloaking techniques
> > but you want to remain in compliance with our guidelines, it's best to
> > change that behavior at your earliest convenience.
> > "Googley mad" is a very cute phrase, btw.
> > Thanks for the question and best of luck,
> > Maile- Hide quoted text -
> "Post-cloaking, a more specific form of cloaking, is frowned upon at
> Google. If you believe your site implements post-cloaking techniques
> but you want to remain in compliance with our guidelines, it's best to
> change that behavior at your earliest convenience. "
> I'm honestly not trying to be combative here but considering the
> INTENT of a webmaster, how is what kingbyu is doing considered not "in
> compliance with our guidelines" when his intent is the same as those
> of your AdWords advertisers setting different URLs for the Display and
> Destination URL? I know that cloaking is wrong. OK? I know that. But
> there has to be a middle ground.
> On Jun 10, 2:45 pm, Maile Ohye wrote:
> > Hi kingbyu,
> > > I read your blog post, and it didn't talk about creating landing pages
> > > for visitors coming from a search engine after typing a specific
> > > query.
> > ...
> > > Our server config files detect the
> > > "house designs" search phrase and executes a "Temporary Redirect" to a
> > > landing page (of a different URL) which we present helpful links (and
> > > ads) about "House Designs".
> > ...
> > > Is this ok? or is googlebot going to decide that we are
> > > discriminating against a particular referrer and get googley mad at us?
> > I believe you're describing post-cloaking, which is serving users who
> > have selected a search result different content than Googlebot
> > requesting the same URL. It's often implemented when the webmaster
> > varies their content based on the "Referer" header showing a search
> > engine URL.
> > Post-cloaking, a more specific form of cloaking, is frowned upon at
> > Google. If you believe your site implements post-cloaking techniques
> > but you want to remain in compliance with our guidelines, it's best to
> > change that behavior at your earliest convenience.
> > "Googley mad" is a very cute phrase, btw.
> > Thanks for the question and best of luck,
> > Maile- Hide quoted text -
Hi again Maile, my apologies for the rude tone I took in my last post.
To clarify, here's what I'm wondering. If I show Googlebot the same
thing I show visitors coming from Google search results, and do so
with cloaking, would that put me in the high-risk category? Thanks in
advance for any support you can provide on this and please let me know
how I can clarify where needed.
> > I read your blog post, and it didn't talk about creating landing pages
> > for visitors coming from a search engine after typing a specific
> > query.
> ...
> > Our server config files detect the
> > "house designs" search phrase and executes a "Temporary Redirect" to a
> > landing page (of a different URL) which we present helpful links (and
> > ads) about "House Designs".
> ...
> > Is this ok? or is googlebot going to decide that we are
> > discriminating against a particular referrer and get googley mad at us?
> I believe you're describing post-cloaking, which is serving users who
> have selected a search result different content than Googlebot
> requesting the same URL. It's often implemented when the webmaster
> varies their content based on the "Referer" header showing a search
> engine URL.
> Post-cloaking, a more specific form of cloaking, is frowned upon at
> Google. If you believe your site implements post-cloaking techniques
> but you want to remain in compliance with our guidelines, it's best to
> change that behavior at your earliest convenience.
> "Googley mad" is a very cute phrase, btw.
> Thanks for the question and best of luck,
> Maile
I want to use IP delivery to redirect visitors from the US to a
different site that is targeted to their needs. However, most of my
business is conducted outside the US, mostly in Europe. If we set this
up, my guess is the googlebot will go to the US site rather than my
site that services Europe. I do not want to redirect the bots to my
European site to risk cloaking penalites. I am worried that this will
negatively affect my European presence.
Can you give me some guidance on how we should set this up so we
remain in compliance with the rules.
> > I read your blog post, and it didn't talk about creating landing pages
> > for visitors coming from a search engine after typing a specific
> > query.
> ...
> > Our server config files detect the
> > "house designs" search phrase and executes a "Temporary Redirect" to a
> > landing page (of a different URL) which we present helpful links (and
> > ads) about "House Designs".
> ...
> > Is this ok? or is googlebot going to decide that we are
> > discriminating against a particular referrer and get googley mad at us?
> I believe you're describing post-cloaking, which is serving users who
> have selected a search result different content than Googlebot
> requesting the same URL. It's often implemented when the webmaster
> varies their content based on the "Referer" header showing a search
> engine URL.
> Post-cloaking, a more specific form of cloaking, is frowned upon at
> Google. If you believe your site implements post-cloaking techniques
> but you want to remain in compliance with our guidelines, it's best to
> change that behavior at your earliest convenience.
> "Googley mad" is a very cute phrase, btw.
> Thanks for the question and best of luck,
> MaIile
It is still unclear whether or not Google has different and seperate
bots for each origin so that for example, Google's cache of google.ca
will be different from the Google's cache of google.com (US), in the
same way Candian and US cutomers see localized content. We currently
use IP-delivery but the Google's cache is the same for both origins.
I have read the Group FAQs and checked out docs in the Help Center. I
can't seem to find an answer to my question so here it is. If I show
Googlebot the same content (text, code, images, etc) that I show
visitors coming from Google search results (from google.com, Google's
Content Network, or otherwise), does it matter to Google if I show
something different to visitors who are not Googlebot or visitors who
are not coming Google search results (from google.com, Google's
Content Network, or otherwise)?
> Please let us know if we can provide clarification. :)
I understand that changing the content based on the user agent
(specifically the Googlebot) user agent is verboten. No problem.
My problem is that I can't calculate the Last-Modified-Time LMT
correctly for my pages because they're composed of a information from
a LARGE number of different places. However, from the point of view of
search crawlers, there are only one or two files that I need to check
in order to provide a sufficiently meaningful LMT (from the point of
view of search, changes in labels and such are hardly meaningful).
I'm already considering making a change to calculate a simplified LMT
(and honor If-Modified-Since) for our internal search crawler and I'm
wondering if doing the same for GoogleBot (and the others) will cause
problems. Maybe it won't have any affect either but Its worth
knowing.