Looks good, nice video and production work. You may want to have the
link in "We'll see you at the related thread in our Webmaster Help
Group." actually point here rather than the whole group, as it will
get lost and bumped down after a few hours.
On our site we serve different CSS files depending on browser
detection, so we can work around CSS differences between different
browsers. We serve robots a "generic" css. It does cause a minor
difference in the HTML (the file name of the CSS includes "generic"
rather than "firefox-2" or "ie-7"). The rest of the HTML is the same.
Hopefully this is legit, but if not, how in general should browser
specific HTML generation be handled? Should we pretend that robots
are IE7 or Firefox, for example?
Well content is really just text and links. Styling doesn't count -
except in order to determine if you have text or links which are
hidden from the human visitor.
That said you can certainly engage in some serious black hat things
that way. If the css is handled server-side rather than client side
(in js) then robtos won't know. Unless of course they use tricks like
spoofing their user agent string to check up on such things.
I'd suggest you don't do this kind of thing. With a bit of effort you
can fix your css so you don't have to d this at all.
Or at worst you can keep one basic css file and supply separate style
sheet directives as needed with elements to override for some special
browser based considerations.
There too I'm quite convinced that with a bit of effort you'd likely
eliminate this need altogether.
> On our site we serve different CSS files depending on browser
> detection, so we can work around CSS differences between different
> browsers. We serve robots a "generic" css. It does cause a minor
> difference in the HTML (the file name of the CSS includes "generic"
> rather than "firefox-2" or "ie-7"). The rest of the HTML is the same.
> Hopefully this is legit, but if not, how in general should browser
> specific HTML generation be handled? Should we pretend that robots
> are IE7 or Firefox, for example?
> Unless of course they use tricks like
> spoofing their user agent string to check up on such things.
Isn't that the implication of this article?
> I'd suggest you don't do this kind of thing. With a bit of effort you
> can fix your css so you don't have to d this at all.
I think the cleanest way to fix browser-bugginess is with server-side
browser detection and a bit of conditional logic to emit special CSS
for horrible old browsers like IE6. It's still code that changes
based on the user-agent. Hopefully this is legit. I'd hate to have
to add CSS hacks in order to be Google-compliant. Although I wish it
wasn't so, there just simply are differences between today's common
browsers in their CSS compliance that need to be worked around.
> Or at worst you can keep one basic css file and supply separate style
> sheet directives as needed with elements to override for some special
> browser based considerations.
That is exactly what I'm doing, with server-side logic to drive the
emitted CSS.
I've been exploring this subject recently - it's popped up on Matt
Cutt's blog.
I have two questions:
1. How can we develop client-side application's like Google have done
with Reader or Gmail (except the kind whose content should be indexed
by Google), and present our content to Google at the same time?
2. Re: cloaking: What about REST and the idea of different resource
representations that is so prevalent throughout the web? Saying that
one could run a MD5 hash on a resource url to determine a change in
content violates the principles of REST, which states that resources
can exist at the same url in different representations, the nature of
the representation returned or received being determined by content
negotiation?
The blog post seems to be pretty strict when it comes to Cloaking. On
my site I have a large forum where I use cloaking (by your
definition). Google and other crawlers get almost the same page as
random other users but without links that I don't want to have
crawled.
For instance a topic has a URL like:
viewtopic.php?t=1234
which consists of 25 different posts. To enable our users to link to a
specific post this page contains 25 links like:
viewtopic.php?t=1234&p=234567#p234567
as far as I know I can't exclude those links in robots.txt therefore I
just don't include them in pages that are sent to the GoogleBot. Those
post links are already redirected to the 'normal' topic page at the
moment but it's a waste or resources to constantly send 301s to
Google. Google also seems to be limited to approx 2 requests/second
and I'd rather have it retrieve actual content than confirm that those
post pages are redirected to topic pages.
Is what I'm doing really cloaking? I'm only trying to make it easier
for Google to (efficiently) crawl the site which contains over 30
million posts. I'm not in any way trying to manipulate my rankings.
ERm.... may be it's jsut me being a little thick (okay... 'very thick'
and it's a common occurence ;) )
"... The key is to treat Googlebot as you would a typical user from a
similar location, IP range, etc. (i.e. don't treat Googlebot as if it
came from its own separate country—that's cloaking). ..."
"... Googlebot should see the same content a typical user from the
same IP address would see. ..."
.
So, does that mean content should be provided to match the IP location
that the Google bot is using.... or should it be a 'default' version
(incase IP/Geo targeting fail) or should it be the same location as I
am in ????
Basically, I don't think it's 100% clear, so maybe adding something as
an example (or 2/3/4 examples just to make sure).
If I can see it as unclear, gaurenteed others will as well.
> So, does that mean content should be provided to match the IP location
> that the Google bot is using.... or should it be a 'default' version
> (incase IP/Geo targeting fail) or should it be the same location as I
> am in ????
Faskinaytin'
(I have a NASTY habit of reading between the lines with these things.)
Is Google perhaps preparing for a future in which it will deliberately
crawl sites from several different geographies and note where it gets
a location-dependent page - then indexes each location-dependent page
to as to serve the appropriate version to someone searching from that
location?
I read your blog post, and it didn't talk about creating landing pages
for visitors coming from a search engine after typing a specific
query. For example, I run a site called "Vanilla House Designs" which
sells sewing patterns. However, we get visitors coming to our site
looking for "House Designs". Our server config files detect the
"house designs" search phrase and executes a "Temporary Redirect" to a
landing page (of a different URL) which we present helpful links (and
ads) about "House Designs".
Is this ok? or is googlebot going to decide that we are
discriminating against a particular referrer and get googley mad at us?