Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Message from discussion 6 months since we mostly dropped out of the search index
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
JohnMu Google employee  
View profile  
 More options Jun 23 2008, 4:53 am
From: JohnMu
Date: Mon, 23 Jun 2008 01:53:22 -0700 (PDT)
Local: Mon, Jun 23 2008 4:53 am
Subject: Re: 6 months since we mostly dropped out of the search index
Hi kscope and welcome to the groups!

Looking at the pages indexed, I see two issues which Autocrat has
already mentioned (thanks, Autocrat!) that I'd like to expand on:

1. Long & crazy URLs
I ran into this URL while looking at [site:gadgetguy.com.au] :
http://www.gadgetguy.com.au/small-kitchen-appliances-toaster-kettle-c...

Now I'm all for having descriptive URLs, but .... this seems to be
taking it a bit too far and I have a bit of trouble identifying
anything that matches in the content of your page.

The problem with URLs like this is that they almost appear to be
random and in fact I can get exactly the same page by using something
like: http://www.gadgetguy.com.au/xyzzy-42.html . In general, you
should make sure that you have only one URL that leads to your content
-- all others should either redirect to the proper URL or return HTTP
result code 404 to signal that the URL is invalid. Without that, your
site is leading us (and all other crawlers) on a wild goose chase.

If your CMS is not able to handle this properly (one URL per piece of
content), I would recommend not using rewritten URLs so that we can
recognize and skip over unimportant parameters in your URL query
string.

2. Broken HTML code
In general, we try to get it right regardless of what a webmaster uses
on his page. However, there are limits to what we can guess at.
Although this is definitely not as important as the first point, you
can see this happening when you search for something like:
http://www.google.com/search?q=site:www.gadgetguy.com.au+intitle:shor...

Hope it helps!
John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.