Without any technical reason (as far as I know) webmastertools (Web
crawl errors > Not found) reports 10.3790 errors for my site http://lamundial.net
I've checked the server errors log but no clues, this non existing
urls have a strange pattern that makes me think in a glitch in
googlebot but everything toghether is quite odd.
PR has been dropped from 6 to 4. Could this be the reason?
It is difficult to have an opinion without some of the
not-found URLs.
Did you look at the Links and
the 'What Googlebot sees' pages in Google Webmaster Tools,
to check that all is OK there.
> Without any technical reason (as far as I know) webmastertools (Web
> crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> I've checked the server errors log but no clues, this non existing
> urls have a strange pattern that makes me think in a glitch in
> googlebot but everything toghether is quite odd.
> PR has been dropped from 6 to 4. Could this be the reason?
> It is difficult to have an opinion without some of the
> not-found URLs.
> Did you look at the Links and
> the 'What Googlebot sees' pages in Google Webmaster Tools,
> to check that all is OK there.
> Cristina.
> On May 9, 8:45 am, Dynamical.Biz wrote:
> > Without any technical reason (as far as I know) webmastertools (Web
> > crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> > I've checked the server errors log but no clues, this non existing
> > urls have a strange pattern that makes me think in a glitch in
> > googlebot but everything toghether is quite odd.
> > PR has been dropped from 6 to 4. Could this be the reason?
Check just in case if your site was hacked.
If you can, block the URLs in the robots.txt file,
if many of them start in the same way and
differently from good URLs.
I get HTTP status response 200 (OK)
( after redirection to the non-found page )
for inexistent URLs from your site,
you should have 404 (Not Found).
> > It is difficult to have an opinion without some of the
> > not-found URLs.
> > Did you look at the Links and
> > the 'What Googlebot sees' pages in Google Webmaster Tools,
> > to check that all is OK there.
> > Cristina.
> > On May 9, 8:45 am, Dynamical.Biz wrote:
> > > Without any technical reason (as far as I know) webmastertools (Web
> > > crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> > > I've checked the server errors log but no clues, this non existing
> > > urls have a strange pattern that makes me think in a glitch in
> > > googlebot but everything toghether is quite odd.
> > > PR has been dropped from 6 to 4. Could this be the reason?
> > > Someone had a similar experience?
> > > Thanks
There are two possible reasons I can think of for so many not-found
URLs happening:
1- Your site used to have a different URL structure where those URLs
were valid (as a result of a redesign or a hack), and there are broken
links on your site or other sites to these pages. These might be
worth correcting. (see B and C below)
2- Sometimes authors type URL's wrong, spammers generate buggy URLs,
or any number of other things which produce dead-end links that never
did have any good content. These you can usually safely ignore.
Here are some things to do:
A- It looks like Cristina just beat me to this (and point C,
congrats!), but you should redo your 404's. Currently, when I go to
those "broken" links that "redirect to a 404" I don't actually get a
'404 Not Found'. I am getting a 302 redirect to 404.php, which
returns a '200 OK'. Perhaps one of the .htaccess wizards in the group
can advise you on how to configure your server to return a proper 404
header instead, while serving the body of 404.php as the message
content.
B- Take a good look around your site for broken links to bad pages.
Other folks around here have suggested Xenu as a useful tool for this,
but I cannot personally (or officially, as a Googler) endorse it, as
I've never used it.
C- If you notice that most of the nonexistent URL's can be matched to
a few simple patterns, you may want to add those patterns to your
robots.txt file. That way, Googlebot won't even ask for them--you can
save both your server and our crawler precious time and bandwidth. :-)
As for your PR, I would advise you not to worry much about the green
pixels. They're a rough approximation of one of hundreds of factors
we use when ranking sites. Not to mention that the number is usually
only updated every few weeks or months, so it's not guaranteed to be
up to date. If you've experienced a sudden drop in green-pixel PR, it
may be worth going over the Google Webmaster Guidelines to make sure
you're not in violation. Just remember that PR changes over time just
like the web changes, so all sites can expect some fluctuation.
Let us know how it goes--if you'd like advice or explanation of
anything I've said (or for any other responses that show up here),
keep on asking. We like questions around here. Especially good, hard
ones.
-Bergy
> Check just in case if your site was hacked.
> If you can, block the URLs in the robots.txt file,
> if many of them start in the same way and
> differently from good URLs.
> I get HTTP status response 200 (OK)
> ( after redirection to the non-found page )
> for inexistent URLs from your site,
> you should have 404 (Not Found).
> > they are redirected to 404 error page obviously
> > thanks Cristina for your interest
> > On 9 mayo, 13:57, cristina wrote:
> > > It is difficult to have an opinion without some of the
> > > not-found URLs.
> > > Did you look at the Links and
> > > the 'What Googlebot sees' pages in Google Webmaster Tools,
> > > to check that all is OK there.
> > > Cristina.
> > > On May 9, 8:45 am, Dynamical.Biz wrote:
> > > > Without any technical reason (as far as I know) webmastertools (Web
> > > > crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> > > > I've checked the server errors log but no clues, this non existing
> > > > urls have a strange pattern that makes me think in a glitch in
> > > > googlebot but everything toghether is quite odd.
> > > > PR has been dropped from 6 to 4. Could this be the reason?
> > > > Someone had a similar experience?
> > > > Thanks
Hi Bergy, thanks a lot for your help
point by point
On 9 mayo, 23:58, Berghausen wrote:
> HiDynamical-
> There are two possible reasons I can think of for so many not-found
> URLs happening:
> 1- Your site used to have a different URL structure where those URLs
> were valid (as a result of a redesign or a hack), and there are broken
> links on your site or other sites to these pages. These might be
> worth correcting. (see B and C below)
none of this to options, I checked as soon they appeared
> 2- Sometimes authors type URL's wrong, spammers generate buggy URLs,
> or any number of other things which produce dead-end links that never
> did have any good content. These you can usually safely ignore.
yes, in error log I can see some worng URL's but not the kind that
webmastertool is showing and that is strnage
> Here are some things to do:
> A- It looks like Cristina just beat me to this (and point C,
> congrats!), but you should redo your 404's. Currently, when I go to
> those "broken" links that "redirect to a 404" I don't actually get a
> '404 Not Found'. I am getting a 302 redirect to 404.php, which
> returns a '200 OK'.
what is doing htaccess now is instread of showing the typical 404 Not
Found Apache webpage they are redirected to a custom error page trying
to keep as much trafic inside the web as possible giving some other
navigation options
> Perhaps one of the .htaccess wizards in the group
> can advise you on how to configure your server to return a proper 404
> header instead, while serving the body of 404.php as the message
> content.
> B- Take a good look around your site for broken links to bad pages.
> Other folks around here have suggested Xenu as a useful tool for this,
> but I cannot personally (or officially, as a Googler) endorse it, as
> I've never used it.
xenu report says everything is ok, I checked
> C- If you notice that most of the nonexistent URL's can be matched to
> a few simple patterns, you may want to add those patterns to your
> robots.txt file. That way, Googlebot won't even ask for them--you can
> save both your server and our crawler precious time and bandwidth. :-)
Ok I'll Disallow /home*$ and see how it goes
> As for your PR, I would advise you not to worry much about the green
> pixels. They're a rough approximation of one of hundreds of factors
> we use when ranking sites. Not to mention that the number is usually
> only updated every few weeks or months, so it's not guaranteed to be
> up to date. If you've experienced a sudden drop in green-pixel PR, it
> may be worth going over the Google Webmaster Guidelines to make sure
> you're not in violation. Just remember that PR changes over time just
> like the web changes, so all sites can expect some fluctuation.
yes I know but is an strange coincidence, the 10.379 broken urls
appear and PR goes down 2 points, isn't it?
> Let us know how it goes--if you'd like advice or explanation of
> anything I've said (or for any other responses that show up here),
> keep on asking. We like questions around here. Especially good, hard
> ones.
thanks Bergy. this is a personal website so it is not be considered a
'dead or alive' thing but this so strange problems are very
interesting for a proffesional SEO
> > Check just in case if your site was hacked.
> > If you can, block the URLs in the robots.txt file,
> > if many of them start in the same way and
> > differently from good URLs.
> > I get HTTP status response 200 (OK)
> > ( after redirection to the non-found page )
> > for inexistent URLs from your site,
> > you should have 404 (Not Found).
> > > they are redirected to 404 error page obviously
> > > thanks Cristina for your interest
> > > On 9 mayo, 13:57, cristina wrote:
> > > > It is difficult to have an opinion without some of the
> > > > not-found URLs.
> > > > Did you look at the Links and
> > > > the 'What Googlebot sees' pages in Google Webmaster Tools,
> > > > to check that all is OK there.
> > > > Cristina.
> > > > On May 9, 8:45 am,Dynamical.Biz wrote:
> > > > > Without any technical reason (as far as I know) webmastertools (Web
> > > > > crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> > > > > I've checked the server errors log but no clues, this non existing
> > > > > urls have a strange pattern that makes me think in a glitch in
> > > > > googlebot but everything toghether is quite odd.
> > > > > PR has been dropped from 6 to 4. Could this be the reason?
> > > > > Someone had a similar experience?
> > > > > Thanks
What the above does is in case of a 404 it does a 302 redirections to
the specifed url (because it is a fully quaified url).
This should be replaced by:
ErrorDocument 404 /404.php
Since the above is a url relative to the root there is no redirection
involved, the content of the error page will be shown with the 404
response code.
> Hi Bergy, thanks a lot for your help
> point by point
> On 9 mayo, 23:58, Berghausen wrote:
> > HiDynamical-
> > There are two possible reasons I can think of for so many not-found
> > URLs happening:
> > 1- Your site used to have a different URL structure where those URLs
> > were valid (as a result of a redesign or a hack), and there are broken
> > links on your site or other sites to these pages. These might be
> > worth correcting. (see B and C below)
> none of this to options, I checked as soon they appeared
> > 2- Sometimes authors type URL's wrong, spammers generate buggy URLs,
> > or any number of other things which produce dead-end links that never
> > did have any good content. These you can usually safely ignore.
> yes, in error log I can see some worng URL's but not the kind that
> webmastertool is showing and that is strnage
> > Here are some things to do:
> > A- It looks like Cristina just beat me to this (and point C,
> > congrats!), but you should redo your 404's. Currently, when I go to
> > those "broken" links that "redirect to a 404" I don't actually get a
> > '404 Not Found'. I am getting a 302 redirect to 404.php, which
> > returns a '200 OK'.
> what is doing htaccess now is instread of showing the typical 404 Not
> Found Apache webpage they are redirected to a custom error page trying
> to keep as much trafic inside the web as possible giving some other
> navigation options
> > Perhaps one of the .htaccess wizards in the group
> > can advise you on how to configure your server to return a proper 404
> > header instead, while serving the body of 404.php as the message
> > content.
> > B- Take a good look around your site for broken links to bad pages.
> > Other folks around here have suggested Xenu as a useful tool for this,
> > but I cannot personally (or officially, as a Googler) endorse it, as
> > I've never used it.
> xenu report says everything is ok, I checked
> > C- If you notice that most of the nonexistent URL's can be matched to
> > a few simple patterns, you may want to add those patterns to your
> > robots.txt file. That way, Googlebot won't even ask for them--you can
> > save both your server and our crawler precious time and bandwidth. :-)
> Ok I'll Disallow /home*$ and see how it goes
> > As for your PR, I would advise you not to worry much about the green
> > pixels. They're a rough approximation of one of hundreds of factors
> > we use when ranking sites. Not to mention that the number is usually
> > only updated every few weeks or months, so it's not guaranteed to be
> > up to date. If you've experienced a sudden drop in green-pixel PR, it
> > may be worth going over the Google Webmaster Guidelines to make sure
> > you're not in violation. Just remember that PR changes over time just
> > like the web changes, so all sites can expect some fluctuation.
> yes I know but is an strange coincidence, the 10.379 broken urls
> appear and PR goes down 2 points, isn't it?
> > Let us know how it goes--if you'd like advice or explanation of
> > anything I've said (or for any other responses that show up here),
> > keep on asking. We like questions around here. Especially good, hard
> > ones.
> thanks Bergy. this is a personal website so it is not be considered a
> 'dead or alive' thing but this so strange problems are very
> interesting for a proffesional SEO
> I'll keep you updated
> Regards
> > -Bergy
> > On May 9, 2:33 pm, cristina wrote:
> > > Check just in case if your site was hacked.
> > > If you can, block the URLs in the robots.txt file,
> > > if many of them start in the same way and
> > > differently from good URLs.
> > > I get HTTP status response 200 (OK)
> > > ( after redirection to the non-found page )
> > > for inexistent URLs from your site,
> > > you should have 404 (Not Found).
> > > > they are redirected to 404 error page obviously
> > > > thanks Cristina for your interest
> > > > On 9 mayo, 13:57, cristina wrote:
> > > > > It is difficult to have an opinion without some of the
> > > > > not-found URLs.
> > > > > Did you look at the Links and
> > > > > the 'What Googlebot sees' pages in Google Webmaster Tools,
> > > > > to check that all is OK there.
> > > > > Cristina.
> > > > > On May 9, 8:45 am,Dynamical.Biz wrote:
> > > > > > Without any technical reason (as far as I know) webmastertools (Web
> > > > > > crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> > > > > > I've checked the server errors log but no clues, this non existing
> > > > > > urls have a strange pattern that makes me think in a glitch in
> > > > > > googlebot but everything toghether is quite odd.
> > > > > > PR has been dropped from 6 to 4. Could this be the reason?
> > > > > > Someone had a similar experience?
> > > > > > Thanks- Hide quoted text -
> Hi Bergy, thanks a lot for your help
> point by point
> On 9 mayo, 23:58, Berghausen wrote:
> > HiDynamical-
> > There are two possible reasons I can think of for so many not-found
> > URLs happening:
> > 1- Your site used to have a different URL structure where those URLs
> > were valid (as a result of a redesign or a hack), and there are broken
> > links on your site or other sites to these pages. These might be
> > worth correcting. (see B and C below)
> none of this to options, I checked as soon they appeared
> > 2- Sometimes authors type URL's wrong, spammers generate buggy URLs,
> > or any number of other things which produce dead-end links that never
> > did have any good content. These you can usually safely ignore.
> yes, in error log I can see some worng URL's but not the kind that
> webmastertool is showing and that is strnage
> > Here are some things to do:
> > A- It looks like Cristina just beat me to this (and point C,
> > congrats!), but you should redo your 404's. Currently, when I go to
> > those "broken" links that "redirect to a 404" I don't actually get a
> > '404 Not Found'. I am getting a 302 redirect to 404.php, which
> > returns a '200 OK'.
> what is doing htaccess now is instread of showing the typical 404 Not
> Found Apache webpage they are redirected to a custom error page trying
> to keep as much trafic inside the web as possible giving some other
> navigation options
> > Perhaps one of the .htaccess wizards in the group
> > can advise you on how to configure your server to return a proper 404
> > header instead, while serving the body of 404.php as the message
> > content.
> > B- Take a good look around your site for broken links to bad pages.
> > Other folks around here have suggested Xenu as a useful tool for this,
> > but I cannot personally (or officially, as a Googler) endorse it, as
> > I've never used it.
> xenu report says everything is ok, I checked
> > C- If you notice that most of the nonexistent URL's can be matched to
> > a few simple patterns, you may want to add those patterns to your
> > robots.txt file. That way, Googlebot won't even ask for them--you can
> > save both your server and our crawler precious time and bandwidth. :-)
> Ok I'll Disallow /home*$ and see how it goes
> > As for your PR, I would advise you not to worry much about the green
> > pixels. They're a rough approximation of one of hundreds of factors
> > we use when ranking sites. Not to mention that the number is usually
> > only updated every few weeks or months, so it's not guaranteed to be
> > up to date. If you've experienced a sudden drop in green-pixel PR, it
> > may be worth going over the Google Webmaster Guidelines to make sure
> > you're not in violation. Just remember that PR changes over time just
> > like the web changes, so all sites can expect some fluctuation.
> yes I know but is an strange coincidence, the 10.379 broken urls
> appear and PR goes down 2 points, isn't it?
> > Let us know how it goes--if you'd like advice or explanation of
> > anything I've said (or for any other responses that show up here),
> > keep on asking. We like questions around here. Especially good, hard
> > ones.
> thanks Bergy. this is a personal website so it is not be considered a
> 'dead or alive' thing but this so strange problems are very
> interesting for a proffesional SEO
> I'll keep you updated
> Regards
> > -Bergy
> > On May 9, 2:33 pm, cristina wrote:
> > > Check just in case if your site was hacked.
> > > If you can, block the URLs in the robots.txt file,
> > > if many of them start in the same way and
> > > differently from good URLs.
> > > I get HTTP status response 200 (OK)
> > > ( after redirection to the non-found page )
> > > for inexistent URLs from your site,
> > > you should have 404 (Not Found).
> > > > they are redirected to 404 error page obviously
> > > > thanks Cristina for your interest
> > > > On 9 mayo, 13:57, cristina wrote:
> > > > > It is difficult to have an opinion without some of the
> > > > > not-found URLs.
> > > > > Did you look at the Links and
> > > > > the 'What Googlebot sees' pages in Google Webmaster Tools,
> > > > > to check that all is OK there.
> > > > > Cristina.
> > > > > On May 9, 8:45 am,Dynamical.Biz wrote:
> > > > > > Without any technical reason (as far as I know) webmastertools (Web
> > > > > > crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> > > > > > I've checked the server errors log but no clues, this non existing
> > > > > > urls have a strange pattern that makes me think in a glitch in
> > > > > > googlebot but everything toghether is quite odd.
> > > > > > PR has been dropped from 6 to 4. Could this be the reason?
> > > > > > Someone had a similar experience?
> > > > > > Thanks- Hide quoted text -
> This will 301 redirect all those bad urls to your homepage.
I'll test tha one
In any case what I want you all to keep in mind is that all this
10.379 urls are not existing anywhere but somewhere in "google
indexing system memory".
I'm begining to believe quite sure is a google indexing problem, Why?
The wrong URL's are not broken internal/external linking.
- I'm not a PR neurotic but I try to ensure a good user experience,
this is why I did some htaccess redirections just in case
- 10.379 wrong urls appearing at webmastertool and PR going down 2
points not a coincidence?
They not appear at server error log
- if they would really be broken links the server error log would had
reflected all they time ago before google could discover them. None of
this 10.379 ones are in server error log!
Not too much time ago webmatertools for this account was not working
good, now works better but maybe there still is some glitch.
As I said, I'll wait several days till the website is reindexed again
and see if the robots.txt exclusion is having some positive efect.
Otherwise I'll ask reconsideration for the PR thing and if any way to
delete this URLs from webmatertools from google side
> > Hi Bergy, thanks a lot for your help
> > point by point
> > On 9 mayo, 23:58, Berghausen wrote:
> > > HiDynamical-
> > > There are two possible reasons I can think of for so many not-found
> > > URLs happening:
> > > 1- Your site used to have a different URL structure where those URLs
> > > were valid (as a result of a redesign or a hack), and there are broken
> > > links on your site or other sites to these pages. These might be
> > > worth correcting. (see B and C below)
> > none of this to options, I checked as soon they appeared
> > > 2- Sometimes authors type URL's wrong, spammers generate buggy URLs,
> > > or any number of other things which produce dead-end links that never
> > > did have any good content. These you can usually safely ignore.
> > yes, in error log I can see some worng URL's but not the kind that
> > webmastertool is showing and that is strnage
> > > Here are some things to do:
> > > A- It looks like Cristina just beat me to this (and point C,
> > > congrats!), but you should redo your 404's. Currently, when I go to
> > > those "broken" links that "redirect to a 404" I don't actually get a
> > > '404 Not Found'. I am getting a 302 redirect to 404.php, which
> > > returns a '200 OK'.
> > what is doing htaccess now is instread of showing the typical 404 Not
> > Found Apache webpage they are redirected to a custom error page trying
> > to keep as much trafic inside the web as possible giving some other
> > navigation options
> > > Perhaps one of the .htaccess wizards in the group
> > > can advise you on how to configure your server to return a proper 404
> > > header instead, while serving the body of 404.php as the message
> > > content.
> > > B- Take a good look around your site for broken links to bad pages.
> > > Other folks around here have suggested Xenu as a useful tool for this,
> > > but I cannot personally (or officially, as a Googler) endorse it, as
> > > I've never used it.
> > xenu report says everything is ok, I checked
> > > C- If you notice that most of the nonexistent URL's can be matched to
> > > a few simple patterns, you may want to add those patterns to your
> > > robots.txt file. That way, Googlebot won't even ask for them--you can
> > > save both your server and our crawler precious time and bandwidth. :-)
> > Ok I'll Disallow /home*$ and see how it goes
> > > As for your PR, I would advise you not to worry much about the green
> > > pixels. They're a rough approximation of one of hundreds of factors
> > > we use when ranking sites. Not to mention that the number is usually
> > > only updated every few weeks or months, so it's not guaranteed to be
> > > up to date. If you've experienced a sudden drop in green-pixel PR, it
> > > may be worth going over the Google Webmaster Guidelines to make sure
> > > you're not in violation. Just remember that PR changes over time just
> > > like the web changes, so all sites can expect some fluctuation.
> > yes I know but is an strange coincidence, the 10.379 broken urls
> > appear and PR goes down 2 points, isn't it?
> > > Let us know how it goes--if you'd like advice or explanation of
> > > anything I've said (or for any other responses that show up here),
> > > keep on asking. We like questions around here. Especially good, hard
> > > ones.
> > thanks Bergy. this is a personal website so it is not be considered a
> > 'dead or alive' thing but this so strange problems are very
> > interesting for a proffesional SEO
> > I'll keep you updated
> > Regards
> > > -Bergy
> > > On May 9, 2:33 pm, cristina wrote:
> > > > Check just in case if your site was hacked.
> > > > If you can, block the URLs in the robots.txt file,
> > > > if many of them start in the same way and
> > > > differently from good URLs.
> > > > I get HTTP status response 200 (OK)
> > > > ( after redirection to the non-found page )
> > > > for inexistent URLs from your site,
> > > > you should have 404 (Not Found).
> > > > > they are redirected to 404 error page obviously
> > > > > thanks Cristina for your interest
> > > > > On 9 mayo, 13:57, cristina wrote:
> > > > > > It is difficult to have an opinion without some of the
> > > > > > not-found URLs.
> > > > > > Did you look at the Links and
> > > > > > the 'What Googlebot sees' pages in Google Webmaster Tools,
> > > > > > to check that all is OK there.
> > > > > > Cristina.
> > > > > > On May 9, 8:45 am,Dynamical.Biz wrote:
> > > > > > > Without any technical reason (as far as I know) webmastertools (Web
> > > > > > > crawl errors > Not found) reports 10.3790 errors for my sitehttp://lamundial.net
> > > > > > > I've checked the server errors log but no clues, this non existing
> > > > > > > urls have a strange pattern that makes me think in a glitch in
> > > > > > > googlebot but everything toghether is quite odd.
> > > > > > > PR has been dropped from 6 to 4. Could this be the reason?
> > > > > > > Someone had a similar experience?
> > > > > > > Thanks- Hide quoted text -
> In any case what I want you all to keep in mind is that all this
> 10.379 urls are not existing anywhere but somewhere in "google
> indexing system memory".
> I'm begining to believe quite sure is a google indexing problem, Why?
> The wrong URL's are not broken internal/external linking.
> - I'm not a PR neurotic but I try to ensure a good user experience,
> this is why I did some htaccess redirections just in case
> - 10.379 wrong urls appearing at webmastertool and PR going down 2
> points not a coincidence?
> They not appear at server error log
> - if they would really be broken links the server error log would had
> reflected all they time ago before google could discover them. None of
> this 10.379 ones are in server error log!
Yup. I first observed this in 2006 and I've seen it several times
since. Search Usenet on "googlebot active imagination".
I warn you - you're ploughing a lone furrow. Everyone seems to
believe Google is infallible but if you've the kind of experience I
have (forty years now) you'll recognise that Google is not very good
at coding and even worse at testing.
Anyway, the basic test is blindingly obvious. If such links existed
ANYWHERE the other bots would come looking for them and they don't.
It usually looks like the Googlebot has concatenated your domain name
with a bunch of relative URLs from some other random site. One
characteristic is tha all these URLs appear in the reports in one pass
- another is that they'll never appear again - which, of course, they
would if they were the result of some misconfiguration either on your
site or somewhere else.
You wouldn't BELIEVE how reluctant otherwise rational people are to
accept this is a Googlebug. The infallibility of the Googlebot is
part of their catechsim.
> Yup. I first observed this in 2006 and I've seen it several times
> since. Search Usenet on "googlebot active imagination".
> I warn you - you're ploughing a lone furrow. Everyone seems to
> believe Google is infallible but if you've the kind of experience I
> have (forty years now) you'll recognise that Google is not very good
> at coding and even worse at testing.
I don't believe Google is infallible, nothing is but hard to
demostrate
> Anyway, the basic test is blindingly obvious. If such links existed
> ANYWHERE the other bots would come looking for them and they don't.
thanks
> It usually looks like the Googlebot has concatenated your domain name
> with a bunch of relative URLs from some other random site. One
> characteristic is tha all these URLs appear in the reports in one pass
> - another is that they'll never appear again - which, of course, they
> would if they were the result of some misconfiguration either on your
> site or somewhere else.
completely right, patern seems some crazy concatenation of pieces of
urls
if this was a site misconfiguration I would have notice time ago but
not
> You wouldn't BELIEVE how reluctant otherwise rational people are to
> accept this is a Googlebug. The infallibility of the Googlebot is
> part of their catechsim.
Another way to return HTTP status response
404 (Not Found) for non-existent URLs of the
site http://lamundial.net is to use the PHP header function
in the 404.php error page
http://lamundial.net/ 404.php
(I added a space in the URL not to be followed as a link)
Nonexistent URLs are redirected to 404.php
which returns HTTP status 200 (OK),
change that so 404.php to return HTTP status
response 404 (Not Found).
> Another way to return HTTP status response
> 404 (Not Found) for non-existent URLs of the
> sitehttp://lamundial.netis to use the PHP header function
> in the 404.php error pagehttp://lamundial.net/404.php > (I added a space in the URL not to be followed as a link)
> Nonexistent URLs are redirected to 404.php
> which returns HTTP status 200 (OK),
> change that so 404.php to return HTTP status
> response 404 (Not Found).
I get now the correct response 404 (Not Found)
for non-existent URLs from your site,
including for the spammy URLs, so it looks OK.
I think you should continue to check if there was
some hacking, just in case,
look at 'what googlebot sees'
page in Google Webmaster Tools, and to
the cached copy as text from search results.
> first line is a regular 404 personaliced error page
> second one is to capture th strange urls if any so I'll delete it as
> soon as I get any conclusion
> I get now the correct response 404 (Not Found)
> for non-existent URLs from your site,
> including for the spammy URLs, so it looks OK.
> I think you should continue to check if there was
> some hacking, just in case,
> look at 'what googlebot sees'
> page in Google Webmaster Tools, and to
> the cached copy as text from search results.
> Cristina.
> On May 12, 1:41 pm,Dynamical.Biz wrote:
> > Thanks Cristina but lets forguet the error handling
> > first line is a regular 404 personaliced error page
> > second one is to capture th strange urls if any so I'll delete it as
> > soon as I get any conclusion
Googlebot came to lamundial.net 26/05/2008
now only 8 "404 errors"
10346 restricted in robots.txt using this not so standard but working
"Disallow: /home*$"
not a single coma changed in the website
maybe I'm going to remove the "Disallow: /home*$" to see if this
googlebot error comes again
PR matters
in the mean time PR downgrade from 6 to 4 is making my website lose
organic traffic and the ones gaining positions are no so relevant, the
first one for "musica copyleft" is just a bunch of copy and paste
articles while we write every single line of our posts. that's why we
are a reference
in other hand we want to get sponsors to finance our next copyleft
musical project and the lose of visibility is not a good thing right
now
I removed the "Disallow: /home*$" from the robots.txt to see if this
strange 10.000 wrong URLs could happend again.
Since then googlebot has come 2 more times to my web and I have no
changed any single "," in the site so for me the conclusion is clear
right now:
googlebot got confused itself and generated this 10.000 non real wrong
urls
what is real for me is the consecuence of it, worst sites than mine
ranking better but this could be another chapter
Since then googlebot has come 2 more times to my web, no changed a
single "," in it and the errors are not appearing again so for me the
conclusion is clear.