I have quite a few 404 errors listed under Diagnostics > Web crawl >
Not found for one my sites. Nearly all of them are for relative links
that have been listed incorrectly as having 404 errors.
For example, let's say a full URL is www.mysite.com/abc/xyz/file.html.
If the relative link is ../../abc/xyz/file.html in the abc/xyz sub-
directory, Web crawl sees the link sometimes as www.mysite.com/abc/abc/xyz/file.html (note the duplicate abc sub-directory) and lists it incorrectly as a
404 error. (Yes, the relative link could be simply file.html, but this
is how a tool I am using inserts the links -- and the links are
valid.)
Similarly, say a full URL is www.mysite.com/mno/file.html. If the
relative link is ../../mno/file.html in the abc/xyz sub-directory, Web
crawl sees the link sometimes as www.mysite.com/abc/mno/file.html (note the extra abc sub-directory) and lists it incorrectly as a 404
error.
Web crawl also appears to pick up "extra" sub-directories in other
cases, but I have not discovered the pattern.
The site where these diagnositics can be found is www.service-architecture.com.
These "errors" were all detected in October.
These 404 errors can be coming from anywhere
on the web and not just your site. They could be
faulty inbound links gathered by some of the bots
used by other sites.
To make sure these are not links generated
by your site you could use a link checking tool
like Xenu Link Sleuth
home.snafu.de/tilman/xenulink.html
You could also do a inurl: google search
(or just a regular search) using those incorrect
links to try and determine where they are coming from.
If found to be on other sites, notify the webmaster so
they might correct the links or...
If they are internal, then fix them.
If they are external you can either ignore them or
of you think it is worth while, 301 redirect them to
the most appropriate page.
> I have quite a few 404 errors listed under Diagnostics > Web crawl >
> Not found for one my sites. Nearly all of them are for relative links
> that have been listed incorrectly as having 404 errors.
> For example, let's say a full URL iswww.mysite.com/abc/xyz/file.html.
> If the relative link is ../../abc/xyz/file.html in the abc/xyz sub-
> directory, Web crawl sees the link sometimes aswww.mysite.com/abc/abc/xyz/file.html > (note the duplicate abc sub-directory) and lists it incorrectly as a
> 404 error. (Yes, the relative link could be simply file.html, but this
> is how a tool I am using inserts the links -- and the links are
> valid.)
> Similarly, say a full URL iswww.mysite.com/mno/file.html. If the
> relative link is ../../mno/file.html in the abc/xyz sub-directory, Web
> crawl sees the link sometimes aswww.mysite.com/abc/mno/file.html > (note the extra abc sub-directory) and lists it incorrectly as a 404
> error.
> Web crawl also appears to pick up "extra" sub-directories in other
> cases, but I have not discovered the pattern.
> The site where these diagnositics can be found iswww.service-architecture.com.
> These "errors" were all detected in October.
If the Diagnostics > Web crawl page is accurate, then it shows that
"Linked From" pages on my site are causing the false errors and not
from anywhere on the Web. I spot checked the pages listed under
"Linked From" and they have relative links that reference the given
files in the various "erroneous" URLs. So, unless the "Linked From"
column means something else, I think the defect still stands.
> These 404 errors can be coming from anywhere
> on the web and not just your site. They could be
> faulty inbound links gathered by some of the bots
> used by other sites.
> To make sure these are not links generated
> by your site you could use a link checking tool
> like Xenu Link Sleuth
> home.snafu.de/tilman/xenulink.html
> You could also do a inurl: google search
> (or just a regular search) using those incorrect
> links to try and determine where they are coming from.
> If found to be on other sites, notify the webmaster so
> they might correct the links or...
> If they are internal, then fix them.
> If they are external you can either ignore them or
> of you think it is worth while, 301 redirect them to
> the most appropriate page.
> Hope that helps,
> Abracadabra
> On Oct 24, 10:22 am, DougBarry wrote:
> > I have quite a few 404 errors listed under Diagnostics > Web crawl >
> > Not found for one my sites. Nearly all of them are for relative links
> > that have been listed incorrectly as having 404 errors.
> > For example, let's say a full URL iswww.mysite.com/abc/xyz/file.html.
> > If the relative link is ../../abc/xyz/file.html in the abc/xyz sub-
> > directory, Web crawl sees the link sometimes aswww.mysite.com/abc/abc/xyz/file.html > > (note the duplicate abc sub-directory) and lists it incorrectly as a
> > 404 error. (Yes, the relative link could be simply file.html, but this
> > is how a tool I am using inserts the links -- and the links are
> > valid.)
> > Similarly, say a full URL iswww.mysite.com/mno/file.html. If the
> > relative link is ../../mno/file.html in the abc/xyz sub-directory, Web
> > crawl sees the link sometimes aswww.mysite.com/abc/mno/file.html > > (note the extra abc sub-directory) and lists it incorrectly as a 404
> > error.
> > Web crawl also appears to pick up "extra" sub-directories in other
> > cases, but I have not discovered the pattern.
> > The site where these diagnositics can be found iswww.service-architecture.com.
> > These "errors" were all detected in October.
Run Xenu - and it will pick out all the broken links. You will be
astonished to find the same ones reported by Google.
If you have links like this: <a href="/../blahblah.html">Blahblah</a>
or <a href="../blahblah.html">Blah</a> these will show up as broken. A
browser can figure them out but a robot cannot.
> If the Diagnostics > Web crawl page is accurate, then it shows that
> "Linked From" pages on my site are causing the false errors and not
> from anywhere on the Web. I spot checked the pages listed under
> "Linked From" and they have relative links that reference the given
> files in the various "erroneous" URLs. So, unless the "Linked From"
> column means something else, I think the defect still stands.
> Thanks,
> Doug
> On Oct 24, 9:50 am, Tim Abracadabra wrote:
> > Hi DougBarry and welcome.
> > These 404 errors can be coming from anywhere
> > on the web and not just your site. They could be
> > faulty inbound links gathered by some of the bots
> > used by other sites.
> > To make sure these are not links generated
> > by your site you could use a link checking tool
> > like Xenu Link Sleuth
> > home.snafu.de/tilman/xenulink.html
> > You could also do a inurl: google search
> > (or just a regular search) using those incorrect
> > links to try and determine where they are coming from.
> > If found to be on other sites, notify the webmaster so
> > they might correct the links or...
> > If they are internal, then fix them.
> > If they are external you can either ignore them or
> > of you think it is worth while, 301 redirect them to
> > the most appropriate page.
> > Hope that helps,
> > Abracadabra
> > On Oct 24, 10:22 am, DougBarry wrote:
> > > I have quite a few 404 errors listed under Diagnostics > Web crawl >
> > > Not found for one my sites. Nearly all of them are for relative links
> > > that have been listed incorrectly as having 404 errors.
> > > For example, let's say a full URL iswww.mysite.com/abc/xyz/file.html.
> > > If the relative link is ../../abc/xyz/file.html in the abc/xyz sub-
> > > directory, Web crawl sees the link sometimes aswww.mysite.com/abc/abc/xyz/file.html > > > (note the duplicate abc sub-directory) and lists it incorrectly as a
> > > 404 error. (Yes, the relative link could be simply file.html, but this
> > > is how a tool I am using inserts the links -- and the links are
> > > valid.)
> > > Similarly, say a full URL iswww.mysite.com/mno/file.html. If the
> > > relative link is ../../mno/file.html in the abc/xyz sub-directory, Web
> > > crawl sees the link sometimes aswww.mysite.com/abc/mno/file.html > > > (note the extra abc sub-directory) and lists it incorrectly as a 404
> > > error.
> > > Web crawl also appears to pick up "extra" sub-directories in other
> > > cases, but I have not discovered the pattern.
> > > The site where these diagnositics can be found iswww.service-architecture.com.
> > > These "errors" were all detected in October.- Hide quoted text -
> Run Xenu - and it will pick out all the broken links. You will be
> astonished to find the same ones reported by Google.
> If you have links like this: <a href="/../blahblah.html">Blahblah</a>
> or <a href="../blahblah.html">Blah</a> these will show up as broken. A
> browser can figure them out but a robot cannot.
> On Oct 24, 12:23 pm,DougBarrywrote:
> > Abracadabra,
> > If the Diagnostics > Web crawl page is accurate, then it shows that
> > "Linked From" pages on my site are causing the false errors and not
> > from anywhere on the Web. I spot checked the pages listed under
> > "Linked From" and they have relative links that reference the given
> > files in the various "erroneous" URLs. So, unless the "Linked From"
> > column means something else, I think the defect still stands.
> > Thanks,
> > Doug
> > On Oct 24, 9:50 am, Tim Abracadabra wrote:
> > > HiDougBarryand welcome.
> > > These 404 errors can be coming from anywhere
> > > on the web and not just your site. They could be
> > > faulty inbound links gathered by some of the bots
> > > used by other sites.
> > > To make sure these are not links generated
> > > by your site you could use a link checking tool
> > > like Xenu Link Sleuth
> > > home.snafu.de/tilman/xenulink.html
> > > You could also do a inurl: google search
> > > (or just a regular search) using those incorrect
> > > links to try and determine where they are coming from.
> > > If found to be on other sites, notify the webmaster so
> > > they might correct the links or...
> > > If they are internal, then fix them.
> > > If they are external you can either ignore them or
> > > of you think it is worth while, 301 redirect them to
> > > the most appropriate page.
> > > Hope that helps,
> > > Abracadabra
> > > On Oct 24, 10:22 am,DougBarrywrote:
> > > > I have quite a few 404 errors listed under Diagnostics > Web crawl >
> > > > Not found for one my sites. Nearly all of them are for relative links
> > > > that have been listed incorrectly as having 404 errors.
> > > > For example, let's say a full URL iswww.mysite.com/abc/xyz/file.html.
> > > > If the relative link is ../../abc/xyz/file.html in the abc/xyz sub-
> > > > directory, Web crawl sees the link sometimes aswww.mysite.com/abc/abc/xyz/file.html > > > > (note the duplicate abc sub-directory) and lists it incorrectly as a
> > > > 404 error. (Yes, the relative link could be simply file.html, but this
> > > > is how a tool I am using inserts the links -- and the links are
> > > > valid.)
> > > > Similarly, say a full URL iswww.mysite.com/mno/file.html. If the
> > > > relative link is ../../mno/file.html in the abc/xyz sub-directory, Web
> > > > crawl sees the link sometimes aswww.mysite.com/abc/mno/file.html > > > > (note the extra abc sub-directory) and lists it incorrectly as a 404
> > > > error.
> > > > Web crawl also appears to pick up "extra" sub-directories in other
> > > > cases, but I have not discovered the pattern.
> > > > The site where these diagnositics can be found iswww.service-architecture.com.
> > > > These "errors" were all detected in October.- Hide quoted text -