A page like this has no business being crawled and indexed:
http://www.rare-cancer.org/dictionary/index.php/viewpage/Feedback+%252F+Suggest+A+Word.xhtml
So you either add a noindex robots meta tag to it or disallow it in
the robots.txt file.
No idea why the css.php file is picked up as a url to crawl and
include, but it needs filtering out.
There's at least one broken link:
URL: http://www.rare-cancer.org/dictionary/mai
Error: HTTP-Error 404 Not Found
Linked from: http://www.rare-cancer.org/dictionary/
You should avoid using links to urls ending in /index.php (i.e. with
no query string after it). This is a duplciate of the same url ending
in / without the index.php .
The crawling is quite slow, might be due to a slow server (even if
it's called Lightspeed) , slow database and slow url rewriting.
Thanks and Take Care, Sharon
Another thing you should do is manage the canonical preference: www vs
non-www urls. Currently they both respond with 200.
One set shoudl respodn that way, the other shoudl be redirected to the
rpeferred form.
See here how to do it on an Apache server (includes Lightspeed):
> > Add it there under the Settings > General tab and re-crawl.- Hide quoted text -
>
> - Show quoted text -