I manage a site (
http://www.bridge.ids.ac.uk/ ) which hosts numerous
documents (mainly in *.pdf and *.doc format) linked from flat html
pages.
For example: the following document was put online at the beginning of
July:
GENDER and INDICATORS
http://www.bridge.ids.ac.uk/reports/IndicatorsORfinal.pdf
It is linked to from this page:
http://www.bridge.ids.ac.uk/reports_gend_CEP.html#Indicators
If I search for "Gender and Indicators" Google lists numerous results
from other sites about the document and which point to it on our
server, but does not actually list the resource on www.bridge.ids.ac.uk
itself.
I recently succesfully submitted a partial *.txt sitemap to see if
this would help (http://www.bridge.ids.ac.uk/
070709_bridge_sitemap.txt) which includes the URL of this resource, to
see if that would improve the situation.
However, Google is still not able to find either internal or external
links to the page
http://www.bridge.ids.ac.uk/reports/IndicatorsORfinal.pdf
(I tried through the Links tool)
I know from my own server-based webstats that this pdf page has been
requested at least 500 times in the last 3 weeks but am mystified as
to why it isn't being found by Google. I have many older *.pdf and
*.doc documents on the site which are returned. However the more
recent ones seem to struggle to be recognised.
Can anyone suggest and explanation for this and thoughts on anything I
can do to get Google to crawl this page?
Thanks