I think you should disallow again /cgi-bin/ in the robots.txt of the
subddomain search.japantimes.co.jp and request removal of the folder
from the idnex.
It must have been indexed accidentally when it was not disallowed and
disallowing it later only resulted in no cache for it.
But also that page has this meta tag:
<meta name="robots" content="noarchive,follow,index"/>
Which says index it but with no cache. So even if you make any changes
to the title it would not be reflected due to there being no cached
copy saved for it. At least this is how I see it.
On Aug 6, 11:15 pm, katamari wrote:
> Thanks for the reply, Thu.
> The site is japantimes.co.jp
> Here is an example of a malformed search result (Headline: "A-bombings
> couldn't be helped: Kyuma" comes out "a bombings couldn t be helped
> kyuma)
> http://www.google.com/search?hl=en&safe=off&client=firefox-a&rls=org....
> We previously disallowed the cgi-bin directory in our robots.txt but
> it is allowed now.
> Thanks.
> On Aug 7, 8:20 am, Thu Tu wrote:
> > Hi katamari,
> > Would you mind sharing the urls of your news site and of the search
> > results page(s)? It might help us to see what is going on. When you
> > asked about robots.txt, did you mean that you have already disallowed
> > bots from cgi-bin?
> > Regards,
> > Thu
> > On Aug 5, 11:45 pm, katamari wrote:
> > > In the past month or so, certain story headlines from our news site
> > > have been showing up in Google index all lower case, minus punctuation
> > > and minus snippets.
> > > This seems to happen only with stories that have /cgi-bin/ in the URL.
> > > We have mail and RSS versions of the same stories (with /rss/ and /
> > > mail/ in the URL) and these are showing up in the search results
> > > fine.
> > > Would disallowing the cgi-bin in the robots.txt cause this problem?- Hide quoted text -
> - Show quoted text -