Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

robots.txt

3 views
Skip to first unread message

gor...@gmail.com

unread,
Feb 12, 2006, 7:51:46 PM2/12/06
to
Is the robots.txt file blocking all group and user sites under the OCF
to be blocked from crawling? Is there any way to change the robots.txt
file to prevent crawling only on the www.ocf.berkeley.edu site itself?

gor...@gmail.com

unread,
Feb 12, 2006, 7:55:28 PM2/12/06
to
here's what i was talking about:

at http://www.ocf.berkeley.edu/robots.txt, this file looks like this:

# /robots.txt for http://www.ocf.berkeley.edu/
# See <http://www.robotstxt.org/wc/norobots.html> for full details.

User-agent: *
Disallow: /cgi-bin/
Disallow: /OCF/contact.html
Disallow: /upgrade/
Disallow: /new/
Disallow: /old/
Disallow: /v3/


I've noticed that 'disallow' has been specified to select directories
now. Hopefully that will prevent the robots.txt from blanketing its
disallow policy on every single website hosted under the OCF.

0 new messages