at http://www.ocf.berkeley.edu/robots.txt, this file looks like this:
# /robots.txt for http://www.ocf.berkeley.edu/
# See <http://www.robotstxt.org/wc/norobots.html> for full details.
User-agent: *
Disallow: /cgi-bin/
Disallow: /OCF/contact.html
Disallow: /upgrade/
Disallow: /new/
Disallow: /old/
Disallow: /v3/
I've noticed that 'disallow' has been specified to select directories
now. Hopefully that will prevent the robots.txt from blanketing its
disallow policy on every single website hosted under the OCF.