Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

unconscientious bots (networks)

2 views
Skip to first unread message

Ivan Shmakov

unread,
Oct 11, 2011, 5:11:32 AM10/11/11
to
>>>>> D Stussy <spam+ne...@bde-arc.ampr.org> writes:

[Cross-posting to news:comp.infosystems.www.servers.misc, for my
questions aren't specific to Apache.]

[…]

> Baiduspider does not respect "/robots.txt"

I've just checked that it occasionally tries to GET /robots.txt
from one of my HTTP servers. I'm yet to check whether it
respects it or not (I don't have one just now.)

> nor repeated 403's. I block its entire set of IP ranges in my
> firewall.

BTW, is there a kind of black list of such unconscientious bots
(networks)? Or a kind of DNSBL?

TIA.

--
FSF associate member #7257

D. Stussy

unread,
Oct 11, 2011, 2:49:33 PM10/11/11
to
"Ivan Shmakov" <iv...@gray.siamics.net> wrote in message
news:86lisr3n...@gray.siamics.net...

> >>>>> D Stussy <spam+ne...@bde-arc.ampr.org> writes:
>
> [Cross-posting to news:comp.infosystems.www.servers.misc, for my
> questions aren't specific to Apache.]
>
> [.]

>
> > Baiduspider does not respect "/robots.txt"
>
> I've just checked that it occasionally tries to GET /robots.txt
> from one of my HTTP servers. I'm yet to check whether it
> respects it or not (I don't have one just now.)
>
> > nor repeated 403's. I block its entire set of IP ranges in my
> > firewall.
>
> BTW, is there a kind of black list of such unconscientious bots
> (networks)? Or a kind of DNSBL?

No DNSBL that I'm aware of. There are a handful of web sites dedicated to
user-agent identification but no malicious list per se.


0 new messages