Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

unconscientious bots (networks)

11 views
Skip to first unread message

Ivan Shmakov

unread,
Oct 11, 2011, 5:11:32 AM10/11/11
to
>>>>> D Stussy <spam+ne...@bde-arc.ampr.org> writes:

[Cross-posting to news:comp.infosystems.www.servers.misc, for my
questions aren't specific to Apache.]

[…]

> Baiduspider does not respect "/robots.txt"

I've just checked that it occasionally tries to GET /robots.txt
from one of my HTTP servers. I'm yet to check whether it
respects it or not (I don't have one just now.)

> nor repeated 403's. I block its entire set of IP ranges in my
> firewall.

BTW, is there a kind of black list of such unconscientious bots
(networks)? Or a kind of DNSBL?

TIA.

--
FSF associate member #7257

D. Stussy

unread,
Oct 11, 2011, 2:49:33 PM10/11/11
to
"Ivan Shmakov" <iv...@gray.siamics.net> wrote in message
news:86lisr3n...@gray.siamics.net...

> >>>>> D Stussy <spam+ne...@bde-arc.ampr.org> writes:
>
> [Cross-posting to news:comp.infosystems.www.servers.misc, for my
> questions aren't specific to Apache.]
>
> [.]

>
> > Baiduspider does not respect "/robots.txt"
>
> I've just checked that it occasionally tries to GET /robots.txt
> from one of my HTTP servers. I'm yet to check whether it
> respects it or not (I don't have one just now.)
>
> > nor repeated 403's. I block its entire set of IP ranges in my
> > firewall.
>
> BTW, is there a kind of black list of such unconscientious bots
> (networks)? Or a kind of DNSBL?

No DNSBL that I'm aware of. There are a handful of web sites dedicated to
user-agent identification but no malicious list per se.


0 new messages