Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

BUG URLLIB2

2 views
Skip to first unread message

bart

unread,
Nov 14, 2002, 7:20:47 PM11/14/02
to
I'm an italian student and my problem consists about urllib2; urlopen
function, sometimes, give me old (and inexactly) url because the site
has moved in a new location but urlopen doesn't undestand it and return
403 error FORBIDDEN.

EXAMPLE:

Python 2.2.1 (#1, Aug 30 2002, 12:15:30)
[GCC 3.2 20020822 (Red Hat Linux Rawhide 3.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> con=urllib2.urlopen('http://www.google.it/search')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.2/urllib2.py", line 138, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.2/urllib2.py", line 322, in open
'_open', req)
File "/usr/lib/python2.2/urllib2.py", line 301, in _call_chain
result = func(*args)
File "/usr/lib/python2.2/urllib2.py", line 785, in http_open
return self.do_open(httplib.HTTP, req)
File "/usr/lib/python2.2/urllib2.py", line 779, in do_open
return self.parent.error('http', req, fp, code, msg, hdrs)
File "/usr/lib/python2.2/urllib2.py", line 348, in error
return self._call_chain(*args)
File "/usr/lib/python2.2/urllib2.py", line 301, in _call_chain
result = func(*args)
File "/usr/lib/python2.2/urllib2.py", line 400, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
>>>


What can I do to repair the situation?

Help me please!!!

Thanks...

-Ennio-

Martin v. Löwis

unread,
Nov 15, 2002, 12:07:45 PM11/15/02
to
"bart" <e_v...@libero.it> wrote in message
news:3DD43DA7...@libero.it...

> I'm an italian student and my problem consists about urllib2; urlopen
> function, sometimes, give me old (and inexactly) url because the site
> has moved in a new location but urlopen doesn't undestand it and
return
> 403 error FORBIDDEN.

That's not a bug in urllib2; google.it is *really* returning 403
FORBIDDEN.
It appears that this google behaviour is triggered by the header

User-agent: Python-urllib/2.0a1

that urllib2 sends, which, in turn, suggests that Google explicitly bans
urllib2.
Complain to them.

Regards,
Martin

bart

unread,
Nov 15, 2002, 2:29:06 PM11/15/02
to
> That's not a bug in urllib2; google.it is *really* returning 403
> FORBIDDEN.
> It appears that this google behaviour is triggered by the header
>
> User-agent: Python-urllib/2.0a1
>
> that urllib2 sends, which, in turn, suggests that Google explicitly bans
> urllib2.
> Complain to them.
>
> Regards,
> Martin
>

Thanks to all them that helped me swiftly!!!

How can I change User-Agent field presents inside "urllib2"?

I find two variables that (I think) define user agent in "urllib2"
library: "__name__" and "__version__".

I tested to set them following way:

__name__="Mozzilla"
__version__="5.0"

but it failed yet!!!

Whatever suggest is accept!!!

- Ennio Viola -

bart

unread,
Nov 15, 2002, 2:30:02 PM11/15/02
to
> That's not a bug in urllib2; google.it is *really* returning 403
> FORBIDDEN.
> It appears that this google behaviour is triggered by the header
>
> User-agent: Python-urllib/2.0a1
>
> that urllib2 sends, which, in turn, suggests that Google explicitly bans
> urllib2.
> Complain to them.
>
> Regards,
> Martin
>

Thanks to all them that helped me swiftly!!!

bart

unread,
Nov 15, 2002, 2:31:37 PM11/15/02
to
> That's not a bug in urllib2; google.it is *really* returning 403
> FORBIDDEN.
> It appears that this google behaviour is triggered by the header
>
> User-agent: Python-urllib/2.0a1
>
> that urllib2 sends, which, in turn, suggests that Google explicitly bans
> urllib2.
> Complain to them.
>
> Regards,
> Martin
>

Thanks to all them that helped me swiftly!!!

0 new messages