urllib.urlopen and proxies

69 views
Skip to first unread message

Paul Moore

unread,
Mar 4, 2001, 8:44:20 AM3/4/01
to
I've just looked at urllib for the first time. I was pleased to find
that it automatically picks up my proxy configuration from the system
(this is ActiveState's Python 2.0 on Windows 2000). However, I have a
problem - my proxy requires a user name and password, and there is no
place that I can see to enter them. So all I get back from urlopen()
is the polite "you need authorisation" message that our proxy gives
:-(

Can anyone tell me how to pass a proxy userid and password to
urllib.urlopen()?

Thanks,
Paul Moore

Chris Gonnerman

unread,
Mar 5, 2001, 9:03:35 AM3/5/01
to Paul Moore, pytho...@python.org
It *appears* (from reading the source) that username and password
information for
the proxy server must be passed as part of the URL:

http://username:pass...@my.host.com/index.html

for instance. I haven't tested this yet. The source for urllib is a bit
dense
IMHO but the answers are in there.

> --
> http://mail.python.org/mailman/listinfo/python-list
>


Paul Moore

unread,
Mar 5, 2001, 10:26:18 AM3/5/01
to
On Mon, 5 Mar 2001 15:03:35 +0100 , "Chris Gonnerman"
<chris.g...@usa.net> wrote:

>It *appears* (from reading the source) that username and password
>information for
>the proxy server must be passed as part of the URL:
>
> http://username:pass...@my.host.com/index.html
>
>for instance. I haven't tested this yet. The source for urllib is a bit
>dense IMHO but the answers are in there.

Doesn't work for me... I attach an example, in case there are any
pointers in the result.

Paul

--- Example session ---

>>> from urllib import urlopen
>>> print urlopen("http://UUUU:PP...@www.python.org/").read()
<HTML><HEAD>
<TITLE>ERROR: Proxy Access Denied, authentication failed</TITLE>
<!-- $Id: ERR_CACHE_ACCESS_DENIED,v 1.1 2000/12/20 12:11:12 devet Exp
$ -->
</HEAD>
<BODY BGCOLOR="#ffffff">
<H1>ERROR</H1>
<H2>Proxy Access Denied, authentication failed</H2>
<HR>
<P>
While trying to retrieve the URL:
<A HREF="http://www.python.org/">http://www.python.org/</A>
<P>
The following error was encountered:
<UL>
<LI>
<STRONG>
Proxy Access Denied, authentication failed.
</STRONG>
</UL>
</P>

<P>Sorry, you are currently not allowed to request:
<PRE> http://www.python.org/</PRE>
from this proxy until you have authenticated yourself.
</P>

<P>Please use your NT userid <B>without the "domain\" part</B> and
your
NT password. <B>Example</B>: "cc01234" (not "europe\cc01234"!) and
"zekret!". </P>

<P>

See the <A
HREF="http://channels.origin-it.com/newsbin/center/fulltext.asp?msgId=NE20001207
110741C111083">
announcement</A> on Channels for further information.

<P>
You need to use Netscape version 2.0 or greater, or Microsoft Internet
Explorer 3.0, or an HTTP/1.1 compliant browser for this to work.
Please
contact the the helpdesk if you have difficulties authenticating
yourself.
</P>

<br clear="all">
<hr noshade size=1>
Generated Mon, 05 Mar 2001 15:20:35 GMT by www-proxy2.nl.origin-it.com
(<a href=
"http://squid.nlanr.net/Squid/">Squid/2.2.STABLE4</a>)
</BODY></HTML>

Siggy Brentrup

unread,
Mar 5, 2001, 10:07:42 AM3/5/01
to pytho...@python.org
"Chris Gonnerman" <chris.g...@usa.net> writes:

> It *appears* (from reading the source) that username and password
> information for
> the proxy server must be passed as part of the URL:
>
> http://username:pass...@my.host.com/index.html
>

I suggest using urllib2, changing the proxy setup described in the
module's docstring to

proxy_support = urllib2.ProxyHandler({"http" :
"http://user:pw@proxy:port"})

Untested too, since my proxy doesn't require authorization.

HTWorks
Siggy


Paul Moore

unread,
Mar 6, 2001, 9:47:57 AM3/6/01
to
On Mon, 5 Mar 2001 16:07:42 +0100 , Siggy Brentrup <b...@winnegan.de>
wrote:

I had a look at urllib2. The module docstring even includes an
example!

----------
import urllib2

# set up authentication info
authinfo = urllib2.HTTPBasicAuthHandler()
authinfo.add_password('realm', 'host', 'username', 'password')

proxy_support = urllib2.ProxyHandler({"http" :
"http://ahad-haam:3128"})

# build a new opener that adds authentication and caching FTP handlers
opener = urllib2.build_opener(proxy_support, authinfo,
urllib2.CacheFTPHandler)

# install it
urllib2.install_opener(opener)

f = urllib2.urlopen('http://www.python.org/')
----------

The only problem is that I don't know what the 'realm' and 'host'
parameters to authinfo.add_password should be. I need to supply a
username and password for everything...

Paul.

Siggy Brentrup

unread,
Mar 6, 2001, 1:05:35 PM3/6/01
to pytho...@python.org
Paul Moore <paul....@uk.origin-it.com> writes:

> On Mon, 5 Mar 2001 16:07:42 +0100 , Siggy Brentrup <b...@winnegan.de>
> wrote:
>

[...]

> >I suggest using urllib2, changing the proxy setup described in the
> >module's docstring to
> >
> > proxy_support = urllib2.ProxyHandler({"http" :
> > "http://user:pw@proxy:port"})
> >
> >Untested too, since my proxy doesn't require authorization.
>
> I had a look at urllib2. The module docstring even includes an
> example!
>

[...]

>
> The only problem is that I don't know what the 'realm' and 'host'
> parameters to authinfo.add_password should be. I need to supply a
> username and password for everything...

The authinfo stuff is for remote sites requiring authentication, as
long as you visit unprotected sites only you can ignore it.

Modify the docstring example as follows:

--------
import urllib2

proxy_info = {
'user' : 'your userid',
'pass' : 'your password',
'host' : "proxy.doma.in',
'port' : 3128 # or 8080 or whatever
}

# build a new opener that uses a proxy requiring authorization
proxy_support = urllib2.ProxyHandler({"http" :
"http://%(user)s:%(pass)s@%(host)s:%(port)d" % proxy_info})
opener = urllib2.build_opener(proxy_support)



# install it
urllib2.install_opener(opener)

f = urllib2.urlopen('http://www.python.org/')
--------

Proxy authorization still untested, please report if it works.

Siggy


Paul Moore

unread,
Mar 7, 2001, 6:10:16 AM3/7/01
to
On Tue, 6 Mar 2001 19:05:35 +0100 , Siggy Brentrup <b...@winnegan.de>
wrote:

>The authinfo stuff is for remote sites requiring authentication, as


>long as you visit unprotected sites only you can ignore it.
>
>Modify the docstring example as follows:
>
>--------

[...]


>--------
>
>Proxy authorization still untested, please report if it works.

Unfortunately, it still fails, with a traceback

>>> f = urllib2.urlopen('http://www.python.org/')

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "c:\applications\python\lib\urllib2.py", line 137, in urlopen
return _opener.open(url, data)
File "c:\applications\python\lib\urllib2.py", line 325, in open
'_open', req)
File "c:\applications\python\lib\urllib2.py", line 304, in
_call_chain
result = func(*args)
File "c:\applications\python\lib\urllib2.py", line 747, in http_open
raise URLError(err)
urllib2.URLError: <urlopen error host not found>

I'll see if I can understand what is going on when I get a chance, but
I don't know how far I'll get...

Paul.

Paul Moore

unread,
Mar 8, 2001, 5:46:23 AM3/8/01
to
On Wed, 7 Mar 2001 13:44:11 +0100 , Siggy Brentrup <b...@winnegan.de> wrote:

>Paul Moore <paul....@uk.origin-it.com> writes:
>Well, obviously I should have tested before posting :-( There's one
>omission in my code as well as three typos in urllib2.py (sample and
>patch attached)

Hmm. I was using Python 2.0. I switched to Python 2.1, and applied your patch
(did you send it to SourceForge as a bug report? If not, would you like me
to?) Running your script (with my proxy info) I still get an error...

C:\Data>python21 proxy_auth.py


Traceback (most recent call last):

File "proxy_auth.py", line 18, in ?


f = urllib2.urlopen('http://www.python.org/')

File "c:\applications\python21\lib\urllib2.py", line 135, in urlopen
return _opener.open(url, data)
File "c:\applications\python21\lib\urllib2.py", line 318, in open
'_open', req)
File "c:\applications\python21\lib\urllib2.py", line 297, in _call_chain
result = func(*args)
File "c:\applications\python21\lib\urllib2.py", line 822, in http_open
return self.do_open(httplib.HTTP, req)
File "c:\applications\python21\lib\urllib2.py", line 816, in do_open
return self.parent.error('http', req, fp, code, msg, hdrs)
File "c:\applications\python21\lib\urllib2.py", line 344, in error
return self._call_chain(*args)
File "c:\applications\python21\lib\urllib2.py", line 297, in _call_chain
result = func(*args)
File "c:\applications\python21\lib\urllib2.py", line 425, in
http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 407: Proxy Authentication Required

I dumped out a few of the objects, and it looks like the request is sending a
Proxy-Authorization header to the proxy, with the base64-encoded user ID and
password in it. Is there any chance my proxy doesn't work that way, and
requires a different form of authorisation? I know nothing much about proxies,
but I seem to recall seeing "Squid" in some of the proxy messages - is Squid a
type of proxy????

If this is some sort of bizarre peculiarity of my proxy setup, then feel free
to drop the issue - I can live without getting this to work, and I don't want
to take up too much of your time with this.

Thanks for all the help you've provided so far - I feel that I at least
understand the problem much better now...!

Paul.

Siggy Brentrup

unread,
Mar 8, 2001, 7:56:42 AM3/8/01
to pytho...@python.org
Paul Moore <paul....@uk.origin-it.com> writes:

> On Wed, 7 Mar 2001 13:44:11 +0100 , Siggy Brentrup <b...@winnegan.de> wrote:
>
> >Paul Moore <paul....@uk.origin-it.com> writes:
> >Well, obviously I should have tested before posting :-( There's one
> >omission in my code as well as three typos in urllib2.py (sample and
> >patch attached)
>
> Hmm. I was using Python 2.0. I switched to Python 2.1, and applied your patch
> (did you send it to SourceForge as a bug report?

Sure, cf Bug #406683 "typos in urllib2"

Yes, I'm running Squid as my proxy too and _I'm_in_control_of_it_.
Your proxy setup might use some other form of authentication, what
urllib2.ProxyHandler uses, looks like "Basic HTTP Authentication" to
me - no time to investigate further now.

> If this is some sort of bizarre peculiarity of my proxy setup, then
> feel free to drop the issue - I can live without getting this to
> work, and I don't want to take up too much of your time with this.

Don't worry, I'll take this as an opportunity to learn about proxy
authentication, I'll look into it this weekend.

> Thanks for all the help you've provided so far - I feel that I at least
> understand the problem much better now...!

You're welcome

Siggy


Reply all
Reply to author
Forward
0 new messages