I need to get a file via http, but must login first, so I am doing two
requests. The login request succeeds and I get the correct HTML
response. But the second request to get the actual file I wanted (after
logging in) is throwing an httplib.ResponseNotReady exception.
I've replaced >>> with --- to avoid confusion with quoted text.
---h = httplib.HTTPConnection('123.45.67.89')
---h.connect()
---h.putrequest('GET','/default.asp?loginname=myuser&password=mypass&action=login')
---h.putheader('Accept','text/html')
---h.putheader('Accept','text/plain')
---h.endheaders()
---h.send('')
---r = h.getresponse()
---r.status
200
---r.getheader('Set-Cookie')
'ASPSESSIONIDGQGGHQKY=OFIPJJMABHNODAIFKFPDPMCN; path=/'
---cookie = r.getheader('Set-Cookie')
---h.putrequest('GET','/secure.asp')
---h.putheader('Cookie',cookie)
---h.putheader('Set-Cookie',cookie)
---h.putheader('Accept','text/html')
---h.putheader('Accept','text/plain')
---h.endheaders()
---h.send('')
---r2 = h.getresponse()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\Python22\lib\httplib.py", line 752, in getresponse
raise ResponseNotReady()
httplib.ResponseNotReady
>Can someone tell me what I am doing wrong here?
>
>I need to get a file via http, but must login first, so I am doing two
>requests. The login request succeeds and I get the correct HTML
>response. But the second request to get the actual file I wanted (after
>logging in) is throwing an httplib.ResponseNotReady exception.
>
The second connection needs to have it's own HTTPConnection object.
I was just a confused about using this module it seems. The documenation
I have from ActiveState's Windows distribution seems fairly lacking in
most aspects. Is there something better?
Why don't you use urllib, which takes care of a lot of this stuff for you?
Doesn't urllib use httplib? Doesn't urllib hide the more advanced, and
in this case necessary, features of httplib? Specifically headers...?
If not, how would my ~12 lines of code look if they used urllib?
You are not using ClentCookie ;-)
http://wwwsearch.sourceforge.net/ClientCookie/
With it you will be able to write code like the following, wich makes
the process pretty painless (untested):
import urllib, urllib2, ClientCookie
class SiteBrowser:
def __init__(self, url, **form_data):
# form fields, including hidden ones
data = urllib.urlencode(form_data)
# login page
request = urllib2.Request(url, data)
result = ClientCookie.urlopen(request)
self.content = ''
def browse(self, url):
# Find a password protected page
req = urllib2.Request(url)
res = ClientCookie.urlopen(req)
self.content = res.read()
res.close()
if __name__=='__main__':
usr = 'user'
pwd = 'secret'
form_name = 'login_form', # hidden field!
button = 'ok'
url = 'http://www.somewhere.com/loginform.asp'
browser = SiteBrowser(
url, USERNAME=usr, PASSWORD=pwd, FORM_NAME=form_name, ok='')
browser.browse('http://www.somewhere.com/protected_page.asp')
print browser.content
--
hilsen/regards Max M
http://www.futureport.dk/
Fremtiden, videnskab, skeptiscisme og transhumanisme