urlfetch head ache - please help

49 views
Skip to first unread message

EuGeNe

unread,
Apr 8, 2008, 7:41:56 PM4/8/08
to Google App Engine
Hi,

I am struggling with urlfetch (like others it seems) and would greatly
appreciate a little help figuring things out. In test.py bellow I am
trying to post a search for a ISBN on a given site.

The site redirects properly, location in the initial headers is known
(in the case presented bellow it gives Location: /C-5/I-0596514557/).

I realize that urlfetch doesn't follow the redirect and that I would
have to handle it "manually" by using the result.headers['location']
but as the output of http://localhost:8080 shows,

status: 302
headers: {'content-length': '0', 'x-powered-by': 'PHP/5.2.0-8+etch7',
'server': 'Apache/2.2.3 (Debian) PHP/5.2.0-8+etch7', 'location': '/',
'date': 'Tue, 08 Apr 2008 23:18:35 GMT', 'content-type': 'text/html;
charset=UTF-8'}

result.headers['location'] is '/' and not the expected '/C-5/
I-0596514557/'. What am I doing wrong knowing that using the Python
standard urlib.urlopen(url, data) works perfectly well in a standard
Python environment?

---- test.py ----

import wsgiref.handlers
from urllib import urlencode

from google.appengine.api import urlfetch
from google.appengine.ext import webapp

class MainPage(webapp.RequestHandler):
def get(self):
url = "http://www.isbn.pl/search.php"
isbn = "0596514557"
data = urlencode({'isbn':isbn, 'lang[]':
['4','2','1','8','16']})
result = urlfetch.fetch(url, payload=data,
method=urlfetch.POST)
self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write("status: %s\nheaders: %s" % (
result.status_code, result.headers))

def main():
application = webapp.WSGIApplication(
[('/', MainPage)],
debug=True)
wsgiref.handlers.CGIHandler().run(application)

if __name__ == "__main__":
main()

-----

Thanks for your assistance,

EuGeNe -- http://www.3kwa.com

Shalabh Chaturvedi

unread,
Apr 9, 2008, 2:36:52 AM4/9/08
to google-a...@googlegroups.com
The standard library urllib automatically adds the following headers
for POST requests:

h.putheader('Content-type',
'application/x-www-form-urlencoded')
h.putheader('Content-length', '%d' % len(data))

I'm not sure if urlfetch adds these - you may want to add these
yourself and see if it works.

Shalabh

EuGeNe

unread,
Apr 9, 2008, 5:26:34 AM4/9/08
to Google App Engine
On Apr 9, 8:36 am, "Shalabh Chaturvedi" <shalabh.chaturv...@gmail.com>
wrote:
> The standard library urllib automatically adds the following headers
> for POST requests:
>
> h.putheader('Content-type',
> 'application/x-www-form-urlencoded')
> h.putheader('Content-length', '%d' % len(data))

Thanks, only the first one is required (adding the content-length
doesn't work, it is probably already added by urlfetch.fetch. See code
bellow for the proper syntax.

I bumped into another problem though, the SDK version and the live
version are out of sync (don't behave the same way) I guess I should
report a "bug".

---- test.py ----

import wsgiref.handlers

from urllib import urlencode

from google.appengine.api import urlfetch
from google.appengine.ext import webapp

BASE_URL = 'http://www.isbn.pl'

class MainPage(webapp.RequestHandler):
""" Dummy experimentation with urlfetch

Search for a ISBN (0596514557 - Ben Fry - Visualizing Data -
O'Reilly)
on http://www.isbn.pl and redirects to the result page.
"""
def get(self):
url = BASE_URL + "/search.php"
isbn = "0596514557"
data = urlencode({'isbn':isbn, 'lang[]':
['4','2','1','8','16']})
headers = {'Content-type':'application/x-www-form-urlencoded'}

result = urlfetch.fetch(url, payload=data, headers= headers,
method=urlfetch.POST)
# Required when run locally in the SDK but not on appspot ...
weird!
#if result.status_code == 302:
# url = BASE_URL + result.headers['location']
# self.redirect(url)

self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write('failed: %s\n' % result.status_code)
self.response.out.write('headers: %s\n' % result.headers)
self.response.out.write('content: %s\n' % result.content)

def main():
application = webapp.WSGIApplication(
[('/', MainPage)],
debug=True)
wsgiref.handlers.CGIHandler().run(application)

if __name__ == "__main__":
main()

Thanks again,

EuGeNe -- http://www.3kwa.com
Reply all
Reply to author
Forward
0 new messages