html fetching with Python on appspot

43 views
Skip to first unread message

max kraun

unread,
Jan 29, 2014, 9:07:29 PM1/29/14
to google-a...@googlegroups.com
Hi guys,
I have a Python code that logs in a vbulletin forum, goes to 10 forum page and copies their html source code to compute a ranking. When I run it from my computer everything works.
I tried to move the code to google app engine, it should just print the ranking in the response.write().
When I run it on localhost it works, but when I run it on appspot.com it looks like if some pages are not counted. Sometimes it works, sometimes it does not. Sometimes it returns some errors ("Deadline exceeded while waiting for HTTP response from URL" or similar), too.

I think that the log in fails or that it logs out at some point, I tried doing something but I'm quite new so I really don't know what to do with the web-part of the code.

The login code is basically this (this is not mine, I just adapted it on my code), then i added two "for" like this:
       for i in range(5):
           response
= opener.open('i.html') #pages 1 to 5 (then 6 to 10)
           list1
[i]=response.readlines() #(then list2)

The logging in and the html-copying parts of the code are the same when I run the code on my computer, on localhost and on appspot, the ranking-computing part too.

Then I just save the ranking on a string and print it on this way:
       self.response.write('''
               <html>
               <title>Title</title>
               <body>
               <table>
               %s
               </body>
               </html>'''
%str(text))
Again, it perfectly works on localhost. But not the same when it runs on appspot.

Where am I wrong? What could I do to solve this?
Thanks in advance

Vinny P

unread,
Feb 5, 2014, 2:26:21 AM2/5/14
to google-a...@googlegroups.com
On Wed, Jan 29, 2014 at 8:07 PM, max kraun <krau...@gmail.com> wrote:
When I run it on localhost it works, but when I run it on appspot.com it looks like if some pages are not counted. Sometimes it works, sometimes it does not. Sometimes it returns some errors ("Deadline exceeded while waiting for HTTP response from URL" or similar), too.


Hi Max,

It's difficult to say exactly what the problem is without a more detailed look at the logs, but you can try a few things:

1. Try adding a time.sleep call between each page retrieval. See http://docs.python.org/2/library/time.html for usage. Sometimes App Engine doesn't like it when repeated URL Fetch calls are made.

2. Try making sure the authentication token you're using is still valid. For instance, you could make sure that the page you retrieved shows text consistent with a logged-in user - for example, only "sign out" links and not "sign in" links.

3. The deadline exceeded error comes because the other server took too long to send back the response. You can increase this time by setting the default fetch timeout: urlfetch.set_default_fetch_deadline(60) 
 
-----------------
-Vinny P
Technology & Media Advisor
Chicago, IL

App Engine Code Samples: http://www.learntogoogleit.com

rahul kumar

unread,
Sep 26, 2019, 9:46:13 AM9/26/19
to Google App Engine
Sleep() function actually suspends the processing of the thread in which it is called by the operating system, allowing other threads and processes to execute while it sleeps. With multiple threads and processes, sleep() suspends your thread - it uses next to zero processing power.

rahul kumar

unread,
Sep 26, 2019, 9:46:14 AM9/26/19
to Google App Engine
Sleep() function actually suspends the processing of the thread in which it is called by the operating system, allowing other threads and processes to execute while it sleeps. With multiple threads and processes, sleep() suspends your thread - it uses next to zero processing power.

On Wednesday, February 5, 2014 at 12:56:21 PM UTC+5:30, Vinny P wrote:

rahul kumar

unread,
Sep 26, 2019, 9:46:14 AM9/26/19
to Google App Engine
Sleep() function actually suspends the processing of the thread in which it is called by the operating system, allowing other threads and processes to execute while it sleeps. With multiple threads and processes, sleep() suspends your thread - it uses next to zero processing power.

On Wednesday, February 5, 2014 at 12:56:21 PM UTC+5:30, Vinny P wrote:
Reply all
Reply to author
Forward
0 new messages