tree = lxml.html.parse(url) >>> Server Error

171 views
Skip to first unread message

Bassam Aoun

unread,
May 13, 2012, 2:49:09 PM5/13/12
to google-a...@googlegroups.com

Hello,

My application work fine on the GAE and the SDK. However, when I include the line 
    tree = lxml.html.parse(url)   

I get SERVER ERROR when I deploy my application to GAE. It works fine however in the SDK on my desktop. 

All those imports work fine on both the GAW and the SDK.

import lxml
from lxml import html
import lxml.html

but,   tree = lxml.html.parse(url)      gives me SERVER ERROR on GAE.

I appreciate any help  or suggestion. 

Thanks in advance. 

application address is http://hps109.appspot.com/?ticker=BIDU&manager=Blue

Dmitry Chusovitin

unread,
May 13, 2012, 5:42:03 PM5/13/12
to google-a...@googlegroups.com
lxml only availailable on Python 2.7. You must change your app.yml file.

add this code:

libraries:
- name: lxml
  version: "2.3"

воскресенье, 13 мая 2012 г., 22:49:09 UTC+4 пользователь Bassam Aoun написал:

Bassam Aoun

unread,
May 13, 2012, 10:59:31 PM5/13/12
to google-a...@googlegroups.com
thanks for the reply. 

the app.yaml already has the following:

libraries:

- name: jinja2
  version: latest

- name: lxml
  version: "2.3"


The imports are working fine, my the line  tree = lxml.html.parse(url)   is causing a Server Error when the application is deployed on  GAE although it works on the SDK. The webpage prints:  

Error: Server Error

The server encountered an error and could not complete your request.


any idea?

Brian Quinlan

unread,
May 13, 2012, 11:33:16 PM5/13/12
to google-a...@googlegroups.com
On Mon, May 14, 2012 at 12:59 PM, Bassam Aoun <bassa...@gmail.com> wrote:
> thanks for the reply.
>
> the app.yaml already has the following:
>
> libraries:
>
> - name: jinja2
>   version: latest
>
> - name: lxml
>   version: "2.3"
>
>
> The imports are working fine, my the line  tree = lxml.html.parse(url)   is
> causing a Server Error when the application is deployed on  GAE although it
> works on the SDK. The webpage prints:
>
> Error: Server Error
>
> The server encountered an error and could not complete your request.
>
>
> any idea?

Have you looked at the logs for your application?

Cheers,
Brian

>
> On Sunday, May 13, 2012 3:42:03 PM UTC-6, Dmitry Chusovitin wrote:
>>
>> lxml only availailable on Python 2.7. You must change your app.yml file.
>>
>> See https://developers.google.com/appengine/docs/python/python27/using27?hl=ru-RU#Configuring_Libraries
>>
>> add this code:
>>
>> libraries:
>> - name: lxml
>>   version: "2.3"
>>
>> воскресенье, 13 мая 2012 г., 22:49:09 UTC+4 пользователь Bassam Aoun
>> написал:
>>>
>>>
>>> Hello,
>>>
>>> My application work fine on the GAE and the SDK. However, when I include
>>> the line
>>>     tree = lxml.html.parse(url)
>>>
>>> I get SERVER ERROR when I deploy my application to GAE. It works fine
>>> however in the SDK on my desktop.
>>>
>>> All those imports work fine on both the GAW and the SDK.
>>>
>>> import lxml
>>> from lxml import html
>>> import lxml.html
>>>
>>> but,   tree = lxml.html.parse(url)      gives me SERVER ERROR on GAE.
>>>
>>> I appreciate any help  or suggestion.
>>>
>>> Thanks in advance.
>>>
>>> application address
>>> is http://hps109.appspot.com/?ticker=BIDU&manager=Blue
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/aYc1qZR0_IEJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.

Maxim Lacrima

unread,
May 14, 2012, 2:28:16 AM5/14/12
to google-a...@googlegroups.com
Hi,

I am not sure what the problem here is, but you can try to fetch url using Urlfetch service and then pass response body to lxml (instead of providing url to lxml). It can be that lxml has some difficulties when fetching urls on its own on GAE platform.
--
with regards,
Maxim

Bassam Aoun

unread,
May 14, 2012, 3:32:55 AM5/14/12
to google-a...@googlegroups.com
Thank you so much. That seems to work:
file = urlfetch.fetch(url)
tree = lxml.html.parse(StringIO(file.content)) 

I dunno why. Using StringIO was also key. I think due to some characters in the html file. 

It is weird that tree = lxml.html.parse(url)  worked nicely on the SDK but not on GAE. that was confusing. 
Hi,

> To post to this group, send email to google-appengine@googlegroups.com.

> To unsubscribe from this group, send email to

> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--
with regards,
Maxim
Reply all
Reply to author
Forward
0 new messages