Re: How use Scrapy encoding

92 views
Skip to first unread message

Morad Edwar

unread,
Mar 20, 2015, 4:35:19 AM3/20/15
to scrapy...@googlegroups.com
What about your database encoding???
i use
.encode('utf-8')
 and it works great with the Arabic characters which needs to be converted to utf8.
P.S : i'm using PG not MySQL.


On Thursday, March 19, 2015 at 5:59:04 PM UTC+2, Rico A Mada wrote:
Hi all,

I'm blocked with encodage issue when using Scrapy, hope someone can help me.

  • On my spider : item['title'] = html.xpath('.//h5/text()')
  • On pipeline : item['title'] = item['title'].extract()[0].encode('utf-8', 'replace')

It result string like Namontana \xe2\x80\x93 Une attaque \xc3\xa0 main arm\xc3\xa9e avort\xc3\xa9e. I save all item on database (mysql for now).

Now I want to show all this items to a website but my problem is I can't transform \xe2 (for example) to visual char.

I've already try :

  • Add # -*- coding: utf-8 -*- at begin of all .py file
  • Use htmlentities or utf8_decode functions when display with PHP code
  • Add unicode(response.body.decode(response.encoding)).encode('utf-8') on my spider
  • Add <meta http-equiv="content-type" content="text/html; charset=utf-8" /> to my HTML page
  • Check and convert all file to UTF8 without BOM

For now, my only alternative is to use custom function to replace all char (explain here) but I thinks they've better solution.

Thanks in advance for your help.

Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages