Special Characters problem while working find_all function

17 views
Skip to first unread message

Nikolis Galerakis

unread,
Sep 1, 2017, 1:12:35 PM9/1/17
to beautifulsoup
Hello all I need help with this one if possible

So I am trying to read an html file that has characters like so : 10½oz. The problem is that when I load the file in the 
soup = BeautifulSoup(htmlDoc, 'html.parser')
and then print the file the string:  10½oz appears correctly but when I use a method such as find_all
tags = soup.find_all('div')
then if I print the tags that contain the characters mentioned the result looks like this: 10\xbdoz
anyone knows a workaround for this ?? can I somehow specify encoding for the find_all function or something ??
or if there is an automated way to get rid of this escapes characters and convert them buck to the originals ?

Sorry if it's a newbie question but I am new to python,
thanks in advance 
Reply all
Reply to author
Forward
0 new messages