Special Characters problem while working find_all function

17 views

Skip to first unread message

unread,

Sep 1, 2017, 1:12:35 PM9/1/17

to beautifulsoup

Hello all I need help with this one if possible

So I am trying to read an html file that has characters like so : 10½oz. The problem is that when I load the file in the

soup = BeautifulSoup(htmlDoc, 'html.parser')

and then print the file the string: 10½oz appears correctly but when I use a method such as find_all

tags = soup.find_all('div')

then if I print the tags that contain the characters mentioned the result looks like this: 10\xbdoz

anyone knows a workaround for this ?? can I somehow specify encoding for the find_all function or something ??

or if there is an automated way to get rid of this escapes characters and convert them buck to the originals ?

Sorry if it's a newbie question but I am new to python,

thanks in advance

Reply all

Reply to author

Forward

0 new messages