Re: Crash on bad doctype declaration

29 views
Skip to first unread message

Leonard Richardson

unread,
Jul 13, 2012, 1:20:22 PM7/13/12
to beauti...@googlegroups.com
> Has anyone seen this? Do you have a workaround?
>
> It looks like this is a bug in libxml2 though.

It is a bug in libxml2. I filed the bug back in March and the lxml
developer committed a fix to the development branch.

https://bugs.launchpad.net/lxml/+bug/984936

Apart from upgrading lxml, the best workaround would be to parse the
document with html5lib or html.parser.

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser

Leonard
Reply all
Reply to author
Forward
0 new messages