Hi,
parsing with the 'html5lib' parser builds an erroneous document if the HTML input string includes "selfclosing" HTML elements .
Tested on Debian Jessie with BeautifulSoup v4.4.40 and Python v2.7.9.
Take a look to the following example (notice the "selfclosing" iframe):
root@14c0ede392c2:/tmp# python
Python 2.7.9 (default, Mar 1 2015, 12:57:24)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import bs4
>>>
>>> bs4.__version__
'4.4.0'
>>>
>>>
>>> bs4.BeautifulSoup( html, "html5lib" )
<html><head></head><body><iframe src="http://www.google.es"><div id="test"></div></html></iframe></body></html> >>>
so, as you can see, the "div" element was lost due to the parser, on the other hand, if the 'html.parser' is used all is parsed successfully.
Regards,
Fanny