I must use amara.bindery.html.parse with html pages, but sometimes
(depending on the encoding) it raises a ValueError
File "/home/lm/workspace/scrapinmo/src/dataextractor.py", line 94, in __init__
self.page = html.parse(content)
File "/home/lm/entornos/scraping/local/lib/python2.7/site-packages/amara/bindery/html.py",
line 250, in parse
doc = parser.parse(inputsource(source, None).stream, encoding=encoding)
File "/home/lm/entornos/scraping/local/lib/python2.7/site-packages/amara/lib/_inputsource.py",
line 84, in __new__
raise ValueError("Does not appear to be well-formed XML")
ValueError: Does not appear to be well-formed XML
_inputsource.py source indicates at this line a warning:
else:
#FIXME L10N
raise ValueError("Does not appear to be well-formed XML")
How can I use with parse the sourcetype param?
Regards,
-- luismiguel (@lmorillas)