sorry if this is a FAQ, I've done a quick search and haven't found an
answer.
I'm willing to convert a malformed HTML to XML using the same rules
that firefox uses. For example, in a document I have the HTML:
<tr>
<td></td>
<td>
<form>
<input ...>
</td>
</tr>
<tr>
<td></td>
<td>
</form>
</td>
</tr>
and firefox interpret it as:
<tr>
<td></td>
<td>
<form />
</td>
</tr>
<tr>
<td></td>
<td>
</td>
</tr>
Is it possible to reuse some firefox component to help me to do this
conversion? If so, is there any python binding to use it in a console
application?
Thank you for your attention.
regards,
Bruno
Your best bet is probably to use html5lib:
http://code.google.com/p/html5lib/
~fantasai
Thanks! I'll try it!