On Tue, Apr 20, 2021 at 11:09 AM Linus Torvalds
<
torv...@linuxfoundation.org> wrote:
>
> Sadly, our error reporting in this area isn't great, because xml
> parsing is all done by a different library (libxml2), and it's just
> xmlReadMemory() failing. It doesn't tell us where or anything like
> that.
Dirk forwarded the problematic file to me for checking, and it
actually seems to be an outright problem with libxml2.
Adding the XML_PARSE_HUGE flag to the xmlReadMemory() call seems to
fix it. Very odd broken defaults for libxml2.
It might possibly be a good idea to also add
XML_PARSE_COMPACT = 65536 : compact small text nodes; no
modification of the tree allowed afterwards (will possibly crash if
you try to modify the tree)
which should make the libxml2 parser use less memory, but I didn't
test that. Afaik, we never modify the xml tree - we only use libxml2
for parsing - so that should be good.
But I didn't do that part. I only added the XML_PARSE_HUGE flag, and
verified that that seems to fix the problem with that particular Greek
input.
I did a github pull request at
https://github.com/subsurface/subsurface/pull/3231
for this, but I really don't know libxml2 very well, so let's at least
wait for all the tests to finish.
Linus