Re: Segmentation fault loading dblp xml

19 views
Skip to first unread message

Leonard Richardson

unread,
Nov 5, 2012, 8:58:44 AM11/5/12
to beautifulsoup
Rafael,

This sounds like lxml bug 984936:

https://bugs.launchpad.net/lxml/+bug/984936

If this is it, you can fix the bug by upgrading your lxml installation
to 2.3.5. You can also work around the bug by removing the doctype at
the beginning of the document.

It's also possible that the segfault happens because the document is
very large (over a gigabyte). If that's the case, I don't have a
solution. Once a document gets that large, it might be better to work
directly with lxml.

Leonard

On Mon, Nov 5, 2012 at 4:36 AM, Rafael Barbosa <rrba...@gmail.com> wrote:
> Hi,
>
> I am new to Beautiful Soup, I just installed 'python-bs4' (version 4.1.0-1)
> in my machine and tried to load this XML file:
> http://dblp.uni-trier.de/xml/dblp.xml
>
> After a few seconds I get a segmentation fault.
>
> I am using Python 2.7.3rc2 on Debian (3.2.0-3-amd64).
>
> --
> You received this message because you are subscribed to the Google Groups
> "beautifulsoup" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/beautifulsoup/-/bD5e45rNgKEJ.
> To post to this group, send email to beauti...@googlegroups.com.
> To unsubscribe from this group, send email to
> beautifulsou...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/beautifulsoup?hl=en.
Reply all
Reply to author
Forward
0 new messages