Groups
Groups
Sign in
Groups
Groups
beautifulsoup
Conversations
About
Send feedback
Help
BeautifulStoneSoup xml parsing issue
298 views
Skip to first unread message
Nikolay
unread,
Jan 19, 2009, 4:07:57 AM
1/19/09
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beautifulsoup
How can I disable the following xml-rewriting feature (or bug?):
In [4]: print BeautifulSoup.__version__
3.1.0.1
In [5]: print BeautifulStoneSoup("<a><b><a></a></b></a>").prettify()
<a>
<b>
</b>
</a>
<a>
</a>
For instance, Django object to XML dumper generates similar code. And
so this dump become invalid after some work with it in BS.
Have a nice day,
Nikolay.
Leonard Richardson
unread,
Jan 19, 2009, 8:36:59 AM
1/19/09
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beauti...@googlegroups.com
Nikolay,
You need to tell BS that <a> tags can be nested within themselves.
Customizing the list of netable tags is covered here.
http://www.crummy.com/software/BeautifulSoup/documentation.html#Customizing%20the%20Parser
Leonard
Nikolay Panov
unread,
Jan 19, 2009, 9:03:42 AM
1/19/09
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beauti...@googlegroups.com
Thank you!
Have a nice day,
Nikolay.
Nikolay Panov
unread,
Jan 19, 2009, 9:10:05 AM
1/19/09
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beauti...@googlegroups.com
Another question. Why BS do that:
In [13]: print MyBeautifulSoup("<A></A>").prettify()
<a>
</a>
How can I prevent lowercasing?
Have a nice day,
Nikolay.
On Mon, Jan 19, 2009 at 16:36, Leonard Richardson <
leon...@segfault.org
> wrote:
>
Leonard Richardson
unread,
Jan 19, 2009, 9:18:14 AM
1/19/09
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beauti...@googlegroups.com
> Another question. Why BS do that:
> In [13]: print MyBeautifulSoup("<A></A>").prettify()
> <a>
> </a>
> How can I prevent lowercasing?
BS does this because HTMLParser does it. If you need to preserve tag
case, try lxml.
Leonard
Reply all
Reply to author
Forward
0 new messages