Hello everyone,
I'm trying to move nested <p> inside another <p> after the descendants' parent. I'm using BS4 and Python 2.7.3 on Gentoo. Here's the code:
for p in soup.find_all("p"):
for desc in p.descendants:
if isinstance(desc, Tag):
if desc.name == "p":
p.insert_after(desc)
My input may look like this one:
<html>
<head>
<title>title</title>
</head>
<body>
<p>
<p>first nested paragraph</p>
<p><img src="img.png"></p>
<p><br></p>
Text inside parent paragraph.
</p>
</body>
</html>
However, when p.insert_after(desc) is called, the loop processes only the first nested paragraph. My guess is that the loop is trying to continue on the descendants/siblings of the first nested paragraph. Since it's moved, the cycle is broken. I tested desc.extract() and it stops with AttributeError: 'NoneType' object has no attribute 'next_element' after calling it.
Can anyone comment on this and shed some light on how the loop over descendants work? I'm open to any alternative.
Thanks!
P.
--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beautifulsou...@googlegroups.com.
To post to this group, send email to beauti...@googlegroups.com.
Visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.