Aron Griffis
unread,May 20, 2013, 2:49:48 PM5/20/13Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beauti...@googlegroups.com
Jan-Philip Gehrcke wrote: [Fri May 17 2013, 11:46:55AM EDT]
> Generally spoken, my question is if we can create a new
> bs4.BeautifulSoup object from a collection of existing tags.
Slightly on a tangent from your original question, creating a new
soup object from an existing tree is something I've wondered
about too.
BeautifulSoup doesn't seem to provide much in the way of
a functional interface; every operation mutates the tree rather
than returning a new data structure. So the best answer it
probably to copy then extract, like this:
>>> soup = BeautifulSoup('<script>x=3</script>', 'html5lib')
>>> script = copy.deepcopy(soup).find('script').extract()
>>> print script
<script>x=3</script>
>>> print soup
<html><head><script>x=3</script></head><body></body></html>
I don't know if that's interesting or helpful to you. The other
replies regarding the "hidden" attribute are probably more
useful. But it will be helpful for me to have this technique of
extracting a subtree in the future.
In particular it might be helpful for form handling similar to
what I mentioned regarding WebTest on the "ampersands" thread
(accidental subject that I forgot to fix before posting). It
seems like a better approach to extract the "form" subtree as
objects without serializing and re-parsing.
Aron