How to create a tag in version 4?

446 views
Skip to first unread message

James

unread,
Mar 9, 2011, 3:08:32 PM3/9/11
to beautifulsoup
I can't figure out how to make a new tag that works correctly in
version 4. I'm assuming it has something to do with my misuse of the
external parser, which I haven't used before.

I am getting my html like this from a local file:

parser =
html5lib.HTMLParser(tree=html5lib.treebuilders.getTreeBuilder('beautifulsoup'))
builder = builder_registry.lookup('html5lib')()
html = file(filename).read()
data = parser.parse(html)

This seems to work, and I can do all the operations I would normally
do in BS3 except for creating tags. newTag = (soup, "newTag") throws
the error that it needs 4 arguments, so I tried this after looking
through the source files:

newTag = Tag(parser, builder, "newTag")

This creates a tag, but it is a bs4.element.Tag and not a
BeautifulSoup.Tag like all of the tags in "data," and it doesn't work
or respond like a regular tag.

Aaron DeVore

unread,
Mar 10, 2011, 3:00:11 AM3/10/11
to beauti...@googlegroups.com
On Wed, Mar 9, 2011 at 12:08 PM, James <janin...@gmail.com> wrote:
> I can't figure out how to make a new tag that works correctly in
> version 4. I'm assuming it has something to do with my misuse of the
> external parser, which I haven't used before.
>
> I am getting my html like this from a local file:
>
> parser =
> html5lib.HTMLParser(tree=html5lib.treebuilders.getTreeBuilder('beautifulsoup'))
> builder = builder_registry.lookup('html5lib')()
> html = file(filename).read()
> data = parser.parse(html)

parser.parse("...") is using the html5lib library's internal Beautiful
Soup 3 support (import BeautifulSoup) to create a Beautiful Soup 3
tree. To get a Beautiful Soup 4 tree, you need to use:

import bs4
data = bs4.BeautifulSoup(html, features="html5lib")

That uses Beautiful Soup 4's support for the html5lib parser instead
of the other way around. It will produce a tree with bs4.element.Tag
objects.

> This creates a tag, but it is a bs4.element.Tag and not a BeautifulSoup.Tag like all of the tags in "data," and it doesn't work or respond like a regular tag.

What are the differences in the behavior of Beautiful Soup 4's Tag?
Tag's behavior needs to be essentially identical to Beautiful Soup 3.

-Aaron DeVore

James

unread,
Mar 10, 2011, 10:46:21 AM3/10/11
to beautifulsoup
Aaron—a million thanks. Now that I importing the data correctly, and
my new tags and old tags are being built using the same tree builder,
everything is working as expected in terms of creating new tags.

James
Reply all
Reply to author
Forward
0 new messages