Re: Converts attributes to lowercase?

1,237 views
Skip to first unread message

Leonard Richardson

unread,
Sep 8, 2012, 9:03:45 AM9/8/12
to beauti...@googlegroups.com
> The attribute names eg. data-docTitle, are automatically converted to
> lowercase ie. data-doctitle. Is this standard behavior?

It is for a document parsed as HTML:

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#other-parser-problems

"Because HTML tags and attributes are case-insensitive, all three HTML
parsers convert tag and attribute names to lowercase. That is, the
markup <TAG></TAG> is converted to <tag></tag>. If you want to
preserve mixed-case or uppercase tags and attributes, you’ll need to
parse the document as XML."

>>> soup = BeautifulSoup(markup, "xml")

Passing in features="lxml", as you did, will use lxml's HTML parser,
not its XML parser.

Leonard
Reply all
Reply to author
Forward
0 new messages