tag attribute returning list rather than string

30 views
Skip to first unread message

James Burford

unread,
Jan 25, 2017, 7:28:47 AM1/25/17
to beautifulsoup
>>> from bs4 import BeautifulSoup

>>> soup = BeautifulSoup('<p class="someclass">inner text</p>')

>>> attr = soup.p['class']

>>> print(attr)
['someclass']



Per the docs, shouldn't the variable attr contain a plain string, 'someclass', rather than a list?

Strangely, when I try the code above replacing the input with <a href="somehref">inner text</a>, the attribute returns a string as expected.

However, <a class="someclass">inner text</a> returns a list.

I searched several times trying to find others having the same issue, but couldn't find anything. Apologies if this is a well-known issue.

Using version 4.5.3 inside a virtualenv.


leonardr

unread,
Jan 25, 2017, 7:35:50 AM1/25/17
to beautifulsoup
James,

Thanks for writing in. This is by design. HTML allows a few attributes, like 'class' (but not 'href') to have multiple values, and Beautiful Soup

This is documented here:

https://www.crummy.com/software/BeautifulSoup/bs4/doc/#multi-valued-attributes

Thanks,
Leonard

James Burford

unread,
Jan 25, 2017, 8:58:48 AM1/25/17
to beautifulsoup
Understood, I was confused by the Quick Start / dormouse example in which a paragraph tag is shown to return a plain string when its class attribute is indexed.

Thank you for your response and for the effort you put into creating this software, which has allowed me to bring ideas to life.



Reply all
Reply to author
Forward
0 new messages