CSS selector does not support both class and attribute

31 views
Skip to first unread message

PeterB

unread,
Jan 21, 2018, 11:26:12 AM1/21/18
to beautifulsoup
The CSS selector syntax "element.class[attribute=value]" does not work.

I know the BeautifulSoup docs say that it only provides a subset of CSS selectors, but it would be really great if it supported such multi-level ones!

The code below illustrates, with theSpan2 not working (I am using BS version 4.6.0).

import bs4

myHTML
= """
<html>
    <head>
        <style type="
text/css">
            .theSpan[data-value=spanner] { background-color: grey; }
        </style>
    </head>
    <body>
        Here is
        <span class="
theSpan" data-value="spanner">SPAN</span>
    </body>
</html>
"""


soup
= bs4.BeautifulSoup(myHTML, "html.parser")

theSpan1
= soup.select('span.theSpan')
theSpan2
= soup.select('span.theSpan[data-value=spanner]')
theSpan3
= soup.select('span[data-value=spanner]')

print("theSpan1 matched", len(theSpan1), "item:", theSpan1)
print("theSpan2 matched", len(theSpan2), "item:", theSpan2)
print("theSpan3 matched", len(theSpan3), "item:", theSpan3)


PeterB

unread,
Apr 22, 2019, 5:38:13 AM4/22/19
to beautifulsoup
I have recently upgraded to BeautifulSoup 4.7.1, and this problem is now fixed. Probably because of the new "soupsieve" CSS module it uses.

facelessuser

unread,
Apr 22, 2019, 1:15:40 PM4/22/19
to beautifulsoup
Yes, this is definitely due to BeautifulSoup 4.7 using soupsieve. The old algorithm was very limited. While it handled basic selectors, it had a hard time when multiple basic selectors were combined via a compound selector. Selector support was re-written from the ground up and should handle compound and complex selectors much better. Pseudo-class support is much better as well.
Reply all
Reply to author
Forward
0 new messages