Incomplete support for [attribute=value] selector

267 views
Skip to first unread message

Radu Dan

unread,
Oct 28, 2013, 5:55:30 AM10/28/13
to beauti...@googlegroups.com
Searching for nodes whose attributes contain square brackets is pretty much impossible:

>>> import bs4
>>> bs4.__version__
'4.3.0'
>>> document = bs4.BeautifulSoup("<input name=\"foo[]\"/>")
>>> document.select("input")
[<input name="foo[]"/>]

>>> document.select('[name=foo\[\]]')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "bs4\element.py", line 1309, in select
    'Unsupported or invalid CSS selector: "%s"' % token)
ValueError: Unsupported or invalid CSS selector: "[name=foo\[\]]"

>>> document.select('[name="foo[]"]')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "bs4\element.py", line 1309, in select
    'Unsupported or invalid CSS selector: "%s"' % token)
ValueError: Unsupported or invalid CSS selector: "[name="foo[]"]"

>>> document.select('[name="foo\[\]"]')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "bs4\element.py", line 1309, in select
    'Unsupported or invalid CSS selector: "%s"' % token)

>>> document.select('[name=foo\\5b \\5d ]')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "bs4\element.py", line 1309, in select
    'Unsupported or invalid CSS selector: "%s"' % token)
ValueError: Unsupported or invalid CSS selector: "[name=foo\5b"

>>> document.select('[name="foo\\5b \\5d "]')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "bs4\element.py", line 1309, in select
    'Unsupported or invalid CSS selector: "%s"' % token)
ValueError: Unsupported or invalid CSS selector: "[name="foo\5b"

>>> document.select('[name=foo\\00005b\\00005d]')
[]

>>> document.select('[name="foo\\00005b\\00005d"]')
[]


All these behaviors are defined in CSS 2 ( http://www.w3.org/TR/CSS2/syndata.html#characters ) and thus should work (the supported selectors section of the docs link to the CSS 2 spec, so you are implying that you support it completely)

Leonard Richardson

unread,
Oct 28, 2013, 7:23:08 AM10/28/13
to beautifulsoup
All these behaviors are defined in CSS 2 ( http://www.w3.org/TR/CSS2/syndata.html#characters ) and thus should work (the supported selectors section of the docs link to the CSS 2 spec, so you are implying that you support it completely)

I didn't meant to imply that Beautiful Soup 4 supported the complete CSS 2 spec. Since that link seems to confuse more than it helps, I've removed it from the docs and reworded that section to be as clear as I could make it that Beautiful Soup 4 supports a *subset* of CSS selectors.

As always, Beautiful Soup's CSS selector implementation comes from user contributions. If you feel strongly enough about this to add support for square brackets to Tag.select(), I'll integrate your changes into Beautiful Soup.

Leonard
 

--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beautifulsou...@googlegroups.com.
To post to this group, send email to beauti...@googlegroups.com.
Visit this group at http://groups.google.com/group/beautifulsoup.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages