I think I've found a deliberate quirk with BeautifulSoup but I'm not sure why it exists.
I'm trying to match an empty attribute, specifically the 'selected' attribute in the option tag. Below is an example chunk of HTML - I'm trying to match the option with value 'B'.
<select class="period" name="period">
<option value=" ">21 May 2013 until 01 September 2013</option>
<option value="A">02 September 2013 until 03 September 2013</option>
<option value="B" selected>04 September 2013 until 07 September 2013</option>
<option value="C">08 September 2013 until 12 September 2013</option>
<option value="D">13 September 2013 onwards</option>
</select>
According to the docs, the following command should match it.
soup.find_all('option',attrs={'selected':True})
... however, it matches nothing; an empty list is returned.
I'm using Python 3.3 with beautifulsoup4 4.3.1. I do not have any other XML/HTML parsers installed, i.e. neither lxml or html5lib are installed, therefore I believe the built-in parser is used.
I believe this doesn't happen with lxml or html5lib because those parsers implicitly converts the missing attribute value into an empty string.
The fix is simple (remove the condition on that line), but I'm not sure what the consequences are.
Sam