find() not finding anything

42 views
Skip to first unread message

Greg

unread,
Jun 15, 2016, 3:56:38 PM6/15/16
to beautifulsoup
Here is an example that replicates the problem I am having:
from bs4 import BeautifulSoup
import re

soup = BeautifulSoup('<root><td><label for="pol_nbr">Policy number<br></label></td></root>', 'html.parser')
anchor = soup.find('label', text=re.compile('Policy number'))

But anchor is none. Why is it not the label tag?

Jim Tittsler

unread,
Jun 15, 2016, 9:23:15 PM6/15/16
to beautifulsoup
On Thu, Jun 16, 2016 at 4:20 AM, Greg <greg...@gmail.com> wrote:
> But anchor is none. Why is it not the label tag?

I can't help, but I note it does work if the <br> isn't present.
(Which makes me suspect I don't really understand the string=
(formerly text=) argument.)

Givon

unread,
Aug 15, 2016, 11:02:37 PM8/15/16
to beautifulsoup

perhaps, because of the parser.  the <br> tag doesn't have to be closed usually.  but, in xhtml, it does.  either way, the parser will close certain tags.  different parsers handle the closure of open tags differently.  try html5lib as the parser & see what happens.
Reply all
Reply to author
Forward
0 new messages