wild card findAll

955 views
Skip to first unread message

khinester

unread,
Nov 18, 2008, 11:04:51 AM11/18/08
to beautifulsoup
Hello,
Perhaps a simple questions, perhaps not as I am unable to find any
solution.

I have a table which has different <td> classes as:

<table>
<tr>
<td class="col_red">$35</td>
<td class="col_orange">$65</td>
<td class="col_green">$85</td>
</tr>
</table>


What will be the simplest way to extract the values for each <td>?

price = soup.find('td', {'class': 'col_*'})

can I put the col_*

or is there a simpler way to find all the '$'

Cheers
Norman

Eric Lee

unread,
Nov 18, 2008, 2:32:13 PM11/18/08
to beauti...@googlegroups.com
Without seeing the whole document, you could query for all <td> elements with a class attribute specified with

   soup.find('td', {'class': True})

which would match <td class="col_red"> but not <td> (no class attribute given).

If you're trying to match <td> elements with a class attribute value beginning with "col_" you could use a lambda expression:

   soup.find('td', {'class': lambda x : x.startswith('col_')})

Or something similar.

-e

Aaron DeVore

unread,
Nov 18, 2008, 2:59:40 PM11/18/08
to beauti...@googlegroups.com

You can also use a regular expression if you need more sophisticated matching:

pattern = re.compile("col_.*")
soup.find('td', {'class': pattern})

However, the lambda works perfectly well in this situation.

-Aaron

Reply all
Reply to author
Forward
0 new messages