Efficient way to unpack an HTML table?

51 views

Skip to first unread message

OlyDLG

unread,

Oct 16, 2016, 8:36:47 PM10/16/16

to beautifulsoup

Hi, all. I have a 2000 row x 19 col HTML table from which I need to extract the text, preserving the table-like structure (I'm loading it into a NumPy recarray).

contents = [[item.text for item in row.find_all('td')] for row in table.find_all('tr')]

(where table has been extracted using soup.find('table')) works, I'm simply wondering if there's a more "Soup-onic" way.

Thanks!

OlyDLG

unread,

Oct 19, 2016, 11:45:05 PM10/19/16

to beautifulsoup

Just a little update: I've switched to loading the nested list comprehension result into a pandas DataFrame, and that's working well.

Reply all

Reply to author

Forward

0 new messages