pandas can not read html file with method read_html() with many table row?

52 views
Skip to first unread message

Ramin Farajpour Cami

unread,
Aug 19, 2017, 1:06:37 PM8/19/17
to PyData
Hi,


I have html with 34K row in my table i use code :

df = pd.read_html("/root/index.html")

but can not work for many table row in read html by pandas, but this work in read html with a few row in table html,

how to resolve ?

Paul Hobson

unread,
Aug 19, 2017, 1:50:01 PM8/19/17
to pyd...@googlegroups.com
Are you sure you don't have some malformed HTML? Without your data or even the error message, it's going to be nearly impossible to provide any real suggestions.

Assuming you have malformed HTML somewhere in your table, I recommend troubleshooting the data by bisection:

In other word, cut your data in half, reading both halves separately. Then split up with ever half gives you an error. Then split the quarter that gives you an error, read those...yada yada yada.
-Paul

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages