impor re
import urllib2
#to retrieve the contents of the page
page = urllib2.urlopen("http://example.com/page.html").read().strip()
#to create the tables list
tables=[[re.findall('<TD>(.*?)</TD>',r,re.S) for r in re.findall('<TR>(.*?)</TR>',t,re.S)] for t in re.findall('<TABLE>(.*?)</TABLE>',page,re.S)]
Pretty simple. Good luck!
----------------------------------------
> Date: Fri, 24 May 2013 10:32:26 -0700
> Subject: Total Beginner - Extracting Data from a Database Online (Screenshot)
> From: logan.c...@gmail.com
> To: pytho...@python.org
For example:
# to retrieve the contents of all column '# fb' (11th column from the image you sent)
c11 = [tables[0][r][10] for r in range(len(tables[0]))]
# ---------------- -------------
# this is the content this is the quantity
# of the 11th cell of rows in table[0]
# of row 'r'
On 28 May 2013 02:21, "Carlos Nepomuceno" <carlosne...@outlook.com> wrote:
>
> ----------------------------------------
> > Date: Mon, 27 May 2013 17:58:00 -0700
> > Subject: Re: Total Beginner - Extracting Data from a Database Online (Screenshot)
> > From: logan.c...@gmail.com
> > To: pytho...@python.org
> [...]
> >
> > Oh goodness, yes, I have no clue.
>
> For example:
>
> # to retrieve the contents of all column '# fb' (11th column from the image you sent)
>
> c11 = [tables[0][r][10] for r in range(len(tables[0]))]
Or rather:
c11 = [row[10] for row in tables[0]]
In most cases, range(len(x)) is a sign that you're doing it wrong :)