I have now come to the another tag... <li> Within this tag, I want to "hit" the button next so it loads the next 100 recruits then repeats the loop. Here is the sample html code:
<li><form method="post" action="recruit-search-results"><button type="submit" name="start" value="100">Next »<span class="value">100</span></button><input type="hidden" name="sport" value="football">
<input type="hidden" name="year" value="2011">
<input type="hidden" name="committed" value="1">
<input type="hidden" name="uncommitted" value="1">
<input type="hidden" name="loc" value="City, State or Zip Code">
<input type="hidden" name="hsprospects" value="1">
<input type="hidden" name="prepprospects" value="1">
<input type="hidden" name="jucoprospects" value="1">
<input type="hidden" name="start" value="0"></form></li>
Im not really sure which tag I need to target in order to go to the "Next" page or next 100 recruits... Typically you'd target an <a> tag and the href... Anyways should I put the "row loop" into its own function in order to keep reapeating the loop until there are no more "next" pages...??
Below is an snippet of the code that I have been working on....I highlighted my idea for "hitting" the next button..
user_agent = 'Mozilla/5 (Solaris 10) Gecko'
headers = { 'User-Agent' : user_agent }
year = raw_input("Input recruiting year: ")
values = {'s' : year }
data = urllib.urlencode(values)
request = urllib2.Request("
http://rivals.yahoo.com/ncaa/football/recruiting/recruit-search", data, headers)
page = urllib2.urlopen(request)
soup = BeautifulSoup(page)
evens = soup.find_all('tr', 'even')
odds = soup.find_all('tr', 'odd')
rows = evens + odds
for row in rows:
tdlist = row.find_all('td')
data1 = tdlist[0].string
data2 = row.find('th').a.string
data3 = tdlist[1].contents[0].string
data4 = tdlist[1].contents[1].string
data5 = tdlist[2].string
data6 = tdlist[3].string
data7 = tdlist[4].string
if tdlist[5].span is not None:
data8 = tdlist[5].span.string
else:
data8 = ""
data9 = tdlist[6].string
data10 = tdlist[7].string
data11 = tdlist[8].a.string
print '%s %s %s %s %s %s %s %s %s %s %s' % (data1, data2, data3, data4, data5, data6, data7, data8, data9, data10, data11)
## After creating outfile with all the data on the page, go to next page then repeat loop
next = { value : 'Next' } for i in next: if next == 'Next': continue else: break
On Wednesday, July 18, 2012 12:06:31 PM UTC-4, LunkRat wrote: