Hello,Working with a basic soup implem. This is what my html file looks like.<html><head><script>...</script></head><body>...<center><script><table><table></table></table><table>...</table>
<table border="1px"><tbody><tr><td><b>Date</b></td><td><b>Email</b></td><td><b>Opened</b></td><td><b>Link 1</b></td><td><b>Link 2</b></td><td><b>Link 3</b></td><td><b>MES</b></td><td><b>Evals</b></td><td><b>Pricing</b></td><td><b>Newsletter</b></td><td><b>DVD Updates</b></td><td><b>Options</b></td></tr>
<tr> <td>2012-07-11</td> <td>21...@msn.com</td> <td><center><img src="./Campaign Manager 184_files/opened.png"></center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center><a name="0"></a><a href="/campaignTracking/viewCampaign.php?campaignID=184#0" onclick="showEmail('184', '216...@msn.com')">Details</a> - <a target="_NEW" href="/campaignTracking/viewEmailFull.php?email=21...@msn.com">Full Details</a></center></td> </tr> <tr> <td>2012-07-11</td> <td>a....@tiscali.it</td> <td><center><img src="./ Campaign Manager 184_files/ opened.png"></center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center> </center></td> <td><center><a name="1"></a><a href="/campaignTracking/viewCampaign.php?campaignID=184#1" onclick="showEmail('184', 'a.c...@tiscali.it')">Details</a> - <a target="_NEW" href="/campaignTracking/viewEmailFull.php?email=a.c...@tiscali.it">Full Details</a></center></td> </tr> and many more <tr>'s (a big list) -- basically a tablee off email addresses and whetheer they were opened.What I want to do: Count # of emails (if they were opened) for each date in the doc.My method:loop thru' searching for an email address in the <td>if found, thennote date, set counter to 0if the next <td> contains the "opene" <img> tag, then counter++elsecontinue looping
#!/usr/bin/env pythonfrom bs4 import BeautifulSoupimport repage = open('./184.htm')soup=BeautifulSoup(page)# print soup.prettify()#for string in soup.stripped_strings:# print(repr(string))srchtxt = re.compile(r'@', re.IGNORECASE)xdate = soup.find_all("td", text=srchtxt)print "\n", xdate[0], "\n", xdate[1]# print "\n", xdate[0].next # --- gives a "max recursion depth exceeded" error# works print soup.table.tr#works#for atsym in soup.findAll("td", text=srchtxt):#print tr.text works# print atsym#for eml in xdate: # does not work# print xdate.previous_sibling, xdate.next_sibling# print xdate.td.previous_sibling# print xdate.td.next_siblingThis script produces: see belowAs you can see, I tried several things, but no go! I am a Python newbie!I just do not know how to read/access the content of the previous/next <td> element (using bsoup) once I have found an email address. Can you please help? Thank you very much.Output:<td>21...@msn.com</td><td>a.c...@tiscali.it</td>Traceback (most recent call last):File "I:\Python27\readhtml\test.py", line 17, in <module>print "\n", xdate[0].next, "\n", xdate[1], "\n", xdate[2]File "I:\Python27\lib\idlelib\rpc.py", line 595, in __call__value = self.sockio.remotecall(self.oid, self.name, args, kwargs)File "I:\Python27\lib\idlelib\rpc.py", line 210, in remotecallseq = self.asynccall(oid, methodname, args, kwargs)File "I:\Python27\lib\idlelib\rpc.py", line 225, in asynccallself.putmessage((seq, request))File "I:\Python27\lib\idlelib\rpc.py", line 324, in putmessages = pickle.dumps(message)File "I:\Python27\lib\copy_reg.py", line 74, in _reduce_exgetstate = self.__getstate__RuntimeError: maximum recursion depth exceeded--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To view this discussion on the web visit https://groups.google.com/d/msg/beautifulsoup/-/wg1fgb4N8q4J.
To post to this group, send email to beauti...@googlegroups.com.
To unsubscribe from this group, send email to beautifulsou...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/beautifulsoup?hl=en.