"At this point you effectively have two parse trees: one rooted at the BeautifulSoup object you used to parse the document, and one rooted at the tag that was extracted."
PROCESS TAGS INSIDE SOUP
Tue Sep 6 16:48:26 2016 tagSent
2138697 function calls (2106434 primitive calls) in 1.677 seconds
Ordered by: internal time
List reduced from 144 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function)
15641/223 0.139 0.000 0.433 0.002 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:1065(decode)
514 0.121 0.000 0.980 0.002 {method 'feed' of 'lxml.etree._FeedParser' objects}
18093 0.105 0.000 0.524 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/builder/_lxml.py:136(start)
54793 0.088 0.000 0.312 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/__init__.py:287(endData)
83496 0.088 0.000 0.090 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:191(setup)
19640 0.069 0.000 0.176 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:783(__init__)
15641/223 0.069 0.000 0.426 0.002 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:1164(decode_contents)
32956 0.062 0.000 0.062 0.000 {built-in method __new__ of type object at 0x100188900}
315646 0.060 0.000 0.071 0.000 {isinstance}
57801 0.053 0.000 0.055 0.000 {method 'sub' of '_sre.SRE_Pattern' objects}
EXTRACT TAGS FROM SOUP, THEN PROCESS
Tue Sep 6 16:49:59 2016 tagSent
178885528 function calls (171646797 primitive calls) in 90.007 seconds
Ordered by: internal time
List reduced from 125 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function)
14425177 11.030 0.000 15.582 0.000 {hasattr}
58030351 10.841 0.000 26.147 0.000 {isinstance}
10812650 9.615 0.000 75.500 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:1639(search)
7211540 9.528 0.000 15.306 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/abc.py:128(__instancecheck__)
2279 6.745 0.003 88.108 0.039 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:506(_find_all)
3605513 6.433 0.000 32.307 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:1665(_matches)
3605513 6.096 0.000 49.311 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:1598(search_tag)
3610071 5.019 0.000 14.571 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:1562(_normalize_search_value)
14422566 4.877 0.000 4.877 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/_weakrefset.py:70(__contains__)
7207137 4.552 0.000 4.552 0.000 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py:704(__getattr__)