findall prematurely closes the group in svg text (regression?)

12 views
Skip to first unread message

Cee Bee Lee

unread,
Apr 8, 2015, 9:37:26 AM4/8/15
to beauti...@googlegroups.com


Hello. My code stopped working after an update at some point.  Is this a case of my "backend" changing?

The following code

from BeautifulSoup import BeautifulSoup
svgtext=open('test.svg','r').read()
# Load into Beautiful Soup
soup = BeautifulSoup(svgtext, selfClosingTags=['defs','sodipodi:namedview'])
# Find geographic groups
paths = soup.findAll('g')
print paths[3]


returns a piece of text which does not exist in the source svg
(attached).    The entire path (the actual country's borders) that
should be inside this group <g  ...   /g>   is now excluded, rendering
the output useless.
That is,  for me the above returns:

<g class="land mg" id="mg" transform="matrix(1.229834,0,0,1.1888568,-278.10861,-149.0924)" style="1;stroke:#ffffff;stroke-width:0.99986994;stroke-miterlimit:3.97446823;stroke-dasharray:none;stroke-opacity:1">
</g>

while what it used to return included a whole lot of <path> tags following the content above, but before the </g>.  


(I'm on Ubuntu 15.04 with BeautifulSoup 3.2.1)

BeautifulSoup.__version__
Out[2]: '3.2.1'

I have html5lib and lxml installed.

test.svg
Reply all
Reply to author
Forward
0 new messages