Strange error

510 views
Skip to first unread message

vt.ak...@gmail.com

unread,
Jan 12, 2010, 3:28:27 AM1/12/10
to beautifulsoup
Hi All!
While trying to parse simple xml I got an error:

xml = '<a> <b> asdf </b> </a>'
bs = BeautifulStoneSoup(xml)
bs.contents[0].contents[0].name
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python26\Lib\site-packages\BeautifulSoup.py", line 447, in
__getattr__
raise AttributeError, "'%s' object has no attribute '%s'" %
(self.__class__.__name__, attr)
AttributeError: 'NavigableString' object has no attribute 'name'

But for the sligthly modified variant:
xml = '<a><b> asdf </b> </a>'
(space between first two tags was deleted)
there are no errors at all...

Aaron DeVore

unread,
Jan 12, 2010, 6:17:59 PM1/12/10
to beauti...@googlegroups.com
I don't have Beautiful Soup right next to me to do testing, so I'll
just guess. It looks like Beautiful Soup gives this tree when you
don't use spaces.

soup
a
b
"asdf"

a.contents[0] goes to <b> and all is well.

With the spaces, you get this instead.
soup
a
""
b
"asdf"
""


When you do a.contents[0], you get the empty string instead of <b>.
NavigableString has no 'name' attribute, hence the error. The first
fix I can think of is to use something like a.find(True) to get the
first tag instead of the first child.

Cheers!
Aaron DeVore

> --
> You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
> To post to this group, send email to beauti...@googlegroups.com.
> To unsubscribe from this group, send email to beautifulsou...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
>
>
>
>

vt.ak...@gmail.com

unread,
Jan 13, 2010, 3:13:30 AM1/13/10
to beautifulsoup
Thank you, Aaron, for the answer on my newbie's question!
Reply all
Reply to author
Forward
0 new messages