where is this .text() method coming from?

43 views
Skip to first unread message

Chris Papademetrious

unread,
Jan 13, 2025, 12:20:17 PMJan 13
to beautifulsoup
Hello fellow soupers!

My code was erroneously using a .text() method of NavigableString objects. For example,

import bs4
ns = bs4.BeautifulSoup("<p>TEXT</p>", "lxml").find(string=True)
print(f"{ns.text=}")
# ns.text='TEXT'

I did not notice this bug until I subclassed NavigableString to MyNavigableString, which somehow caused .text() to stop working:

import bs4

class MyNavigableString(bs4.NavigableString):
    pass

class MyBeautifulSoup(bs4.BeautifulSoup):
    def __init__(self, *args, **kwargs):
        kwargs = {
            **kwargs,
            "element_classes": {
                bs4.NavigableString: MyNavigableString,
            },
        }
        super().__init__(*args, **kwargs)

ns = MyBeautifulSoup("<p>TEXT</p>", "lxml").find(string=True)
print(f"{ns.text=}")
# ns.text=''

I have two questions:
  1. Where is this mysterious .text() method coming from?
  2. Why would subclassing NavigableString cause a behavior change?
Thanks in advance!

 - Chris


Reply all
Reply to author
Forward
0 new messages