How to extract a NavigableString, if html code contains line breaks

70 views

Skip to first unread message

Evgeny

unread,

Jun 1, 2021, 8:11:15 AM6/1/21

to beautifulsoup

Hi,

I am trying to extract a NavigableString object from html.

The idea is to be able to further manipulate it with string.replace_with("new string")

In the following example everything works fine

from bs4 import BeautifulSoup

html = """ my string """

soup = BeautifulSoup(html,"html.parser") soup.p.string.replace_with("New string")

print(soup)

# Output is 'New string'

However if html code itself contains line breaks, then NavigableString is not extracted, instead a None is returned

from bs4 import BeautifulSoup

html = """

my string

"""

soup = BeautifulSoup(html,"html.parser") soup.p.string.replace_with("New string")

print(soup) # Output is: AttributeError: 'NoneType' object has no attribute 'replace_with'

So, the question is how to extract a NavigableString, if html code contains line breaks. I know that get_text(strip=True) will work, but it will return just a text, not a NavigableString, so I will not be able to further manipulate it.

Reply all

Reply to author

Forward

0 new messages