Here are three different approaches to doing what I think should be the same thing:
import bs4
print bs4.__version__
from bs4 import BeautifulSoup
html = """
<html>
<body>
<p class="c1">
<span class="c5">
1 // Values.scala
</span>
</p>
<p class="c1">
<span class="c5">
2
</span>
</p>
<p class="c1">
<span class="c5">
3 val anInteger: Int = 11
</span>
</p>
<p class="c1">
<span class="c5">
4 val aDouble: Double = 1.4
</span>
</p>
<p class="c1">
<span class="c5">
5 // true or false:
</span>
</p>
<p class="c1">
<span class="c5">
6 val trueOrFalse: Boolean = true
</span>
</p>
</body>
</html>
"""
print '1' * 40
soup = BeautifulSoup(html)
for s in soup._all_strings():
if s.strip():
s.string.replace_with("Something else")
print(soup.prettify())
print '2' * 40
soup = BeautifulSoup(html)
for s in [s for s in soup._all_strings() if s.strip()]:
s.string.replace_with("Something else")
print(soup.prettify())
print '3' * 40
soup = BeautifulSoup(html)
for s in reversed([s for s in soup._all_strings() if s.strip()]):
s.string.replace_with("Something else")
print(soup.prettify())
-------------------------------------------------------------------------------------------
I think all three approaches should replace all the strings. I think the list comprehension should produce the same result as the basic for loop. But in the output, the for loop only replaces the first string, whereas the list comprehensions (regardless of direction) replaced all of them:
4.0.0b6
1111111111111111111111111111111111111111
<html>
<body>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
2
</span>
</p>
<p class="c1">
<span class="c5">
3 val anInteger: Int = 11
</span>
</p>
<p class="c1">
<span class="c5">
4 val aDouble: Double = 1.4
</span>
</p>
<p class="c1">
<span class="c5">
5 // true or false:
</span>
</p>
<p class="c1">
<span class="c5">
6 val trueOrFalse: Boolean = true
</span>
</p>
</body>
</html>
2222222222222222222222222222222222222222
<html>
<body>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
</body>
</html>
3333333333333333333333333333333333333333
<html>
<body>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
<p class="c1">
<span class="c5">
Something else
</span>
</p>
</body>
</html>
-- Bruce Eckel
www.Reinventing-Business.com
www.MindviewInc.com