Inserting non-breaking spaces ( ) into generated HTML?

224 views
Skip to first unread message

Bruce Eckel

unread,
Mar 30, 2015, 7:07:10 PM3/30/15
to beauti...@googlegroups.com
Is there a way to do this?

I've encountered two problems:

1) Not sure what approach to use -- just trying to insert " " ends up with " ", and u'\xa0' seems to get removed when converting from soup to HTML.

2) I think bs4 handles non-breaking spaces specially.

Thanks for any insights.

leonardr

unread,
Jun 27, 2015, 10:34:51 AM6/27/15
to beauti...@googlegroups.com, bruce...@gmail.com

1) Not sure what approach to use -- just trying to insert " " ends up with " ", and u'\xa0' seems to get removed when converting from soup to HTML.


The correct way to do it is to insert u'\xa0' into the tree, then use the 'html' formatter to turn them into   on output. An example:

from bs4 import BeautifulSoup
soup = BeautifulSoup("<a></a>")
soup.a.append(soup.new_string(u"foo \N{NO-BREAK SPACE} bar\N{NO-BREAK SPACE}"))
print soup.encode(formatter='html')
print soup.prettify(formatter='html')

If you're using soup.prettify(), your problem may be that a lot of space characters, including non-breaking spaces, are stripped. Entity encoding happens before spaces are stripped, so if you specify a formatter it won't be a problem.

Leonard
Reply all
Reply to author
Forward
0 new messages