Preserving newlines

0 views
Skip to first unread message

Cecil Westerhof

unread,
May 13, 2008, 12:41:54 AM5/13/08
to TurboGears
I wanted to preserve newlines in the generated webpage. I had a
solution with split in my template. Something like:
#####
<element py:strip="" py:for="line in blogEntry.entryText.split('\n')">
${line}<br/>
</element>
#####

I made the following function:
#####
def splitLines(thisText):
if thisText == None:
return []
thisText = thisText.replace('&', '&amp;')
thisText = thisText.replace('<', '&lt;')
thisText = thisText.replace('>', '&gt;')
return '<br/>'.join(thisText.split('\n'))
#####

and now I can put the following in my template:
#####
${XML(tg.splitLines(blogEntry.entryText))}
#####

This is more clear and I am rid of the <br/> at the end.

--
Cecil Westerhof

Daniel Fetchinson

unread,
May 13, 2008, 2:04:19 AM5/13/08
to turbo...@googlegroups.com
> I wanted to preserve newlines in the generated webpage. I had a
> solution with split in my template. Something like:
> #####
> <element py:strip="" py:for="line in blogEntry.entryText.split('\n')">
> ${line}<br/>
> </element>
> #####
>
> I made the following function:
> #####
> def splitLines(thisText):
> if thisText == None:
> return []
> thisText = thisText.replace('&', '&amp;')
> thisText = thisText.replace('<', '&lt;')
> thisText = thisText.replace('>', '&gt;')
> return '<br/>'.join(thisText.split('\n'))
> #####

You might want to add &quot; to your list. And if in the majority of
cases thisText doesn't contain the special characters the following
will be better performance-wise:

def splitLines(thisText):
if thisText == None:
return []

if '&' in thisText: thisText = thisText.replace('&', '&amp;')
if '<' in thisText: thisText = thisText.replace('<', '&lt;')
if '>' in thisText: thisText = thisText.replace('>', '&gt;')
if '"' in thisText: thisText = thisText.replace('"', '&quot;' )


return '<br/>'.join(thisText.split('\n'))

And if this part of your code becomes the real performance bottleneck
you might want to do the search/replace using the re module. Take a
look at xml.etree.ElementTree in the stdlib for an example (if you
have python > 2.4).

> and now I can put the following in my template:
> #####
> ${XML(tg.splitLines(blogEntry.entryText))}
> #####
>
> This is more clear and I am rid of the <br/> at the end.


Cheers,
Daniel
--
Psss, psss, put it down! - http://www.cafepress.com/putitdown

Christoph Zwerschke

unread,
May 13, 2008, 3:56:23 AM5/13/08
to turbo...@googlegroups.com
Daniel Fetchinson schrieb:

> You might want to add &quot; to your list. And if in the majority of
> cases thisText doesn't contain the special characters the following
> will be better performance-wise:
>
> def splitLines(thisText):
> if thisText == None:
> return []
> if '&' in thisText: thisText = thisText.replace('&', '&amp;')
> if '<' in thisText: thisText = thisText.replace('<', '&lt;')
> if '>' in thisText: thisText = thisText.replace('>', '&gt;')
> if '"' in thisText: thisText = thisText.replace('"', '&quot;' )
> return '<br/>'.join(thisText.split('\n'))

There is also a function cgi.escape for doing this in the standard lib.

-- Christoph

Cecil Westerhof

unread,
May 13, 2008, 5:35:12 AM5/13/08
to turbo...@googlegroups.com
2008/5/13 Daniel Fetchinson <fetch...@googlemail.com>:

> And if this part of your code becomes the real performance bottleneck
> you might want to do the search/replace using the re module. Take a
> look at xml.etree.ElementTree in the stdlib for an example (if you
> have python > 2.4).

I do not think this will be the bottleneck. Displaying is offcourse
what is done most, but retreiving the values from the database is more
expensive as the replacements I think.
But it never hurts to think about performance I think. So I made it like:
#####
startSpaces = re.compile(' +')


def splitLines(thisText):
if thisText == None:

return None
thisText = xml.etree.ElementTree._encode_entity(thisText)
lines = thisText.split('\n')
for i in range(len(lines)):
match = startSpaces.match(lines[i])
if match:
lines[i] = '&nbsp;' * match.end() + lines[i][match.end():]
return '<br/>'.join(lines)
#####

With xml.etree.ElementTree._encode_entity the conversion is done more
efficiënt and at the same time there is more done.
I find the spaces at the start of a line significant. That is why I
added the for-loop. In this way the spaces at the start of a line are
replaced with &nbsp; and is the indent of a line saved.

--
Cecil Westerhof

Cecil Westerhof

unread,
May 13, 2008, 5:49:20 PM5/13/08
to turbo...@googlegroups.com
2008/5/13 Cecil Westerhof <cldwes...@gmail.com>:

> #####
> startSpaces = re.compile(' +')
>
> def splitLines(thisText):
> if thisText == None:
> return None
> thisText = xml.etree.ElementTree._encode_entity(thisText)
> lines = thisText.split('\n')
> for i in range(len(lines)):
> match = startSpaces.match(lines[i])
> if match:
> lines[i] = '&nbsp;' * match.end() + lines[i][match.end():]
> return '<br/>'.join(lines)
> #####

if thisText == None:
return None

should be changed to
if thisText == None:
return ''

Otherwise XML crashes.

--
Cecil Westerhof

Reply all
Reply to author
Forward
0 new messages