Preserving newlines

Cecil Westerhof

unread,

May 13, 2008, 12:41:54 AM5/13/08

to TurboGears

I wanted to preserve newlines in the generated webpage. I had a
solution with split in my template. Something like:
#####
<element py:strip="" py:for="line in blogEntry.entryText.split('\n')">
${line} 
</element>
#####

I made the following function:
#####
def splitLines(thisText):
if thisText == None:
return []
thisText = thisText.replace('&', '&')
thisText = thisText.replace('<', '<')
thisText = thisText.replace('>', '>')
return ' '.join(thisText.split('\n'))
#####

and now I can put the following in my template:
#####
${XML(tg.splitLines(blogEntry.entryText))}
#####

This is more clear and I am rid of the at the end.

--
Cecil Westerhof

Daniel Fetchinson

unread,

May 13, 2008, 2:04:19 AM5/13/08

to turbo...@googlegroups.com

> I wanted to preserve newlines in the generated webpage. I had a
> solution with split in my template. Something like:
> #####
> <element py:strip="" py:for="line in blogEntry.entryText.split('\n')">
> ${line} 
> </element>
> #####
>
> I made the following function:
> #####
> def splitLines(thisText):
> if thisText == None:
> return []
> thisText = thisText.replace('&', '&')
> thisText = thisText.replace('<', '<')
> thisText = thisText.replace('>', '>')
> return ' '.join(thisText.split('\n'))
> #####

You might want to add " to your list. And if in the majority of
cases thisText doesn't contain the special characters the following
will be better performance-wise:

def splitLines(thisText):
if thisText == None:
return []

if '&' in thisText: thisText = thisText.replace('&', '&')
if '<' in thisText: thisText = thisText.replace('<', '<')
if '>' in thisText: thisText = thisText.replace('>', '>')
if '"' in thisText: thisText = thisText.replace('"', '"' )

return ' '.join(thisText.split('\n'))

And if this part of your code becomes the real performance bottleneck
you might want to do the search/replace using the re module. Take a
look at xml.etree.ElementTree in the stdlib for an example (if you
have python > 2.4).

> and now I can put the following in my template:
> #####
> ${XML(tg.splitLines(blogEntry.entryText))}
> #####
>
> This is more clear and I am rid of the at the end.

Cheers,
Daniel
--
Psss, psss, put it down! - http://www.cafepress.com/putitdown

Christoph Zwerschke

unread,

May 13, 2008, 3:56:23 AM5/13/08

to turbo...@googlegroups.com

Daniel Fetchinson schrieb:

> You might want to add " to your list. And if in the majority of
> cases thisText doesn't contain the special characters the following
> will be better performance-wise:
>
> def splitLines(thisText):
> if thisText == None:
> return []
> if '&' in thisText: thisText = thisText.replace('&', '&')
> if '<' in thisText: thisText = thisText.replace('<', '<')
> if '>' in thisText: thisText = thisText.replace('>', '>')
> if '"' in thisText: thisText = thisText.replace('"', '"' )
> return ' '.join(thisText.split('\n'))

There is also a function cgi.escape for doing this in the standard lib.

-- Christoph

Cecil Westerhof

unread,

May 13, 2008, 5:35:12 AM5/13/08

to turbo...@googlegroups.com

2008/5/13 Daniel Fetchinson <fetch...@googlemail.com>:

> And if this part of your code becomes the real performance bottleneck
> you might want to do the search/replace using the re module. Take a
> look at xml.etree.ElementTree in the stdlib for an example (if you
> have python > 2.4).

I do not think this will be the bottleneck. Displaying is offcourse
what is done most, but retreiving the values from the database is more
expensive as the replacements I think.
But it never hurts to think about performance I think. So I made it like:
#####
startSpaces = re.compile(' +')

def splitLines(thisText):
if thisText == None:

return None
thisText = xml.etree.ElementTree._encode_entity(thisText)
lines = thisText.split('\n')
for i in range(len(lines)):
match = startSpaces.match(lines[i])
if match:
lines[i] = ' ' * match.end() + lines[i][match.end():]
return ' '.join(lines)
#####

With xml.etree.ElementTree._encode_entity the conversion is done more
efficiënt and at the same time there is more done.
I find the spaces at the start of a line significant. That is why I
added the for-loop. In this way the spaces at the start of a line are
replaced with   and is the indent of a line saved.

--
Cecil Westerhof

Cecil Westerhof

unread,

May 13, 2008, 5:49:20 PM5/13/08

to turbo...@googlegroups.com

2008/5/13 Cecil Westerhof <cldwes...@gmail.com>:

> #####
> startSpaces = re.compile(' +')
>
> def splitLines(thisText):
> if thisText == None:
> return None
> thisText = xml.etree.ElementTree._encode_entity(thisText)
> lines = thisText.split('\n')
> for i in range(len(lines)):
> match = startSpaces.match(lines[i])
> if match:
> lines[i] = ' ' * match.end() + lines[i][match.end():]
> return ' '.join(lines)
> #####

if thisText == None:
return None

should be changed to
if thisText == None:
return ''

Otherwise XML crashes.

--
Cecil Westerhof

Reply all

Reply to author

Forward