the template system's whitespace handling

Gary Wilson

unread,

Aug 2, 2006, 11:05:35 PM8/2/06

to Django developers

Malcolm Tredinnick wrote:
> On Tue, 2006-08-01 at 22:53 -0700, Gary Wilson wrote:
> > I never really liked how the templating system leaves all those
> > newlines. This middleware is cool, but it would really be nice if the
> > templating system could collapse the lines that only contain one or
> > more evaluating-to-nothing template tags.
> >
> > Thoughts on a possible implementation:
> >
> > What if an endofline token were added. Lexer.tokenize() creates the
> > tokens and Parser.parse() creates the nodelist like normal. Each node
> > in the nodelist is rendered, except for endofline nodes. Pass through
> > nodelist again, removing whitespace-strings and empty-strings if those
> > whitespace-strings and empty-strings are all that exist between two
> > endofline nodes. The endofline nodes following removed
> > whitespace-strings or empty-strings are also removed, while all other
> > endifline nodes get rendered to a newline. Join and return rendered
> > nodelist like normal.
> >
> > Would this work? Is there a better way?
>
> The big question with this sort of thing is always going to be speed.
> Rendering templates is pretty fast at the moment, but it wants to be,
> too. That being said, I haven't implemented or profiled your approach,
> so I have no idea of its real impact, but you are introducing another
> pass over the source text chunks (chunks == the results of an re.split()
> call).
>
> I've been experimenting with a somewhat funky reg-exp change inside the
> template parser that would have the same effect as yours. I'm still
> optimising it (I *knew* there was a reason that part of Friedl's book
> existed) and profiling the results, but it looks possible. Essentially,
> this would have the same effect you are after: a blank line that results
> from just template directives is removed entirely. Any spaces or other
> stuff on the line are left alone, though, so it's a very selective
> reaper.
>
> My motivation here was having to debug an email generation template
> yesterday that was like a train wreck with all the template tags jammed
> together to avoid spurious blank lines. It's going to be a few more days
> before I can work on this seriously, I suspect (there are two more
> urgent Django things I need to finish first, for a start), so you might
> like to experiment along those lines too, if you're keen. I'm not sure I
> like my solution a lot, either, since it makes things a little more
> opaque in the code; still having debates with myself about that.
>
> Regards,
> Malcolm

Breaking this discussion off from Strip Whitespace Middleware.

So, I think that the template system should not leave blank lines
behind if the tag(s) on that line evaulated to the empty string. To
make sure we are all on the same page, here is an example:

This template code:
====================
<h1>My list</h1>

<ul>
{% for item in items %}
<li>{{ item }}</li>
{% endfor %}
</ul>
====================

Would normally evaluate to:
====================
<h1>My list</h1>

<ul>

</ul>
====================

And with the {% spaceless %} tag evaluates to:
====================
<h1>My list</h1> <ul> <li>item 1</li> <li>item 2</li> <li>item 3</li>
</ul>
====================

And with the StripWhitespace Middleware evaluates to
====================
<h1>My list</h1>
<ul>
<li>item 1</li>
<li>item 2</li>
<li>item 3</li>
</ul>
====================

But what I think should evaluate to:
====================
<h1>My list</h1>

<ul>
<li>item 1</li>
<li>item 2</li>
<li>item 3</li>
</ul>
====================

In other words, leave lines that I add, but remove the empty lines that
had template tags on them.

Gary Wilson

Malcolm Tredinnick

unread,

Aug 3, 2006, 12:41:31 AM8/3/06

to django-d...@googlegroups.com

On Wed, 2006-08-02 at 20:05 -0700, Gary Wilson wrote:
[...]

> In other words, leave lines that I add, but remove the empty lines that
> had template tags on them.

That was where I was coming from, too.

HTML output I don't care too much about (I mean, it's machine
interpreted, so beauty is a bit of a minor goal). For things like email
and text output, it becomes more relevant (and the spaceless tag is not
as handy in those cases).

Cheers,
Malcolm

Gary Wilson

unread,

Aug 3, 2006, 1:21:25 AM8/3/06

to Django developers

Malcolm Tredinnick wrote:
> That was where I was coming from, too.

Oh yeah, I meant to ask you more about the implementation you have. Is
it functional?

Will McCutchen

unread,

Aug 3, 2006, 10:18:03 AM8/3/06

to Django developers

Gary Wilson wrote:
> In other words, leave lines that I add, but remove the empty lines that
> had template tags on them.

+1

Tim Keating

unread,

Aug 3, 2006, 12:54:42 PM8/3/06

to Django developers

This has been brought up a couple of times. See
http://code.djangoproject.com/ticket/696, which was marked wontfix,
reopened, and marked wontfix again. It's clearly not a concern the core
devs care about.

The solution seems to be, write your own middleware.

TK

James Bennett

unread,

Aug 3, 2006, 12:59:18 PM8/3/06

to django-d...@googlegroups.com

On 8/3/06, Tim Keating <mrt...@gmail.com> wrote:
> This has been brought up a couple of times. See
> http://code.djangoproject.com/ticket/696, which was marked wontfix,
> reopened, and marked wontfix again. It's clearly not a concern the core
> devs care about.

Considering Malcolm is one of the "core devs", I'd be wary of your conclusion ;)

From looking at the ticket, the problem was not the general idea, but
rather the specific syntax proposed for it.

--
"May the forces of evil become confused on the way to your house."
-- George Carlin

je...@jeffcroft.com

unread,

Aug 4, 2006, 2:30:41 AM8/4/06

to Django developers

As someone who spends most of my day job working in Djangos' template
system, I can say that I would also prefer the same evaluation as Gary
99% of the time. I really think it should be the default.

Tom Tobin

unread,

Aug 4, 2006, 11:48:04 AM8/4/06

to django-d...@googlegroups.com

I have to concur with Jeff; while my day job primarily involves models
and views, the default template behavior vis-a-vis whitespace and
template tags drives me fairly crazy in my personal projects. ^_^ I
understand, theoretically speaking, the potential need for preserving
every last line of whitespace . . . but I'm having a hard time coming
up with a practical use case where Gary's proposed evaluation would
cause a problem.

Ahmad Alhashemi

unread,

Aug 4, 2006, 1:58:40 PM8/4/06

to Django developers

In Rails, template tags can have an extra hyphen at the end to denote
the fact that they should consume the newline right after the tag. So:

{% some_tag %}

Would look like this:

{% some_tag -%}

Isn't it possible to just add this functionality by making the newline
that comes right after the tag part of the regex of the tag?

Something like this:

------------------------
# template syntax constants
[...]
BLOCK_TAG_END_NO_NEWLINE = '-%}\n'
[...]
tag_re = re.compile('(%s.*?(%s|%s)|%s.*?%s)' % \

(re.escape(BLOCK_TAG_START),
re.escape(BLOCK_TAG_END),

re.escape(BLOCK_TAG_END_NO_NEWLINE),

re.escape(VARIABLE_TAG_START),

re.escape(VARIABLE_TAG_END)))
------------------------

Just one more small change to Lexer.create_token and we are all set up.
This way the newline will be matched with the tag and consumed, and
therefore it will just disappear. No big performance hit sustained.

(Note that I did not test any of this)

Tom Tobin

unread,

Aug 4, 2006, 3:01:57 PM8/4/06

to django-d...@googlegroups.com

On 8/4/06, Ahmad Alhashemi <ahmad.a...@gmail.com> wrote:
>
> In Rails, template tags can have an extra hyphen at the end to denote
> the fact that they should consume the newline right after the tag. So:
>
> {% some_tag %}
>
> Would look like this:
>
> {% some_tag -%}

I can't help but feel that this syntax is ugly and, furthermore, easy
to miss while scanning through a template. I wouldn't mind having to
set something once in my root-level template (e.g., {% chomp %}{%
endchomp %}) if it was picked up by child templates inheriting from
it.

Malcolm Tredinnick

unread,

Aug 4, 2006, 7:19:28 PM8/4/06

to django-d...@googlegroups.com

Not targeted at this particular response specifically (although it also
applies here), but more to head off a bunch of "here's a syntax change
we could make" replies...

Slightly tricky syntax changes was one of the problems with the original
ticket that was marked as "wontfix". It's complicating things too much
to add all this extra syntax. Nobody has come up with a good reason not
to consume lines that only consist of empty template tags and leading
whitespace, so we should just do the Right Thing always. Not
configurable, not optional, just do it. (There may be a reason; Anybody
who has concerns should feel free to sing out.)

General philosophy (eventually something like this will get written up
and placed in Django's documentation, along with a few other similar
points): It's very easy to go overboard when considering changes and try
to make everything configurable in order to please the mythical
"everybody". That has the effect of increasing the maintenance load (now
there are multiple code paths to maintain), increase the learning curve
(I'm just starting out with Django; why are there 200 configuration
options? Which ones do I care about?), increase the documentation load
and length (200 configuration options, again, that have to be kept up to
date) and increase the amount of knowledge a template designer needs (in
this case; remember that the template language is an *extra* thing a web
page developer needs to learn on top of all their other skills). Now
you've managed to make life harder for four groups of people with one
change designed to make things easier. Possibly not the intended
win-loss ratio.

For the sort of change Gary proposed, which is essentially identical to
what I and no doubt others had arrived at as well, there is no
meaningful semantic or syntactic difference for almost all cases. There
is a way to emulate the old (current) behaviour, so no functionality is
lost. PLus the new behaviour feels a bit more intuitive to many people,
it appears.

Back to the original problem:

My initial solution that I talked about the other day didn't work in a
few cases (ten minute hacks do that sometimes). I'll get back to this
today and see what we can come up with that is non-invasive, fast and
correct.

Best wishes,
Malcolm

Tim Keating

unread,

Aug 11, 2006, 11:39:09 AM8/11/06

to Django developers

I agree. It seems to me people either care about this or they don't.
Ergo the Django way to do this would be to make the whitespace handler
pluggable and set it globally in settings.py.

TK

Gary Wilson

unread,

Aug 13, 2006, 1:49:16 AM8/13/06

to Django developers

Tim Keating wrote:
> I agree. It seems to me people either care about this or they don't.
> Ergo the Django way to do this would be to make the whitespace handler
> pluggable and set it globally in settings.py.

If performance does not suffer, I'm with Malcolm that it should simply
be done by default with no extra settings.

Gary Wilson

unread,

Aug 23, 2006, 2:27:22 AM8/23/06

to Django developers

I created a ticket:
http://code.djangoproject.com/ticket/2594

I also attached a patch that I have done a little testing with and
seems to work ok. I first attacked this at the Node level, but
realized that might not be the best way because the Nodes get rendered
recursively. In order to clean up the line's whitespace, you would
have to wait for everything to render and then remember which
lines/templatetags/templatevariables the render originated from.

The patch attacks the problem at the token level. It finds the end of
lines and then evaluates each line to determine if whitespace should be
cleaned up. But the patch is not perfect. If you have lines in your
template like:
{% for item in items %}<li>{{ item.name }}</li>{% endfor %}
or
{% if item.name %}<li>{{ item.name }}</li>{% endif %}

where blocks are on the same line as text and the block renders no
output, my patch is fooled and still will insert a blank line in the
rendered string.

If, however, you were to instead write the above examples as:
{% for item in fooitems %}
<li>{{ item.name }}</li>
{% endfor %}
or
{% if item.name %}
<li>{{ item.name }}</li>
{% endif %}

then my patch will clean up the whitespace nicely. This patch may also
very well not work on windows due to the different line endings.

There are a few setbacks though. It seems that with this patch,
render_to_string becomes 2-3 times slower. Toying with python's
profile for a few minutes shows that most of the time was being spent
in Token.__init__ and Node.__init__, which seemed logical due to all
the extra tokens, and hence nodes, that get created when using my patch.

Reply all

Reply to author

Forward