Stripping spaces and linebreaks from blocktrans?

1,051 views
Skip to first unread message

Dmitri Fedortchenko

unread,
Oct 26, 2007, 11:04:29 AM10/26/07
to Django developers
I've started using blocktrans in my templates and noticed that when
using make_messages it includes all the tabs and linebreaks around the
text that is surrounded by blocktrans tags.

For example:
{% blocktrans %}
Translate this string
{% plural %}
And this plural string
{% endblocktrans %}

Will turn into:
"\t\t\t\tTranslate this string"
"\t\t"
Plural:
"\t\t"
"\t\t\t\tAnd this plural string"

in the .po file... (it affects both normal blocktrans and plural
blocktrans)

This is rather ugly i think. Also in HTML, indentation and linebreaks
are generally ignored, so in that sense django does not behave
correctly. If you have a string of some length you might not want to
place it all on the same line, for the beauty of the syntax...

So I put in some preliminary work for stripping the translation
string.
I just realised that my patch only strips the ends of the string which
still leaves the middle full of ugliness if you have a multiline
msgid.

Has this ever been discussed?
To what extent would the stripping be appropriate? Or perhaps there is
a simpler solution that I am missing?

Index: django/templatetags/i18n.py
===================================================================
--- django/templatetags/i18n.py (revision 6603)
+++ django/templatetags/i18n.py (working copy)
@@ -66,10 +66,12 @@
for var,val in self.extra_context.items():
context[var] = val.resolve(context)
singular = self.render_token_list(self.singular)
+ singular = singular.strip()
if self.plural and self.countervar and self.counter:
count = self.counter.resolve(context)
context[self.countervar] = count
plural = self.render_token_list(self.plural)
+ plural = plural.strip()
result = translation.ungettext(singular, plural, count)
else:
result = translation.ugettext(singular)
Index: django/utils/translation/trans_real.py
===================================================================
--- django/utils/translation/trans_real.py (revision 6603)
+++ django/utils/translation/trans_real.py (working copy)
@@ -442,13 +442,13 @@
pluralmatch = plural_re.match(t.contents)
if endbmatch:
if inplural:
- out.write(' ngettext(%r,%r,count) ' %
(''.join(singular), ''.join(plural)))
+ out.write(' ngettext(%r,%r,count) ' %
(''.join(singular).strip(), ''.join(plural).strip()))
for part in singular:
out.write(blankout(part, 'S'))
for part in plural:
out.write(blankout(part, 'P'))
else:
- out.write(' gettext(%r) ' %
''.join(singular))
+ out.write(' gettext(%r) ' %
''.join(singular).strip())
for part in singular:
out.write(blankout(part, 'S'))
intrans = False

Malcolm Tredinnick

unread,
Oct 26, 2007, 7:41:12 PM10/26/07
to django-d...@googlegroups.com

There are really two things at work here: one not so good and one that I
don't want to change.

The first issue (which is worth fixing) is the leading whitespace and
particularly the bonus blank lines that appear to be introduced. They
are worth stripping.

The second issue is what to when we fix blocktrans so that it can
correctly handle *blocks* of text (things with newlines). In that case,
I wouldn't want to strip anything beyond a common amount of leading
whitespace because the source layout can be indicative of meaning
sometimes (or at least, useful). PO files don't need to preserve
linebreaks, so the translator can do whatever they want there, but
arbitrarily stripping and merging feels slightly wrong to me.

Whitespace may well be ignore by HTML processors(not entirely true in
all cases, but a reasonable generalisation), but it's not ignored by
humans reading the text. Strip out all whitespace between tags in an
HTML page and notice how completely unreadable it becomes.

So I'd be happy to include something that stripped the same amount of
whitespace from the front of every line inside a blocktrans block but
kept other linebreaks and extra leading whitespace intact. That would
make the PO file more readable. Anything more compressing than that is
going to require stronger arguments.

Regards,
Malcolm

--
Monday is an awful way to spend 1/7th of your life.
http://www.pointy-stick.com/blog/

doug.na...@gmail.com

unread,
Oct 26, 2007, 8:57:09 PM10/26/07
to Django developers
I would argue against ANY implicit stripping. some of us use the
template system for generating things like .ics files and others for
which every space counts as part of the syntax.

Please don't break the pycon website calendar support ;-)

-Doug


On Oct 26, 7:41 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

Malcolm Tredinnick

unread,
Oct 26, 2007, 9:06:28 PM10/26/07
to django-d...@googlegroups.com
On Sat, 2007-10-27 at 00:57 +0000, doug.na...@gmail.com wrote:
> I would argue against ANY implicit stripping. some of us use the
> template system for generating things like .ics files and others for
> which every space counts as part of the syntax.
>
> Please don't break the pycon website calendar support ;-)

If leading whitespace is important in the translated string, you're
doomed. There's nothing to require translators to keep any whitespace
intact. If whitespace is significant it should be outside of the text
passed to translators.

We are only talking about stripping whitespace in the strings presented
to translators and it won't change the strings you see in untranslated
text.

Regards,
Malcolm

--
Everything is _not_ based on faith... take my word for it.
http://www.pointy-stick.com/blog/

doug.na...@gmail.com

unread,
Oct 27, 2007, 12:31:45 AM10/27/07
to Django developers
Sorry, I somehow missed that.

On Oct 26, 9:06 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

Dmitri Fedortchenko

unread,
Oct 28, 2007, 5:46:17 PM10/28/07
to Django developers
That's a great idea! I'll see if I can squeeze out a patch for this,
since I feel that I want to be able to indent blocktrans without
having the extra spaces in the po files (I realize one can remove the
extra spaces manually in the po files, but I like consistent
autogeneration hehe).

In other words, whitespace equal to the amount of the indentation of
first line will be removed, as well as leading and trailing whitespace
and line breaks.
One question that springs to mind: If the first line is on the same
line as the {% blocktrans %}tag itself (i.e. it has no indentation),
should we strip the subsequent lines based on the indentation of the
second line, or assume that the user wants to keep the text as is?

:)
Please advise.

On Oct 27, 12:41 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

Malcolm Tredinnick

unread,
Oct 28, 2007, 8:37:01 PM10/28/07
to django-d...@googlegroups.com
On Sun, 2007-10-28 at 21:46 +0000, Dmitri Fedortchenko wrote:
> That's a great idea! I'll see if I can squeeze out a patch for this,
> since I feel that I want to be able to indent blocktrans without
> having the extra spaces in the po files (I realize one can remove the
> extra spaces manually in the po files, but I like consistent
> autogeneration hehe).
>
> In other words, whitespace equal to the amount of the indentation of
> first line will be removed, as well as leading and trailing whitespace
> and line breaks.
> One question that springs to mind: If the first line is on the same
> line as the {% blocktrans %}tag itself (i.e. it has no indentation),
> should we strip the subsequent lines based on the indentation of the
> second line, or assume that the user wants to keep the text as is?

Well, firstly, I'd rather you solved the two problems in a slightly
reverse order: first allow linebreaks inside blocktrans, then let's
worry about stripping whitespace. But it's up to you.

My ideal scenario would be that if we have a block of text that needs to
be translated, strip a constant amount of whitespace from the front of
each line. This amount will be the minimum amount of leading whitespace
on any of the lines in that block. So, at the end, one line (at least)
will always be flush against the left margin and all other lines will be
indented relative to that line as they were in the source text.

(My reason for writing that alternative formulation is because I don't
really understand what you were asking about being on the same level as
the blocktrans tag. I'm not sure why the tag is relevant here.)

Come up with a patch that does whatever you like and we can go from
there. I think we're thinking along similar lines, so it will be easy
enough to tweak once we have some concrete code to throw darts at.

Regards,
Malcolm

--
Remember that you are unique. Just like everyone else.
http://www.pointy-stick.com/blog/

Dmitri Fedortchenko

unread,
Oct 31, 2007, 1:52:26 PM10/31/07
to django-d...@googlegroups.com
Ok I got the patch.

Should I open a ticket?
I attached it, but not sure if it will get through to the mailing list.

I also attached a semi-comprehensive battery of tests, showing the outcome of the stripping process.

Basically it strips indentation based on the indentation of the first line of text.
If the first line is empty, it will use the indentation of the second line.

If the first line has no indentation, then nothing other then a .strip() will be done to the string.

If any subsequent lines have less indentation then the first line, then those lines will be de-indented as much as possible.
If any subsequent lines have more indetation then the first line, then only the amount of indetation equal to the first line is stripped.

The only caveat is if mixed tabs and spaces are used for indentation, some strange results may be produced.

//D

On 10/29/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:

On Sun, 2007-10-28 at 21:46 +0000, Dmitri Fedortchenko wrote:
> That's a great idea! I'll see if I can squeeze out a patch for this,
> since I feel that I want to be able to indent blocktrans without
> having the extra spaces in the po files (I realize one can remove the
> extra spaces manually in the po files, but I like consistent
> autogeneration hehe).
>
> In other words, whitespace equal to the amount of the indentation of
> first line will be removed, as well as leading and trailing whitespace
> and line breaks.
> One question that springs to mind: If the first line is on the same
> line as the {% blocktrans %}tag itself ( i.e. it has no indentation),
blocktrans_strip.diff
strip_blocktrans_tests.py

Dmitri Fedortchenko

unread,
Oct 31, 2007, 1:58:08 PM10/31/07
to django-d...@googlegroups.com
Now that I re-read your definition, there is one difference between your suggestion and my patch.
My patch just uses the first line to define the amount of whitespace to strip, while you suggest we use the shortest indentation in the whole block.

Which is better?

//D

On 10/29/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:

Dmitri Fedortchenko

unread,
Oct 31, 2007, 2:17:18 PM10/31/07
to django-d...@googlegroups.com
I opened a ticket, looking forward to your input and ideas.

http://code.djangoproject.com/ticket/5849

Malcolm Tredinnick

unread,
Oct 31, 2007, 10:10:35 PM10/31/07
to django-d...@googlegroups.com
On Wed, 2007-10-31 at 19:17 +0100, Dmitri Fedortchenko wrote:
> I opened a ticket, looking forward to your input and ideas.
>
> http://code.djangoproject.com/ticket/5849

Thanks. I'll get to it eventually, but it might take a week or so. I'm
pretty busy right at the moment and have a few other Django commitments
to deliver on. At some point I have to spend a few days closing all the
internationalisation bugs, though. I'd like to that sometime in
November.

Regards,
Malcolm

--
I just got lost in thought. It was unfamiliar territory.
http://www.pointy-stick.com/blog/

Dmitri Fedortchenko

unread,
Oct 31, 2007, 10:15:48 PM10/31/07
to django-d...@googlegroups.com
Sounds good.
I'll try to clean up the patch as much as possible when I get some time too. Right now it's a little rough around the edges.

//D
Reply all
Reply to author
Forward
0 new messages