#4796 - Needs an expert's eye.

Andrew Durdin

unread,

Sep 21, 2007, 6:43:33 PM9/21/07

to Django developers

I was looking into #4796, and submitted a patch that fixes the error
-- and I'm fairly sure that it fixes it in the appropriate spot.
However, I didn't quite grok the runtime interaction between the
__proxy__ objects and the delayed_loader function, so I'd appreciate
it if someone else could cast their eye over this patch.

http://code.djangoproject.com/ticket/4796

Cheers,

Andrew.

Malcolm Tredinnick

unread,

Sep 21, 2007, 9:16:32 PM9/21/07

to django-d...@googlegroups.com

Thanks, Andrew. That patch looks "intuitively" correct; it's in the
right area of code that I would have expected to find the problem, etc.
I'll have a think about it through today to see if there are any
technical problems, but I think you've nailed it. Well done (and many
thanks -- that was a big problem for people who were hitting it). :-)

Regards,
Malcolm

Andrew Durdin

unread,

Sep 22, 2007, 7:31:48 AM9/22/07

to Django developers

On Sep 22, 2:16 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

>
> Thanks, Andrew. That patch looks "intuitively" correct; it's in the
> right area of code that I would have expected to find the problem, etc.
> I'll have a think about it through today to see if there are any
> technical problems, but I think you've nailed it.

Thank goodness for intuition -- that's what led me to the fix, even
though I didn't understand quite why it fixed it at the time. In the
end it's got nothing to do with fcgi or mod_python, as the following
python script displays identical behaviour:

def delayed_loader(*args, **kw):
def func(*strings):
return u''.join([unicode(s) for s in strings])
g = globals()
g['real_string_concat'] = func
#return func(*args, **kw) # This is the fix
return string_concat(*args, **kw)

real_string_concat = delayed_loader
def _string_concat(*strings):
return real_string_concat(*strings)

class __proxy__(object):
def __init__(self, func, *args, **kw):
self.__func = func
self.__args = args
self.__kw = kw

def __unicode__(self):
return self.__func(*self.__args, **self.__kw)

string_concat = lambda *args, **kw: __proxy__(_string_concat, *args,
**kw)

verbose_name_plural = string_concat('thing', 's')
# verbose_name_plural.__unicode # In the ticket, this line would
prevent the error
print unicode(verbose_name_plural)

I'm confident now that the patch is correct, as I now understand why
it works:

string_concat returns a __proxy__ that will call _string_concat when
__unicode__ is called, so we expect _string_concat to return a unicode
instance;

but _string_concat calls delayed_loader(), which returns the result of
string_concat, which is another __proxy__ instance, not a unicode
instance.

however, if we called __unicode__ on the __proxy__ instance (so that
we don't care what type it returns, unlike unicode()), before calling
unicode(), then _string_concat will now be calling the true
real_string_concat, not the delayed loader anymore, so it will return,
as expected, a unicode instance.

Nevertheless, I don't understand why string_concat() and friends
should be lazy at all. Care to explain?

Andrew

Malcolm Tredinnick

unread,

Sep 22, 2007, 8:53:17 PM9/22/07

to django-d...@googlegroups.com

On Sat, 2007-09-22 at 04:31 -0700, Andrew Durdin wrote:
[...]

> I'm confident now that the patch is correct, as I now understand why
> it works:
>
> string_concat returns a __proxy__ that will call _string_concat when
> __unicode__ is called, so we expect _string_concat to return a unicode
> instance;
>
> but _string_concat calls delayed_loader(), which returns the result of
> string_concat, which is another __proxy__ instance, not a unicode
> instance.

This is the root cause: ideally, I'd like to have replaced
_string_concat by it's real code, not the call to delayed_loader(). The
only reason we need delayed_loader() is because we don't know whether to
use trans_null or trans_real until late in the process.

Previously, I've wondered about pulling string_concat out of trans_real
and into __init__, since it doesn't depend on the USE_I18N setting at
all. That piece of tidying would avoid some of these problems, too, so I
think I'll do that as well.

[...]

> Nevertheless, I don't understand why string_concat() and friends
> should be lazy at all. Care to explain?

string_concat is the essentially lazy version of Python's standard
string.join. We don't want the results -- the pieces being joined -- to
be coerced to unicode or str objects until we are ready to display them,
because what they produce is dependent on the current locale setting.

The non-lazy version would mean that

s = string_concat("Result:", ugettext_lazy("foo"))

evaluated the ugettext_lazy() result immediately and once only, so the
resulting string would be translated in the locale at the time of
execution. In this case type(s) would be unicode.

The lazy version does not end up executing the ugettext_lazy() call
until rendering time (when unicode(s), or equivalent, is called) and so
the locale in effect at that moment participates in the conversion. If
it is later re-rendered in another locale, the new locale will take
effect. For example:

>>> from django.utils.translation import ugettext_lazy,
string_concat, activate
>>> s = string_concat("Today's word is: ",
ugettext_lazy("five"))
>>> print unicode(s)
Today's word is: five
>>> from django.utils.translation import activate
>>> activate('es')
>>> print unicode(s)
Today's word is: cinco
>>> activate('de')
>>> print unicode(s)
Today's word is: fünf

Without making string_concat() lazy, this effect wouldn't be possible.
Often the real benefits of lazy evaluation are hidden because you don't
see them until you try to use two locales that aren't "en". So even
developing in a non-English language doesn't show the problems until you
use a third language and need things to be re-translated at output time.

Don't worry too much if your brain melts when debugging lazy
interactions. Happens to me quite a bit, too. :-)

Hugo's __proxy__ class is a fairly tricky object, but it's worth
suffering a little complexity in there because it makes the end-user
code so much easier. If you've ever done internationalisation of C code,
you'll appreciate the benefit of ugettext_lazy over having to use
gettext_noop and then remember to include an extra gettext() call
everywhere you output something that might be translated. This is one of
those cases where the code maintainers have to work harder for the
benefit of our application writers.

Regards,
Malcolm

Andrew Durdin

unread,

Sep 24, 2007, 9:12:43 AM9/24/07

to django-d...@googlegroups.com, mal...@pointy-stick.com

On 9/23/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
>
> Previously, I've wondered about pulling string_concat out of trans_real
> and into __init__, since it doesn't depend on the USE_I18N setting at
> all. That piece of tidying would avoid some of these problems, too, so I
> think I'll do that as well.

Makes a lot of sense.

> string_concat is the essentially lazy version of Python's standard
> string.join. We don't want the results -- the pieces being joined -- to
> be coerced to unicode or str objects until we are ready to display them,
> because what they produce is dependent on the current locale setting.

Ah, of course. I hadn't considered the possibility of changing
locales while running.