Feature request: making gettext more robust

99 views
Skip to first unread message

Gergely Kalmár

unread,
Jun 15, 2023, 9:33:07 AM6/15/23
to Django developers (Contributions to Django itself)
Hello all,

It seems that gettext is currently quite permissive – it falls back to the default language whenever a translation file is missing or if the requested message ID is missing from the translation file. This can lead to errors slipping through easily.

Consider this example from the documentation:

from django.http import HttpResponse
from django.utils.translation import gettext as _

def my_view(request):
    output = _("Welcome to my site.")
    return HttpResponse(output)

Let's also assume there's two languages used in the application: LANGUAGES = [('en', 'English'), ('de', 'German')])

Note that even if you display the view with the German language, you will see "Welcome to my site." and will not receive any error or warning about the fact that the German translation file doesn't even exist yet.

Then create a translation catalog file and translate the sentence. Notice that the translated sentence appears now properly. Now change the output line to output = _("Welcome to my updated site."). Notice how the translated sentence turns back into English even when using German as a language and you don't get any warning or error again.

I think it would be great if there was a way to make gettext raise an error when the translation file is missing or when the msgid is missing. In order to add this feature in a backwards-compatible manner we could consider controlling this behavior through a new settings option. Alternatively, a warning could be also emitted, I could convert those into errors at least during testing. Silently falling back to a different language upon changes is just not great, I think.

Thanks,
Gergely

Tobias Kunze

unread,
Jun 15, 2023, 10:15:37 AM6/15/23
to django-d...@googlegroups.com
On 23-06-15 04:29:59, Gergely Kalmár wrote:
>It seems that gettext is currently quite permissive – it falls back to the
>default language whenever a translation file is missing or if the requested
>message ID is missing from the translation file. This can lead to errors
>slipping through easily.
>
>I think it would be great if there was a way to make gettext raise an error
>when the translation file is missing or when the msgid is missing.

Agreed that this is annoying behaviour, but as far as I can tell, there's not
much that Django can do. IIRC we only wrap Python's gettext module¹.

The relevant method, GNUTranslations.gettext, returns the original message if
no translation has been found, and it does so without indicating that this is
a fallback response².

AIUI this behaviour is rooted in GNU's gettext, which (just like the Python
version) allows you to set a priority list of languages to fall back to³.

Tobias

¹ https://docs.python.org/3/library/gettext.html
² https://docs.python.org/3/library/gettext.html#gettext.GNUTranslations
³ https://www.gnu.org/software/gettext/manual/gettext.html#The-LANGUAGE-variable

--
Tobias Kunze / rixx (er/he)
rixx.de software development
Mühlenbecker Weg 1, 16515 Oranienburg
https://rixx.de | https://pretalx.com
Tel.: +49 176 64636590

Jure Erznožnik

unread,
Jun 16, 2023, 1:42:19 AM6/16/23
to django-d...@googlegroups.com
The behaviour is the same on Android. iOS makes it more straight-forward
because you HAVE TO have all translations in all languages you support.

LP,
Jure

Michiel Beijen

unread,
Jun 16, 2023, 2:44:07 AM6/16/23
to django-d...@googlegroups.com
> On 15 Jun 2023, at 16:15, Tobias Kunze <ri...@cutebit.de> wrote:
>
> On 23-06-15 04:29:59, Gergely Kalmár wrote:
>> It seems that gettext is currently quite permissive – it falls back to the
>> default language whenever a translation file is missing or if the requested
>> message ID is missing from the translation file. This can lead to errors
>> slipping through easily.
>>
>> I think it would be great if there was a way to make gettext raise an error
>> when the translation file is missing or when the msgid is missing.
>
> Agreed that this is annoying behaviour, but as far as I can tell, there's not
> much that Django can do. IIRC we only wrap Python's gettext module¹.
>
> The relevant method, GNUTranslations.gettext, returns the original message if
> no translation has been found, and it does so without indicating that this is
> a fallback response².
>
> AIUI this behaviour is rooted in GNU's gettext, which (just like the Python
> version) allows you to set a priority list of languages to fall back to³.

In ‘runtime’ indeed it is difficult to get a warning for an untranslated string; the best way to go about it is to generate the translation file and check for untranslated string in your translation file via some automated check such as a Github Action.

The added benefit this has is that if there is a translation string hiding in a lesser used part of your app such as the password reset form or so, it will still be spotted by the translation file generation, whereas you might otherwise miss this if you’re just clicking around in the app.


Michiel

Matthew Pava

unread,
Jun 16, 2023, 9:55:52 AM6/16/23
to django-d...@googlegroups.com
I personally like the current behavior. If I don't have a translation prepared for a certain text, I want it to fall back on the default text. Saying that, how about Django incorporate a management command of some sort that can be used to examine the translation files and return the texts that are not translated?
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/4FE0664B-5E88-460C-826F-F0A85FC09D5B%40x14.nl.

אורי

unread,
Jun 16, 2023, 10:13:30 AM6/16/23
to django-d...@googlegroups.com
Hi,

There is a command to check which texts are not translated. If my code is under directory speedy, I can run this command to see all the texts which are not translated, not including English (because English can use the default):

for k in speedy/*/locale/*/LC_MESSAGES/django.po; do echo $(msgattrib --untranslated $k | fgrep msgstr | wc -l) $k; done | grep -v '^0 ' | fgrep -v "/locale/en/LC_MESSAGES/django.po" | sort -nr

And this is with English included:
for k in speedy/*/locale/*/LC_MESSAGES/django.po; do echo $(msgattrib --untranslated $k | fgrep msgstr | wc -l) $k; done | grep -v '^0 ' | sort -nr

Thanks,
Uri.


Gergely Kalmár

unread,
Jun 16, 2023, 10:47:13 AM6/16/23
to django-d...@googlegroups.com
I like the idea of having a management command for checking whether the translations are up-to-date. I know that many people use workarounds in their CI which essentially does that (like re-extracting messages and comparing it with existing files). Note that I already developed a pytest check for myself that captures missing translations and outdated compiled files (see https://github.com/logikal-io/pytest-logikal/blob/main/pytest_logikal/translations.py), so I'm only really missing a check for outdated extractions.

Now, the only challenge is that I usually use Babel for managing translation files, so I'm not sure if a management command that builds on top of Django's makemessages would be all that useful.

I'm still thinking that it should be possible for Django to wrap gettext in a way that allows us to raise exceptions. It seems silly to me that we could not control this core aspect of the process.

You received this message because you are subscribed to a topic in the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/ACiRPy-TN_U/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CABD5YeGSHmg%2B5O4ZSNkmyNschH9Ba52GHa2bs4oY8mG_%2ByiAHw%40mail.gmail.com.

Shai Berger

unread,
Jun 17, 2023, 10:01:04 AM6/17/23
to django-d...@googlegroups.com
Hi Gergely,

On Fri, 16 Jun 2023 16:46:31 +0200
Gergely Kalmár <gergely....@gmail.com> wrote:

>
> I'm still thinking that it should be possible for Django to wrap
> gettext in a way that allows us to raise exceptions. It seems silly
> to me that we could not control this core aspect of the process.
>

I think indeed it is possible. Take a look at the code in
django/utils/translation/trans_real.py and in particular, the
TranslationCatalog class; I _think_ you should be able to insert a
"fallback" catalog which raises some non-KeyError exception in its
`__getitem__()`, and that should give you what you want.

Note that it may be a little complex -- the mechanism there is built to
handle not only fallback languages (i.e. "en-US => en"), but a set of
catalogs for each language (i.e. collecting the translations into one
language from different apps), and further, the translations in each
language need to be kept separate because each may have its own
pluralization formula (in English, last I checked, there is only
one rule -- if the number is not 1, it's plural -- but if you specify
this rule in the .po file, the formula you get is technically distinct
from the default. Other languages sometimes have actually different
rules in different files, mostly for historical reasons).

I'm explicitly not expressing an opinion about whether this is
desirable :).

HTH,
Shai.

Wim Feijen

unread,
Aug 25, 2023, 9:57:29 AM8/25/23
to Django developers (Contributions to Django itself)
Hi Gergely,

What helps for me is Django rosetta, https://django-rosetta.readthedocs.io/ . It makes filtering on untranslated or fuzzy translations very easy.

Best regards, Wim

PS Sorry for the late response.

Op zaterdag 17 juni 2023 om 16:01:04 UTC+2 schreef Shai Berger:
Reply all
Reply to author
Forward
0 new messages