Suggestion: Limit activated languages to settings.LANGUAGES

73 views
Skip to first unread message

Shai Berger

unread,
Oct 4, 2022, 10:05:28 AM10/4/22
to Django developers (Contributions to Django itself)
Hello Djangonauts,

This suggestion is following from discussions of the security issue
which was resolved in today's release. In essence, the issue is that
language codes are optionally used as prefixes in URLs, and for this
use they also become part of regular expressions used by the URL
resolution mechanisms. So, if an attacker manages to convince the
server to use a "language code" which encodes a pathologically complex
regex, the server can be DOS'd. The solution was to modify the
regex-inclusion part to regex-escape the language code -- ensuring
that language codes are never interpreted, in this role, as anything
other than a simple chain of single characters. This is the proper
security fix: Prevents the problem where it could manifest, with
minimal effects elsewhere.

But looking forward, I think we should reconsider the fact that
django.utils.translation.activate() will just activate whatever
language code it is given. We do have a setting, LANGUAGES, which
defines a list of the languages (and codes) supported by our site. Why
should activate() accept anything that is not in this list?

Two points have been raised in support of the current behavior: The
existence of custom languages, and of fallback language codes (that is,
e.g where the user asks for zh-hk and gets, instead, zh-hant). But in
my opinion, they do not justify it:

- A custom language should be included in settings.LANGUAGES if it is
to be supported; otherwise, e.g. makemessages will not even handle
its translation file.

- When a language with a fallback is requested, the site should really
activate the fallback language, not pretend to activate the requested
one while actually using the fallback. As an example, if "en-us" is
used as a fallback for "en-gb", and the URL has "en-gb" in it, then a
British user would rightly be offended by all the American spelling
they would see. The site should be honest enough to say "yes, you
asked for en-gb, but we fell back to en-us; sorry, that's the best we
have for you".

Note that there are all sorts of functions that check if a language
code is valid. For example, django.views.i18n.set_language() checks if
a translation for the languages exists in the project or its apps (but
not, AFAICT, the setting). d.u.translation.get_language_from_request()
and get_language_from_path() do check the LANGUAGES setting. It is
likely that including the check in activate() will do some double work.
And yet, we found ourselves introducing the security fix.

(I should note that this suggestion was also, independently, raised by
Benjamin who reported the vulnerability)

Opinions, suggestions, and explanations of the value I miss in allowing
activate() to take random language codes welcome.

Thanks,
Shai.

אורי

unread,
Oct 4, 2022, 11:33:15 AM10/4/22
to django-d...@googlegroups.com
Hi Shai,

Actually, I think this issue is very similar to what I wrote in the thread of "Model-level validation". I have been using Django for 6 years, and I was not aware that one can call translation.activate() with a language not in django_settings.LANGUAGES. I was thinking this would raise an exception, which it should in my opinion. If the language is not in django_settings.LANGUAGES, so there are probably no .po files to translate it, so all its strings will remain untranslated. I checked and we have a few tests in Speedy Net, where we check if we can call user.speedy_match_profile._set_active_languages with an unsupported language, and it does raise an exception when saving the model. The field is defined in a way that it has only specific valid choices. You can see for example the following tests:


Another thing - we don't call translation.activate() with a language code received from the user. Instead, we check our own language codes and call translation.activate() if one of them matches the URL. Otherwise we redirect to the default URL. I think one should never call  translation.activate() with a language code received from the user.

         for language_code, language_name in django_settings.LANGUAGES:
            if (domain == "{language_code}.{domain}".format(language_code=language_code, domain=site.domain)):
                translation.activate(language=language_code)
                request.LANGUAGE_CODE = translation.get_language()
                return self.get_response(request=request)

Or, alternatively, you can check if the language_code is in django_settings.LANGUAGES, and only then activate it.

Thanks,
Uri Rodberg, Speedy Net.


--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20221004170511.4342ebc7.shai%40platonix.com.

Adam Johnson

unread,
Nov 17, 2022, 6:07:15 AM11/17/22
to django-d...@googlegroups.com
I do like this suggestion. I also find the current behaviour surprising.

Do you have a plan for implementation? I guess this would have to go through the regular deprecation pathway?

Reply all
Reply to author
Forward
0 new messages