[Django] #33318: Truncator class recognizes different length of ellipsis(...) depending on the LANGUAGE

Django

unread,

Nov 24, 2021, 10:52:20 AM11/24/21

to django-...@googlegroups.com

#33318: Truncator class recognizes different length of ellipsis(...) depending on
the LANGUAGE_CODE (ex. en-us, ko-kr...)
-------------------------------------+-------------------------------------
Reporter: YoungJoo | Owner: nobody
Kim |
Type: Bug | Status: new
Component: Template | Version: 3.2
system | Keywords: truncatechars
Severity: Normal | Truncator ellipsis LANGUAGE_CODE
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
First, I am so sorry about my bad english skill..;;

There is something strange about the `truncatechars` method of Django
Template Language (DTL).
In the `add_truncation_text` method of the Truncator class, the length of
the `truncate` variable depends on the value of LANGUAGE_CODE(ex. 'en-us',
'ko-kr' ...) in settings.py

[[br]]

The `add_truncation_text` method of the Truncator class is as follows.
{{{
# django/utils/text.py

def add_truncation_text(self, text, truncate=None):
if truncate is None:
truncate = pgettext(
'String to return when truncating text',
'%(truncated_text)s…')
if '%(truncated_text)s' in truncate:
return truncate % {'truncated_text': text}
# The truncation text didn't contain the %(truncated_text)s string
# replacement argument so just append it to the text.
if text.endswith(truncate):
# But don't append the truncation text if the current text already
# ends in this.
return text
return '%s%s' % (text, truncate)
}}}

The `truncate` variable is assigned the string `'%(truncated_text)s…'` by
the `pgettext` method.

In LANGUAGE_CODE `'en-us'(default)`, ellipsis is recognized as a string of
length `1`.
But in LANGUAGE_CODE `'ko-kr'`, ellipsis is recognized as three dots(...)
and has length `3`.

I think that the pgettext method is the cause.

[[br]]

So even though it is the same code, the output is different depending on
the language.

In the `chars` method of the same Truncator class, the number of strings
to print ellipsis from is calculated through the `for` statement.

The number of times this `for` loop is determined by the return value (==
`truncate` variable) of the `add_truncation_text` method.

{{{
# django/utils/text.py

def chars(self, num, truncate=None, html=False):
"""
Return the text truncated to be no longer than the specified number
of characters.

`truncate` specifies what should be used to notify that the string has
been truncated, defaulting to a translatable string of an ellipsis.
"""
self._setup()
length = int(num)
text = unicodedata.normalize('NFC', self._wrapped)

# Calculate the length to truncate to (max length - end_text length)
truncate_len = length
for char in self.add_truncation_text('', truncate):
if not unicodedata.combining(char):
truncate_len -= 1
if truncate_len == 0:
break
if html:
return self._truncate_html(length, truncate, text, truncate_len,
False)
return self._text_chars(length, truncate, text, truncate_len)
}}}

[[br]]

In conclusion, output of the `truncatechars` method in HTML is as follows.
{{{ <p>{{ fruit|truncatechars:6 }}</p> }}}
1. LANGUAGE_CODE = 'en-us' (default)
{{{
straw...
pinea...
}}}
2. LANGUAGE_CODE = 'ko-kr'
{{{
str...
pin...
}}}

[[br]]

Even if the language is different, the ellipsis should be recognized same
as a string of length `1`.

Users from other countries may misunderstand the functionality of the
`truncatechars` method.

Thank you!

--
Ticket URL: <https://code.djangoproject.com/ticket/33318>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,

Nov 24, 2021, 11:25:32 AM11/24/21

to django-...@googlegroups.com

#33318: Truncator class recognizes different length of ellipsis(...) depending on

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Changes (by Mariusz Felisiak):

* status: new => closed
* resolution: => invalid
* component: Template system => Internationalization

Comment:

Thanks for the report, however it's an issue in translations that are
handled at
[https://docs.djangoproject.com/en/dev/internals/contributing/localizing/#translations.
Transifex] and not in this tracker.

I found "..." instead of ellipsis in `ar`, `lt`, `fa`, `sr`, `ca`,
`pt_BR`, `ml`, `tk`, `es`, `ko`, `lv`, and `ky` translations.

--
Ticket URL: <https://code.djangoproject.com/ticket/33318#comment:1>

Django

unread,

Nov 24, 2021, 11:31:36 AM11/24/21

to django-...@googlegroups.com

#33318: Truncator class recognizes different length of ellipsis(...) depending on

Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

I fixed this at Transifex.

--
Ticket URL: <https://code.djangoproject.com/ticket/33318#comment:2>

Django

unread,

Nov 25, 2021, 11:25:48 PM11/25/21

to django-...@googlegroups.com

#33318: Truncator class recognizes different length of ellipsis(...) depending on

Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Description changed by YoungJoo Kim:

Old description:

New description:

[[br]]

{{{
# django/utils/text.py

[[br]]

Thank you!

--

--
Ticket URL: <https://code.djangoproject.com/ticket/33318#comment:3>

Reply all

Reply to author

Forward

[Django] #33318: Truncator class recognizes different length of ellipsis(...) depending on the LANGUAGE_CODE (ex. en-us, ko-kr...)

Django

Django

Django

Django