?DO WE REALLY NEED SUCH SYNTAX: T("Message {var1} with {1}").format(var1="blabla", 100)

46 views
Skip to first unread message

Vladyslav Kozlovskyy

unread,
Jul 19, 2012, 10:18:25 AM7/19/12
to web2py-developers
Hi all!

It is possible to realize ".format()" syntax for T() (http://docs.python.org/library/string.html#formatstrings)
Do we really need it?

Pluralization placeholder will be "%{{word}}" in this case.
Will we have a problem in templates with such placeholder or not?"

With best regards,
Vladyslav Kozlovskyy (Ukraine)

Mariano Reingart

unread,
Jul 19, 2012, 10:44:16 AM7/19/12
to web2py-d...@googlegroups.com
I think pluralization placeholder is a bad idea for this kind of
complex situations.

Whe should have a simpler plural form support to avoid more syntax
issues and handle special cases (like avoiding argument reordering
syntax)

A gettext approach would like:

T ("One file removed", "%d files removed", n)

T ("Delete the selected file?",
"Delete the selected files?",
n)

The translation strings would look like:

msgid "One file removed"
msgid_plural "%d files removed"
msgstr[0] "%d slika je uklonjena"
msgstr[1] "%d datoteke uklonjenih"
msgstr[2] "%d slika uklonjenih"

Note that you don't need any special syntax for plural, and you could
handle special cases (one vs many)

Also note that there are almost 11 diffrent language families, each
one with different requeriments (from no plural at all, to very
complex combinations)

I think we should no reinvent the wheel here, did you look at gettext?

http://www.gnu.org/savannah-checkouts/gnu/gettext/manual/html_node/Plural-forms.html

http://www.gnu.org/savannah-checkouts/gnu/gettext/manual/html_node/Translating-plural-forms.html#Translating-plural-forms

How does your proposed patch compares with this?

Best regards

Mariano Reingart
http://www.sistemasagiles.com.ar
http://reingart.blogspot.com
> --
> -- mail from:GoogleGroups "web2py-developers" mailing list
> make speech: web2py-d...@googlegroups.com
> unsubscribe: web2py-develop...@googlegroups.com
> details : http://groups.google.com/group/web2py-developers
> the project: http://code.google.com/p/web2py/
> official : http://www.web2py.com/
>
>

Vladyslav Kozlovskyy

unread,
Jul 19, 2012, 11:22:39 AM7/19/12
to web2py-d...@googlegroups.com
Well. Very interesting.

How about to pluralize such string:

"%(hours)s hours %(min)s minutes %(sec)s seconds"?

With placeholder we can do it simply:

"%(hours)s %%{hour(hours)} %(min) %%{minute(min)} %(sec)s %%{second(sec)}"

and we'll get:

"1 hour 1 minute 1 second"
"1 hour 10 minutes 3 seconds"
"4 hours 2 minutes 1 second"
"11 hours 23 minutes 35 seconds"

in Ukrainian it's more complex:
"1 година 1 хвилина 1 секунда"
"1 година 10 хвилин 3 секунди"
"4 години 2 хвилини 1 секунда"
"11 годин 23 хвилини 35 секунд"

How many gettext-strings we need to create to translate this simply one message?

gettext approach is badly and dirty. I do not like it.
And it is very complex for languages with more then 2 forms.

And yes, I reinvented all 11 languages families as you ask. My approach cover them all.

Vladyslav Kozlovskyy (Ukraine)



19.07.12 17:44, Mariano Reingart написав(ла):

Mariano Reingart

unread,
Jul 19, 2012, 1:25:02 PM7/19/12
to web2py-d...@googlegroups.com
On Thu, Jul 19, 2012 at 12:22 PM, Vladyslav Kozlovskyy
<dbde...@gmail.com> wrote:
> Well. Very interesting.
>
> How about to pluralize such string:
>
> "%(hours)s hours %(min)s minutes %(sec)s seconds"?

Using gettext standard approach, such string should be splitted in
several translation pieces:

"%(hours)s hours"
"%(min)s minutes"
"%(sec)s seconds"

> With placeholder we can do it simply:
>
> "%(hours)s %%{hour(hours)} %(min) %%{minute(min)} %(sec)s %%{second(sec)}"

It looks to me too complex and error prone, sorry

> and we'll get:
>
> "1 hour 1 minute 1 second"
> "1 hour 10 minutes 3 seconds"
> "4 hours 2 minutes 1 second"
> "11 hours 23 minutes 35 seconds"
>
> in Ukrainian it's more complex:
> "1 година 1 хвилина 1 секунда"
> "1 година 10 хвилин 3 секунди"
> "4 години 2 хвилини 1 секунда"
> "11 годин 23 хвилини 35 секунд"
>
> How many gettext-strings we need to create to translate this simply one
> message?

Many surely, but remember that translator may have little programming
(python) knowledge.

> gettext approach is badly and dirty. I do not like it.
> And it is very complex for languages with more then 2 forms.

I Agree, but it is straight-forward and well-known...

> And yes, I reinvented all 11 languages families as you ask. My approach
> cover them all.

:-)

Regards

Vladyslav Kozlovskyy

unread,
Jul 19, 2012, 1:59:47 PM7/19/12
to web2py-d...@googlegroups.com
I know. And it is stupid


19.07.12 20:25, Mariano Reingart написав(ла):

Vladyslav Kozlovskyy

unread,
Jul 19, 2012, 3:29:41 PM7/19/12
to web2py-d...@googlegroups.com
I know. And it's stupid. Because we have not 3 pieces but (for Ukrainian) 3x3=9 several messages in po-file. And it is difficult to support.

Example from life:

in /welcome/appadmin/ccache we have 5 messages:

"**%(items)s** %%{item(items)}, **%(bytes)s** %%{byte(bytes)}"

"Cache contains items up to **%(hours)02d** %%{hour(hours)} **%(min)02d** %%{minute(min)} **%(sec)02d** %%{second(sec)} old."

"RAM contains items up to **%(hours)02d** %%{hour(hours)} **%(min)02d** %%{minute(min)} **%(sec)02d** %%{second(sec)} old."

"DISK contains items up to **%(hours)02d** %%{hour(hours)} **%(min)02d** %%{minute(min)} **%(sec)02d** %%{second(sec)} old."

"Hit Ratio: %(ratio)s% (**%(hits)s** %{hit(hits)} and **%(misses)s** %%{miss(misses)})"

It is very easy to write them (one message == one sentence). And it is very easy to translate them:
1. we can reorder variables as we wish
2. we can use pluralization or not
3. we can use markmin notation or not (because developers use T.M() for this messages)

We have this 5 messages in one page (!!!). Do you really believe that you can found developers who want to write 14 messages instead of 5? And use only one variable per message? And have not a chance to reorder messages (how to do this when the message was split into 3-4 pieces?)  Wake up! It's XXI century now!

Also if you really hate my approach, you can write like before:

"%(items)s items, %(bytes)s bytes"
"Cache contains items up to %(hours)02d hours %(min)02d minutes %(sec)02d seconds old."
"RAM contains items up to %(hours)02d hours %(min)02d minutes %(sec)02d seconds old."
"DISK contains items up to %(hours)02d hours %(min)02d minutes %(sec)02d seconds old."
"Hit Ratio: %(ratio)s% (%(hits)s hits and %(misses)s misses)"

And this is enough for me to create perfect translation into Ukrainian, Russian, Poland using pluralization and having only your 5 sentences. Do you understand me?

I DO NOT WANT TO MAKE STUPID WORK. I WANT TO DO CREATIVE JOBS ONLY.

that's all

Vladyslav.

P.S. Well. But what about T().format(). Do we really need it?



19.07.12 20:25, Mariano Reingart написав(ла):
On Thu, Jul 19, 2012 at 12:22 PM, Vladyslav Kozlovskyy
<dbde...@gmail.com> wrote:

Vladyslav Kozlovskyy

unread,
Jul 19, 2012, 3:47:49 PM7/19/12
to web2py-d...@googlegroups.com
19.07.12 20:25, Mariano Reingart написав(ла):
> On Thu, Jul 19, 2012 at 12:22 PM, Vladyslav Kozlovskyy
> <dbde...@gmail.com> wrote:
>> Well. Very interesting.
>>
>> How about to pluralize such string:
>>
>> "%(hours)s hours %(min)s minutes %(sec)s seconds"?
> Using gettext standard approach, such string should be splitted in
> several translation pieces:
>
> "%(hours)s hours"
> "%(min)s minutes"
> "%(sec)s seconds"
we can't reorganize the whole sentence, so translation may be with mistakes.
>
>> With placeholder we can do it simply:
>>
>> "%(hours)s %%{hour(hours)} %(min) %%{minute(min)} %(sec)s %%{second(sec)}"
> It looks to me too complex and error prone, sorry
I agree with you. For you language you can use standard approach (as I shown in
my previous letter)

>
>> and we'll get:
>>
>> "1 hour 1 minute 1 second"
>> "1 hour 10 minutes 3 seconds"
>> "4 hours 2 minutes 1 second"
>> "11 hours 23 minutes 35 seconds"
>>
>> in Ukrainian it's more complex:
>> "1 година 1 хвилина 1 секунда"
>> "1 година 10 хвилин 3 секунди"
>> "4 години 2 хвилини 1 секунда"
>> "11 годин 23 хвилини 35 секунд"
>>
>> How many gettext-strings we need to create to translate this simply one
>> message?
> Many surely, but remember that translator may have little programming
> (python) knowledge.
plural placeholder is not a python! It is simple a tiny language created from
scratch for plural translation. If you don't need to use pluralized messages,
you don't need to learn this language at all

>
>> gettext approach is badly and dirty. I do not like it.
>> And it is very complex for languages with more then 2 forms.
> I Agree, but it is straight-forward and well-known...
Well. I think if anybody gives gettext patches for web2py we can remove my
system from web2py. It's easy. And I always can use forked web2py with my
patches. :)

>
>> And yes, I reinvented all 11 languages families as you ask. My approach
>> cover them all.
> :-)
:-)

> Regards
>
> Mariano Reingart
> http://www.sistemasagiles.com.ar
> http://reingart.blogspot.com
>
Regars

Vladyslav Kozlovskyy (Ukraine)


Mariano Reingart

unread,
Jul 19, 2012, 3:49:04 PM7/19/12
to web2py-d...@googlegroups.com
Vladyslav, please don't get upset ;-)

Please read all of the mail, and please, do not use informal wording,
this is a public list that is archived and read by the whole
community.

What is wrong with splitting a long text into smaller parts?

I think a larger message like "Cache contains items up to %(hours)02d
hours %(min)02d minutes %(sec)02d seconds old." should be breaked up
into smaller pieces, more "translateable" and much more reusable:

T("%s second", "%s seconds", n)

In the languages files we could have (for example, for Spanish that is
a simpler one):

{"%s second": ["%s segundo", "% segundos"]}

For more complex languages, we could have:

{"%s second": {
1: "%s секунда"
3: "%s секунди"
4: "%s секунда"
11: "%s секунд"
}}

(1, 3, 4, 11 are just examples, we even could have a predefined set of
lambas or anything else)

Look that this in fact doesn't requires a large patch nor a complex syntax.

An yes, IMHO we should support all formatting syntax python has
(including format)

Best regards,
On Thu, Jul 19, 2012 at 4:29 PM, Vladyslav Kozlovskyy

Jonathan Lundell

unread,
Jul 19, 2012, 3:57:41 PM7/19/12
to web2py-d...@googlegroups.com
On 19 Jul 2012, at 12:49 PM, Mariano Reingart wrote:
>
> Please read all of the mail, and please, do not use informal wording,
> this is a public list that is archived and read by the whole
> community.
>
> What is wrong with splitting a long text into smaller parts?

One problem is that it constrains the translation to use the same order of those parts. That's perhaps not an issue with this example, but it seems to me that it's better to translate by sentence rather than by fragment.

Bruno Rocha

unread,
Jul 19, 2012, 4:15:38 PM7/19/12
to web2py-d...@googlegroups.com
Hi,

My wife does the translations for me on almost every system I code, she almost gets crazy to understand the context of the translations, for this, now I am using the translation comment

T('Most liked members # this is the title on home page top panel')

So she can understand the context and translate it well, if i'd tell her to translate breaked up plural strings.. it will be a hell.

I prefer to have the whole sentence.

Vladyslav Kozlovskyy

unread,
Jul 19, 2012, 4:25:28 PM7/19/12
to web2py-d...@googlegroups.com
19.07.12 22:49, Mariano Reingart написав(ла):
> Vladyslav, please don't get upset ;-)
:)
>
> Please read all of the mail, and please, do not use informal wording,
> this is a public list that is archived and read by the whole
> community.
>
> What is wrong with splitting a long text into smaller parts?
I have translated many open source project. And know that translate the whole
sentence easier that parts.
>
> I think a larger message like "Cache contains items up to %(hours)02d
> hours %(min)02d minutes %(sec)02d seconds old." should be breaked up
> into smaller pieces, more "translateable" and much more reusable:
>
> T("%s second", "%s seconds", n)
>
> In the languages files we could have (for example, for Spanish that is
> a simpler one):
>
> {"%s second": ["%s segundo", "% segundos"]}
>
> For more complex languages, we could have:
>
> {"%s second": {
> 1: "%s секунда"
> 3: "%s секунди"
> 4: "%s секунда"
> 11: "%s секунд"
> }}
>
> (1, 3, 4, 11 are just examples, we even could have a predefined set of
> lambas or anything else)
>
> Look that this in fact doesn't requires a large patch nor a complex syntax.
It looks more complex then my approach. Would you explain me how does this work?

Please translate one of given sentences using this approach.

> An yes, IMHO we should support all formatting syntax python has
> (including format)
Thanks for your opinion. Will you use format() in your projects? Do you really
need it? :)

Vladyslav Kozlovskyy

unread,
Jul 19, 2012, 4:26:34 PM7/19/12
to web2py-d...@googlegroups.com
19.07.12 22:49, Mariano Reingart написав(ла):
> Vladyslav, please don't get upset ;-)
:)
>
> Please read all of the mail, and please, do not use informal wording,
> this is a public list that is archived and read by the whole
> community.
>
> What is wrong with splitting a long text into smaller parts?
I have translated many open source project. And know that translate the whole
sentence easier that parts.
>
> I think a larger message like "Cache contains items up to %(hours)02d
> hours %(min)02d minutes %(sec)02d seconds old." should be breaked up
> into smaller pieces, more "translateable" and much more reusable:
>
> T("%s second", "%s seconds", n)
>
> In the languages files we could have (for example, for Spanish that is
> a simpler one):
>
> {"%s second": ["%s segundo", "% segundos"]}
>
> For more complex languages, we could have:
>
> {"%s second": {
> 1: "%s секунда"
> 3: "%s секунди"
> 4: "%s секунда"
> 11: "%s секунд"
> }}
>
> (1, 3, 4, 11 are just examples, we even could have a predefined set of
> lambas or anything else)
>
> Look that this in fact doesn't requires a large patch nor a complex syntax.
It looks more complex then my approach. Would you explain me how does this work?

Please translate one of given sentences using this approach.

> An yes, IMHO we should support all formatting syntax python has
> (including format)
Thanks for your opinion. Will you use format() in your projects? Do you really
need it? :)
>
>> (how to do this when the message was split into 3-4 pieces?) Wake up!It's

Massimo DiPierro

unread,
Jul 19, 2012, 4:49:59 PM7/19/12
to web2py-d...@googlegroups.com
Just some general considerations.

1) there is no right or wrong. There are things that have consensus and things that do not. With all kind of shade in the middle.

In general we accept patches if:
a) we add functionality, are general enough and do not slow down existing code
b) make code shorter or simpler but are backward compatible

2) we should all try keep discussion on this list as professional as possible.

3) I do not have a strong preference about syntax of the pluralization system although longer strings are better than short strings in my view. We cannot use gettext because it translates/pluralizes assuming the entire process needs the same language. In the case of web2py different apps in different threads need different rules.

Massimo



Mariano Reingart

unread,
Jul 19, 2012, 4:51:08 PM7/19/12
to web2py-d...@googlegroups.com
On Thu, Jul 19, 2012 at 5:26 PM, Vladyslav Kozlovskyy
<dbde...@gmail.com> wrote:
> 19.07.12 22:49, Mariano Reingart написав(ла):
>>
>> Vladyslav, please don't get upset ;-)
>
> :)
>>
>>
>> Please read all of the mail, and please, do not use informal wording,
>> this is a public list that is archived and read by the whole
>> community.
>>
>> What is wrong with splitting a long text into smaller parts?
>
> I have translated many open source project. And know that translate the
> whole sentence easier that parts.

Me too, but this examples are special cases.

We can have whole sentences context as comments, many tools uses this approach.

>>
>> I think a larger message like "Cache contains items up to %(hours)02d
>> hours %(min)02d minutes %(sec)02d seconds old." should be breaked up
>> into smaller pieces, more "translateable" and much more reusable:
>>
>> T("%s second", "%s seconds", n)
>>
>> In the languages files we could have (for example, for Spanish that is
>> a simpler one):
>>
>> {"%s second": ["%s segundo", "% segundos"]}
>>
>> For more complex languages, we could have:
>>
>> {"%s second": {
>> 1: "%s секунда"
>> 3: "%s секунди"
>> 4: "%s секунда"
>> 11: "%s секунд"
>> }}
>>
>> (1, 3, 4, 11 are just examples, we even could have a predefined set of
>> lambas or anything else)
>>
>> Look that this in fact doesn't requires a large patch nor a complex
>> syntax.
>
> It looks more complex then my approach. Would you explain me how does this
> work?

The syntax was just an example, the admin should be updated to show
the orignal text and the plural form in the same place
(like now, but several textbox instead of one)

> Please translate one of given sentences using this approach.

I did.

>> An yes, IMHO we should support all formatting syntax python has
>> (including format)
>
> Thanks for your opinion. Will you use format() in your projects? Do you
> really need it? :)

I don't know, if I use it or you use it is not the point.
If we broke format, the python community will complain.

Mariano Reingart

unread,
Jul 19, 2012, 5:03:15 PM7/19/12
to web2py-d...@googlegroups.com
I'm not recommending gettext, in fact, the threading issue is not only
the main problem, also the tools to compile PO are very difficult to
use sometimes (ie. in windows).

I'm just saying that Vladyslav syntax seems to complex and whe are
analyzing just one use case of very complex sentence.
Most senteces could be translated without being splitted.
Reply all
Reply to author
Forward
0 new messages