Serbian Cyrillic translation

128 views
Skip to first unread message

Janos

unread,
Jan 24, 2009, 6:58:40 AM1/24/09
to Django I18N
Hello,

I'm new member of this group. My name is Janos Guljas, and I work on
Serbian Cyrillic translation of Django modules for my personal
projects. It is based on the existing Serbian translation with Latin
alphabet. Cyrillic alphabet is official in Serbia, but both alphabets
are used.

I would like to contribute with this translation, but the question is
under what code? Available choices from http://www.i18nguy.com/unicode/language-identifiers.html
are: sr, sr-Cyrl or sr-Latn. Cyrillic version should be sr because
it's official, but it is already taken.

My suggestion is that under sr code should be Cyrillic, and under sr-
Latn Latin version of the translation. There is a script recode-sr-
latin in all major GNU/Linux distributions which converts Cyrillic to
Latin alphabet. The reverse process is multi-fold in some cases. The
script can be used for automation and consistency.

Thanks.

Malcolm Tredinnick

unread,
Jan 26, 2009, 8:07:36 PM1/26/09
to Djang...@googlegroups.com
On Sat, 2009-01-24 at 03:58 -0800, Janos wrote:
[...]

> My suggestion is that under sr code should be Cyrillic, and under sr-
> Latn Latin version of the translation. There is a script recode-sr-
> latin in all major GNU/Linux distributions which converts Cyrillic to
> Latin alphabet. The reverse process is multi-fold in some cases. The
> script can be used for automation and consistency.

I'd like to hear some of the existing Serbian users and translators'
opinions on this before making a decision, but your logic seems
reasonable.

What version of gettext first contained recode-sr-latin? Most
importantly, is it in gettext-0.15? If so, we're probably fine to rely
on it, since that's a version that is commonly available for Windows. If
it only appeared in, say, gettext-0.17, that's more of a problem. We're
basically saying, these days, that if you need UTF-8 support, you can't
use gettext-0.14.4 on Windows, which is one of the more commonly
available binaries. For better or worse, we have a large Windows-based
user base, including translators, so that audience is of importance,
too. Compilation of binaries for Windows lags (by quite a few years in
the gettext case, sadly) source releases.

Regards,
Malcolm


Janos Guljas

unread,
Jan 27, 2009, 3:11:18 AM1/27/09
to Djang...@googlegroups.com
I have send the proposal, and my translations (cyr and lat) to Nebojsa
Djordjevic (current Serbian translator). He agreed that there should
be Cyrillic version but uncertain under what code maybe sr for
Cyrillicc and sr_lat for Latin). I expect his replay on quality of po
files.

Recode-sr-latin is in gettext since version 0.15. In general, users
would not need it, because the translators would synchronize Latin
version with Cyrillic when every change is made in the official
translations.
--
Janoš Guljaš <ja...@janos.in.rs>
WWW: http://www.janos.in.rs
GPG: public key ID 61D97459, http://www.janos.in.rs/janosguljas.asc

Malcolm Tredinnick

unread,
Jan 27, 2009, 8:36:46 PM1/27/09
to Djang...@googlegroups.com
On Tue, 2009-01-27 at 09:11 +0100, Janos Guljas wrote:
[...]

> Recode-sr-latin is in gettext since version 0.15. In general, users
> would not need it, because the translators would synchronize Latin
> version with Cyrillic when every change is made in the official
> translations.

Yes, that's understood, but we have translators who use Windows as well
as Linux, BSD and MacOS. It's those people I'm concerned about here and
keeping the barriers to entry for them as low as possible.

Let's wait to see what Nebojsa says, anyway. It sounds like this isn't
going to be too hard to sort out one way or the other.

Regards,
Malcolm

Petar Marić

unread,
Feb 1, 2009, 5:15:50 PM2/1/09
to Djang...@googlegroups.com
Hello Janoš,

I'm the other original Serbian translator :)

When Nebojša and me were originally translating Django to Serbian we
had a long discussion on how to support both Serbian scripts. At the
time (2005) the browsers weren't reporting the scripts correctly - see
http://www.petarmaric.com/entry/2007/mar/20/ie6-izbor-izmedju-latinice-ili-cirilice/
AFAIK IE7 still has this problem and Firefox didn't even provide you
with a choice.
The even bigger problem was the lack of standards for marking scripts:
sr_LAT, sp_yu, sh_sp, sh_yu, sh, cs, sr_LAT_CS, sr@Latn, sr-lat,
sr-lat-utf-8, sr-latin, sr_yu just to name a few.

So since we couldn't support both scripts reliably we defaulted on
Latin, mainly because using Cyrillic would be a SEO equivalent of
suicide at the time.

If I were to choose today I would only go for Cyrillic as "sr" code,
and drop Latin script support altogether - Google and FaceBook also do
this.

PS: Just for heads up - if you're also updating translations the
problem you are sure to run into sooner or later is the need for
accusative, but the lack of support for it - which leads to silly
sounding translations.

Regards,
--
Petar Marić
*e-mail: petar...@gmail.com
*mobile: +381 (64) 6122467

*icq: 224720322
*jabber: petar...@gmail.com
*web: http://www.petarmaric.com/

Janos Guljas

unread,
Feb 1, 2009, 5:48:06 PM2/1/09
to Djang...@googlegroups.com
Hello Petar,

Lack of standards is the main problem. I understand. So it is up to us
to choose. :)

There is urlify function for SEO. I added mappings for Cyrillic, and
made jQuery plugin for the forms outside the admin app.

And accusative... Admin app would be perfect without that annoyance.

Janos Guljas

unread,
Feb 2, 2009, 12:26:49 PM2/2/09
to Djang...@googlegroups.com
> What version of gettext first contained recode-sr-latin? Most
> importantly, is it in gettext-0.15? If so, we're probably fine to rely
> on it, since that's a version that is commonly available for Windows. If
> it only appeared in, say, gettext-0.17, that's more of a problem. We're
> basically saying, these days, that if you need UTF-8 support, you can't
> use gettext-0.14.4 on Windows, which is one of the more commonly
> available binaries. For better or worse, we have a large Windows-based
> user base, including translators, so that audience is of importance,
> too. Compilation of binaries for Windows lags (by quite a few years in
> the gettext case, sadly) source releases.

I wrote a replacement for recode-sr-latin from gettext as python 2.x
script. Somebody should test it on Windows.

recode-cyr2lat-0.1.tar.bz2

Petar Marić

unread,
Feb 2, 2009, 2:36:23 PM2/2/09
to Djang...@googlegroups.com
> I wrote a replacement for recode-sr-latin from gettext as python 2.x
> script. Somebody should test it on Windows.

You had 3 errors in the conversion tables (diff attached).
Otherwise it seems to work OK.

Also, have you created a ticket on code.djangoproject.com? I'd like to
review your updates to the translation when I find the time.

recode-cyr2lat.py.diff

Janos Guljas

unread,
Feb 2, 2009, 3:21:32 PM2/2/09
to Djang...@googlegroups.com
On Mon, Feb 2, 2009 at 8:36 PM, Petar Marić <petar...@gmail.com> wrote:
>> I wrote a replacement for recode-sr-latin from gettext as python 2.x
>> script. Somebody should test it on Windows.
>
> You had 3 errors in the conversion tables (diff attached).
> Otherwise it seems to work OK.

Thank you for the patch. Here is the patched version.

>
> Also, have you created a ticket on code.djangoproject.com? I'd like to
> review your updates to the translation when I find the time.

There is the ticket. http://code.djangoproject.com/ticket/10175

I'll continue to translate in free time...

recode-cyr2lat-0.2.tar.bz2

Malcolm Tredinnick

unread,
Feb 2, 2009, 9:45:58 PM2/2/09
to Djang...@googlegroups.com
On Mon, 2009-02-02 at 18:26 +0100, Janos Guljas wrote:
> > What version of gettext first contained recode-sr-latin? Most
> > importantly, is it in gettext-0.15? If so, we're probably fine to rely
> > on it, since that's a version that is commonly available for Windows. If
> > it only appeared in, say, gettext-0.17, that's more of a problem. We're
> > basically saying, these days, that if you need UTF-8 support, you can't
> > use gettext-0.14.4 on Windows, which is one of the more commonly
> > available binaries. For better or worse, we have a large Windows-based
> > user base, including translators, so that audience is of importance,
> > too. Compilation of binaries for Windows lags (by quite a few years in
> > the gettext case, sadly) source releases.
>
> I wrote a replacement for recode-sr-latin from gettext as python 2.x
> script. Somebody should test it on Windows.

Why is this needed? If recode-sr-latin is in gettext-0.15 and we decided
to go with Cyrillic as the default, then there's no problem. It was only
if the program was introduced in a very recent version of gettext (0.17,
say) that there's a problem.

Malcolm


Janos Guljas

unread,
Feb 2, 2009, 9:55:02 PM2/2/09
to Djang...@googlegroups.com

It is an alternative. You may use it or not.

Petar Marić

unread,
Feb 3, 2009, 3:07:10 AM2/3/09
to Djang...@googlegroups.com
> Why is this needed? If recode-sr-latin is in gettext-0.15 and we decided

Alas, only 0.14.4 is available on Windows.

> to go with Cyrillic as the default, then there's no problem. It was only
> if the program was introduced in a very recent version of gettext (0.17,
> say) that there's a problem.

I've been thinking about this - if Cyrillic is indeed made the
default, it's kind of a backwards incompatibility. Is that a problem
for 1.1?

Regards,

Malcolm Tredinnick

unread,
Feb 3, 2009, 3:16:10 AM2/3/09
to Djang...@googlegroups.com
On Tue, 2009-02-03 at 09:07 +0100, Petar Marić wrote:
> > Why is this needed? If recode-sr-latin is in gettext-0.15 and we decided
>
> Alas, only 0.14.4 is available on Windows.

Then if "recode" isn't in that, we can use the Python only version.
However we get there, the end goal is that translators who choose to use
Windows must be able to work without too much pain (a little pain, such
as upgrading to the latest released binary version is fine, but not too
much pain).

> > to go with Cyrillic as the default, then there's no problem. It was only
> > if the program was introduced in a very recent version of gettext (0.17,
> > say) that there's a problem.
>
> I've been thinking about this - if Cyrillic is indeed made the
> default, it's kind of a backwards incompatibility.

How is it backwards incompatible? If everybody who can read Serbian can
reasonably be expected to read both Cyrillic and Latin, then it
shouldn't be a problem.

That being said, the fact that apparently most people can read both is
also why I'm not particularly worried about which is the default. The
goal is to be readable by the people who speak the language. So script
choice in this case is quite possibly on the same level of importance as
word choice or whether to use formal or informal forms of address: we
pick something that works as the default and isn't too surprising to
people.

> Is that a problem
> for 1.1?

No. When we wrote the api-stability notes, I made sure that there's a
part included that explicitly excludes translated strings (in fact,
*any* strings) from any stability guarantee. Otherwise you wouldn't be
able to update translations, since people might be relying on the fact
that a certain string wasn't translated or was translated in a
particular fashion. Then we'd have endless debates about what's a bug
and what's an enhancement for translations and it would be horrible.

Strings aren't part of the "stable API", in other words.

Regards,
Malcolm


Petar Marić

unread,
Feb 28, 2009, 2:20:33 PM2/28/09
to Djang...@googlegroups.com
I see that Jacob flagged #10175 as Accepted. Is anybody else
interested in completing the Serbian Cyrillic translation?

Petar Marić

unread,
Feb 28, 2009, 2:25:18 PM2/28/09
to Djang...@googlegroups.com
Also I almost forgot - django/contrib/admin/media/js/urlify.js should
be updated to include Serbian Cyrillic mappings.

Malcolm Tredinnick

unread,
Feb 28, 2009, 8:03:03 PM2/28/09
to Djang...@googlegroups.com
On Sat, 2009-02-28 at 20:20 +0100, Petar Marić wrote:
> I see that Jacob flagged #10175 as Accepted. Is anybody else
> interested in completing the Serbian Cyrillic translation?

Well, he moved it out of unreviewed, as part of triaging a bunch of
tickets. We need to resolve the ticket one way or the other.

I'm kind of hoping the Serbian people on this list will get together and
come up with a decision. If there's no clear consensus, then we won't
change anything. I like the fact that you originally thought about this
issue when making the original translation and had reasons for choosing
the Latin alphabet. If those reasons are still applicable, then we might
as well not bother changing.

So when you all come up with a decision, let me know. What we have now
works and isn't really harming anybody. So there's no huge rush here.

Regards,
Malcolm

Malcolm Tredinnick

unread,
Feb 28, 2009, 8:04:44 PM2/28/09
to Djang...@googlegroups.com
On Sat, 2009-02-28 at 20:25 +0100, Petar Marić wrote:
> Also I almost forgot - django/contrib/admin/media/js/urlify.js should
> be updated to include Serbian Cyrillic mappings.

Hmm ... maybe.

At some point we're going to start getting clashes between existing
mappings from Cyrillic to Latin alphabets and new ones. At that point,
it's just going to be a case of "tough luck". Somebody won't have
perfect conversion. urlify.js is a useful aid, but not something that
will work across every language on the planet. So we may have to
compromise here sometimes.

Not saying that will happen in this case, but if you're looking at that
update, check for clashes with the existing Cyrillic character sets.

Malcolm


Đorđević Nebojša - nesh

unread,
Mar 1, 2009, 2:57:28 AM3/1/09
to Django I18N
On Feb 1, 11:15 pm, Petar Marić <petar.ma...@gmail.com> wrote:
> If I were to choose today I would only go for Cyrillic as "sr" code,
> and drop Latin script support altogether - Google and FaceBook also do
> this.

Well, I agree that we need Cyrillic support and that we can use
Cyrillic text as a base for generating Latin translations but I'm
against dropping Latin support altogether.

If browsers can't reliable send report preferred language it will be
up to developer to provide alternative way for users to select which
variant they want. Cyrillic can be default but not the only Serbian
translation.

And yes, we should go and make Cyrillic version the default and use
(sr-Latin looks OK) it as a base for Latin translation.

Petar Marić

unread,
Mar 1, 2009, 6:54:37 AM3/1/09
to Djang...@googlegroups.com
So we agree:
* Serbian Cyrillic will be the default script and have the 'sr' code and
* Serbian Latin will have the 'sr_LATIN' code and be automatically
generated from Serbian Cyrillic.

Any objections?

Nebojša Đorđević

unread,
Mar 1, 2009, 7:47:23 AM3/1/09
to Djang...@googlegroups.com
On Sun, Mar 1, 2009 at 12:54, Petar Marić <petar...@gmail.com> wrote:
>
> So we agree:
> * Serbian Cyrillic will be the default script and have the 'sr' code and
> * Serbian Latin will have the 'sr_LATIN' code and be automatically
> generated from Serbian Cyrillic.
>
> Any objections?

+1

--
Nebojša Đorđević - nesh
Studio Quattro - Niš - Serbia
http://studioquattro.biz/
Registered Linux User 282159 [http://counter.li.org]

Jean-Luc Godard - "To be or not to be. That's not really a question."

Janos Guljas

unread,
Mar 1, 2009, 9:42:03 AM3/1/09
to Djang...@googlegroups.com
I agree.

Malcolm Tredinnick

unread,
Mar 1, 2009, 6:42:08 PM3/1/09
to Djang...@googlegroups.com
On Sun, 2009-03-01 at 15:42 +0100, Janos Guljas wrote:
> I agree.
>
> On Sun, Mar 1, 2009 at 12:54 PM, Petar Marić <petar...@gmail.com> wrote:
> >
> > So we agree:
> > * Serbian Cyrillic will be the default script and have the 'sr' code and
> > * Serbian Latin will have the 'sr_LATIN' code and be automatically
> > generated from Serbian Cyrillic.

Okay, that sounds like a plan, then.

I'll do the necessary juggling of files in subversion in the next day or
two (committing the Serbian Cyrillic patch that's already there).

I was thinking about the transcoding issue more (converting from
Cyrillic to Latin) and I think that we'll stick with the script that's
shipped in gettext. If somebody is using gettext-0.14.4 to make an
update, we can do the conversion when the patch is committed (in fact,
that might be the way we end up working anyway to keep things
synchronised).

Regards,
Malcolm


Đorđević Nebojša - nesh

unread,
Mar 15, 2009, 10:28:45 AM3/15/09
to Django I18N
On Mar 1, 12:54 pm, Petar Marić <petar.ma...@gmail.com> wrote:
> So we agree:
> * Serbian Cyrillic will be the default script and have the 'sr' code and
> * Serbian Latin will have the 'sr_LATIN' code and be automatically

Just a quick note, pybabel uses following codes Serbian language:

sr Serbian
sr_BA Serbian (Bosnia and Herzegovina)
sr_CS Serbian (Serbia and Montenegro)
sr_Cyrl Serbian (Cyrillic)
sr_Cyrl_BA Serbian (Cyrillic, Bosnia and Herzegovina)
sr_Cyrl_CS Serbian (Cyrillic, Serbia and Montenegro)
sr_Cyrl_ME Serbian (Cyrillic, Montenegro)
sr_Cyrl_RS Serbian (Cyrillic, Serbia)
sr_Cyrl_YU Serbian (Cyrillic)
sr_Latn Serbian (Latin)
sr_Latn_BA Serbian (Latin, Bosnia and Herzegovina)
sr_Latn_CS Serbian (Latin, Serbia and Montenegro)
sr_Latn_ME Serbian (Latin, Montenegro)
sr_Latn_RS Serbian (Latin, Serbia)
sr_Latn_YU Serbian (Latin)
sr_ME Serbian (Montenegro)
sr_RS Serbian (Serbia)
sr_YU Serbian

so I suggest that we use sr_Latn for latin version.

Janos Guljas

unread,
Mar 15, 2009, 11:48:05 AM3/15/09
to Djang...@googlegroups.com

I agree.

Reply all
Reply to author
Forward
0 new messages