I'd like to hear some of the existing Serbian users and translators'
opinions on this before making a decision, but your logic seems
reasonable.
What version of gettext first contained recode-sr-latin? Most
importantly, is it in gettext-0.15? If so, we're probably fine to rely
on it, since that's a version that is commonly available for Windows. If
it only appeared in, say, gettext-0.17, that's more of a problem. We're
basically saying, these days, that if you need UTF-8 support, you can't
use gettext-0.14.4 on Windows, which is one of the more commonly
available binaries. For better or worse, we have a large Windows-based
user base, including translators, so that audience is of importance,
too. Compilation of binaries for Windows lags (by quite a few years in
the gettext case, sadly) source releases.
Regards,
Malcolm
Yes, that's understood, but we have translators who use Windows as well
as Linux, BSD and MacOS. It's those people I'm concerned about here and
keeping the barriers to entry for them as low as possible.
Let's wait to see what Nebojsa says, anyway. It sounds like this isn't
going to be too hard to sort out one way or the other.
Regards,
Malcolm
I'm the other original Serbian translator :)
When Nebojša and me were originally translating Django to Serbian we
had a long discussion on how to support both Serbian scripts. At the
time (2005) the browsers weren't reporting the scripts correctly - see
http://www.petarmaric.com/entry/2007/mar/20/ie6-izbor-izmedju-latinice-ili-cirilice/
AFAIK IE7 still has this problem and Firefox didn't even provide you
with a choice.
The even bigger problem was the lack of standards for marking scripts:
sr_LAT, sp_yu, sh_sp, sh_yu, sh, cs, sr_LAT_CS, sr@Latn, sr-lat,
sr-lat-utf-8, sr-latin, sr_yu just to name a few.
So since we couldn't support both scripts reliably we defaulted on
Latin, mainly because using Cyrillic would be a SEO equivalent of
suicide at the time.
If I were to choose today I would only go for Cyrillic as "sr" code,
and drop Latin script support altogether - Google and FaceBook also do
this.
PS: Just for heads up - if you're also updating translations the
problem you are sure to run into sooner or later is the need for
accusative, but the lack of support for it - which leads to silly
sounding translations.
Regards,
--
Petar Marić
*e-mail: petar...@gmail.com
*mobile: +381 (64) 6122467
*icq: 224720322
*jabber: petar...@gmail.com
*web: http://www.petarmaric.com/
I wrote a replacement for recode-sr-latin from gettext as python 2.x
script. Somebody should test it on Windows.
You had 3 errors in the conversion tables (diff attached).
Otherwise it seems to work OK.
Also, have you created a ticket on code.djangoproject.com? I'd like to
review your updates to the translation when I find the time.
Thank you for the patch. Here is the patched version.
>
> Also, have you created a ticket on code.djangoproject.com? I'd like to
> review your updates to the translation when I find the time.
There is the ticket. http://code.djangoproject.com/ticket/10175
I'll continue to translate in free time...
Why is this needed? If recode-sr-latin is in gettext-0.15 and we decided
to go with Cyrillic as the default, then there's no problem. It was only
if the program was introduced in a very recent version of gettext (0.17,
say) that there's a problem.
Malcolm
It is an alternative. You may use it or not.
Alas, only 0.14.4 is available on Windows.
> to go with Cyrillic as the default, then there's no problem. It was only
> if the program was introduced in a very recent version of gettext (0.17,
> say) that there's a problem.
I've been thinking about this - if Cyrillic is indeed made the
default, it's kind of a backwards incompatibility. Is that a problem
for 1.1?
Regards,
Then if "recode" isn't in that, we can use the Python only version.
However we get there, the end goal is that translators who choose to use
Windows must be able to work without too much pain (a little pain, such
as upgrading to the latest released binary version is fine, but not too
much pain).
> > to go with Cyrillic as the default, then there's no problem. It was only
> > if the program was introduced in a very recent version of gettext (0.17,
> > say) that there's a problem.
>
> I've been thinking about this - if Cyrillic is indeed made the
> default, it's kind of a backwards incompatibility.
How is it backwards incompatible? If everybody who can read Serbian can
reasonably be expected to read both Cyrillic and Latin, then it
shouldn't be a problem.
That being said, the fact that apparently most people can read both is
also why I'm not particularly worried about which is the default. The
goal is to be readable by the people who speak the language. So script
choice in this case is quite possibly on the same level of importance as
word choice or whether to use formal or informal forms of address: we
pick something that works as the default and isn't too surprising to
people.
> Is that a problem
> for 1.1?
No. When we wrote the api-stability notes, I made sure that there's a
part included that explicitly excludes translated strings (in fact,
*any* strings) from any stability guarantee. Otherwise you wouldn't be
able to update translations, since people might be relying on the fact
that a certain string wasn't translated or was translated in a
particular fashion. Then we'd have endless debates about what's a bug
and what's an enhancement for translations and it would be horrible.
Strings aren't part of the "stable API", in other words.
Regards,
Malcolm
Well, he moved it out of unreviewed, as part of triaging a bunch of
tickets. We need to resolve the ticket one way or the other.
I'm kind of hoping the Serbian people on this list will get together and
come up with a decision. If there's no clear consensus, then we won't
change anything. I like the fact that you originally thought about this
issue when making the original translation and had reasons for choosing
the Latin alphabet. If those reasons are still applicable, then we might
as well not bother changing.
So when you all come up with a decision, let me know. What we have now
works and isn't really harming anybody. So there's no huge rush here.
Regards,
Malcolm
Hmm ... maybe.
At some point we're going to start getting clashes between existing
mappings from Cyrillic to Latin alphabets and new ones. At that point,
it's just going to be a case of "tough luck". Somebody won't have
perfect conversion. urlify.js is a useful aid, but not something that
will work across every language on the planet. So we may have to
compromise here sometimes.
Not saying that will happen in this case, but if you're looking at that
update, check for clashes with the existing Cyrillic character sets.
Malcolm
Any objections?
+1
--
Nebojša Đorđević - nesh
Studio Quattro - Niš - Serbia
http://studioquattro.biz/
Registered Linux User 282159 [http://counter.li.org]
Jean-Luc Godard - "To be or not to be. That's not really a question."
Okay, that sounds like a plan, then.
I'll do the necessary juggling of files in subversion in the next day or
two (committing the Serbian Cyrillic patch that's already there).
I was thinking about the transcoding issue more (converting from
Cyrillic to Latin) and I think that we'll stick with the script that's
shipped in gettext. If somebody is using gettext-0.14.4 to make an
update, we can do the conversion when the patch is committed (in fact,
that might be the way we end up working anyway to keep things
synchronised).
Regards,
Malcolm
I agree.