urlify.js to support Serbian alphabet

Janos Guljas

unread,

May 7, 2009, 3:49:37 PM5/7/09

to Djang...@googlegroups.com

Hi,

Here is a patch for urilfy.js in admin application that should support
Serbian alphabet. There are no collisions between existing and new
mappings.

Is this OK with other language teams?

--
Janoš Guljaš <ja...@janos.in.rs>
WWW: http://www.janos.in.rs
GPG: public key ID 61D97459, http://www.janos.in.rs/janosguljas.asc

urlify.js.1.diff

Branko Vukelic

unread,

May 7, 2009, 3:54:54 PM5/7/09

to Djang...@googlegroups.com

On Thu, May 7, 2009 at 9:49 PM, Janos Guljas <ja...@janos.in.rs> wrote:
> Hi,
>
> Here is a patch for urilfy.js in admin application that should support
> Serbian alphabet. There are no collisions between existing and new
> mappings.
>
> Is this OK with other language teams?

We have mentioned this among ourselves in the Serbian translatiors
group, but since Janos hasn't mentioned it here, perhaps I could.

The reason Janos mentioned 'collision' between existing and new tables
is that Serbian transliteration pairs collide with the Russian pairs
(a small subset, if I'm not mistaken). Serbian transliteration is not
by sound as Russian is, so

in ru: ш --> sh
in sr: ш --> s

This is a minor problem that most people will overlook, and there
doesn't seem to be a reasonable solution to it. If someone thinks of
something that could solve this issue, please let us know.

Best regards,

--
Branko
--------------------------------------------------
To leave a quick note to Branko, use this page:
http://www.wallwisher.com/wall/foxbunny
--------------------------------------------------

eml: bg.b...@gmail.com
alt: fox2...@yahoo.co.uk
blg1: http://sudologic.blogspot.com/
blg2: http://foxbunny.blogspot.com/
blg3: http://brankovukelic.blogspot.com/
img: http://picasaweb.google.com/bg.branko
twt: http://www.twitter.com/foxbunny/

Malcolm Tredinnick

unread,

May 7, 2009, 4:04:34 PM5/7/09

to Djang...@googlegroups.com

On Thu, 2009-05-07 at 21:54 +0200, Branko Vukelic wrote:
> On Thu, May 7, 2009 at 9:49 PM, Janos Guljas <ja...@janos.in.rs> wrote:
> > Hi,
> >
> > Here is a patch for urilfy.js in admin application that should support
> > Serbian alphabet. There are no collisions between existing and new
> > mappings.
> >
> > Is this OK with other language teams?
>
> We have mentioned this among ourselves in the Serbian translatiors
> group, but since Janos hasn't mentioned it here, perhaps I could.
>
> The reason Janos mentioned 'collision' between existing and new tables
> is that Serbian transliteration pairs collide with the Russian pairs
> (a small subset, if I'm not mistaken). Serbian transliteration is not
> by sound as Russian is, so
>
> in ru: ш --> sh
> in sr: ш --> s
>
> This is a minor problem that most people will overlook, and there
> doesn't seem to be a reasonable solution to it. If someone thinks of
> something that could solve this issue, please let us know.

Good to mention it, because that's the question I would normally ask in
this type of situation (although Janos explicitly ruled it out). Fact of
life is that there isn't going to be a perfect way to do that and we
shouldn't worry about that at all. The "slugify" process is imperfect.
It's also an approximation and there is no correct answer in any
language at all, since it intentionally throws away information. If
people want to use it, they are free to do so. If it isn't what they are
after, then the solution is not to use it.

For the record, the issue gets much more complicated when you consider
East Asian languages (Chinese, Japanese, Korean, etc). Our goal is
reasonable efforts and acknowledging that this is, by definition, an
approximate solution.

So I think this is fine. Having the full explanation on record is
useful, too. You know that at some point in the future somebody is going
to open a ticket saying that Serbian transliteration handles ш
incorrectly and, even when we point out why, will argue that it should
be the Russians who suffer. I think your current approach is as good as
we can do with that particular function (locale-specific functions are a
more specialised possibility, but they shouldn't be in core, at least
not until they've been proven externally first).

Regards,
Malcolm

Branko Vukelic

unread,

May 7, 2009, 4:15:32 PM5/7/09

to Djang...@googlegroups.com

On Thu, May 7, 2009 at 10:04 PM, Malcolm Tredinnick
<mal...@pointy-stick.com> wrote:
> Good to mention it, because that's the question I would normally ask in
> this type of situation (although Janos explicitly ruled it out). Fact of
> life is that there isn't going to be a perfect way to do that and we
> shouldn't worry about that at all. The "slugify" process is imperfect.
> It's also an approximation and there is no correct answer in any
> language at all, since it intentionally throws away information. If
> people want to use it, they are free to do so. If it isn't what they are
> after, then the solution is not to use it.
>
> For the record, the issue gets much more complicated when you consider
> East Asian languages (Chinese, Japanese, Korean, etc). Our goal is
> reasonable efforts and acknowledging that this is, by definition, an
> approximate solution.
>
> So I think this is fine. Having the full explanation on record is
> useful, too. You know that at some point in the future somebody is going
> to open a ticket saying that Serbian transliteration handles ш
> incorrectly and, even when we point out why, will argue that it should
> be the Russians who suffer. I think your current approach is as good as
> we can do with that particular function (locale-specific functions are a
> more specialised possibility, but they shouldn't be in core, at least
> not until they've been proven externally first).

Yes, I understand. Janos and I agreed that we can always maintain a
fork separately, and that is just fine for the time being. Just
thought it wouldn't hurt to point the issue out in case a solution
emerges in the future.

Janos Guljas

unread,

May 7, 2009, 4:58:10 PM5/7/09

to Djang...@googlegroups.com

Maybe, we could write a page in documentation with explanation and
procedure how to include different urlify.js script in the admin app.
For example, trough admin.py, or direct from media folder.

There is an another solution. For easier modifications in the future,
we could reorganize character maps, not by language, but by script (as
in fonts: Arabic, Armenian, Balinese,...,Yi). And provide several sets
of mappings by type of transliteration. URLField could have an
optional argument for type of sluging. In other and uglier variant,
URLField would not be modified, but diferent JavaScript files had to
be added in admin.py. This way we could solve transliteration by look
and by sound, and different tipe of Eastern Asian Languages
transliterations.

Branko Vukelic

unread,

May 7, 2009, 5:05:22 PM5/7/09

to Djang...@googlegroups.com

On Thu, May 7, 2009 at 10:58 PM, Janos Guljas <ja...@janos.in.rs> wrote:
>
> Maybe, we could write a page in documentation with explanation and
> procedure how to include different urlify.js script in the admin app.
> For example, trough admin.py, or direct from media folder.
>
> There is an another solution. For easier modifications in the future,
> we could reorganize character maps, not by language, but by script (as
> in fonts: Arabic, Armenian, Balinese,...,Yi). And provide several sets
> of mappings by type of transliteration. URLField could have an
> optional argument for type of sluging. In other and uglier variant,
> URLField would not be modified, but diferent JavaScript files had to
> be added in admin.py. This way we could solve transliteration by look
> and by sound, and different tipe of Eastern Asian Languages
> transliterations.

I have a feeling these issues will eventually solve themselves once
Django gets its standard AJAX library. Meanwhile, there is definitely
no easy solution for this. You need at least one argument passed to
the function, that defaults to Null in which case the current set of
tables are used.

I don't know how this can be enabled without introducing
locale-related stuff into admin site's forms, which is not too clean a
solution. It's ok for slugify to accept such optional arguments, but I
don't think it's a good idea for JS.

Reply all

Reply to author

Forward