On Thu, May 7, 2009 at 9:49 PM, Janos Guljas <ja...@janos.in.rs> wrote: > Hi,
> Here is a patch for urilfy.js in admin application that should support > Serbian alphabet. There are no collisions between existing and new > mappings.
> Is this OK with other language teams?
We have mentioned this among ourselves in the Serbian translatiors group, but since Janos hasn't mentioned it here, perhaps I could.
The reason Janos mentioned 'collision' between existing and new tables is that Serbian transliteration pairs collide with the Russian pairs (a small subset, if I'm not mistaken). Serbian transliteration is not by sound as Russian is, so
in ru: ш --> sh in sr: ш --> s
This is a minor problem that most people will overlook, and there doesn't seem to be a reasonable solution to it. If someone thinks of something that could solve this issue, please let us know.
Best regards,
-- Branko -------------------------------------------------- To leave a quick note to Branko, use this page: http://www.wallwisher.com/wall/foxbunny --------------------------------------------------
On Thu, 2009-05-07 at 21:54 +0200, Branko Vukelic wrote: > On Thu, May 7, 2009 at 9:49 PM, Janos Guljas <ja...@janos.in.rs> wrote: > > Hi,
> > Here is a patch for urilfy.js in admin application that should support > > Serbian alphabet. There are no collisions between existing and new > > mappings.
> > Is this OK with other language teams?
> We have mentioned this among ourselves in the Serbian translatiors > group, but since Janos hasn't mentioned it here, perhaps I could.
> The reason Janos mentioned 'collision' between existing and new tables > is that Serbian transliteration pairs collide with the Russian pairs > (a small subset, if I'm not mistaken). Serbian transliteration is not > by sound as Russian is, so
> in ru: ш --> sh > in sr: ш --> s
> This is a minor problem that most people will overlook, and there > doesn't seem to be a reasonable solution to it. If someone thinks of > something that could solve this issue, please let us know.
Good to mention it, because that's the question I would normally ask in this type of situation (although Janos explicitly ruled it out). Fact of life is that there isn't going to be a perfect way to do that and we shouldn't worry about that at all. The "slugify" process is imperfect. It's also an approximation and there is no correct answer in any language at all, since it intentionally throws away information. If people want to use it, they are free to do so. If it isn't what they are after, then the solution is not to use it.
For the record, the issue gets much more complicated when you consider East Asian languages (Chinese, Japanese, Korean, etc). Our goal is reasonable efforts and acknowledging that this is, by definition, an approximate solution.
So I think this is fine. Having the full explanation on record is useful, too. You know that at some point in the future somebody is going to open a ticket saying that Serbian transliteration handles ш incorrectly and, even when we point out why, will argue that it should be the Russians who suffer. I think your current approach is as good as we can do with that particular function (locale-specific functions are a more specialised possibility, but they shouldn't be in core, at least not until they've been proven externally first).
<malc...@pointy-stick.com> wrote: > Good to mention it, because that's the question I would normally ask in > this type of situation (although Janos explicitly ruled it out). Fact of > life is that there isn't going to be a perfect way to do that and we > shouldn't worry about that at all. The "slugify" process is imperfect. > It's also an approximation and there is no correct answer in any > language at all, since it intentionally throws away information. If > people want to use it, they are free to do so. If it isn't what they are > after, then the solution is not to use it.
> For the record, the issue gets much more complicated when you consider > East Asian languages (Chinese, Japanese, Korean, etc). Our goal is > reasonable efforts and acknowledging that this is, by definition, an > approximate solution.
> So I think this is fine. Having the full explanation on record is > useful, too. You know that at some point in the future somebody is going > to open a ticket saying that Serbian transliteration handles ш > incorrectly and, even when we point out why, will argue that it should > be the Russians who suffer. I think your current approach is as good as > we can do with that particular function (locale-specific functions are a > more specialised possibility, but they shouldn't be in core, at least > not until they've been proven externally first).
Yes, I understand. Janos and I agreed that we can always maintain a fork separately, and that is just fine for the time being. Just thought it wouldn't hurt to point the issue out in case a solution emerges in the future.
-- Branko -------------------------------------------------- To leave a quick note to Branko, use this page: http://www.wallwisher.com/wall/foxbunny --------------------------------------------------
Maybe, we could write a page in documentation with explanation and procedure how to include different urlify.js script in the admin app. For example, trough admin.py, or direct from media folder.
There is an another solution. For easier modifications in the future, we could reorganize character maps, not by language, but by script (as in fonts: Arabic, Armenian, Balinese,...,Yi). And provide several sets of mappings by type of transliteration. URLField could have an optional argument for type of sluging. In other and uglier variant, URLField would not be modified, but diferent JavaScript files had to be added in admin.py. This way we could solve transliteration by look and by sound, and different tipe of Eastern Asian Languages transliterations.
On Thu, May 7, 2009 at 10:58 PM, Janos Guljas <ja...@janos.in.rs> wrote:
> Maybe, we could write a page in documentation with explanation and > procedure how to include different urlify.js script in the admin app. > For example, trough admin.py, or direct from media folder.
> There is an another solution. For easier modifications in the future, > we could reorganize character maps, not by language, but by script (as > in fonts: Arabic, Armenian, Balinese,...,Yi). And provide several sets > of mappings by type of transliteration. URLField could have an > optional argument for type of sluging. In other and uglier variant, > URLField would not be modified, but diferent JavaScript files had to > be added in admin.py. This way we could solve transliteration by look > and by sound, and different tipe of Eastern Asian Languages > transliterations.
I have a feeling these issues will eventually solve themselves once Django gets its standard AJAX library. Meanwhile, there is definitely no easy solution for this. You need at least one argument passed to the function, that defaults to Null in which case the current set of tables are used.
I don't know how this can be enabled without introducing locale-related stuff into admin site's forms, which is not too clean a solution. It's ok for slugify to accept such optional arguments, but I don't think it's a good idea for JS.
-- Branko -------------------------------------------------- To leave a quick note to Branko, use this page: http://www.wallwisher.com/wall/foxbunny --------------------------------------------------