Difference between Arabic-indic and Persian digits on directionallity

177 views
Skip to first unread message

Ebrahim

unread,
Oct 26, 2012, 5:53:06 AM10/26/12
to persian-...@googlegroups.com
Hi!

I want to know why Arabic-indic characters are different in directionallity from Persian digits?

data:text/html;charset=utf-8,<div dir="RTL"><p>|age = %D9%A5</p><p>|age = %DB%B5</p></div> test this for example.

This is available on Firefox, IE and Chrome.

This inconsistency is really annoying on RTL LTR text editing. For example, see this edit on Wikipedia http://fa.wikipedia.org/w/index.php?diff=8335954 where fifth and seventh line is readable before converting Arabic-indic digits to Persian on the diff (right side of diff) but after this edit will be unreadable (on the left side)

(A screenshot of Chrome on Windows 8 is also attached in order if you can not see my problem in your OS (if direction properties is related to OS, I don't know).)
Difference between Persian and Arabic-indic digits.png

Shervin Afshar

unread,
Oct 26, 2012, 5:28:29 PM10/26/12
to Ebrahim, persian-...@googlegroups.com
Hi, 

First of all, Arabic-Indic digits should not be used with Persian text. Eastern Arabic-Indic digits should be used. I think all the bots on w:fa: are trained to do this replacement fix. 

The issue that you are facing while editing the text is a bidi display issue and not an inconsistency. Since the Unicode bidi class of Eastern Arabic-Indic digits is EN (http://unicode.org/cldr/utility/character.jsp?a=%DB%B6&B1=Show, in contrast to AN http://unicode.org/cldr/utility/character.jsp?a=%D9%A6&B1=Show), they are resolved in a different manner than Arabic-Indic ones while going through the Unicode bidi algo. See the resolution details below:
Since this is not generated in the actual page, there's not much to worry about. If users are still concerned about the display of this while editing, I changed the template (http://fa.wikipedia.org/wiki/%D8%A7%D9%84%DA%AF%D9%88:Infobox_fossil) in a manner to use Arabic script names for parameters http://fa.wikipedia.org/wiki/%D8%A7%D9%84%DA%AF%D9%88:%D9%81%D8%B3%DB%8C%D9%84. Here is the diff: http://fa.wikipedia.org/w/index.php?title=%D8%AC%D9%85%D8%AC%D9%85%D9%87_%DA%AF%D9%88%DB%8C%D8%B3&diff=8339502&oldid=8335954

Shervin

Connie Bobroff

unread,
Oct 28, 2012, 2:57:32 AM10/28/12
to Shervin Afshar, Ebrahim, persian-...@googlegroups.com
This is a very serious issue because in order to "solve" the problem, people are now changing the order of input (the order in which they type the digits). A date like this
 ۲/۲/۷۵
will display very differently in different browsers.
See for example
in Internet Explorer (digits displayed correctly)
and Chrome and Firefox (digits displayed incorrect.)

Behnam Rassi

unread,
Oct 27, 2012, 9:45:00 AM10/27/12
to Shervin Afshar, persian-...@googlegroups.com
(off topic)
I think decimal separator is being used instead of thousand separator. This is not noticable when viewing with Tahoma but my computer changes Tahoma automatically to a font I made (I can't stand Tahoma!) and there, it's quite noticable.
-b

Behnam Esfahbod

unread,
Oct 28, 2012, 3:15:45 AM10/28/12
to Connie Bobroff, Persian Computing
Connie,

The problem you are referring to is fairly different from the thread topic. Your concern has been discussed previously on P-C and on Unicode mailing list (http://unicode.org/mail-arch/unicode-ml/y2011-m10/0081.html)

-Behnam


On Sun, Oct 28, 2012 at 2:57 AM, Connie Bobroff <con...@gmail.com> wrote:
This is a very serious issue because in order to "solve" the problem, people are now changing the order of input (the order in which they type the digits). A date like this
 ۲/۲/۷۵
will display very differently in different browsers.
See for example
in Internet Explorer (digits displayed correctly)
and Chrome and Firefox (digits displayed incorrect.)

 
 
--
Behnam Esfahbod | بهنام اسفهبد
http://behnam.es/
GPG Fingerprint: 3E7F B4B6 6F4C A8AB 9BB9 7520 5701 CA40 259E 0F8B


Ebrahim Byagowi

unread,
Oct 30, 2012, 5:31:13 PM10/30/12
to Persian Computing
Ops, Now I notice that I sent my mail directly to Shervin Afshar instead of sending it to the group. This was my reply to Shervin:
"Thanks for introducing me that tools and your related edits on Wikipedia.

I did know that Arabic-Indic digits must not be used with Persian texts, but my main question has not any relation to Wikipedia pages, templates, or users requests. I showed that just as an example where Arabic-Indic digits are used on Arabic (Persian) script and it is showing that Arabic-Indic digits (and their specifications) are making editing mixed RTL LTR texts more easier than Persian digits on Arabic script. 

For example on: data:text/html;charset=utf-8,<div><p>|date discovered = %D9%A1%D9%A6 %D9%81%D9%88%D8%B1%DB%8C%DB%80 %D9%A2%D9%A0%D9%A0%D9%A6 %D9%85%DB%8C%D9%84%D8%A7%D8%AF%DB%8C</p><p>|date discovered = %DB%B1%DB%B6 %D9%81%D9%88%D8%B1%DB%8C%D9%87%D9%94 %DB%B2%DB%B0%DB%B0%DB%B6 %D9%85%DB%8C%D9%84%D8%A7%D8%AF%DB%8C</p></div> 
Which one do you thing is more readable? I think Arabic-Indic digits.

If I want to explain my question again, why Persian digits have not same bidi class as Arabic-Indic digits? In my point of view this is such an inconsistency where "ARABIC-INDIC DIGIT"s have "Arabic_Number" bidi class but "EXTENDED(?) ARABIC-INDIC DIGIT"s have "European_Number" bidi class. Why they are not in a same class?

IMO someone should request to edit the specifications and putting "EXTENDED ARABIC-INDIC DIGIT"s on same bidi class as "ARABIC-INDIC DIGIT". Where I can submit such as these requests?"
And Shervin respond that with "[..] it's better to first discuss this here [..]" and I am agreeing him on this, so somebody please tell me/us what was the reason for this difference. Thanks :)




--
Ebrahim Byagowi

Shervin Afshar

unread,
Oct 31, 2012, 3:16:00 PM10/31/12
to Ebrahim Byagowi, Persian Computing
Here's my complete response:

All right. Now, it's clear what you meant. According to any source that I checked, use of Arabic-Indic digits with Persian text is considered not acceptable (e.g. http://behdad.org/download/Publications/persiancomputing/a007.pdf, page 9). I checked UBA and the Standard and couldn't find out (or sort out) what is the rationale behind the difference in the bidi class.
All in all, I think there is a strong reason for this being the way it is. So it's better to first discuss this here. You can always find a way to make yourself heard to Unicode (http://www.unicode.org/reporting.html).
Reply all
Reply to author
Forward
0 new messages