Bugs that affect Persian typography on the web

92 views
Skip to first unread message

Mostafa Hajizadeh

unread,
Aug 28, 2012, 5:33:42 AM8/28/12
to persian-...@googlegroups.com
Hi all,

I've started to make a list of bugs that affect Persian typography on the web. You can find it here. It's very limited now. Everybody can help by mentioning bugs missed in the list. You can also help by giving or gathering more information about the bugs and reporting them to the developers.

Would be happy to hear from you about this.

I have a blog post (in Persian) about this.

Regards,
Mostafa

Ehsan Akhgari

unread,
Aug 28, 2012, 10:31:12 AM8/28/12
to Mostafa Hajizadeh, persian-...@googlegroups.com
Hi Mostafa,

Thanks for doing this, this is great!  While gathering lists like this are useful, it's usually best to report these bugs to the vendors of the browsers in question, so that they get a chance to fix them (there is always a chance that they don't know about it before you report it to them!).  Here are a list of bug tracking software for the popular browser vendors:

1. Mozilla: https://bugzilla.mozilla.org/ (if you file a Mozilla bug, please make sure to CC me on it directly using my first name @mozilla.com).
2. WebKit (the engine behind Safari, Chrome, and Android's web browser): https://bugs.webkit.org/
3. Chrome (if you're not sure whether something is a WebKit or Chrome bug, report it here): http://crbug.com/
4. Safari: https://developer.apple.com/bugreporter/ (I'm not really sure if they actually look at the bugs people report...)
5. Android: http://source.android.com/source/report-bugs.html (In my experience, the Android team is not really responsive to external reports... but I hope that has changed now)
6. IE: http://connect.microsoft.com/IE (I have mixed experience with reporting IE bugs, but it's always best to try)

For the public bug tracking sites among the above (number 1, 2, 3, and 5), it's also a good idea to include a link to the reported bugs in your list for future reference.

Cheers,
--
Ehsan
<http://ehsanakhgari.org/>


Behdad Esfahbod

unread,
Aug 28, 2012, 3:31:11 PM8/28/12
to Ehsan Akhgari, Mostafa Hajizadeh, persian-...@googlegroups.com
+1.

For Chrome, Android, and Webkit issues, feel free to CC me on the bugs you
report (beh...@google.com or beh...@chromium.org should do), and I'll make
sure they get fixed.

The ZWNJ issues you see OS X have been a Webkit regression that we fixed
recently. See:

https://bugs.webkit.org/show_bug.cgi?id=89843

I also got the infamous Chrome Linux spacing issue fixed a couple months ago:

http://code.google.com/p/chromium/issues/detail?id=105685#c46

I shall go write a blogpost about these I think.

behdad

On 08/28/2012 10:31 AM, Ehsan Akhgari wrote:
> Hi Mostafa,
>
> Thanks for doing this, this is great! While gathering lists like this are
> useful, it's usually best to report these bugs to the vendors of the browsers
> in question, so that they get a chance to fix them (there is always a chance
> that they don't know about it before you report it to them!). Here are a list
> of bug tracking software for the popular browser vendors:
>
> 1. Mozilla: https://bugzilla.mozilla.org/ (if you file a Mozilla bug, please
> make sure to CC me on it directly using my first name @mozilla.com
> <http://mozilla.com>).
> 2. WebKit (the engine behind Safari, Chrome, and Android's web browser):
> https://bugs.webkit.org/
> 3. Chrome (if you're not sure whether something is a WebKit or Chrome bug,
> report it here): http://crbug.com/
> 4. Safari: https://developer.apple.com/bugreporter/ (I'm not really sure if
> they actually look at the bugs people report...)
> 5. Android: http://source.android.com/source/report-bugs.html (In my
> experience, the Android team is not really responsive to external reports...
> but I hope that has changed now)
> 6. IE: http://connect.microsoft.com/IE (I have mixed experience with reporting
> IE bugs, but it's always best to try)
>
> For the public bug tracking sites among the above (number 1, 2, 3, and 5),
> it's also a good idea to include a link to the reported bugs in your list for
> future reference.
>
> Cheers,
> --
> Ehsan
> <http://ehsanakhgari.org/>
>
>
> On Tue, Aug 28, 2012 at 5:33 AM, Mostafa Hajizadeh <most...@gmail.com
> <mailto:most...@gmail.com>> wrote:
>
> Hi all,
>
> I've started to make a list of bugs that affect Persian typography on the
> web. You can find it here
> <https://trello.com/board/browser-bugs-for-persian/4fae6912e93d630e7137e07f>.
> It's very limited now. Everybody can help by mentioning bugs missed in the
> list. You can also help by giving or gathering more information about the
> bugs and reporting them to the developers.
>
> Would be happy to hear from you about this.
>
> I have a blog post (in Persian) about this
> <http://sarkeshtype.ir/writings/technical-problems-for-persian-typography-on-web/>.

Mostafa Hajizadeh

unread,
Aug 29, 2012, 8:03:29 AM8/29/12
to persian-...@googlegroups.com
Thanks a lot for your help.

Sure, the list is just the beginning. Your information helps a lot in filing bug reports.

I made a new column and a bunch of labels to keep track of progress on these bugs.

I’ll file necessary bug reports very soon.

Cheers,
Mostafa

reza moshksar

unread,
Oct 10, 2012, 3:51:19 PM10/10/12
to Mostafa Hajizadeh, persian-...@googlegroups.com



--
--
REZA MOSHKSAR
Phd candidate in Building, Environment, Science and Technology
B.E.S.T Department - Politecnico di Milano
Via Bonardi 9 20133 Milano Italy


Mohsen BANAN

unread,
Oct 11, 2012, 10:40:38 PM10/11/12
to persian-...@googlegroups.com

For that bug list, the following is a bug that
affects more than Persian typography on the
web. It affects communication.

Mozilla/Firefox and Chrome (as of late 2012) do
not conform to auto paragraph detection as
specified in Unicode Bidirectional Algorithm
http://unicode.org/reports/tr9/ (UAX #9).

This is an egregious Mozilla/Firefox and Chrome
bug, because it materially impacts human
communication.

I have expanded on this with an example in:

http://www.persoarabic.org/answers#bidiBrowserProblems

Was that already included in the list?

Best,

...Mohsen

Mostafa Hajizadeh

unread,
Oct 16, 2012, 8:39:49 AM10/16/12
to persian-...@googlegroups.com, list-g...@mohsen.1.banan.byname.net
Hi Mohsen,

Thanks for explanation on this bug. Is this for paragraphs or inline texts? This seems to be about elements that don’t have an explicit direction. Does anybody know what W3C says about this? Should browsers detect paragraph direction based on UAX #9 when it’s not given explicitly? Also, a sample page would help in understanding this bug clearly.

And a general update: I updated the board: moved a few cards to resolved, added new ones which need more information, and reported others to developers. Take a look at the board to see the current state.

Mostafa

Ehsan Akhgari

unread,
Oct 16, 2012, 8:49:12 AM10/16/12
to Mostafa Hajizadeh, persian-...@googlegroups.com, list-g...@mohsen.1.banan.byname.net
Browsers resolve the base directionality of text based on the value of the dir HTML element.  The new dir=auto HTML attribute which is implemented in WebKit and soon to be implemented in Firefox will enable the example you quote in the link below to be rendered properly.


Cheers,
--
Ehsan
<http://ehsanakhgari.org/>


Mostafa Hajizadeh

unread,
Oct 16, 2012, 8:54:00 AM10/16/12
to persian-...@googlegroups.com, Mostafa Hajizadeh, list-g...@mohsen.1.banan.byname.net
Thanks Ehsan. So this can’t be considered a bug currently.

Mohsen BANAN

unread,
Oct 17, 2012, 12:31:38 PM10/17/12
to Mostafa Hajizadeh, Ehsan Akhgari, persian-...@googlegroups.com, list-g...@mohsen.1.banan.byname.net

Salaam Mostafa and Ehsan,

My responses and comments are in-line below:

>>>>> On Tue, 16 Oct 2012 05:54:00 -0700 (PDT), Mostafa Hajizadeh <most...@gmail.com> said:

Mostafa> Thanks Ehsan. So this can’t be considered a bug currently.

No, I disagree.

This was a bug yesterday. And because dir=auto has
been added to HTML, does not result in it being
not a bug currently.

I see that you are both using gmail.com -- a mostly web
based mail user agent.

Let me show you the bug. Here it is:

یک دو سه four five six هفت هشت نه ten eleven.


Now, if you see that in natural count and natural
mixed direction as a RTL paragraph; then this can't be considered a
bug currently.

If you see it all messed up in order and as a LTR
paragraph then there is a bug.

Note that as a result, email communication has
been impacted and note that I am not even sending
this with any use of HTML at all.

This entire email is generated in plain text,
using emacs's Gnus through gmane (not
googlegroups) and currently from Tehran -- fully
conforming to internet email specifications.

I had sent two previous emails to the group about
Persian use of Emacs and Gnus with the following
subject lines:

PersoArabic With Emacs -- نگارش به فارسی با ایمکس
Persian Input Methods -- شیوه‌هایِ درج به فارسی

Both of these seem to have been suppressed
(censored) by the list administrator.

Note to List Admins: If that is the case, at a
minimum you should say that you suppressed it and
explain why.

I chose to bring up this bug, because it impacts
Persian interoperability.

Emacs is properly conforming to auto paragraph detection as
specified in Unicode Bidirectional Algorithm
http://unicode.org/reports/tr9/ (UAX #9).

And the browser based webmails (gmail.com ...) are non-conformant.

Below is my suggestion for a proper fix.

Mostafa> On Tuesday, October 16, 2012 4:19:53 PM UTC+3:30, Ehsan Akhgari wrote:

Ehsan> Browsers resolve the base directionality of text based on the value of the
Ehsan> dir HTML element. The new dir=auto HTML attribute which is implemented in
Ehsan> WebKit and soon to be implemented in Firefox will enable the example you
Ehsan> quote in the link below to be rendered properly.

Ehsan, just that can't be the fix.

Here is the right way of doing it -- which is what
has been done in emacs.

You look at the character set of what is to be
rendered, if it includes an RTL language then you
assume auto direction detection unless it is
otherwise specified.

This should be a browser behaviors even outside of
HTML considerations. For example visiting a plain text bidi
file render should come out correct when you do
something like

file://tmp/example.bidi

It is only such a fix, that really is a fix --
just adding dir=auto to HTML and just implementing
that won't cut it.

Best,

...Mohsen

[ No more in-line comments below. ]


Mostafa> On Tue, Oct 16, 2012 at 8:39 AM, Mostafa Hajizadeh <most...@gmail.com>
Mostafa> wrote:

Mostafa> Hi Mohsen,

Mostafa> Thanks for explanation on this bug. Is this for paragraphs or inline
Mostafa> texts? This seems to be about elements that don’t have an explicit
Mostafa> direction. Does anybody know what W3C says about this? Should browsers
Mostafa> detect paragraph direction based on UAX #9 when it’s not given
Mostafa> explicitly? Also, a sample page would help in understanding this bug
Mostafa> clearly.

Mostafa> And a general update: I updated the board: moved a few cards to
Mostafa> resolved, added new ones which need more information, and reported
Mostafa> others to developers. Take a look at the board to see the current
Mostafa> state.

Mostafa> Mostafa

Mostafa> On Friday, October 12, 2012 6:10:38 AM UTC+3:30, Mohsen BANAN wrote:


Mohsen> For that bug list, the following is a bug that
Mohsen> affects more than Persian typography on the
Mohsen> web. It affects communication.

Mohsen> Mozilla/Firefox and Chrome (as of late 2012) do
Mohsen> not conform to auto paragraph detection as
Mohsen> specified in Unicode Bidirectional Algorithm
Mohsen> http://unicode.org/reports/tr9/ (UAX #9).

Mohsen> This is an egregious Mozilla/Firefox and Chrome
Mohsen> bug, because it materially impacts human
Mohsen> communication.

Mohsen> I have expanded on this with an example in:

Mohsen> http://www.persoarabic.org/answers#bidiBrowserProblems

Mohsen> Was that already included in the list?

Mohsen> Best,

Mohsen> ...Mohsen




Mostafa Hajizadeh

unread,
Oct 17, 2012, 2:52:15 PM10/17/12
to Mohsen BANAN, Ehsan Akhgari, persian-...@googlegroups.com
Dear Mohsen,

I see your point. We should be able to communicate with plain text in Persian. That’s a legitimate concern and I really appreciate your dedication to changing that.

But even if we go ahead and report this as a bug it’ll probably won’t be accepted because we don’t have a high chance in convincing developers to change default browser behavior for every element when W3C does not recommend that.

I guess the best way to go about this is to ask service providers to set dir="auto" for user-generated content.

Thank you,
Mostafa

Ehsan Akhgari

unread,
Oct 17, 2012, 5:08:57 PM10/17/12
to Mohsen BANAN, Mostafa Hajizadeh, persian-...@googlegroups.com
Dear Mohsen,

I understand all of your concerns, and I also understand that this is frustrating to you.  And I totally agree that a plain text mail client should be able to handle this through the application of the UBA algorithm.  What I was trying to point out was that this has never been intended to work *in HTML*, and dir=auto is the first attempt of browser vendors to fix this problem.  For more information about the concept of directionality in HTML, please see <http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#the-directionality>.

I have been part of the group who proposed the new dir=auto directionality mode, so I would be very happy to know if you think that using dir=auto would not cover your concern in the case of HTML email clients.  The biggest issue about dir=auto is that the algorithm used to guess the directionality of a text is very simplistic, since it only looks at the first strong character for a sentence, but after a long discussion we decided that it is not possible to design an algorithm which can handle each and every case.  One example where the dir=auto directionality mode would resolve the wrong directionality (in this case, the correct directionality would be LTR) would be the following;

<p dir=auto>احسان is my name spelled in Persian.</p>


Cheers,
--
Ehsan
<http://ehsanakhgari.org/>


Behdad Esfahbod

unread,
Oct 17, 2012, 9:21:15 PM10/17/12
to Ehsan Akhgari, Mohsen BANAN, Mostafa Hajizadeh, persian-...@googlegroups.com
On 12-10-17 02:08 PM, Ehsan Akhgari wrote:
...
> <http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#the-directionality>.
>
> I have been part of the group who proposed the new dir=auto directionality
> mode, so I would be very happy to know if you think that using dir=auto would
> not cover your concern in the case of HTML email clients. The biggest issue
> about dir=auto is that the algorithm used to guess the directionality of a
> text is very simplistic, since it only looks at the first strong character for
> a sentence, but after a long discussion we decided that it is not possible to
> design an algorithm which can handle each and every case. One example where
> the dir=auto directionality mode would resolve the wrong directionality (in
> this case, the correct directionality would be LTR) would be the following;
>
> <p dir=auto>احسان is my name spelled in Persian.</p>

That's bad form anyway :).

b


> Cheers,
> --
> Ehsan
> <http://ehsanakhgari.org/>
>
>
> On Wed, Oct 17, 2012 at 12:31 PM, Mohsen BANAN
> <list-g...@mohsen.1.banan.byname.net
> <mailto:list-g...@mohsen.1.banan.byname.net>> wrote:
>
>
> Salaam Mostafa and Ehsan,
>
> My responses and comments are in-line below:
>
> >>>>> On Tue, 16 Oct 2012 05:54:00 -0700 (PDT), Mostafa Hajizadeh
> <most...@gmail.com <mailto:most...@gmail.com>> said:
>
> Mostafa> Thanks Ehsan. So this can’t be considered a bug currently.
>
> No, I disagree.
>
> This was a bug yesterday. And because dir=auto has
> been added to HTML, does not result in it being
> not a bug currently.
>
> I see that you are both using gmail.com <http://gmail.com> -- a mostly web
> And the browser based webmails (gmail.com <http://gmail.com> ...) are
> <most...@gmail.com <mailto:most...@gmail.com>>
Reply all
Reply to author
Forward
0 new messages