Rich Gillam
unread,Feb 5, 2025, 7:51:24 PMFeb 5Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to design-wg (CLDR), icu-design
Hi everybody—
I’ve run into a number of bidi-related problems lately, and I need some guidance. I’m thinking about filing tickets, but thought it might be good to discuss by email before filing, so I have a better idea of what to say in the tickets.
I’ve run into a couple of situations lately where the bidi behavior of some element might be different depending on which digits we’re using to format a number, and there doesn’t seem to be any facility in CLDR to deal with this.
We offer the ability for users to choose their numbering system independent of their locale, so if your system language is Arabic or Urdu, you can use either native digits or Latin digits. But consider the degree sign: If you’re using Latin digits, you want it to stick to the right-hand side of the number, but if you’re using native digits, you want it to stick to the left-hand side of the number. We get that behavior in Arabic “for free” due to the characters’ bidi properties. But we don’t get that behavior in Urdu or Persian. And I can't just deal with this by changing our copy of CLDR to put a RLM in front of the degree sign, because that’ll move it to the left-hand side regardless of my numbering system. I end up having to include clumsy special-case code.
I’ve run into other variations on this: if I format a time in 12-hour format, which side “AM” or “PM” appears on might depend on the digits I’m using for the time, but I can’t control that, either (I also can’t peg it to one side or the other by changing the “AM” and “PM” strings— the bidi mark has to go on the other side of the time). I’ve also run into problems with currency formats where I’m operating in a RTL language but a particular currency symbol is all LTR characters.
So what’s the preferred solution to these kinds of problems? Right now I can’t think of anything other than special-case code.
For these and many other reasons, it seems like we should be getting away from embedding bidi controls in our CLDR data and moving to a code-based solution based on the bidi isolate characters (and even then, I’m not quite sure how to solve the above problems). What would it take to make that move?
—Rich Gillam