c.b...@posteo.jp
unread,Feb 5, 2024, 6:10:04 AMFeb 5You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
Dear Stephen,
thanks a lot. Your explanation helped me a bit.
How is this represented as bytes on the data disc?
As an simple example lets assume "ABC" is a word in Left-to-Right.
Making it a Hebrew word (e.g. via translation) it would be written "CBA"
because its read from Right-to-left, starting with "A", then "B" and "C"
at the end.
Am I right so far?
No lets add such a shortcut indicator to the first letter (the "A").
Weblate and Qt seems to use the correct BIDI algorithm and will display
it correctly like this:
"CB&A" (or an underlined "A" in a Qt GUI)
But a terminal without using the correct BIDI algorithm shows it like
this:
"&CBA"
I am aware that a unicode character consist of multiple bytes. Usually
it starts with 2 bytes and then there can come additional characters to
it. I remember the emoticon example of an black astronaut:
human+rocket+black (or something like this).
But please lets keep it simple and don't open the unicode box to much. I
assume there is a hidden control character indicating the read
direction?
So what is in the file?
&ABC
or
&CBA
I do guess it is the first (&ABC), right? It is coded into unicode that
the A the B and the C need to be read the "other way around"?
So the IO algorithm read something like this?
& reverted-A reverted-B reverted-C
Or even in Python:
myletters = ['&', 'A', 'B', 'C']
# but myleters[1:] are somehow coded as "other way around"
OK?
Kind
Christian