eid=" /akn/il/act/1961-08-01/@he/!main#point_64ב_א_3" <- It mixed!
This section gives examples of bidirectional IRIs, in Bidi Notation. It shows legal IRIs with the relationship between logical and visual representation, and explains how certain phenomena in this relationship may look strange to somebody not familiar with bidirectional behavior, but familiar to users of Arabic and Hebrew. It also shows what happens if the restrictions given in Section 4.2 are not followed. The examples below can be seen at [BidiEx], in Arabic, Hebrew, and Bidi Notation variants.
To read the bidi text in the examples, read the visual representation from left to right until you encounter a block of rtl text. Read the rtl block (including slashes and other special characters) from right to left, then continue at the next unread ltr character.
Example 1: A single component with rtl characters is inverted:
logical representation: http://ab.CDEFGH.ij/kl/mn/op.html
visual representation: http://ab.HGFEDC.ij/kl/mn/op.html
Components can be read one-by-one, and each component can be read in its natural direction.
Example 2: More than one consecutive component with rtl characters is inverted as a whole:
logical representation: http://ab.CDE.FGH/ij/kl/mn/op.html
visual representation: http://ab.HGF.EDC/ij/kl/mn/op.html
A sequence of rtl components is read rtl, in the same way as a sequence of rtl words is read rtl in a bidi text.
Example 3: All components of an IRI (except for the scheme) are rtl. All rtl components are inverted overall:
logical representation: http://AB.CD.EF/GH/IJ/KL?MN=OP;QR=ST#UV
visual representation: http://VU#TS=RQ;PO=NM?LK/JI/HG/FE.DC.BA
The whole IRI (except the scheme) is read rtl. Delimiters between rtl components stay between the respective components; delimiters between ltr and rtl components don't move.
Example 4: Several sequences of rtl components are each inverted on their own:
logical representation: http://AB.CD.ef/gh/IJ/KL.html
visual representation: http://DC.BA.ef/gh/LK/JI.html
Each sequence of rtl components is read rtl, in the same way as each sequence of rtl words in an ltr text is read rtl.
Example 5: Example 2, applied to components of different kinds:
logical representation: http://ab.cd.EF/GH/ij/kl.html
visual representation: http://ab.cd.HG/FE/ij/kl.html
The inversion of the domain name label and the path component may be unexpected, but is consistent with other bidi behavior. For reassurance that the domain component really is "ab.cd.EF", it may be helpful to read aloud the visual representation following the bidi algorithm. After "http://ab.cd." one reads the RTL block "E-F-slash-G-H", which corresponds to the logical representation.
Example 6: Same as example 5, with more rtl components:
logical representation: http://ab.CD.EF/GH/IJ/kl.html
visual representation: http://ab.JI/HG/FE.DC/kl.html
The inversion of the domain name labels and the path components may be easier to identify because the delimiters also move.
Example 7: A single rtl component with included digits:
logical representation: http://ab.CDE123FGH.ij/kl/mn/op.html
visual representation: http://ab.HGF123EDC.ij/kl/mn/op.html
Numbers are written ltr in all cases, but are treated as an additional embedding inside a run of rtl characters. This is completely consistent with usual bidirectional text.
Example 8 (not allowed): Numbers at the start or end of a rtl component:
logical representation: http://ab.cd.ef/GH1/2IJ/KL.html
visual representation: http://ab.cd.ef/LK/JI1/2HG.html
The sequence '1/2' is interpreted by the bidi algorithm as a fraction, fragmenting the components and leading to confusion. There are other characters that are interpreted in a special way close to numbers, in particular '+', '-', '#', '$', '%', ',', '.', and ':'.
Example 9 (not allowed): The numbers in the previous example are percent-encoded:
logical representation: http://ab.cd.ef/GH%31/%32IJ/KL.html,
visual representation (Hebrew): http://ab.cd.ef/%31HG/LK/JI%32.html
visual representation (Arabic): http://ab.cd.ef/31%HG/%LK/JI32.html
Depending on whether the upper-case letters represent Arabic or Hebrew, the visual representation is different.
Example 10 (allowed, but not recommended):
logical representation: http://ab.CDEFGH.123/kl/mn/op.html
visual representation: http://ab.123.HGFEDC/kl/mn/op.html
Components consisting of only numbers are allowed (it would be rather difficult to prohibit them), but may interact with adjacent RTL components in ways that are not easy to predict.
--
Finally, I assume that subpoint and paragraph correspond to actual structures in the XML document, and therefore you cannot be silent of their names according to the AKN Naming Convention. So:
NOT
> eId="point_64ב_א_ב"
BUT
> eId="point_64ב__subpoint_א__paragraph_ב"
and somewhere there, I do not know exactly, you should place the LRE and/or PDF and/or RLM characters.
To view this discussion on the web visit https://groups.google.com/d/msgid/akomantoso-xml/8FAB5A2F-7191-4ACA-BF7B-F2D2589D249F%40gmail.com.
For each component, the following restrictions apply:
1. A component SHOULD NOT use both right-to-left and left-to-right
characters.
2. A component using right-to-left characters SHOULD start and end
with right-to-left characters.
Example 8 (not allowed): Numbers at the start or end of a rtl component:logical representation: http://ab.cd.ef/GH1/2IJ/KL.htmlvisual representation: http://ab.cd.ef/LK/JI1/2HG.htmlThe sequence '1/2' is interpreted by the bidi algorithm as a fraction, fragmenting the components and leading to confusion. There are other characters that are interpreted in a special way close to numbers, in particular '+', '-', '#', '$', '%', ',', '.', and ':'.
> To unsubscribe from this group and stop receiving emails from it, send an email to akomant...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/akomantoso-xml/2046c498-d1ec-461c-aa7b-a7b9b9ee4c18%40googlegroups.com.
--
Fabio Vitali The sage and the fool
Dept. of Informatics go to their graves
Univ. of Bologna ITALY alike in this respect:
phone: +39 051 2094872 both believe the sage to be a fool.
e-mail: fa...@cs.unibo.it Where, then, may wisdom be found?
http://vitali.web.cs.unibo.it/ Qi, "Neither Yes nor No", The codeless code
--
You received this message because you are subscribed to the Google Groups "akomantoso-xml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akomant...@googlegroups.com.
Hi friends!Fabio and Ashok thank you very much for the detailed and fast response! I appreciate it.First, Fabio, of course you are right, it about href, not eid, copy error. And also for the full, rather than shortened, path as I wrote.When I write the full path(!main#point_64ב__point_א__point_3 instead of point_64ב_א_3 ), the problem solved, I can read the xml file and understand, it not mixed.There are 2 issues here:1- How to read the XML document in text or xml Editor.2- How a internet browser/PDF displays or interprets the xml.The solutions you have mentioned, Fabio, require the addition of special characters (like + U202A, etc.), which helps the browser interpretation, but makes the xml doc itself to be less human readable.The main thing that bothered me was that the xml document itself became less human readable.Beyond that, i didn't understand some of the RFCs guidelines.For each component, the following restrictions apply:
1. A component SHOULD NOT use both right-to-left and left-to-right
characters.
2. A component using right-to-left characters SHOULD start and end
with right-to-left characters.It says that it SHOULD NOT contains both RTL & LTR in one component, and it SHOULD NOT end a path with RTL letter if it starts with LTR.But there are paths that end in this way, like !main#point_64ב__point_א -> this not allowed?
The above restrictions are given as shoulds, rather than as musts.
For IRIs that are never presented visually, they are not relevant.
However, for IRIs in general, they are very important to ensure
consistent conversion between visual presentation and logical
representation, in both directions.
In the RFC that Ashok put above, in Example 8:Example 8 (not allowed): Numbers at the start or end of a rtl component:logical representation: http://ab.cd.ef/GH1/2IJ/KL.htmlvisual representation: http://ab.cd.ef/LK/JI1/2HG.htmlThe sequence '1/2' is interpreted by the bidi algorithm as a fraction, fragmenting the components and leading to confusion. There are other characters that are interpreted in a special way close to numbers, in particular '+', '-', '#', '$', '%', ',', '.', and ':'.It says that it SHOULD NOT end with a number, like !main#point_64ב__point_א__point_3 -> this not allowd?
To unsubscribe from this group and stop receiving emails from it, send an email to akomantoso-xm...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/akomantoso-xml/59e887f5-74a9-4cfd-ac43-de472fbe920b%40googlegroups.com.