Experimental bidirectional support for Arabic and Hebrew

182 views
Skip to first unread message

Neil Hodgson

unread,
May 24, 2018, 12:02:00 AM5/24/18
to scite-i...@googlegroups.com, Scintilla mailing list
Scintilla has recently added some support for bidirectional text on Windows and a ‘bidirectional’ property is now available in SciTE to control it. Bidirectional support allows text that contains Arabic or Hebrew characters (which are drawn from right to left) to be edited more correctly.

The main changes in bidirectional mode are that selections and indicators should cover the correct characters which may man that two visual ranges are drawn. Carets should be drawn in the correct location. This means that pressing the right arrow to move through Arabic text will actually move the caret to the left. Mouse clicks and hovers should resolve to the correct position so clicking will show a caret near the click in most cases.

The implementation works with the most common modes but has issues with less common settings such as virtual space. In bidirectional mode, selections in virtual space may not be visible but the caret should be. However, if the last character on a line is Arabic then a mouse click will place the caret at the line end - not in virtual space.

The ‘bidirectional’ property may be set to
0 - disabled, like previous SciTE
1 - enabled, with text left-justified
2 - *not yet implemented* enabled, with text right justified
SciTE should be using DirectWrite (technology=1) and the file must be UTF-8.

The limitations mean that this mode is experimental and the ‘bidirectional’ property will not be described in the documentation for now.

The code is in the repository. Test source and executables are available from:
https://www.scintilla.org/scite.zip Source
https://www.scintilla.org/wscite.zip Windows executable

Most of the work on bidirectional text was contributed by Uniface and implemented by Raghda Morsy.

Neil

Raghda Morsy

unread,
May 24, 2018, 11:34:07 AM5/24/18
to scintilla-interest
Thank you so much Neil, I'm excited to see others feedback :)

Florian Balmer

unread,
May 27, 2018, 2:56:18 PM5/27/18
to scintilla...@googlegroups.com
Now that's great news!

It's amazing to see "runs" of right-to-left text rendered with correct
right-to-left layout. And, with 'bidirectional=1', it's surprising to see
the caret perform "unexpected" jumps (at least, unexpected for me).

But here my use as a tester is already exhausted. Unfortunately, I do not
have the slightest idea of any language with right-to-left layout, and I
have never been using a right-to-left capable text editor, so far (except,
maybe Word, but not to edit bidirectional text). For my language, the
character sets of Windows-1252 or ISO-8859-1 are already sufficient.

I hope you will get a lot of feedback from people with knowledge of Arabic
and Hebrew.

I'm closely following anything posted to scintilla-interest, and I'm
periodically having a look at the Scintilla commit history. I know that
Scintilla as a whole, not only the very latest bidirectional support, is
hard work, with a lot of routine maintenance, and it's consistently being
delivered at the highest level, with a lot of enthusiasm.

After following this project for more than 15 years, I'm still very
impressed. Thank you very much for all your appreciated efforts!

My ideas for Notepad2 would almost be enough for a second life, but it's
just a hobby, and it has to share its dedicated time with other things.
Nevertheless, I hope I will be able to do some more work on Notepad2, and
in any case, it will be a pleasure to use Scintilla!

--Florian

Raghda Morsy

unread,
May 27, 2018, 4:43:51 PM5/27/18
to scintilla-interest
Thank you Florian for your support :) 

Neil Hodgson

unread,
May 27, 2018, 7:43:26 PM5/27/18
to scintilla...@googlegroups.com
The bidirectional implementation on Win32 has each method containing some common code:

* get the initial font;
* create a wide string version of the line’s text;
* create an IDWriteTextLayout;
* fill out the text layout with each character’s font.

Then, each method extracts the information it wants from that IDWriteTextLayout performing some correlation between char and wide strings.

The shared set up code should be extracted into a common function or class.

Further, the primary result, an IDWriteTextLayout, is independent of the drawing surface and could be cached for re-use. Because a single line redraw may draw a selection with a caret and a couple of indicators, this would amortize the cost of creating the IDWriteTextLayout. This also occurs between draws: when a caret blinks or moves, it can re-use the IDWriteTextLayout from the previous draw.

Some experimenting on macOS has shown that it can also work similarly, with a Core Text CTLine object fulfilling the same role as IDWriteTextLayout on Win32. Each platform would have its own text layout class that provided a common interface. This may not work so well if a platform needs a drawing surface for measuring text but that doesn’t appear to be the case nowadays with resolution-independent text positioning.

The attached patch goes part way towards this: it encapsulates the IDWriteTextLayout in a ScreenLineLayout object but retains the current Surface interface, not allowing the ScreenLineLayout to be extracted and cached.

Neil
ScreenLineCommon.patch

Neil Hodgson

unread,
May 29, 2018, 9:11:16 AM5/29/18
to scintilla...@googlegroups.com
Attached is an initial patch implementing the bidirectional methods for Cocoa.

This is to confirm that the interface is not just Windows-only - the first port is commonly the most difficult and likely to reveal poor assumptions.

With Core Text, Cocoa’s text subsystem, there is no equivalent of the DirectWrite HitTestTextRange call but the same information can be retrieved with some deeper examination of the layout at the glyph-run level.

The SurfaceImpl::MeasureWidths method was changed to match DirectWrite’s behaviour for bidirectional text - the positions returned are in ascending order based on the absolute value of the width (or advancement) of each character. Thus, the positions returned by MeasureWidths can’t be used for placing the caret or selection for R2L text. In bidirectional mode that is provided by the new methods. Its important that the last element of the returned positions is the width of that text run so that the runs can be placed on screen.

Core Text automatically flips neutral characters around to follow right-to-left characters where DirectWrite doesn’t. So the comment
//ج
appears as
ج//
Its possible that the direction of neutral characters could be changed with Core Text if needed.

An image of this code running can be seen at the following URL. Its an unreasonably messy example to try to exercise some edge cases but it shows where the single selection that is drawn as two rectangles on line 9 matches several two-rectangle locations marked with dotted outlines.
https://www.scintilla.org/CocoaBidiMarks.png

Its unlikely I’ll work on adding implementations for GTK+ or Qt at this time but other contributors may be interested.

Some parts of the patch may be inefficient and could be optimized if needed.

Neil
ScreenLayoutMac3.patch

Neil Hodgson

unread,
May 30, 2018, 7:21:51 PM5/30/18
to scintilla...@googlegroups.com
The approach of returning a cacheable ScreenLineLayout object from Surface appears viable. The attached patch implements this for both Win32 DirectWrite and Cocoa.

Unless problems are raised for implementing this on other platforms, it will be committed soon.

Neil
ScreenLineCommon4.patch

Neil Hodgson

unread,
Jun 1, 2018, 7:45:31 PM6/1/18
to Scintilla mailing list
The IScreenLineLayout approach has been committed.

Bidirectional text has been implemented on Cocoa.

Selections in virtual space are now visible when bidirectional mode is active. However, mouse clicks in virtual space do not work when the final visible character on the line is R2L.

Commits:
https://sourceforge.net/p/scintilla/code/ci/23a98ab36601cce0d19db64be427cd5de71f1957/
https://sourceforge.net/p/scintilla/code/ci/5f4011e010f96cf2a2c033d960e8d76fff2fce9f/
https://sourceforge.net/p/scintilla/code/ci/0d86879c5ca5ea41ee142072e89d6f8cffa2586e/

Neil

Reply all
Reply to author
Forward
0 new messages