RECONVERT STRING for CJK IMES on GTK

275 views
Skip to first unread message

johnsonj

unread,
Aug 6, 2015, 4:59:56 AM8/6/15
to scintilla-interest
I have managed to make the initial working version on GTK.
Now On Korean IME, Hanja Candidate Box can show up on a already commited hangul character.

IMErecovert-gtk-0806.patch

johnsonj

unread,
Aug 8, 2015, 6:01:51 AM8/8/15
to scintilla-interest
demonstration lubuntu 15.04 gtk2.4 with fcitx


https://www.youtube.com/watch?v=OUQAR9bYzGk&feature=youtu.be


johnsonj

unread,
Aug 8, 2015, 9:43:09 PM8/8/15
to scintilla-interest
            reconvert
-----------------------------------
            qt4.8       qt5.3
ibus         X             X
fcitx         O            O
*ibus does not go well with qt5.3


             gtk2.4
---------------------------------------------
             hanja convert     reconvert
ibus      O                             X
fcitx      O                             O

*so qt.5.X + fcitx appears good for testing environment on Linux.


                                   windows
                      win32         qt4          qt5
reconvert         O                X             X
* it appears currently impossible to support qt on win32

johnsonj

unread,
Aug 12, 2015, 1:54:54 AM8/12/15
to scintilla-interest
it works cool!
reconvert-gtk-0812.patch

johnsonj

unread,
Aug 12, 2015, 10:10:06 PM8/12/15
to scintilla-interest
removed support for overstrike.
extended surrounding to paragraph.
supported multicarets.

reconvert-gtk-0813.patch

johnsonj

unread,
Aug 14, 2015, 7:03:13 AM8/14/15
to scintilla-interest

streamlined


reconvert-gtk-0814.patch

Neil Hodgson

unread,
Aug 17, 2015, 9:15:34 AM8/17/15
to scintilla...@googlegroups.com
johnsonj:

> streamlined

Yes, its worthwhile implementing the context commands for GTK+ as well as Qt.

However, gtk_im_context_set_surrounding says to not include the preedit string in the text but it does not appear to be cut out from the text. Perhaps there isn’t any preedit text at this point but its not clear. I don’t see the justification for different treatment of empty and non-empty selections.

Neil

johnsonj

unread,
Aug 17, 2015, 10:42:48 AM8/17/15
to scintilla-interest
"""However, gtk_im_context_set_surrounding says to not include the preedit string in the text but it does not appear to be cut out from the text. """
From my playing with on korean IME, it returns correct result even when composing.


"""I don’t see the justification for different treatment of empty and non-empty selections."""
Yes, you are right.

gtk primary selection event of scintilla seems not syncronized with scintilla main selection.
for example, It may be the case that gtk keeps all selections while recoversion feed just main selection only.

Currently this implementation does not work well for multple selections.
Select multi carets with Ctrl+click, and Select ranges with Shift+arrow keys.
Unfortunately it does not answer for Reconversion key.
If Select respectedly one by one, It does.

Maybe this is why selection have to be handled separately.
I am sure there is something between gtk selection event and gtk_im_context_set_surrounding().
but I can not figure out yet.



johnsonj

unread,
Aug 18, 2015, 12:42:43 AM8/18/15
to scintilla-interest
have arrived at:
CopySelectionRange(&primary);

All ranges go into primary.
It never goes with main range for reconversion.
If reconversion selection are set as same as CopySelectionRange() does,
it works multi selection at the same time for reconversion.
but the result is all range characters as preedit string occupy each range respectedly.

johnsonj

unread,
Aug 18, 2015, 9:34:47 AM8/18/15
to scintilla-interest
removed selection check.
solved secret of surrounding text.

multiselecting at the same time still does not answer reconversion key.
it must relate with gtk selection.

giving gtk only main selection make it respond.
It seems to be needed your instruction.




reconvert-gtk-0818-NoSelection.patch

johnsonj

unread,
Aug 19, 2015, 2:23:36 AM8/19/15
to scintilla-interest
This is a test only for making same time multiselection answer reconversion.
It may be a big impact I can not burden.

Please give an arrangement or instruction.

void ScintillaGTK::SelectionGet(GtkWidget *widget,
                                GtkSelectionData *selection_data, guint info, guint) {
    ScintillaGTK *sciThis = ScintillaFromWidget(widget);
    try {
        //Platform::DebugPrintf("Selection get\n");
        if (SelectionOfGSD(selection_data) == GDK_SELECTION_PRIMARY) {
            if (sciThis->primary.Empty()) {
                if (sciThis->sel.Count() > 1) {
                    int start = sciThis->sel.RangeMain().Start().Position();
                    int end = sciThis->sel.RangeMain().End().Position();
                    std::string text = sciThis->RangeText(start, end);
                    sciThis->primary.Copy(text, sciThis->pdoc->dbcsCodePage,
                             sciThis->vs.styles[STYLE_DEFAULT].characterSet, false, false);
                } else {
                    sciThis->CopySelectionRange(&sciThis->primary);
                }
            }
            sciThis->GetSelection(selection_data, info, &sciThis->primary);
        }
    } catch (...) {
        sciThis->errorStatus = SC_STATUS_FAILURE;
    }
}


johnsonj

unread,
Aug 20, 2015, 7:56:03 AM8/20/15
to scintilla-interest

johnsonj

unread,
Aug 20, 2015, 10:04:59 PM8/20/15
to scintilla-interest
+    if (baseStart == mainEnd)
+        return FALSE;
+


reconvert-gtk-0821.patch

johnsonj

unread,
Aug 21, 2015, 2:43:22 AM8/21/15
to scintilla-interest
sorry for experimental code introduced.
to be removed later patch.

johnsonj

unread,
Aug 28, 2015, 8:40:55 AM8/28/15
to scintilla-interest
It is a long time!
follow Qt patch.

rc-gtk-0828-LTRselection.patch

johnsonj

unread,
Sep 4, 2015, 4:40:05 AM9/4/15
to scintilla-interest
your instructions makes good.
It solves 'randomly works' problem.
I am very pleased.
I do really thank you.

but one problem still remains.
If multi selection at the same time,
since primary selection of gtk contains all ranges, while scintilla has main range only,
multiple reconversion does not work.

this problem appears to be above me.

rc-gtk-0904.patch

Neil Hodgson

unread,
Sep 7, 2015, 2:43:04 AM9/7/15
to scintilla...@googlegroups.com
johnsonj:

> but one problem still remains.
> If multi selection at the same time,
> since primary selection of gtk contains all ranges, while scintilla has main range only,
> multiple reconversion does not work.



> <rc-gtk-0904.patch>

To me, this is another case where the platform API should be followed simply instead of trying to influence its behaviour in an indirect way. Respond to retrieve-surrounding with a call to gtk_im_set_surrounding with some text instead of bouncing out for overstrike over line end. Its unlikely that refusing to return text in this case will be robust over all languages and future platform changes. Good support for languages like Thai (as mentioned in the GTK+ documentation) may require changes here that will interact poorly. It appears to be a simple (no side effect) retrieval query so moving the candidate window may not be reasonable.

Neil

johnsonj

unread,
Sep 7, 2015, 7:57:43 PM9/7/15
to scintilla-interest
 """   To me, this is another case where the platform API should be followed simply
instead of trying to influence its behaviour in an indirect way. ""'

sorry I am implenting with a lot of trial and error.


"""Respond to retrieve-surrounding with a call to gtk_im_set_surrounding with some text
instead of bouncing out for overstrike over line end. """

No, It should be controlled in retrieve-surrounding whether reconversion is allowed or not.


"""Its unlikely that refusing to return text in this case will be robust over all languages and
future platform changes. Good support for languages like Thai (as mentioned in the GTK+
documentation) may require changes here that will interact poorly. """

Yes, you are right. I just want to show you this version will not work in multiline and overstrike.


"""It appears to be a simple (no side effect) retrieval query so moving the candidate window may not be reasonable."""

No, If no selection, it may show candidate box to choose hanja in Korean ime.

CONCLUSION: Limit reconversion base to one line without eol. Is this the way to go?

johnsonj

unread,
Sep 8, 2015, 5:43:21 AM9/8/15
to scintilla-interest
reconversion - one line without eol.
replacement - pagraphs.

rc-gtk-0908-commit.patch

johnsonj

unread,
Sep 9, 2015, 10:43:37 PM9/9/15
to scintilla-interest
one line withot eol acccording to gtktextview.c
It works cool with emoji.
====================================================================================
static
gboolean gtk_text_view_retrieve_surrounding_handler (GtkIMContext *context, GtkTextView *text_view) { GtkTextIter start; GtkTextIter end; gint pos; gchar *text; gtk_text_buffer_get_iter_at_mark (text_view->priv->buffer, &start, gtk_text_buffer_get_insert (text_view->priv->buffer)); end = start; pos = gtk_text_iter_get_line_index (&start); gtk_text_iter_set_line_offset (&start, 0); gtk_text_iter_forward_to_line_end (&end); text = gtk_text_iter_get_slice (&start, &end); gtk_im_context_set_surrounding (context, text, -1, pos); g_free (text); return TRUE; }






















johnsonj

unread,
Sep 9, 2015, 10:44:24 PM9/9/15
to scintilla-interest
patch attached

rc-gtk-0910.patch

johnsonj

unread,
Sep 10, 2015, 10:06:18 AM9/10/15
to scintilla-interest

johnsonj

unread,
Sep 14, 2015, 10:07:14 AM9/14/15
to scintilla-interest
changed to feed whole one line for reconversion.


johnsonj

unread,
Sep 14, 2015, 10:08:03 AM9/14/15
to scintilla-interest
patch attached

rc-gtk-0915.patch

johnsonj

unread,
Sep 20, 2015, 6:28:51 AM9/20/15
to scintilla-interest
                        * Thai language usage of ime
             commit            preedit         delete  surrounding
----------------------------------------------------------------------------------------
gtk            O                     X                       X
qt              X                     X                       X
----------------------------------------------------------------------------------------

Thai language ime can not be model reference for reconversion as stated on gtk document.


johnsonj

unread,
Sep 30, 2015, 6:28:17 AM9/30/15
to scintilla-interest
I have reinstalled Fedora22 into virtualbox to test reconversion.

Here is demo gtk reconversion on fedora 22
https://www.youtube.com/watch?v=-h9nzk5V-I4

It works as I expected.

I find ibus is supposed to support reconversion.
It just does not work well.

Currently the state of ibus:
1. it does interfere with fcitx.
2. it does not work on Qt5.
3. it does not work well for reconversion.

johnsonj

unread,
Oct 26, 2015, 3:49:36 AM10/26/15
to scintilla-interest
Welcome to lubuntu 15.10!
Now it defaults to fcitx instead of ibus.

johnsonj

unread,
Nov 11, 2015, 6:25:08 PM11/11/15
to scintilla-interest
Reporting about reconversion with SCIM.
1. It has retrieve-surrounding but not delete-surrounding.
2. Reconversion for japanese works well on qt and gtk with some patch.

it turns out reconversion may work without delete-surrounding.

* SCIM calls 4 times every type on Anthy.
For testing reconverstion, I made scintilla accept only the first call.

Sorry about SCIM.


johnsonj

unread,
Jan 18, 2016, 12:07:58 AM1/18/16
to scintilla-interest
I found a valuable web page for an explantion about Thai reconversion support of ibus.

http://blog.du-a.org/2010/10/29/ibus-and-surrounding-text/

Although tested on lubuntu 15.10, It works well as the web page explains.
“Thai – kesmanee (m17n)” engine really use 'retrieve-surrounding' and 'delete-surrounding' as I can find of.

*ibus-anthy does not work though. I guess anthy has a interaction bug with ibus.
*fcitx has no “Thai – kesmanee (m17n)” engine.
but this is beside my interest.

I want to say finally I find a way to test thai reconversion support: “Thai – kesmanee (m17n)” engine

johnsonj

unread,
Jan 20, 2016, 6:59:57 AM1/20/16
to scintilla-interest
reporting: tested under ibus 1.5.10, lubuntu 15.10, Qt4.6.8, Gtk2.x. with  “Thai – kesmanee (m17n)”

https://youtu.be/s5ea0tngnlo

it works as same as gedit and kate.

but scintllaQt does not respond for retrieve signal while kate works good.

ibus is very strange, it just only work for “Thai – kesmanee (m17n)”.
not for korean or japanese.

since it may likely be entangled with fcitx.
I will remove ibus completely.
so here I keep a record for a success case of reconversion.



johnsonj

unread,
Jan 26, 2016, 8:09:34 AM1/26/16
to scintilla-interest
:just keep recording
from fcitx\fcitx-anthy-0.2.2\src\utils.cpp
-----------------------------------------------------------------------------------------------------------------------------
static bool search_anchor_pos_forward(
...
   size_t offset = fcitx_utf8_get_nth_char(surrounding_text.c_str(), cursor_pos) - surrounding_text.c_str();

    std::string new_start = surrounding_text.substr(offset);

    if (new_start.compare(0, new_start.size(), selected_text) != 0) {
        return false;
    }
...
------------------------------------------------------------------------------------------------------------------------------
1. new_start is from offset to the end of surrounding_text
2. new_start should be selected_text.

so the end of surrounding_text should equal to the end of selected_text if anchor != caret.


johnsonj

unread,
Feb 6, 2016, 6:45:04 AM2/6/16
to scintilla-interest
I am confusing the right thread.
I reported feedEndLimit problem to fcitx-anthy group.

https://groups.google.com/forum/#!topic/fcitx/o4xCbx-wSp0

And have an argument issue.
That is, fcitx should fetch selection info from GDK_PRIMARY_SELECTION, rather than GTK_TEXT_VIEW.

https://groups.google.com/forum/#!topic/fcitx/s5Q-r7Ojs1c

GDK_PRIMARY_SELECTION should be utf8 encoding from ScintillaGtk to appeal my argument.

Neil Hodgson

unread,
Feb 8, 2016, 1:28:58 AM2/8/16
to scintilla...@googlegroups.com
johnsonj:

And have an argument issue.
That is, fcitx should fetch selection info from GDK_PRIMARY_SELECTION, rather than GTK_TEXT_VIEW.

https://groups.google.com/forum/#!topic/fcitx/s5Q-r7Ojs1c

GDK_PRIMARY_SELECTION should be utf8 encoding from ScintillaGtk to appeal my argument.

   This is heading into difficult areas. A more robust approach may be to talk to the GTK+ developers to add anchor support.

   Neil

johnsonj

unread,
Feb 8, 2016, 1:57:57 AM2/8/16
to scintilla-interest
Neil:
   """This is heading into difficult areas. A more robust approach may be to talk to the GTK+ developers to add anchor support."""
Probably, maybe.
But it is in fact needless to give anchor position to ime.
Ime should check selection be inside surrounding.
I am sure Selection is supposed to given with PRIMARY.

Unfortunately fcitx fails since it tries to get selection from gtktextview,
But fortunately fcitx-anthy succedes to get selection from PRIMARY.
So scintillaGtk is able to work for surrouding feature.
(I am marbled at the fact scintillaGtk is well implemented for PRIMARY.)

I am looking forward to when fcitx-anthy fixes feedEndLimit bug.

johnsonj

unread,
Feb 8, 2016, 10:01:29 PM2/8/16
to scintilla-interest
Finally I get reconversion with mozc to work on gtk.

https://youtu.be/xBKrCzJlfVQ

johnsonj

unread,
Feb 9, 2016, 8:03:54 AM2/9/16
to scintilla-interest

johnsonj

unread,
Feb 9, 2016, 8:05:01 AM2/9/16
to scintilla-interest
mozc turned out not to have anchor bug in reconversion.

johnsonj

unread,
Mar 8, 2016, 2:16:22 AM3/8/16
to scintilla-interest
Adapted what has been found so far.
Tested lubuntu15.10 and fedora23
----------------------------------------------------------
fcitx-hangul
fcitx-anthy
fcitx-mozc
* fcitx-thai has no kesmanee (m17n)
----------------------------------------------------------
ibus-hangul
ibus-mozc
ibus-thai “Thai – kesmanee (m17n)"
* ibus-anthy has no retrieve-surounding feature implemented
----------------------------------------------------------
scim-hangul
scim-anthy
* scim should be fixed for normal ime funcition.
* scim has no delete-surrounding feature implemented.
----------------------------------------------------------


rcGTK0308.patch

johnsonj

unread,
Oct 10, 2016, 2:55:58 AM10/10/16
to scintilla-interest
reporting:   Accented input with reconvert string feature.

I have guessed accented input is like hanja input mode in Korean Ime.
So I gave it a try to add mapping table of accented characters to mssymbol.txt, which is a mapping table for special characters.

DEMO: https://youtu.be/2WfqPA9rHiI

It turns out it works well with reconvert string without any coding.
It is interesting.

Neil Hodgson

unread,
Oct 11, 2016, 4:05:25 AM10/11/16
to Scintilla mailing list
johnsonj:

> I have guessed accented input is like hanja input mode in Korean Ime.
> So I gave it a try to add mapping table of accented characters to mssymbol.txt, which is a mapping table for special characters. …
> It turns out it works well with reconvert string without any coding.
> It is interesting.

Reconversion isn’t a known function to users of European languages I think.

Neil

johnsonj

unread,
Oct 11, 2016, 6:55:38 AM10/11/16
to scintilla-interest
Here is quick phrase feature in fcitx.

https://fcitx-im.org/wiki/QuickPhrase

This is for preedit string:
It is similiar to abbreviation feature of scintilla with candidate box.
I imagine it will be interesting for quick phrase and reconvert string to be combined.

johnsonj

unread,
Oct 11, 2016, 8:06:06 AM10/11/16
to scintilla-interest
Accented input with Quick Phrase feature of Fcitx.

DEMO: https://youtu.be/AJ-m5u0yf-o

Welcome Candidate box!

johnsonj

unread,
Dec 11, 2016, 8:14:49 AM12/11/16
to scintilla-interest
Catch up:


rcGtk1212.patch

Neil Hodgson

unread,
Dec 13, 2016, 6:21:25 PM12/13/16
to scintilla...@googlegroups.com
johnsonj:

> Catch up:
> <rcGtk1212.patch>

Is it likely the fcitx bugs will be fixed soon? Including workarounds for bugs in fcitx could make Scintilla respond poorly when fcitx is fixed or to other IMEs that do not have the same problems.

Neil

johnsonj

unread,
Dec 14, 2016, 6:10:27 AM12/14/16
to scintilla-interest
"""  Is it likely the fcitx bugs will be fixed soon? """

I think it is not likely.
You can see fcitx author's opnion about PRIMARY
When I asked him if the focuse widget is not gtktextview, How does fcitx determine anchor-position?.
https://groups.google.com/forum/#!topic/fcitx/s5Q-r7Ojs1c

I gave it try about how to guess anchor position on Qt.
I think it works good. no workaround.
https://groups.google.com/forum/#!topic/fcitx/rAZOsxGmNu8

I implemented this idea on Windows.
https://bugreports.qt.io/browse/QTBUG-50791


""" Including workarounds for bugs in fcitx could make Scintilla respond poorly when fcitx is fixed or to other IMEs that do not have the same problems. """

I am sure my woraround does not affect Scintilla, fcitx or other IMEs.
This workaround just does sanity checks on behalf of IMEs in advance.

+    // Temporary work around for fcitx bugs.
+    if (!sel.Empty()) {
+        // Limit surrounding for anchor position.
+        // Check if selection is in surrounding.
+        int selectionEnd = std::max(mainCaret, sel.MainAnchor());
+        feedEnd = std::min(feedEnd, selectionEnd);
+    }

At any rate, this workaround should be used for fcitx without fixes.
If fcitx will be fixed later, it is Ok just to delete or leave this workaround.

But I do not expect it will likely happen.
Especially this workaround is necessary for Gtk.
So I want Qt to follow Gtk path: I throw away Qt::ImAnchorPosition.



Reply all
Reply to author
Forward
0 new messages