Polished Keyboard API

Yuan Xulei

unread,

Jul 13, 2013, 3:10:36 AM7/13/13

to dev-webapi, Ehsan Akhgari, David Flanagan, "Evan Tseng (曾增仁)", James Ho, Evelyn Hung, Mounir Lamouri, Salvador de la Puente González, Tim Guan-tin Chien, Olli Pettay, Jonas Sicking

Hi all,

I wrapped up an API proposal based on our internal discussion and added
an interface to support listing and switching IMEs.

I also made some changes to the composition APIs.

Some use cases, that I think are important, are listed to illustrate how
to use these APIs.

==User cases==

1. Input Method API - for developing keyboard with complex IMEs

activates IME by focus moving:
* User selects an input element and starts input.
InputMethodConnection.onstart notifies the IME that text input has started.

activates IME by switching to new IME:
* The keyboard is already activated and user switches to other IME.
InputMethodConnection.onfinish notifies current IME to close and clean
up, while InputMethodConnection.onstart notifies the new IME that text
input has started.

erasing multiple characters in one API call instead of simulation by
sending several backspace keys:
* InputMethodConnection.deleteSurroundingText

Move the cursor position or select a specified range of text:
* InputMethodConnection.setSelectionRange

Word suggestion:
* User types 'go' and the suggestion word 'good' shows.
* When the user selects 'good', 'od' will append to 'go'. Keyboard app
uses InputMethodConnection.commitText to append text.

Auto-correction:
* InputMethodConnection.commitText could be used to replace the given
string with another string.

Composition:
* User types '\sum' and the composing text '\sum' shows by
InputMethodConnection.setComposition.
* User selects 'Σ' from the candiates view of the keyboard. The
composing text is clear and 'Σ' is committed by
InputMethodConnection.commitText.

Hide keyboard:
* Most Chinese IMEs on Android provide a "Hide" button to allow user to
close the keyboard .

2. Layout switching API - for switching between different keyboard
layouts. (built-in/3rd party keyboard apps)

switch to next layout:
* User clicks the switch button on the keyboard. Keyboard app calls
|InputMethodManager.switchToNextInputMethod| to switch to next IME.

switch to specified layout:
* User long presses the switch button on the keyboard.
* Keyboard app shows a list of enabled IME, which can be get by
|InputMethodManager.getEnabled|. The current IME needs to be marked in
the list and can be retrieved by |InputMethodManager.getCurrentInputMethod|.
* User selects an IME. Keyboard app calls
|InputMethodManager.setCurrentInputMethod| to switch to the IME.

==Proposed API==

partial interface Navigator {
readonly attribute InputMethodManager InputMethodManager;
readonly attribute InputMethodConnection InputMethodConnection;
};

dictionary InputMethodInfo {
url: 'XXX',
name: 'IME name',
locale: 'en-US',
};

// Manages the list of IMEs, enables/disables IME and switches to an IME.
interface InputMethodManager {
// Show a (system) menu for all enabled input methods that allow
user to select.
void showInputMethodPicker();

// Get a list of all installed IMEs.
Promise<InputMethodInfo[]> getAll();

// Get a list of all enabled IMEs.
Promise<InputMethodInfo[]> getEnabled();

// Get input method info of the caller IME.
Promise<InputMethodInfo> getSelf();

// Switch to the given IME.
// Privileged API. Only the system app and current IME can switch IME.
Promise<boolean> setCurrentInputMethod(InputMethodInfo info);

// Switch to next IME.
// We may not need this method and use setCurrentInputMethod instead.
Promise<boolean> switchToNextInputMethod();

// Get current IME.
Promise<InputMethodInfo> getCurrentInputMethod();

// Enable an IME.
// Privileged API. Only the system app can eanble an IME.
Promise<boolean> enable(InputMethodInfo info);

// Disable an IME.
// Privileged API. Only the system app can disable an IME.
Promise<boolean> disable(InputMethodInfo info);
};

// The input context, which are attributes and information of current
input field.
dictionary InputMethodContext {
// This is used to specify the target of input field operations.
This ID becomes invalid as soon as user leaves input field and blur
event is sent.
long contextId;

// The tag name of input field, which is enum of "input",
"textarea", or "contenteditable"
DOMString name;
// The type of the input field, which is enum of text, number,
password, url, search and email.
DOMString type;
/*
* The input mode string.
* https://bugzilla.mozilla.org/show_bug.cgi?id=796544
* It can be one of the following values:
* "none"
* "verbatim" - no capitalization, no word suggestions
* "latin" - word suggestions but no capitalization
* "latin-prose" - word suggestions and capitalization at the
start of sentences
* "latin-name" - word suggestions and capitalize each word
* "digit" - digits(0-9) only.
*/
DOMString inputmode;
/*
* The primary language for the input field.
* It is the value of HTMLElement.lang.
* see
http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#htmlelement
*/
DOMString lang;
};

interface InputMethodConnection: EventTarget {
// It informs the IME that text input has started in an input
field. If the IME is activated, this event is sent when focus enters a
new input field. Otherwise this event is sent when user switches to the
IME and activates it.
attribute EventHandler onstart;

// It informs the IME that text input has finished in last input
field. This event is sent when focus leaves current input field or user
switches to another IME.
attribute EventHandler onfinish;

// This event is sent when the attributes of input context has
changed, such as type, but focus has not.
attribute EventHandler oncontextchange;

/*
* This event is sent when the text around the cursor is changed,
due to either text
* editing or cursor movement. The text length is limited to 100
characters for each
* back and forth direction.
*
* The event handler function is specified as:
* @param beforeString Text before and including cursor position.
* @param afterString Text after and excluing cursor position.
* function(long contextId, DOMString beforeText, DOMString
afterText) {
* ...
* }
*/
attribute SurroundingTextChangeEventHandler onsurroundingtextchange;

// User moves the cursor, changes the selection, or alters the
composing text length
attribute EventHandler onselectionchange;

// TODO: maybe the parameters could be simpler?
Promise<boolean> sendKey(long contextId, long keyCode, long
charCode, long modifiers);
// Or Promise<boolean> sendKey(long contextId, KeyboardEvent event)

/*
* Get the whole text content of the input field.
*/
Promise<DOMString> getText(long contextId);

/*
* Commit text to current input field and replace text around
cursor position. It will clear the current composition.
*
* @param text The string to be replaced with.
* @param offset The offset from the cursor position where
replacing starts. Defaults to 0.
* @param length The length of text to replace. Defaults to 0.
*/
Promise<boolean> commitText(long contextId, DOMString text,
[optional] long offset, [optional] long length);

/*
*
* Delete text around the cursor.
* @param offset The offset from the cursor position where
deletion starts.
* @param length The length of text to delete.
*/
Promise<boolean> deleteSurroundingText(long offset, long length);

// The start and stop position of the selection.
readonly attribute long selectionStart;
readonly attribute long selectionEnd;

/*
* Set the selection range of the the editable text.
* Note: This method cannot be used to move the cursor during
composition. Calling this
* method will cancel composition.
* @param start The beginning of the selected text.
* @param length The length of the selected text.
*
* Note that the start position should be less or equal to the end
position.
* To move the cursor, set the start and end position to the same
value.
*/
Promise<boolean> setSelectionRange(long contextId, long start,
long length);

/*
* Set current composition. It will start or update composition.
* @param cursor Position in the text of the cursor.
*/
Promise<boolean> setComposition(long contextId, DOMString text,
long cursor);

/*
* Clear composition. Called to cancel composition.
* Use commitText to end composition and commit text.
*/
Promise<boolean> clearComposition(long contextId);

// Clear the focus of the current input field and hide the keyboard.
Promise<boolean> removeFocus(long contextId);

// The input method context
readonly attribute InputMethodContext elementInfo;
};

[TreatNonCallableAsNull]
callback SurroundingTextChangeEventHandlerNonNull = void (long
contextId, DOMString beforeText, DOMString afterText);
typedef SurroundingTextChangeEventHandlerNonNull?
SurroundingTextChangeEventHandler;

Axel Hecht

unread,

Jul 15, 2013, 11:35:13 AM7/15/13

to mozilla-d...@lists.mozilla.org

I got a question below.

Can you detail a bit on how this relates to non-latin scripts, both
alphabet-based and not?

Thanks

Axel

Yuan Xulei

unread,

Jul 15, 2013, 10:50:46 PM7/15/13

to Axel Hecht, mozilla-d...@lists.mozilla.org

On Mon 15 Jul 2013 11:35:13 PM CST, Axel Hecht wrote:
> I got a question below.
>
> On 7/13/13 9:10 AM, Yuan Xulei wrote:

> Can you detail a bit on how this relates to non-latin scripts, both
> alphabet-based and not?

For non-latin scripts, such as Chinese, Japanese and Korean, don't need
capitalization and have different behaviors with the suggestion, so
the value of inputmode will be ignored except the "digit" mode.

Axel Hecht

unread,

Jul 16, 2013, 4:34:15 AM7/16/13

to mozilla-d...@lists.mozilla.org

That doesn't cover thai, cyrl, arab, etc, though.

Axel

janjo...@gmail.com

unread,

Jul 17, 2013, 9:53:18 AM7/17/13

to

Why so elaborate? I think the only thing that we need exposable to 3rd party apps is `showInputMethodPicker()`. That delegates it to the system app and shows an IME list, if a user switches it updates the keyboard etc. All this other stuff does not seem necessary in Gecko and should be handled by keyboard_manager.js in Gaia (which keyboards are active, installed, which IMEs do they have etc.) derived from their manifest files.

Yuan Xulei

unread,

Jul 17, 2013, 1:13:11 PM7/17/13

to Axel Hecht, mozilla-d...@lists.mozilla.org

On 07/16/2013 04:34 PM, Axel Hecht wrote:
> On 7/16/13 4:50 AM, Yuan Xulei wrote:

> That doesn't cover thai, cyrl, arab, etc, though.

Could you give some information and suggestions on how to cover those
scripts?
I know nothing about thai, cyrl and arab.

Yuan Xulei

unread,

Jul 17, 2013, 1:26:06 PM7/17/13

to janjo...@gmail.com, dev-w...@lists.mozilla.org

It is an option to leave most of things done by keyboard_manager.js in
Gaia. If we need to keep a minimal set of methods, I think we need also
keep `switchToNextInputMethod()` besides `showInputMethodPicker()`.
`switchToNextInputMethod()` allows user to directly switch to the next
IME without selecting from a list.

Jan Jongboom

unread,

Jul 17, 2013, 4:16:00 PM7/17/13

to Yuan Xulei, dev-w...@lists.mozilla.org

Great. Sounds like a good solution to me. Just for clarification this would
be exposed under navigator.mozInputMethodManager right? Or would you want
to revisit API design in that case? (Keep it under mozKeyboard? I don't
really know a nice place).

2013/7/17 Yuan Xulei <xy...@mozilla.com>

> On 07/17/2013 09:53 PM, janjo...@gmail.com wrote:
>

>> On Saturday, July 13, 2013 9:10:36 AM UTC+2, Yuan Xulei wrote:
>>
>>> ==Proposed API==
>>>
>>> partial interface Navigator {
>>> readonly attribute InputMethodManager InputMethodManager;
>>> readonly attribute InputMethodConnection InputMethodConnection;
>>> };
>>>
>>> dictionary InputMethodInfo {
>>> url: 'XXX',
>>> name: 'IME name',
>>> locale: 'en-US',
>>> };
>>>
>>> // Manages the list of IMEs, enables/disables IME and switches to an IME.
>>> interface InputMethodManager {
>>> // Show a (system) menu for all enabled input methods that allow
>>> user to select.
>>> void showInputMethodPicker();
>>>
>>> // Get a list of all installed IMEs.
>>> Promise<InputMethodInfo[]> getAll();
>>>
>>> // Get a list of all enabled IMEs.
>>> Promise<InputMethodInfo[]> getEnabled();
>>>
>>> // Get input method info of the caller IME.
>>> Promise<InputMethodInfo> getSelf();
>>>
>>> // Switch to the given IME.
>>> // Privileged API. Only the system app and current IME can switch
>>> IME.

>>> Promise<boolean> setCurrentInputMethod(**InputMethodInfo info);

>>>
>>> // Switch to next IME.
>>> // We may not need this method and use setCurrentInputMethod
>>> instead.
>>> Promise<boolean> switchToNextInputMethod();
>>>
>>> // Get current IME.
>>> Promise<InputMethodInfo> getCurrentInputMethod();
>>>
>>> // Enable an IME.
>>> // Privileged API. Only the system app can eanble an IME.
>>> Promise<boolean> enable(InputMethodInfo info);
>>>
>>> // Disable an IME.
>>> // Privileged API. Only the system app can disable an IME.
>>> Promise<boolean> disable(InputMethodInfo info);
>>> };
>>>
>> Why so elaborate? I think the only thing that we need exposable to 3rd
>> party apps is `showInputMethodPicker()`. That delegates it to the system
>> app and shows an IME list, if a user switches it updates the keyboard etc.
>> All this other stuff does not seem necessary in Gecko and should be handled
>> by keyboard_manager.js in Gaia (which keyboards are active, installed,
>> which IMEs do they have etc.) derived from their manifest files.
>>

Axel Hecht

unread,

Jul 18, 2013, 4:22:13 AM7/18/13

to mozilla-d...@lists.mozilla.org

I sadly don't know much either. I am concerned to design an api around
latin script, though.

Maybe it helps if you can elaborate who'd be using that api, both
reading and writing? We could open up a thread in mozilla.dev.l10n to
gather feedback on how folks expect their script to act in that
situation, and maybe that will help.

Axel

Tim Chien

unread,

Jul 18, 2013, 4:35:01 AM7/18/13

to Axel Hecht, mozilla-d...@lists.mozilla.org

On Thu, Jul 18, 2013 at 4:22 PM, Axel Hecht <l1...@mozilla.com> wrote:
> On 7/17/13 7:13 PM, Yuan Xulei wrote:
>>

>> On 07/16/2013 04:34 PM, Axel Hecht wrote:
>

> I sadly don't know much either. I am concerned to design an api around latin
> script, though.
>
> Maybe it helps if you can elaborate who'd be using that api, both reading
> and writing? We could open up a thread in mozilla.dev.l10n to gather
> feedback on how folks expect their script to act in that situation, and
> maybe that will help.
>

Axel,

The detail here is a little bit side-tracked; the goal of Virtual
Keyboard API (to be renamed to "Input Method API", revised proposal to
come) is to allow web app to implement a virtual keyboard or an input
method by mutate a input field on another web page. The proposed
property |inpputmode| simply expose the information available on the
web page to the app. We should probably revise the inputmode spec [1]
if those keywords are not sufficient for non-Latin language, not the
API here.

[1] http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#input-modalities:-the-inputmode-attribute

--
Tim Guan-tin Chien, Engineering Manager and Front-end Lead, Firefox
OS, Mozilla Corp. (Taiwan)

Yuan Xulei

unread,

Jul 18, 2013, 5:04:37 AM7/18/13

to Jan Jongboom, dev-w...@lists.mozilla.org

In that case, I'd like to move the InputMethodManager inside mozKeyboard
and it'll be something like this:
navigator.mozKeyboard.mozInputMethodManager = {
void showInputMethodPicker();
void switchToNextInputMethod();

}

On 07/18/2013 04:16 AM, Jan Jongboom wrote:
> Great. Sounds like a good solution to me. Just for clarification this
> would be exposed under navigator.mozInputMethodManager right? Or would
> you want to revisit API design in that case? (Keep it under
> mozKeyboard? I don't really know a nice place).
>
>

> 2013/7/17 Yuan Xulei <xy...@mozilla.com <mailto:xy...@mozilla.com>>

>
> On 07/17/2013 09:53 PM, janjo...@gmail.com

janjo...@gmail.com

unread,

Jul 18, 2013, 6:29:03 AM7/18/13

to

I added a preliminary version of this implementation to https://bugzilla.mozilla.org/show_bug.cgi?id=885692 as I'm on PTO as of tomorrow, feel free to change based on discussion results here.

Axel Hecht

unread,

Jul 19, 2013, 10:08:22 AM7/19/13

to mozilla-d...@lists.mozilla.org

I confess, those modes are pretty horrible. "Hermann van Veen" isn't a
latin-script name, and one shouldn't enter a US phone number on a German
device. Is there still a way to influence those states?

Anywho, we probably need to live with some of this.

I'm trying to recap:

A webdeveloper sets inputmode to one of those values, and the api
forwards that back to the keyboard app (3rd party or not), right?

Should the API implement the fallback chain that specified in the spec?

Also, we hope/expect that we'll practically never see anything but
numeric, hopefully? And that'd we'd generally end up with the default
inputmode, which is the user's selected mode for keyboards?

As for how the api is written, I think it'd be better to refer to the
html spec than to a lengthy bug, and detail inline why some of the html
spec modes are not showing up.

Axel

Evelyn Hung

unread,

Jul 21, 2013, 2:19:34 PM7/21/13

to Axel Hecht, mozilla-d...@lists.mozilla.org

----- Original Message -----
> On 7/18/13 10:35 AM, Tim Chien wrote:

> I confess, those modes are pretty horrible. "Hermann van Veen" isn't a
> latin-script name, and one shouldn't enter a US phone number on a German
> device. Is there still a way to influence those states?
>
> Anywho, we probably need to live with some of this.
>
> I'm trying to recap:
>
> A webdeveloper sets inputmode to one of those values, and the api
> forwards that back to the keyboard app (3rd party or not), right?

Yes

> Should the API implement the fallback chain that specified in the spec?

No, simply expose the information and let the keyboard app decides a best way to interact, because API doesn't know if the keyboard app support the specified inputmode or not.

> Also, we hope/expect that we'll practically never see anything but
> numeric, hopefully? And that'd we'd generally end up with the default
> inputmode, which is the user's selected mode for keyboards?

The logic is all controlled by the keyboard app. I'm not sure if the numeric is the only common case practically, and we don't assume that here. When a keyboard see this inputmode hint, it can ignore or fallback to a state it can handle or use user preference to show a layout.

> As for how the api is written, I think it'd be better to refer to the
> html spec than to a lengthy bug, and detail inline why some of the html
> spec modes are not showing up.
> Axel

> _______________________________________________
> dev-webapi mailing list
> dev-w...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-webapi
>

Tim Chien

unread,

Jul 23, 2013, 3:31:58 AM7/23/13

to Yuan Xulei, Ehsan Akhgari, Evan Tseng (曾增仁), dev-webapi, James Ho, Evelyn Hung, Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Jonas Sicking

Hi all,

I've work with Xulei to come up with the new Input Method API idl. I
believe we have covered many of the use cases needed, so we would
really love to have your prompt feedback.

Basically methods to interact with the text field are categorized into
4 kind of use cases, see

https://wiki.mozilla.org/WebAPI/KeboardIME#Use_cases_for_each_of_the_methods

1) For a simple virtual keyboard action (send a character and key
events w/ each user action), use sendKey().
2) For spellcheck, autocomplete etc, use surrounding text methods.
3) For cursor moment helper features, use setSelectionRange() and
related attributes.
4) For Asian IMEs that sends characters and composition along with the
composition events, use setComposition() and endComposition().

One thing I find in disagreement with Xulei is methods that actually
changes the text if the fields. Eventually we have kept the use case
separation, but just like Xulei you might find the APIs are a bit
redundant. We would need more pair of eyes for giving us feedback on
limit the redundancy while keeping the clear separation. Thanks,

On Tue, Jul 23, 2013 at 5:51 AM, Yuan Xulei <xy...@mozilla.com> wrote:
> Hi all,
>
> Thanks for your feedbacks and comments. We moved forward a big step and
> a new version of the API is ready for discussion, which is quite near what
> we
> want.
>
> I list the API proposal at the bottom of the letter and you can also check
> it here
> https://wiki.mozilla.org/WebAPI/KeboardIME.
>
> Tim also made some example code to help understand the API.
> https://wiki.mozilla.org/WebAPI/KeboardIME#Use_cases_for_each_of_the_methods.
>
> The main changes are:
>
> 1. Merged `InputMethodConnection` and `InputMethodContext` as
> `InputContext`.
> `InputContext` represents the input field. The IME will start or stop by
> listening
> to the `inputcontentchange` event.
>
> 2. Reduced the function of `InputMethodManager` and kept those to switch
> IMEs only.
> We also simplied the name of the methods under InputMethodManager.
>
> 3. Added a specified method `endComposition` to end composition and commit
> text.
> `commitText` can only be used to commit text without composing text and
> was
> renamed to `replaceSurroundingText`.
>
> --------- API proposal v2----------
>
> partial interface Navigator {
> readonly attribute InputMethod inputMethod;
> };
>
> interface InputMethod: EventTarget {
> // Input Method Manager contain a few global methods expose to apps
> readonly attribute InputMethodManager mgmt;
>
> // Fired when the input context changes, include changes from and to
> null.
> // The new InputContext instance will be available in the event object
> under |inputcontext| property.
> // When it changes to null it means the app (the user of this API) no
> longer has the control of the original focused input field.
> // Note that if the app saves the original context, it might get void;
> implementation decides when to void the input context.
> attribute EventHandler oninputcontextchange;
>
> // An "input context" is mapped to a text field that the app is allow to
> mutate.
> // this attribute should be null when there is no text field currently
> focused.
> readonly attribute InputContext? inputcontext;
> };
>
>
> // Manages the list of IMEs, enables/disables IME and switches to an IME.
> interface InputMethodManager {
> // Ask the OS to show a list of available IMEs for users to switch from.
> // OS should ignore this request if the app is currently not the active
> one.
> void showInputMethodPicker showAll();
>
> // Ask the OS to switch away from the current active Keyboard app.
> // OS should ignore this request if the app is currently not the active
> one.
> void switchToNextInputMethod next();
>
> // To know if the OS supports IME switching or not.
> // Use case: let the keyboard app knows if it is necessary to show the
> "IME switching"
> // (globe) button. We have a use case that when there is only one IME
> enabled, we
> // should not show the globe icon.
> boolean supportsSwitching();
>
> // Ask the OS to hide the current active Keyboard app. (was:
> |removeFocus()|)
> // OS should ignore this request if the app is currently not the active
> one.
> // The OS will void the current input context (if it exists).
> // This method belong to |mgmt| because we would like to allow Keyboard
> to access to
> // this method w/o a input context.
> void removeFocus hide();
> };
>
> // The input context, which consists of attributes and information of
> current input field.
> // It also hosts the methods available to the keyboard app to mutate the
> input field represented.
> // An "input context" gets void when the app is no longer allowed to
> interact with the text field,
> // e.g., the text field does no longer exist, the app is being switched to
> background, and etc.
> // [JJ] I doubt whether we should have 'name', 'type', etc. here. In the
> manifest we should
> // have entry points where the keyboard specifies which view to load
> when going into a
> // certain context. Requiring to do this manually will give extra work.
> // The system should guarantee that the right view is rendered based on
> entry_points in
> // in manifest (e.g. navigate keyboard to #text/en, or something, based
> on manifest.
> // [Tim] I don't think they are exclusive. A keyboard app might choose to
> load the same page with the same hash
> // for different types but only to deal with the |type| or |inputmode|
> difference later.
> interface InputContext: EventTarget {

> // The tag name of input field, which is enum of "input", "textarea", or
> "contenteditable"
> DOMString name;
>
> // The type of the input field, which is enum of text, number, password,

> url, search, email, and so on.
> // See
> http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html#states-of-the-type-attribute
> DOMString type;
>
> /*
> * The inputmode string, representing the input mode.
> * See
> http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#input-modalities:-the-inputmode-attribute
> */
> DOMString inputmode;

>
> /*
> * The primary language for the input field.
> * It is the value of HTMLElement.lang.

> * See

> http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#htmlelement
> */
> DOMString lang;
>

> /*
> * Get the whole text content of the input field.
> */

> Promise<DOMString> getText([optional] offset, [optional] length);

>
> // The start and stop position of the selection.
> readonly attribute long selectionStart;
> readonly attribute long selectionEnd;
>
> /*
> * Set the selection range of the the editable text.
> * Note: This method cannot be used to move the cursor during
> composition. Calling this
> * method will cancel composition.
> * @param start The beginning of the selected text.
> * @param length The length of the selected text.
> *
> * Note that the start position should be less or equal to the end
> position.
> * To move the cursor, set the start and end position to the same value.
> *

> * [JJ] I think that this method should return the same info as the
> selectionchange event
> * rather than a boolean.
> * [yxl] I don't think so. We could get selection range info by checking
> the attributes of
> * selectionStart and selectionEnd.
> */
> Promise<boolean> setSelectionRange(long start, long length);
>
> /* User moves the cursor, or changes the selection with other means. If
> the text around
> * cursor has changed, but the cursor has not been moved, the IME won't
> get notification.
> *
> * [JJ] I would merge this with onsurroundingtextchange to have 1 state
> event.
> * in the end, every onselectionchange event will also generate a
> surrounding
> * text change event.
> */
> attribute EventHandler onselectionchange;

>
> /*
> * Commit text to current input field and replace text around cursor
> position. It will clear the current composition.
> *
> * @param text The string to be replaced with.
> * @param offset The offset from the cursor position where replacing
> starts. Defaults to 0.
> * @param length The length of text to replace. Defaults to 0.
> */

> Promise<boolean> commitText replaceSurroundingText(DOMString text,

> [optional] long offset, [optional] long length);
>
> /*
> *
> * Delete text around the cursor.
> * @param offset The offset from the cursor position where deletion
> starts.
> * @param length The length of text to delete.

> * TODO: maybe updateSurroundingText(DOMString beforeText, DOMString
> afterText); ?
> * [JJ] Rather do a replaceSurroundingText(long offset, long length,
> optional DOMString text)
> * If text is null or empty, it behaves the same

> */
> Promise<boolean> deleteSurroundingText(long offset, long length);
>

> /*
> * Notifies when the text around the cursor is changed, due to either
> text
> * editing or cursor movement. If the cursor has been moved, but the text
> around has not
> * changed, the IME won't get notification.

> *
> * The event handler function is specified as:
> * @param beforeString Text before and including cursor position.
> * @param afterString Text after and excluing cursor position.

> * function(DOMString beforeText, DOMString afterText) {

> * ...
> * }
> */
> attribute SurroundingTextChangeEventHandler onsurroundingtextchange;
>

> /*
> * send a character with its key events.
> * @param modifiers see
> http://mxr.mozilla.org/mozilla-central/source/dom/interfaces/base/nsIDOMWindowUtils.idl#206
> * @return true if succeeds. Otherwise false if the input context
> becomes void.
> * Alternative: sendKey(KeyboardEvent event), but we will likely waste
> memory for creating the KeyboardEvent object.
> */
> Promise<boolean> sendKey(long keyCode, long charCode, long modifiers);

>
> /*
> * Set current composition. It will start or update composition.
> * @param cursor Position in the text of the cursor.
> *

> * The API implementation should automatically ends the composition
> * session (with event and confirm the current composition) if
> * endComposition is never called. Same apply when the inputContext is
> lost
> * during a unfinished composition session.
> */
> Promise<boolean> setComposition(DOMString text, long cursor);
>
> /*
> * End composition and actually commit the text. (was |commitText(text,
> offset, length)|)
> * Ending the composition with an empty string will not send any text.
> * Note that if composition always ends automatically (with the current
> composition committed) if the composition
> * did not explicitly with |endComposition()| but was interrupted with
> |sendKey()|, |setSelectionRange()|,
> * user moving the cursor, or remove the focus, etc.
> *
> * @param text The text
> */
> Promise<boolean> endComposition(DOMString text);
> };

> /*
> * The primary language for the input field.
> * It is the value of HTMLElement.lang.
> * see
> http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#htmlelement
> */
> DOMString lang;
> };
>

> _______________________________________________
> dev-webapi mailing list
> dev-w...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-webapi
>
>

Yuan Xulei

unread,

Jul 24, 2013, 5:46:03 AM7/24/13

to dev-webapi, Ehsan Akhgari, Tim Guan-tin Chien, Evelyn Hung, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Olli Pettay, Jan Jongboom, Jonas Sicking

voidshowInputMethodPicker showAll();

// Ask the OS to switch away from the current active Keyboard app.
// OS should ignore this request if the app is currently not the active one.

voidswitchToNextInputMethod next();

// To know if the OS supports IME switching or not.
// Use case: let the keyboard app knows if it is necessary to show the "IME switching"
// (globe) button. We have a use case that when there is only one IME enabled, we
// should not show the globe icon.
boolean supportsSwitching();

// Ask the OS to hide the current active Keyboard app. (was: |removeFocus()|)
// OS should ignore this request if the app is currently not the active one.
// The OS will void the current input context (if it exists).
// This method belong to |mgmt| because we would like to allow Keyboard to access to
// this method w/o a input context.

voidremoveFocus hide();

};

// The input context, which consists of attributes and information of current input field.
// It also hosts the methods available to the keyboard app to mutate the input field represented.
// An "input context" gets void when the app is no longer allowed to interact with the text field,
// e.g., the text field does no longer exist, the app is being switched to background, and etc.
// [JJ] I doubt whether we should have 'name', 'type', etc. here. In the manifest we should
// have entry points where the keyboard specifies which view to load when going into a
// certain context. Requiring to do this manually will give extra work.
// The system should guarantee that the right view is rendered based on entry_points in
// in manifest (e.g. navigate keyboard to #text/en, or something, based on manifest.
// [Tim] I don't think they are exclusive. A keyboard app might choose to load the same page with the same hash
// for different types but only to deal with the |type| or |inputmode| difference later.
interface InputContext: EventTarget {
// The tag name of input field, which is enum of "input", "textarea", or "contenteditable"
DOMString name;

// The type of the input field, which is enum of text, number, password, url, search, email, and so on.

// Seehttp://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html#states-of-the-type-attribute

DOMString type;

/*
* The inputmode string, representing the input mode.

* Seehttp://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#input-modalities:-the-inputmode-attribute

*/
DOMString inputmode;

/*
* The primary language for the input field.
* It is the value of HTMLElement.lang.

* Seehttp://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#htmlelement

* @param modifiers seehttp://mxr.mozilla.org/mozilla-central/source/dom/interfaces/base/nsIDOMWindowUtils.idl#206

Jonas Sicking

unread,

Jul 30, 2013, 6:16:56 PM7/30/13

to Tim Chien, Ehsan Akhgari, Evelyn Hung, dev-webapi, James Ho, Evan Tseng (曾增仁), Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Yuan Xulei

Hi All,

I sent this feedback in a separate thread, but just saw this one so I
figured it's better to send it here instead.

I just looked over the keyboard API and in general it looks great! I
have a few comments but in general it feels like something that is
very polished and thought through.

I take it that the API in the Wiki is the latest version? It seems
like it has a few updates compared to what's in the email below, so I
reviewed what's in the wiki.

I added a few minor comments to the wiki and marked them with [JS]. I
also had a few bigger comments below:

* InputContext for contenteditable
We need to define what constitutes an inputcontext when dealing with
contenteditable.

I think generally we should treat any separate block-level HTML
element as a separate inputcontext. That keeps things simpler in that
the keyboard API doesn't have to deal with the concept of paragraphs.
I.e. it can keep considering the inputcontext as a continuous piece of
text.

So I don't think that we need to look at what CSS is applied. Instead
simply looking at element names should be enough to determine where
the boarder between inputcontexts go.

* Race conditions between keyboard and app processes
Another issue that needs to be defined is what to do if the selection
changes between the point when we fire "onselectionchange", and the
keyboard app sends calls replaceSurroundingText or sendKey etc. I.e.
consider the following scenario:

1. The page focuses an <input> element
2. We bring up the keyboard and fire a oninputcontextchange event
3. The user clicks inside a misspelled word in the <input> element
4. We fire an onselectionchange event in the keyboard app
5. The keyboard app calls getText to investigate the text surrounding the cursor
6. We return a result to the keyboard app
7. The keyboard app realizes that the word is misspelled and displays
a suggested correction to the user.
8. The user clicks the suggestion and the keyboard app calls
replaceSurroundingText() to change the misspelled word.
9. Before the app receives the replaceSurroundingText command, the app
calls a function to move the cursor, either to move it to a new
inputcontext, or to move it inside the current inputcontext.
10. The app receives the replaceSurroundingText.

At this point, I think we need to detect that the text or cursor has
changed since the last time the keyboard app was notified. In this
case the replaceSurroundingText command should fail. I.e. no
modification should be done and we should make the Promise that was
returned report an error.

One way to do this is to in the app process and the keyboard process
keep a "generation-number". Any time the current inputcontext is
modified or any time cursor is moved we increase this
generation-number in the app process.

Whenever we send a onselectionchange, oninputcontextchange or
onsurroundingtextchange is sent from the app process to the keyboard
process we include the updated generation-number in the internal
notification from the app process to the keyboard process.

Whenever we actually fire the onselectionchange, oninputcontextchange
or onsurroundingtextchange events, we update the generation-number in
the keyboard process using the in the keyboard app.

Whenever we send a sendKey(), replaceSurroundingText() or other
modification from the keyboard process to the app process, we include
the keyboards current generation-number. When that notification is
processed by the app process we compare with the app process
generation-number. If the two are different we don't do the
modification and instead cause a failure.

If this is complicated to implement then maybe we can wait for 1.3 to fix this.

* Apply spell-fix automatically on lost focus.
When editing latin text, we sometimes want behavior similar to IME
composition characters. Consider the following scenario:

1. User starts messaging app and starts composing a message
2. User types "see you tomorro"
3. Keyboard shows as suggested spell fix "tomorrow", but doesn't
automatically fix the text yet since the user hasn't finished typing
the word.
4. User presses "send" button

At this point we need to give the keyboard the opportunity to fix the
spelling before the message is sent.

Same thing if the user clicked in a different text box or simply
closed the keyboard.

There are at least two ways we can fix this. Either we could let the
keyboard provide a range and a "replacement text". This is used so
that if the user moves focus, the text in the defined range is
replaced with the "replacement text". Note that this wouldn't affect
the selection, since in the example above, the selection is still
collapsed at the end of the text. But the replacement range consists
of the "tomorro".

Another way to fix it is to fire an event in the keyboard API whenever
focus is moved, but *before* the focus is moved. This would give the
keyboard app the ability to call replaceSurroundingText before the
cursor is moved or focus is lost.

However I'm worried that having to fire an event in a separate process
before the cursor is moved could be a performance problem.

Is this something that can be done with the existing API? I don't
quite understand how the setComposition/endComposition functions work
so maybe they can be used for this?

/ Jonas

On Tue, Jul 23, 2013 at 12:31 AM, Tim Chien <timd...@mozilla.com> wrote:
> Hi all,
>
> I've work with Xulei to come up with the new Input Method API idl. I
> believe we have covered many of the use cases needed, so we would
> really love to have your prompt feedback.
>
> Basically methods to interact with the text field are categorized into
> 4 kind of use cases, see
>
> https://wiki.mozilla.org/WebAPI/KeboardIME#Use_cases_for_each_of_the_methods
>
> 1) For a simple virtual keyboard action (send a character and key
> events w/ each user action), use sendKey().
> 2) For spellcheck, autocomplete etc, use surrounding text methods.
> 3) For cursor moment helper features, use setSelectionRange() and
> related attributes.
> 4) For Asian IMEs that sends characters and composition along with the
> composition events, use setComposition() and endComposition().
>
> One thing I find in disagreement with Xulei is methods that actually
> changes the text if the fields. Eventually we have kept the use case
> separation, but just like Xulei you might find the APIs are a bit
> redundant. We would need more pair of eyes for giving us feedback on
> limit the redundancy while keeping the clear separation. Thanks,
>
>
> On Tue, Jul 23, 2013 at 5:51 AM, Yuan Xulei <xy...@mozilla.com> wrote:

>> void showInputMethodPicker showAll();

>>
>> // Ask the OS to switch away from the current active Keyboard app.
>> // OS should ignore this request if the app is currently not the active
>> one.

>> void switchToNextInputMethod next();

>>
>> // To know if the OS supports IME switching or not.
>> // Use case: let the keyboard app knows if it is necessary to show the
>> "IME switching"
>> // (globe) button. We have a use case that when there is only one IME
>> enabled, we
>> // should not show the globe icon.
>> boolean supportsSwitching();
>>
>> // Ask the OS to hide the current active Keyboard app. (was:
>> |removeFocus()|)
>> // OS should ignore this request if the app is currently not the active
>> one.
>> // The OS will void the current input context (if it exists).
>> // This method belong to |mgmt| because we would like to allow Keyboard
>> to access to
>> // this method w/o a input context.

>> void removeFocus hide();

>> partial interface Navigator {

>> readonly attribute InputMethodManager InputMethodManager;
>> readonly attribute InputMethodConnection InputMethodConnection;
>> };
>>
>> dictionary InputMethodInfo {
>> url: 'XXX',
>> name: 'IME name',
>> locale: 'en-US',
>> };
>>

>> // Manages the list of IMEs, enables/disables IME and switches to an IME.
>> interface InputMethodManager {

>> // Show a (system) menu for all enabled input methods that allow user
>> to select.
>> void showInputMethodPicker();
>>
>> // Get a list of all installed IMEs.
>> Promise<InputMethodInfo[]> getAll();
>>
>> // Get a list of all enabled IMEs.
>> Promise<InputMethodInfo[]> getEnabled();
>>
>> // Get input method info of the caller IME.
>> Promise<InputMethodInfo> getSelf();
>>
>> // Switch to the given IME.
>> // Privileged API. Only the system app and current IME can switch IME.
>> Promise<boolean> setCurrentInputMethod(InputMethodInfo info);
>>
>> // Switch to next IME.
>> // We may not need this method and use setCurrentInputMethod instead.
>> Promise<boolean> switchToNextInputMethod();
>>
>> // Get current IME.
>> Promise<InputMethodInfo> getCurrentInputMethod();
>>
>> // Enable an IME.
>> // Privileged API. Only the system app can eanble an IME.
>> Promise<boolean> enable(InputMethodInfo info);
>>
>> // Disable an IME.
>> // Privileged API. Only the system app can disable an IME.
>> Promise<boolean> disable(InputMethodInfo info);
>> };
>>

>> // The input context, which are attributes and information of current input
>> field.

>> dictionary InputMethodContext {
>> // This is used to specify the target of input field operations. This ID
>> becomes invalid as soon as user leaves input field and blur event is sent.
>> long contextId;
>>

>> // The tag name of input field, which is enum of "input", "textarea", or
>> "contenteditable"
>> DOMString name;
>> // The type of the input field, which is enum of text, number, password,

>> url, search and email.
>> DOMString type;
>> /*

>> * The input mode string.
>> * https://bugzilla.mozilla.org/show_bug.cgi?id=796544
>> * It can be one of the following values:
>> * "none"
>> * "verbatim" - no capitalization, no word suggestions
>> * "latin" - word suggestions but no capitalization
>> * "latin-prose" - word suggestions and capitalization at the start of
>> sentences
>> * "latin-name" - word suggestions and capitalize each word
>> * "digit" - digits(0-9) only.

>> */
>> DOMString inputmode;
>> /*
>> * The primary language for the input field.
>> * It is the value of HTMLElement.lang.

>> * see
>> http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#htmlelement
>> */
>> DOMString lang;
>> };
>>
>> interface InputMethodConnection: EventTarget {
>> // It informs the IME that text input has started in an input field. If
>> the IME is activated, this event is sent when focus enters a new input
>> field. Otherwise this event is sent when user switches to the IME and
>> activates it.
>> attribute EventHandler onstart;
>>
>> // It informs the IME that text input has finished in last input field.
>> This event is sent when focus leaves current input field or user switches to
>> another IME.
>> attribute EventHandler onfinish;
>>
>> // This event is sent when the attributes of input context has changed,
>> such as type, but focus has not.
>> attribute EventHandler oncontextchange;
>>
>> /*

>> * This event is sent when the text around the cursor is changed, due to
>> either text

>> * editing or cursor movement. The text length is limited to 100
>> characters for each
>> * back and forth direction.

>> *
>> * The event handler function is specified as:
>> * @param beforeString Text before and including cursor position.
>> * @param afterString Text after and excluing cursor position.

>> * function(long contextId, DOMString beforeText, DOMString afterText) {

>> * ...
>> * }
>> */
>> attribute SurroundingTextChangeEventHandler onsurroundingtextchange;
>>

>> // User moves the cursor, changes the selection, or alters the composing
>> text length
>> attribute EventHandler onselectionchange;
>>
>> // TODO: maybe the parameters could be simpler?

>> Promise<boolean> sendKey(long contextId, long keyCode, long charCode,
>> long modifiers);

>> // Or Promise<boolean> sendKey(long contextId, KeyboardEvent event)
>>

>> /*
>> * Get the whole text content of the input field.
>> */

>> Promise<DOMString> getText(long contextId);

>>
>> /*
>> * Commit text to current input field and replace text around cursor
>> position. It will clear the current composition.
>> *
>> * @param text The string to be replaced with.
>> * @param offset The offset from the cursor position where replacing
>> starts. Defaults to 0.
>> * @param length The length of text to replace. Defaults to 0.
>> */

>> Promise<boolean> commitText(long contextId, DOMString text, [optional]

>> long offset, [optional] long length);
>>
>> /*
>> *
>> * Delete text around the cursor.
>> * @param offset The offset from the cursor position where deletion
>> starts.
>> * @param length The length of text to delete.

>> */
>> Promise<boolean> deleteSurroundingText(long offset, long length);
>>

>> // The start and stop position of the selection.
>> readonly attribute long selectionStart;
>> readonly attribute long selectionEnd;
>>
>> /*
>> * Set the selection range of the the editable text.
>> * Note: This method cannot be used to move the cursor during
>> composition. Calling this
>> * method will cancel composition.
>> * @param start The beginning of the selected text.
>> * @param length The length of the selected text.
>> *
>> * Note that the start position should be less or equal to the end
>> position.
>> * To move the cursor, set the start and end position to the same
>> value.

>> */
>> Promise<boolean> setSelectionRange(long contextId, long start, long
>> length);
>>

>> /*
>> * Set current composition. It will start or update composition.
>> * @param cursor Position in the text of the cursor.

>> */
>> Promise<boolean> setComposition(long contextId, DOMString text, long
>> cursor);
>>
>> /*

Yuan Xulei

unread,

Jul 31, 2013, 5:32:23 AM7/31/13

to Jonas Sicking, Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay

Hi Jonas,

I updated the
wiki(https://wiki.mozilla.org/WebAPI/KeboardIME#Proposed_API) and
please see my comments below.

On 07/31/2013 06:16 AM, Jonas Sicking wrote:
> Hi All,
>
> I sent this feedback in a separate thread, but just saw this one so I
> figured it's better to send it here instead.
>
> I just looked over the keyboard API and in general it looks great! I
> have a few comments but in general it feels like something that is
> very polished and thought through.
>
> I take it that the API in the Wiki is the latest version? It seems
> like it has a few updates compared to what's in the email below, so I
> reviewed what's in the wiki.
>
> I added a few minor comments to the wiki and marked them with [JS]. I
> also had a few bigger comments below:
>
> * InputContext for contenteditable
> We need to define what constitutes an inputcontext when dealing with
> contenteditable.
>
> I think generally we should treat any separate block-level HTML
> element as a separate inputcontext. That keeps things simpler in that
> the keyboard API doesn't have to deal with the concept of paragraphs.
> I.e. it can keep considering the inputcontext as a continuous piece of
> text.

I can't agree more. Dealing with an entire contentEditable element
is complex and has performance issue.

Agree. I'd like to implement this in 1.2.

> * Apply spell-fix automatically on lost focus.
> When editing latin text, we sometimes want behavior similar to IME
> composition characters. Consider the following scenario:
>
> 1. User starts messaging app and starts composing a message
> 2. User types "see you tomorro"
> 3. Keyboard shows as suggested spell fix "tomorrow", but doesn't
> automatically fix the text yet since the user hasn't finished typing
> the word.
> 4. User presses "send" button
>
> At this point we need to give the keyboard the opportunity to fix the
> spelling before the message is sent.

For me it's not a good idea to change text after I press "send", as
I cannot grantee the sent text is what I want. For example, my
keyboard doesn't know the word `IDE` and mistakenly fixes it to `DIE`.
Before sending the message, I don't want it to be `DIE` automatically.

> Same thing if the user clicked in a different text box or simply
> closed the keyboard.
>
> There are at least two ways we can fix this. Either we could let the
> keyboard provide a range and a "replacement text". This is used so
> that if the user moves focus, the text in the defined range is
> replaced with the "replacement text". Note that this wouldn't affect
> the selection, since in the example above, the selection is still
> collapsed at the end of the text. But the replacement range consists
> of the "tomorro".
>
> Another way to fix it is to fire an event in the keyboard API whenever
> focus is moved, but *before* the focus is moved. This would give the
> keyboard app the ability to call replaceSurroundingText before the
> cursor is moved or focus is lost.
>
> However I'm worried that having to fire an event in a separate process
> before the cursor is moved could be a performance problem.
>
> Is this something that can be done with the existing API? I don't
> quite understand how the setComposition/endComposition functions work
> so maybe they can be used for this?

setComposition/endComposition cannot be used for this. The two
functions are used to display intermediate text(named "composition
text") before user commits
the final text. When the focus of the input field is changed, the
intermediate text displayed will
be cleared and the composition is cancelled. For more about composition,
refer to
https://dvcs.w3.org/hg/ime-api/raw-file/default/Overview.html#composer-section
>
> / Jonas
>

Ehsan Akhgari

unread,

Jul 31, 2013, 1:33:18 PM7/31/13

to Yuan Xulei, Tim Guan-tin Chien, "Evan Tseng (曾增仁)", dev-webapi, Masayuki Nakano, James Ho, Evelyn Hung, Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Jonas Sicking

I'm not sure if Masayuki has been looped in about this, but he knows a
lot about IME, so CCing him on this thread.

Ehsan

On 2013-07-22 11:51 PM, Yuan Xulei wrote:
> Hi all,
>

> voidshowInputMethodPicker showAll();

>
> // Ask the OS to switch away from the current active Keyboard app.
> // OS should ignore this request if the app is currently not the active one.

> voidswitchToNextInputMethod next();

>
> // To know if the OS supports IME switching or not.
> // Use case: let the keyboard app knows if it is necessary to show the "IME switching"
> // (globe) button. We have a use case that when there is only one IME enabled, we
> // should not show the globe icon.
> boolean supportsSwitching();
>
> // Ask the OS to hide the current active Keyboard app. (was: |removeFocus()|)
> // OS should ignore this request if the app is currently not the active one.
> // The OS will void the current input context (if it exists).
> // This method belong to |mgmt| because we would like to allow Keyboard to access to
> // this method w/o a input context.

> voidremoveFocus hide();

> };
>
> // The input context, which consists of attributes and information of current input field.
> // It also hosts the methods available to the keyboard app to mutate the input field represented.
> // An "input context" gets void when the app is no longer allowed to interact with the text field,
> // e.g., the text field does no longer exist, the app is being switched to background, and etc.
> // [JJ] I doubt whether we should have 'name', 'type', etc. here. In the manifest we should
> // have entry points where the keyboard specifies which view to load when going into a
> // certain context. Requiring to do this manually will give extra work.
> // The system should guarantee that the right view is rendered based on entry_points in
> // in manifest (e.g. navigate keyboard to #text/en, or something, based on manifest.
> // [Tim] I don't think they are exclusive. A keyboard app might choose to load the same page with the same hash
> // for different types but only to deal with the |type| or |inputmode| difference later.
> interface InputContext: EventTarget {
> // The tag name of input field, which is enum of "input", "textarea", or "contenteditable"
> DOMString name;
>
> // The type of the input field, which is enum of text, number, password, url, search, email, and so on.

> // Seehttp://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html#states-of-the-type-attribute

> DOMString type;
>
> /*
> * The inputmode string, representing the input mode.

> * Seehttp://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#input-modalities:-the-inputmode-attribute

> */
> DOMString inputmode;
>
> /*
> * The primary language for the input field.
> * It is the value of HTMLElement.lang.

> * Seehttp://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#htmlelement

> * @param modifiers seehttp://mxr.mozilla.org/mozilla-central/source/dom/interfaces/base/nsIDOMWindowUtils.idl#206

Masayuki Nakano

unread,

Jul 31, 2013, 10:23:51 PM7/31/13

to Ehsan Akhgari, Yuan Xulei, Tim Guan-tin Chien, "Evan Tseng (曾增仁)", dev-webapi, James Ho, Evelyn Hung, Mounir Lamouri, Salvador de la Puente González, David Flanagan, Tomoya Asai, Jan Jongboom, Olli Pettay, Jonas Sicking

I've already been here ;-)

I'm going to attend a meeting with Japanese IME vendor next week and I
believe that they should join to the API design because I've never
created IME on any platforms and I'm not familiar with the detail of
mobile IME behavior. So, their suggestion must be more helpful.

I have a couple of questions:

1. Does the draft of the API document contain all discussion?
<https://wiki.mozilla.org/WebAPI/KeboardIME> I mean that is it the
latest plan to implement the API?

2. Do you start implemting the API? If so, I'd like to see the source
code. Could you tell me the URL?

I think that we will probably use the draft at the meeting.

On 2013/08/01 2:33, Ehsan Akhgari wrote:
> I'm not sure if Masayuki has been looped in about this, but he knows a
> lot about IME, so CCing him on this thread.
>
> Ehsan

--
Masayuki Nakano <masa...@d-toybox.com>
Manager, Internationalization, Mozilla Japan.

Kan-Ru Chen (陳侃如)

unread,

Aug 1, 2013, 12:24:07 AM8/1/13

to Masayuki Nakano, Ehsan Akhgari, David Flanagan, Evelyn Hung, dev-webapi, James Ho, Evan Tseng (曾增仁), Mounir Lamouri, Salvador de la Puente González, Tim Guan-tin Chien, Tomoya Asai, Jan Jongboom, Olli Pettay, Yuan Xulei, Jonas Sicking

Masayuki Nakano <masa...@d-toybox.com> writes:

> I've already been here ;-)
>
> I'm going to attend a meeting with Japanese IME vendor next week and I
> believe that they should join to the API design because I've never
> created IME on any platforms and I'm not familiar with the detail of
> mobile IME behavior. So, their suggestion must be more helpful.
>
> I have a couple of questions:
>
> 1. Does the draft of the API document contain all discussion?
> <https://wiki.mozilla.org/WebAPI/KeboardIME> I mean that is it the
> latest plan to implement the API?

Yes this is the latest design of the API.

> 2. Do you start implemting the API? If so, I'd like to see the source
> code. Could you tell me the URL?

We have started implementing the API. The tracking bug is 737110. The
source code is still in patch form and some WIPs are attached to each
bugs.

Kanru

Jonas Sicking

unread,

Aug 1, 2013, 2:41:07 AM8/1/13

to Yuan Xulei, Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, Evan Tseng (曾增仁), Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay

Note that I think that we should still try to make it so that inline
elements are considered part of the same context. So if we have markup
like:

<div contenteditable=true>
here is some text
here is another paragraph of text
</div>

Then I think it would be good if that results in two different
inputcontexts. From the keyboard APIs point of view the second
inputcontext contains the string "here is another paragraph of text".
I.e. the inline element is invisible to the keyboard app.

Would this be possible without performance issues?

Cool!

>> * Apply spell-fix automatically on lost focus.
>> When editing latin text, we sometimes want behavior similar to IME
>> composition characters. Consider the following scenario:
>>
>> 1. User starts messaging app and starts composing a message
>> 2. User types "see you tomorro"
>> 3. Keyboard shows as suggested spell fix "tomorrow", but doesn't
>> automatically fix the text yet since the user hasn't finished typing
>> the word.
>> 4. User presses "send" button
>>
>> At this point we need to give the keyboard the opportunity to fix the
>> spelling before the message is sent.
>
> For me it's not a good idea to change text after I press "send", as
> I cannot grantee the sent text is what I want. For example, my
> keyboard doesn't know the word `IDE` and mistakenly fixes it to `DIE`.
> Before sending the message, I don't want it to be `DIE` automatically.

I do understand that this can cause bad results. But my experience is
that it fixes far more bad results than it causes. And the user can
always see in the keyboard app that there's a suggested correction and
so the user will learn to look for that if he/she wants an unusual
word. That's something you have to do when you have such a word in the
middle of the text anyway.

I think ultimately this is a UX decision, and so we should ask the UX
team what behavior they want here. Either way I think we can live
without this for the initial release if needed.

Thanks for the link!

/ Jonas

Yuan Xulei

unread,

Aug 1, 2013, 4:37:34 AM8/1/13

to Jonas Sicking, Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay

Yes, I think so, but it may be complex to define and determine context
border.
Why couldn't we treat the inline element as different context?

Jonas Sicking

unread,

Aug 1, 2013, 4:58:11 AM8/1/13

to Yuan Xulei, Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, Evan Tseng (曾增仁), Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay

Any element that isn't an inline element should be a context border.
The following elements are inline elements:
https://developer.mozilla.org/en-US/docs/HTML/Inline_elements

> Why couldn't we treat the inline element as different context?

For the same reason that we don't use a separate context for each
word. Inline elements are often used to mark off a single word, and
sometimes even part of a word.

If we were to make each inline element a new context, that would mean
that the keyboard would loose context any time that the user clicked
in a bolded or italicised word which could lose any context that
keyboard app had remembered. It would also mean that the keyboard
couldn't do predictive spell checking by inspecting the previous word.

/ Jonas

"Yuan Xulei(袁徐磊)"

unread,

Aug 1, 2013, 5:37:49 AM8/1/13

to Jonas Sicking, Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay

Thanks, that's clear for me.

Ehsan Akhgari

unread,

Aug 1, 2013, 10:47:57 AM8/1/13

to Jonas Sicking, Tim Chien, Evelyn Hung, dev-webapi, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Yuan Xulei

I prefer if we did not use tag names to determine whether something is
inline or not, and relied on the information from the style system
instead. There is nothing preventing an author from making a block
for example, and our layout code for things like text selection, etc
uses the style system information and doing something different here
would lead to weird bugs.

>> Why couldn't we treat the inline element as different context?
>
> For the same reason that we don't use a separate context for each
> word. Inline elements are often used to mark off a single word, and
> sometimes even part of a word.
>
> If we were to make each inline element a new context, that would mean
> that the keyboard would loose context any time that the user clicked
> in a bolded or italicised word which could lose any context that
> keyboard app had remembered. It would also mean that the keyboard
> couldn't do predictive spell checking by inspecting the previous word.

Another thing to think about is how we would deal with content like this:

a mispeld work

Here, it's not clear to me what should happen if the keyboard API
attempts to replace "mispeld" with the correct spelling of the word, for
example. Should we remove the span? Or should we try to reconstruct
the original structure, and if so, what should the algorithm look like?

Ehsan

Jonas Sicking

unread,

Aug 1, 2013, 5:28:15 PM8/1/13

to Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, Evan Tseng (曾增仁), Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Yuan Xulei

I think people turning s into blocks is really rare. Especially
within contenteditable areas. Rare enough that it's ok that the
keyboard doesn't see it as a context limit. The problems with getting
context limits wrong are really fairly small.

Paying attention to CSS would be a relatively big increase in
complexity. First of all there's performance implications of having to
hit the style system for each element. Second, we'd have to monitor
any style changes and then create fairly complex rules for what
happens if a is dynamically changed into a block.

What problems are you worried we'd hit if text selection and keyboard
handling treat block limits differently?

>>> Why couldn't we treat the inline element as different context?
>>
>>
>> For the same reason that we don't use a separate context for each
>> word. Inline elements are often used to mark off a single word, and
>> sometimes even part of a word.
>>
>> If we were to make each inline element a new context, that would mean
>> that the keyboard would loose context any time that the user clicked
>> in a bolded or italicised word which could lose any context that
>> keyboard app had remembered. It would also mean that the keyboard
>> couldn't do predictive spell checking by inspecting the previous word.
>
>
> Another thing to think about is how we would deal with content like this:
>
> a mispeld work
>
> Here, it's not clear to me what should happen if the keyboard API attempts
> to replace "mispeld" with the correct spelling of the word, for example.
> Should we remove the span? Or should we try to reconstruct the original
> structure, and if so, what should the algorithm look like?

I think we should use the same algorithms as ranges use. The keyboard
API provides a range that is to be replaced. If that range encompasses
the whole span, then it simply gets removed when the word is replaced.

There are some tricky situations like:
a mispeld word

This would likely result in the DOM looking something like the
following after spell correction
a misspelled word

Which isn't ideal. Does the editor code have smarts for fixing stuff
like that up?

/ Jonas

Ehsan Akhgari

unread,

Aug 1, 2013, 5:46:33 PM8/1/13

to Jonas Sicking, Tim Chien, Evelyn Hung, dev-webapi, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Yuan Xulei

On 2013-08-01 5:28 PM, Jonas Sicking wrote:
>> I prefer if we did not use tag names to determine whether something is
>> inline or not, and relied on the information from the style system instead.
>> There is nothing preventing an author from making a block for example,
>> and our layout code for things like text selection, etc uses the style
>> system information and doing something different here would lead to weird
>> bugs.
>
> I think people turning s into blocks is really rare. Especially
> within contenteditable areas. Rare enough that it's ok that the
> keyboard doesn't see it as a context limit. The problems with getting
> context limits wrong are really fairly small.
>
> Paying attention to CSS would be a relatively big increase in
> complexity. First of all there's performance implications of having to
> hit the style system for each element. Second, we'd have to monitor
> any style changes and then create fairly complex rules for what
> happens if a is dynamically changed into a block.

Yes, agreed. The first problem isn't that expensive if you avoid
modifying the style tree (since the information would mostly be there
when you need it.) The second problem is a lot harder to solve though.

> What problems are you worried we'd hit if text selection and keyboard
> handling treat block limits differently?

I don't have a lot of examples in mind right now, but for example, it
would be weird for text selection to treat something as two separate
entities, and for the keyboard to treat them as one, or vice versa.

I mean, the problem really stems from the fact that it is the layout
system that determines what users will see on their screens.

>> Another thing to think about is how we would deal with content like this:
>>
>> a mispeld work
>>
>> Here, it's not clear to me what should happen if the keyboard API attempts
>> to replace "mispeld" with the correct spelling of the word, for example.
>> Should we remove the span? Or should we try to reconstruct the original
>> structure, and if so, what should the algorithm look like?
>
> I think we should use the same algorithms as ranges use. The keyboard
> API provides a range that is to be replaced. If that range encompasses
> the whole span, then it simply gets removed when the word is replaced.
>
> There are some tricky situations like:
> a mispeld word
>
> This would likely result in the DOM looking something like the
> following after spell correction
> a misspelled word
>
> Which isn't ideal. Does the editor code have smarts for fixing stuff
> like that up?

Yes, it does. It can handle things like splitting elements when
necessary, and join them as well. These hacks are not exactly specified
anywhere though, and they're sometimes fragile. If we wanted to, we
could use most of the editor code for this purpose.

Cheers,
Ehsan

Jonas Sicking

unread,

Aug 1, 2013, 6:49:09 PM8/1/13

to Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, Evan Tseng (曾增仁), Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Yuan Xulei

I think this problem is rare enough, complex enough to solve, and
doesn't disturb users enough that we should ignore it for now.

>>> Another thing to think about is how we would deal with content like this:
>>>
>>> a mispeld work
>>>
>>> Here, it's not clear to me what should happen if the keyboard API
>>> attempts
>>> to replace "mispeld" with the correct spelling of the word, for example.
>>> Should we remove the span? Or should we try to reconstruct the original
>>> structure, and if so, what should the algorithm look like?
>>
>>
>> I think we should use the same algorithms as ranges use. The keyboard
>> API provides a range that is to be replaced. If that range encompasses
>> the whole span, then it simply gets removed when the word is replaced.
>>
>> There are some tricky situations like:
>> a mispeld word
>>
>> This would likely result in the DOM looking something like the
>> following after spell correction
>> a misspelled word
>>
>> Which isn't ideal. Does the editor code have smarts for fixing stuff
>> like that up?
>
> Yes, it does. It can handle things like splitting elements when necessary,
> and join them as well. These hacks are not exactly specified anywhere
> though, and they're sometimes fragile. If we wanted to, we could use most
> of the editor code for this purpose.

That would be great. I don't think we ever need to split anything
given that the keyboard can't set styles. But it might need to merge
things and we might need to remove useless elements.

/ Jonas

Yuan Xulei

unread,

Aug 1, 2013, 11:26:34 PM8/1/13

to Ehsan Akhgari, Tim Chien, "Evan Tseng (曾增仁)", dev-webapi, James Ho, Evelyn Hung, Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Jonas Sicking

I'm not familiar with the style system. Could you give a hit or example
on how to achieve this?

>
>>> Why couldn't we treat the inline element as different context?
>>
>> For the same reason that we don't use a separate context for each
>> word. Inline elements are often used to mark off a single word, and
>> sometimes even part of a word.
>>
>> If we were to make each inline element a new context, that would mean
>> that the keyboard would loose context any time that the user clicked
>> in a bolded or italicised word which could lose any context that
>> keyboard app had remembered. It would also mean that the keyboard
>> couldn't do predictive spell checking by inspecting the previous word.
>
> Another thing to think about is how we would deal with content like this:
>
> a mispeld work
>
> Here, it's not clear to me what should happen if the keyboard API
> attempts to replace "mispeld" with the correct spelling of the word,
> for example. Should we remove the span? Or should we try to
> reconstruct the original structure, and if so, what should the
> algorithm look like?

I prefer to removing the span and letting user re-apply style to
the replaced word.
Sometimes it is impossible for algorithm to determine where the new
 should start.
>
> Ehsan
>

Ehsan Akhgari

unread,

Aug 2, 2013, 3:34:45 PM8/2/13

to Jonas Sicking, Tim Chien, Evelyn Hung, dev-webapi, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Yuan Xulei

Fair enough.

>>>> Another thing to think about is how we would deal with content like this:
>>>>
>>>> a mispeld work
>>>>
>>>> Here, it's not clear to me what should happen if the keyboard API
>>>> attempts
>>>> to replace "mispeld" with the correct spelling of the word, for example.
>>>> Should we remove the span? Or should we try to reconstruct the original
>>>> structure, and if so, what should the algorithm look like?
>>>
>>>
>>> I think we should use the same algorithms as ranges use. The keyboard
>>> API provides a range that is to be replaced. If that range encompasses
>>> the whole span, then it simply gets removed when the word is replaced.
>>>
>>> There are some tricky situations like:
>>> a mispeld word
>>>
>>> This would likely result in the DOM looking something like the
>>> following after spell correction
>>> a misspelled word
>>>
>>> Which isn't ideal. Does the editor code have smarts for fixing stuff
>>> like that up?
>>
>> Yes, it does. It can handle things like splitting elements when necessary,
>> and join them as well. These hacks are not exactly specified anywhere
>> though, and they're sometimes fragile. If we wanted to, we could use most
>> of the editor code for this purpose.
>
> That would be great. I don't think we ever need to split anything
> given that the keyboard can't set styles. But it might need to merge
> things and we might need to remove useless elements.

The editor does the splits and merges internally. The caller doesn't
get a say over what it does. :-)

Ehsan

Ehsan Akhgari

unread,

Aug 2, 2013, 3:36:43 PM8/2/13

to Yuan Xulei, Tim Chien, "Evan Tseng (曾增仁)", dev-webapi, James Ho, Evelyn Hung, Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Jonas Sicking

JS callers can do this by calling window.getComputedStyle. C++ callers
need to get the frame for a content node by calling GetFrame() on it and
then calling StyleDisplay() etc on the frame.

>>>> Why couldn't we treat the inline element as different context?
>>>
>>> For the same reason that we don't use a separate context for each
>>> word. Inline elements are often used to mark off a single word, and
>>> sometimes even part of a word.
>>>
>>> If we were to make each inline element a new context, that would mean
>>> that the keyboard would loose context any time that the user clicked
>>> in a bolded or italicised word which could lose any context that
>>> keyboard app had remembered. It would also mean that the keyboard
>>> couldn't do predictive spell checking by inspecting the previous word.
>>
>> Another thing to think about is how we would deal with content like this:
>>
>> a mispeld work
>>
>> Here, it's not clear to me what should happen if the keyboard API
>> attempts to replace "mispeld" with the correct spelling of the word,
>> for example. Should we remove the span? Or should we try to
>> reconstruct the original structure, and if so, what should the
>> algorithm look like?
> I prefer to removing the span and letting user re-apply style to
> the replaced word.
> Sometimes it is impossible for algorithm to determine where the new
> should start.

Note that if you decide to use the editor to take care of this, you
wouldn't need to worry about handling this explicitly.

Ehsan

Jonas Sicking

unread,

Aug 4, 2013, 4:21:58 AM8/4/13

to Ehsan Akhgari, Tim Chien, Evelyn Hung, dev-webapi, James Ho, Evan Tseng (曾增仁), Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Yuan Xulei

I think you would also have to explicitly flush style, and watch out
for style changes which can happen later.

So I would recommend not messing with this. At least not for now.

/ Jonas

"Yuan Xulei(袁徐磊)"

unread,

Aug 23, 2013, 9:18:44 AM8/23/13

to dev-webapi, Ehsan Akhgari, David Flanagan, Evelyn Hung, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, Tim Chien, Jan Jongboom, Olli Pettay, Jonas Sicking

Hi all,

Thanks for all your help. The new Keyboard API is under implementation
and we've finished most parts. Refer to bug 737110 for
details(https://bugzilla.mozilla.org/show_bug.cgi?id=737110).

Before we complete the implementation, we'll introduce a few changes as
bellow.

1. Add two attributes to InputContext interface to allow IME to get the
surrounding text directly.
It will be useful for the IME to get the text surrounding the cursor
without an extra asynchronous call to InputContext#getText.
Without these attributes, the IME needs to call InputContext#getText to
initialize auto-correction.

The cost of adding these attributes is low, as their value is already
being passed to onsurroundingtextchange and what we need is to save them
to attributes.

interface InputContext: EventTarget {
...
// The text before and after the begining of the selected text.
readonly attribute DOMString textBeforeCursor;
readonly attribute DOMString textAfterCursor;
...
}

2. Change the return value `Promise<boolean>` to `Promise<void>`.

Several methods of InputContext, such as `setSelectionRange`, use
`Promise<boolean>` as return value to tell whether the operation is
successful. It is unnecessary, because the reject callback of Promise
will do that for us. We can change the return value to `Promise<void>`.

I updated these changes to
wiki(https://wiki.mozilla.org/WebAPI/KeboardIME#Proposed_API).

Waiting for your comments and feedback.

Have a nice weekend.

Yuan

Ehsan Akhgari

unread,

Aug 23, 2013, 3:42:42 PM8/23/13

to "Yuan Xulei(袁徐磊)", Tim Chien, Evelyn Hung, dev-webapi, James Ho, "Evan Tseng (曾增仁)", Mounir Lamouri, Salvador de la Puente González, David Flanagan, Jan Jongboom, Olli Pettay, Jonas Sicking

On 2013-08-23 9:18 AM, "Yuan Xulei(袁徐磊)" wrote:
> Hi all,
>

These changes look good to me!

Ehsan

janjo...@gmail.com

unread,

Sep 4, 2013, 5:49:56 AM9/4/13

to

+1