Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

emacs 24's forward-char vs right-char behavior

210 views
Skip to first unread message

Xah Lee

unread,
Apr 24, 2012, 6:19:29 PM4/24/12
to
if it's not too late, i think the semantics of “forward-char” and
“right-char” should be exchanged (from what it currently is in emacs
24).

“forward-char”'s direction should be context sensitive.

“right-char” should always move to the right.

At first i thought emacs can't do that because lots elisp code depends
on “forward-char”'s existing behavior.

But on second thought, am thinking it's probably rare that elisp is
used to process {Arabic, Persian (Iran), Hebrew} languages. After all,
these weren't supported up to now. So, perhaps changing “forward-
char”'s behavior isn't too bad?

Xah

Joost Kremers

unread,
Apr 24, 2012, 7:07:58 PM4/24/12
to
Xah Lee wrote:
> if it's not too late, i think the semantics of “forward-char” and
> “right-char” should be exchanged (from what it currently is in emacs
> 24).
>
> “forward-char”'s direction should be context sensitive.

i think you're somehow mistaken, because it already is. from C-h f
forward-char RET:

====
Depending on the bidirectional context, the movement may be to the
right or to the left on the screen. This is in contrast with
<right>, which see.
=====

> “right-char” should always move to the right.

which it does already.

> But on second thought, am thinking it's probably rare that elisp is
> used to process {Arabic, Persian (Iran), Hebrew} languages.

actually, i'm doing that right now. ;-)

--
Joost Kremers joostk...@yahoo.com
Selbst in die Unterwelt dringt durch Spalten Licht
EN:SiS(9)

Eli Zaretskii

unread,
Apr 25, 2012, 2:07:04 AM4/25/12
to help-gn...@gnu.org
> From: Xah Lee <xah...@gmail.com>
> Date: Tue, 24 Apr 2012 15:19:29 -0700 (PDT)
>
> if it's not too late, i think the semantics of “forward-char” and
> “right-char” should be exchanged (from what it currently is in emacs
> 24).
>
> “forward-char”'s direction should be context sensitive.

What exactly do you mean by "direction"? There are 2 "directions"
involved here: the direction of the movement in the buffer (either
forward or backward), and the direction on the screen (either right or
left). 'forward-char' always moves forward in the buffer, but as
result could move either to the right or to the left on the screen,
because with bidirectional languages screen position is no longer
monotonically increasing with buffer position.

> “right-char” should always move to the right.

It mostly does. It always does move to the right when the surrounding
text is either all made of left-to-right characters (which is what
happens in Latin languages, for example), and also when the
surrounding text is all made of right-to-left characters. If the
surrounding text is mixed L2R and R2L, then 'right-char' switches its
screen direction, but still generally moves to the right,
i.e. movements to the left are normally short (e.g., when short
sequences of Latin text or numbers are embedded in a generally
right-to-left text).

IOW, Emacs 24 implements the so-called "logical" cursor motion.
Visual cursor motion, the one where 'right-char' would always move to
the right, no matter what the surrounding text, is not implemented.

> At first i thought emacs can't do that because lots elisp code depends
> on “forward-char”'s existing behavior.

That's one reason. The more important one is that reordering of
bidirectional text in Emacs is a display-only feature. It was
designed not to affect any buffer-related commands and movements.
Therefore, 'forward-char' must still move in the forward direction,
i.e. in the direction of increasing buffer or string positions.

> But on second thought, am thinking it's probably rare that elisp is
> used to process {Arabic, Persian (Iran), Hebrew} languages.

Emacs 24 includes full support for editing and displaying plain text
in these languages. And there was no conflict related to
'forward-char' in adding that support that would require a kind of
compromise you are hinting at.

Btw, the cursor movement implemented in Emacs 24 is the same one you
will see in MS products, like Word or Notepad. Emacs didn't invent
anything in this area, at least from the user POV.

> So, perhaps changing “forward-char”'s behavior isn't too bad?

The behavior of 'forward-char' didn't change at all. It still moves
forward.


Xah Lee

unread,
Apr 25, 2012, 3:43:22 AM4/25/12
to
Thanks Joost, and Eli for the informative answer.

first, here's my emacs version am testing from.
GNU Emacs 24.0.93.1 (i386-mingw-nt6.1.7601) of 2012-02-15 on MARVIN

Now, paste this sentence in emacs “(كتاب ألف ليلة و ليلة)”. Then, hold
down right arrow key (which is bound to “right-char”), then when
cursor moves into the Arabic text, it'll suddenly reverse direction,
and move right to left, until it reaches the left most arabic char
sequence, it'll jump back to the english text and continue move right.

Now, do the same but using “forward-char” 【Ctrl+f】. Actually, the same
behavior is observed visually!

from Eli's post, it seems to be the expected behavior. But then what's
the difference of forward-char and right-char? Am totally confused
now.

In emacs 23, holding right arrow (or Ctrl+f) simply move cursor to the
right, ALWAYS. I was expecting this from emacs 24's “right-char”.

Xah

Joost Kremers

unread,
Apr 25, 2012, 4:16:15 AM4/25/12
to
Eli Zaretskii wrote:
> Btw, the cursor movement implemented in Emacs 24 is the same one you
> will see in MS products, like Word or Notepad. Emacs didn't invent
> anything in this area, at least from the user POV.

with the exception that IMHO it's implemented better in emacs. it's much
less painful to edit mixed text in emacs than what i remember in word or
other programs. especially the automatic paragraph alignment is great.
libreoffice to this day displays a full stop at the end of a paragraph in
arabic (r2l) text at the *beginning* of that paragraph...

so hats off to the emacs devs who worked on this. great job.

Eli Zaretskii

unread,
Apr 25, 2012, 4:21:04 AM4/25/12
to help-gn...@gnu.org
> From: Xah Lee <xah...@gmail.com>
> Date: Wed, 25 Apr 2012 00:43:22 -0700 (PDT)
>
> Now, paste this sentence in emacs “(كتاب ألف ليلة و ليلة)”. Then, hold
> down right arrow key (which is bound to “right-char”), then when
> cursor moves into the Arabic text, it'll suddenly reverse direction,
> and move right to left, until it reaches the left most arabic char
> sequence, it'll jump back to the english text and continue move right.
>
> Now, do the same but using “forward-char” 【Ctrl+f】. Actually, the same
> behavior is observed visually!
>
> from Eli's post, it seems to be the expected behavior.

Indeed, expected behavior.

> But then what's the difference of forward-char and right-char? Am
> totally confused now.

Don't feel bad: this bidi business is complicated, especially for
someone who is not a native speaker of one of the bidi languages.

To see the difference between forward-char and right-char, do this:

emacs -Q
C-x b foo RET

Now paste the string "(كتاب ألف ليلة و ليلة)" into the buffer "foo"
you just created, and then try both C-f and <right>. See the
difference now?

Explanation: the difference only shows up in paragraphs whose "base
direction" is right-to-left. (See the Emacs manual's "Bidirectional
Editing" node for more about this.) In the *scratch* buffer, all
paragraphs are forced to be left-to-right, because *scratch* is mostly
used for code snippets. When you create a new buffer "foo", its
default value of bidi-paragraph-direction is nil, which means Emacs
determines the direction from the text of the paragraph. Pasting
Arabic text causes Emacs to treat the paragraph as right-to-left and
render it starting at the right margin of the window. As a side
effect, that affects the behavior of <right> vs forward-char.

> In emacs 23, holding right arrow (or Ctrl+f) simply move cursor to the
> right, ALWAYS. I was expecting this from emacs 24's “right-char”.

Type "C-h k <right>", and you will see that the commands bound to this
key in Emacs 23 and Emacs 24 are different. Then follow the link to
the code of right-char in Emacs 24, and look at its definition. I
think the code is self-explanatory.


Joost Kremers

unread,
Apr 25, 2012, 4:32:36 AM4/25/12
to
Xah Lee wrote:
> Now, paste this sentence in emacs “(كتاب ألف ليلة و ليلة)”. Then, hold
> down right arrow key (which is bound to “right-char”), then when
> cursor moves into the Arabic text, it'll suddenly reverse direction,
> and move right to left, until it reaches the left most arabic char
> sequence, it'll jump back to the english text and continue move right.

ah, *now* i see what you're referring to... (the stuff i'm working on right
now basically involves only r2l or l2r paragraphs, not paragraphs with
mixed text.)

> Now, do the same but using “forward-char” 【Ctrl+f】. Actually, the same
> behavior is observed visually!

yes, true.

i have no idea how difficult it would be to make right/left-char always
move to the right/left even in cases of mixed paragraphs, but i suspect it
wouldn't be easy to get it right. i'd involve checking if right/left-char
happens to move over a direction switch and if so, searching
forward/backward to find the buffer position where the direction switches
again and jump there.

but then again, that's basically what is happening right now, except not in
the buffer but on the screen.

> from Eli's post, it seems to be the expected behavior. But then what's
> the difference of forward-char and right-char? Am totally confused
> now.

the difference is with paragraphs that are completely l2r or r2l. c&p the
following into an empty emacs buffer (or put blank lines before and after):

كانت الخبازة تخبز أفضل خبز في المدينة ولذلك أحبها الجميع. كانت تخرج
الخبز الطازج من فرنها الكبير كل صباح. كانت رائحته تعمُ في جميع أنحاء
الشارع.

the text will be right-aligned, as it's an all-arabic paragraph. now move
the cursor through it with forward-char and with right-char, you'll see
they behave differently. though both behave in the way you'd logically
expect.

Eli Zaretskii

unread,
Apr 25, 2012, 4:50:53 AM4/25/12
to help-gn...@gnu.org
> From: Joost Kremers <joostk...@yahoo.com>
> Date: 25 Apr 2012 08:32:36 GMT
>
> i have no idea how difficult it would be to make right/left-char always
> move to the right/left even in cases of mixed paragraphs, but i suspect it
> wouldn't be easy to get it right. i'd involve checking if right/left-char
> happens to move over a direction switch and if so, searching
> forward/backward to find the buffer position where the direction switches
> again and jump there.

Yes, it's not easy at all. Especially if you think about such
complications as moving cursor with the Shift key pressed, which marks
the region you move across. With the visual cursor motion, the intent
of the user wrt which buffer positions should be included in the
region is ambiguous.

> the difference is with paragraphs that are completely l2r or r2l. c&p the
> following into an empty emacs buffer (or put blank lines before and after):

> كانت الخبازة تخبز أفضل خبز في المدينة ولذلك أحبها الجميع. كانت تخرج
> الخبز الطازج من فرنها الكبير كل صباح. كانت رائحته تعمُ في جميع أنحاء
> الشارع.

> the text will be right-aligned, as it's an all-arabic paragraph.

More accurately, a paragraph is considered right-to-left if its first
strong directional character (after skipping punctuation, digits,
etc.) is R2L.


Joost Kremers

unread,
Apr 25, 2012, 8:48:28 AM4/25/12
to
Eli Zaretskii wrote:
>> the text will be right-aligned, as it's an all-arabic paragraph.
>
> More accurately, a paragraph is considered right-to-left if its first
> strong directional character (after skipping punctuation, digits,
> etc.) is R2L.

yes, i found that out the hard way (started a paragraph that was meant to
be r2l with an english word... ;-)

Eli Zaretskii

unread,
Apr 26, 2012, 7:17:03 AM4/26/12
to help-gn...@gnu.org
> From: Joost Kremers <joostk...@yahoo.com>
> Date: 25 Apr 2012 12:48:28 GMT
>
> Eli Zaretskii wrote:
> >> the text will be right-aligned, as it's an all-arabic paragraph.
> >
> > More accurately, a paragraph is considered right-to-left if its first
> > strong directional character (after skipping punctuation, digits,
> > etc.) is R2L.
>
> yes, i found that out the hard way (started a paragraph that was meant to
> be r2l with an english word... ;-)

You can still have such a paragraph displayed right-to-left if you
insert the RLM (U+200F) character at its beginning.

Xah Lee

unread,
Apr 26, 2012, 10:24:06 AM4/26/12
to
Eli, thanks a lot for the explanation and Joost.

Just one curious question, can't right-char always move to the right
even in mixed R2L/L2R situations? is it because logically it
shouldn't?

Xah

Eli Zaretskii

unread,
Apr 26, 2012, 11:00:50 AM4/26/12
to help-gn...@gnu.org
> From: Xah Lee <xah...@gmail.com>
> Date: Thu, 26 Apr 2012 07:24:06 -0700 (PDT)
>
> can't right-char always move to the right even in mixed R2L/L2R
> situations?

It can: that's what "visual" cursor motion is about. But moving
always to the right requires that after each move, Emacs figures out
the value of point, because moving by one screen position no longer
means moving by one buffer position; it could potentially mean jumping
over many buffer positions.

By contrast, the current "logical" cursor motion always moves point by
one buffer position, then positions the cursor accordingly. The only
new aspect of right-char in Emacs 24 is that sometimes it moves
forward in the buffer and sometimes backward. But it still moves only
one buffer position.

The significant difference between logical and visual cursor motion is
that the former doesn't care about what's on the screen, because it
moves by buffer positions. Therefore, it can easily and predictably
do its job even when the Emacs display is not up to date (e.g.,
because Emacs cannot keep up with input events, or has some prolonged
command running). By contrast, visual cursor movement needs either to
consult the contents of the screen (which would mean difficulties when
you move the cursor off the screen, or if the display is not yet up to
date), or figure out how the text _would_ be displayed (which means
each <right> keypress will need to do much more work than it does
now). These difficulties are the reason why visual cursor motion was
not yet implemented in Emacs.

Joost Kremers

unread,
Apr 26, 2012, 2:19:08 PM4/26/12
to
Eli Zaretskii wrote:
> You can still have such a paragraph displayed right-to-left if you
> insert the RLM (U+200F) character at its beginning.

ah, thanks for the info. didn't know that.

Eli Zaretskii

unread,
Apr 26, 2012, 4:15:48 PM4/26/12
to help-gn...@gnu.org
> From: Joost Kremers <joostk...@yahoo.com>
> Date: 26 Apr 2012 18:19:08 GMT
>
> Eli Zaretskii wrote:
> > You can still have such a paragraph displayed right-to-left if you
> > insert the RLM (U+200F) character at its beginning.
>
> ah, thanks for the info. didn't know that.

It's in the manual, FWIW.

Jason Rumney

unread,
Apr 26, 2012, 9:58:25 PM4/26/12
to help-gn...@gnu.org
On Wednesday, 25 April 2012 16:50:53 UTC+8, Eli Zaretskii wrote:

> Yes, it's not easy at all. Especially if you think about such
> complications as moving cursor with the Shift key pressed, which marks
> the region you move across. With the visual cursor motion, the intent
> of the user wrt which buffer positions should be included in the
> region is ambiguous.

Indeed, when I tested now, I observed that when merely moving the cursor, Chrome uses visual cursor motion, but when selecting text it switches to using logical cursor motion. I'm not sure what is more confusing, logical cursor motion when moving, or changing behaviour when selecting text, but definitely you need to use logical motion when selecting text.

Jason Rumney

unread,
Apr 26, 2012, 9:58:25 PM4/26/12
to gnu.ema...@googlegroups.com, help-gn...@gnu.org
On Wednesday, 25 April 2012 16:50:53 UTC+8, Eli Zaretskii wrote:

> Yes, it's not easy at all. Especially if you think about such
> complications as moving cursor with the Shift key pressed, which marks
> the region you move across. With the visual cursor motion, the intent
> of the user wrt which buffer positions should be included in the
> region is ambiguous.

0 new messages