I don't know about the word boundary thing in vim and elisp code for that but the behaviour of backward-kill-word is simple : kill the last word; where a word is something alphanumeric. Any non alphanumeric characters like : and ( are deleted automatically if between point and last word. There is no concept here of : or ( being word boundaries. So if you do M-d on ":67a" whole thing gets deleted and in "67a:", : remains (with point at beginning of string). --- On Wed, 6/16/10, Paul Drummond <paul.d...@iode.co.uk> wrote: |
Emacs doesn't so much care about word-boundaries as about words.
So when you forward-word, it just skip until the end of the next word,
where "abc" is a word, but ";-( )" is not.
So in many cases, it ends up doing in one step what VI would do in to:
first skip over the non-word chars, and then skip the next few
word-chars, whereas VI would stop after the run of non-word chars and
stop again after the subsequent run of word chars.
I don't think there a very good reason for doing it like Emacs vs doing
it like VI. Each one has its advantages. VI's approach stops more
often, so there's less chance that it'll skip the position in which
you're interested, which is why you like it. In Emacs's approach OTOH
you'll often get away with fewer operations.
Stefan
> Again, vim does the right thing here - pressing 'b' takes the point to
> the closing bracket of Page(this) so it doesn't recognise the semi-colon
> as a bracket which is intuitive and what I would expect. This is really
> the point I am trying to make. I have never taken the time to
> understand the behaviour of word boundaries in Vim because *it just
> works*. In Emacs I am forced to think about word boundaries because
> Emacs keeps surprising me with its weird behaviour!
I never thought about this issue actively. I do have a vague recollection of
facing it when I first moved back from vi to Emacs.
Separating words and word boundaries feels more semantic and less mechanical.
And it seems that you can get more done with the same key binding than we
currently can. Seems like a good idea to implement it:
forward-word-or-boundary, kill-word-or-boundary, ...
My example would be, say "apples, oranges and peaches". Now think of deleting
"apples, ".
Cheers,
Uday
To be more specific, I think it depends on what the syntax table of
the active mode looks like. You can make your own syntax table to
change the behavior of "word commands" to some extent.
--
Deniz Dogan
Good point.
I remember i felt something similar some 5 or 7 years ago and was
annoyed. But now i can't remember any detail... i just got used to
emacs and can't say i find it being problem at all.
actually, i think point is a valid one and a bit technically involved
in detail.
i'll have to study this in detail some other day but here's some
points.
For testing, save a file with this line as content:
something in the water does not compute
Now, you can try the word movement in different editors.
I tested this on Notepad, Notepad++, vim, emacs, Mac's TextEdit.
In short, different text editors all have a bit different behavior.
Here, Notepad, Notepad++, vim have the same behavior, while emacs and
TextEdit have similar behavior.
In Notepad, Notepad++, vim, the cursor always ends at the beginning of
each word.
In emacs and TextEdit, they end in the beginning of the word if you
are using backward-word, but ends at the end of the word if you are
using forward-word.
That's the first major difference.
--------------------------------------------------
Now, try this line:
something !! in @@ the ## water $$ does %% not ^^ compute
Now, vim and Notepad++ 's behavior are identical. Their behavior is
pretty simple and like before. They simply put the cursor at the
beginning of each string sequence, doesn't matter what the characters
are. Notepad is similar, except that it moves into between %%.
emacs and TextEdit behaved similarly.
Emacs will skip the symbol clusters entirely, except %%. (this depends
on what mode you are in)
TextEdit will also stop in middle of $$ and ^^, otherwise skip the
other symbols clusters entirely.
So, from this, it is clear that different editors has different
concepts of syntax group, or not such concept at all.
I understand well the emacs case. Emacs has a syntax table concept,
that groups certain chars into a classes of “whitespace”, “word”,
“symbol”, “punctuation”, ...etc. When you use backward-word, it simply
move untill it reaches a char that's not in the “word” group. So,
depending on which mode you are in, it'll either skip a character
sequence of identical chars entirely, or stop at their boundary. And
if the char sequence is of different symbols such as !@#$%&*() then
emacs may go into middle of them.
The question is whether other editors has syntax group notion, or that
their word movement behavior depends on the language mode at all.
--------------------------------------------------
Now, the interesting question is which model is more efficient for
general everyday coding of different languages.
First question is: is it more efficient in general for forward/
backward word motions to always land in front of the word as in vim,
Notepad, Notepad++ ?
Certainly i think it is more intuitive that way. But otherwise i don'
tknow. I'll have to do research on this some day.
The second question is whether it is good to have the movement
dependant on the language mode. Again i don't know.
Though, i do find emacs syntax table annoying from my experience of
working with it a bit in the past few years... from the little i know,
i felt that it doesn't do much, its power to model syntax is quite
weak, and very complicated to use... but i don't know for sure.
Btw, one of your example, this one:
Page *page = new _Page(this);
page.load();
i cannot duplicate.
Xah
∑ http://xahlee.org/
☄
wrote up a cleaned up version here:
http://xahlee.blogspot.com/2010/06/text-editors-cursor-movement-behavior.html
here's a excerpt of the question:
-------------------------
Now, create a file of this content for more test.
something in the water does not compute
something !! in @@ the ## water $$ does %% not ^^ compute
something!!in@@the##water$$does%%not^^compute
(defun insert-p-tag () "Insert <p></p> at cursor point."
(interactive) (insert "<p></p>") (backward-char 4))
for (my $i = 0; $i < 9; $i++) { print "done!";}
<a><b>a b c</b> d e</a>
Answer this:
* Does the positions the cursor stop depends on whether you are
moving left or right?
* Does the word motion behavior change depending on what language
mode you are in?
* What is your editor? on what OS?
Thanks.
Xah
>
> Emacs doesn't so much care about word-boundaries as about words.
> So when you forward-word, it just skip until the end of the next word,
> where "abc" is a word, but ";-( )" is not.
> So in many cases, it ends up doing in one step what VI would do in [two]:
> first skip over the non-word chars, and then skip the next few
> word-chars, whereas VI would stop after the run of non-word chars and
> stop again after the subsequent run of word chars.
Indeed, reducing two down to one is an advantage.
But if I have "abs;-()" and I want to delete the whole jing bang, Emacs loses
big time!
Cheers,
Uday
Try c-subword-mode for CamelCase. There is also capitalized-words-
mode, but I've never tried it.
Hi,
seems not a question of word-boundaries, but a feature:
as you describe, Vim says: when word-chars are under cursor, kill them.
When non-word chars are there, kill until next word.
Interesting.
> ** Example 3
>
> I have loads of problems when deleting and navigating words over multiple
> lines. In the following C++ code for instance:
>
> Page *page = new _Page(this);
> page.load();
> ^
>
> When point is after "page", before the dot on the second line and I hit M-b
> (backward-word) point ends up at the first opening bracket of "Page(" !!!
>
> Again, vim does the right thing here - pressing 'b' takes the point to the
> closing bracket of Page(this) so it doesn't recognise the semi-colon as a
> bracket which is intuitive and what I would expect. This is really the
> point I am trying to make. I have never taken the time to understand the
> behaviour of word boundaries in Vim because *it just works*. In Emacs I am
> forced to think about word boundaries because Emacs keeps surprising me with
> its weird behaviour!
Forward-moves stop after the object, backward-moves before.
When a mode defines '()' as word-characters, M-x backward-word will stop
at the semi-colon at your example.
Andreas
>
> Note: My examples happen to be C++ but I use lots of other languages too
> including elisp, Clojure, JavaScript, Python and Java and the
> word-boundaries seem to be wrong for all of them.
>
> I have tried several different elisp solutions but each one has at least one
> feature that isn't quite right. Here are some links I kept, I've tried many
> other solutions but don't have the links to hand:
>
> http://stackoverflow.com/questions/2078855/about-the-forward-and-backward-a-word-behaviour-in-emacs
> http://stackoverflow.com/questions/1771102/changing-emacs-forward-word-behaviour/1772365#1772365
Paul Drummond writes:
> ** Example 2
>
> When editing C++ files I often need to delete the "ClassName::" part when
> declaring functions in the header:
>
> void ClassName::function();
> ^
>
> With point at the start of ClassName I want to press M-d twice to delete
> ClassName and :: but "::" isn't recognised as a word. In Vim I just
Twice? Three times, shirley? Class and Name are both words...
Because it needs to be defined somewhat differently for natural
languages and different programming languages, at a guess. What a word
is depends entirely on the context you (and I) decide, and they may well
be different (see two versus three key presses above).
I suspect the answer is b. ;-)
There is another answer: (c) looking at sexps instead of words.
thi
Is it possible to specify word boundaries for a particular mode?
--
Find research and analysis on US healthcare, health insurance,
and health policy at: <http://healthpolicydaily.blogspot.com/>
Yes, it's part of the syntax table. See e.g. `modify-syntax-entry'.
Regarding camel case word jumping, see subword-mode (previously known
as c-subword-mode) which is part of Emacs.
--
Deniz Dogan
Heres the answer again in case you missed it.
• Text Editor's Cursor Movement Behavior (emacs, vi, Notepad++)
http://xahlee.org/emacs/text_editor_cursor_behavior.html
plain text version follows.
-------------------------------------
Text Editor's Cursor Movement Behavior (emacs, vi, Notepad++)
Xah Lee, 2010-06-17
This article discusses some differences of cursor movement behavior
among editors. That is, when you press “Ctrl+→”, on a line of
programing language code with lots of different sequence of symbols,
where exactly does the cursor stop at?
--------------------------------------------------
Always End at Beginning of Word?
Type the following in your favorite text editor.
something in the water does not compute
Now, you can try the word movement in different editors.
I tested this on Notepad, Notepad++, vim, emacs, Mac's TextEdit.
In Notepad, Notepad++, vim, the cursor always ends at the beginning of
each word.
In emacs, TextEdit, Xcode, they end in the beginning of the word if
you are moving backward, but ends at the end of the word if you are
moving forward.
That's the first major difference.
--------------------------------------------------
Does Movement Depends on the Language Mode?
Now, try this line:
something !! in @@ the ## water $$ does %% not ^^ compute
Now, vim and Notepad++ 's behavior are identical. Their behavior is
pretty simple and like before. They simply put the cursor at the
beginning of each string sequence, doesn't matter what the characters
are. Notepad is similar, except that it will move into between %%.
Emacs, TextEdit behaved similarly. Emacs will skip the symbol
clusters !!, @@, ##, ^^ entirely, while stopping at boundaries of $$
and %%. (when emacs is in text-mode) TextEdit will stop in middle of $
$ and ^^, but skip the other symbol clusters entirely.
I don't know about other editors, but i understand the behavior of
emacs well. Emacs has a syntax table concept. Each and every character
is classified into one of “whitespace”, “word”, “symbol”,
“punctuation”, and others. When you use backward-word, it simply move
untill it reaches a char that's not in the “word” group.
Each major mode's value of syntax table are usually different. So,
depending on which mode you are in, it'll either skip a character
sequence of identical chars entirely, or stop at their boundary.
(info "(elisp) Syntax Tables")
The question is whether other editor's word movement behavior changes
depending on the what language mode it is currently in. And if so, how
the behavior changes? do they use a concept similar to emacs's syntax
table?
In Notepad++, cursor word-motion behavior does not change with respect
to what language mode you are in. Some 5 min test shows nor for vim.
--------------------------------------------------
More Test
Now, create a file of this content for more test.
something in the water does not compute
something !! in @@ the ## water $$ does %% not ^^ compute
something!!in@@the##water$$does%%not^^compute
(defun insert-p-tag () "Insert <p></p> at cursor point."
(interactive) (insert "<p></p>") (backward-char 4))
for (my $i = 0; $i < 9; $i++) { print "done!";}
<a><b>a b c</b> d e</a>
Answer this:
* Does the positions the cursor stop depends on whether you are
moving left or right?
* Does the word motion behavior change depending on what language
mode you are in?
* What is your editor? on what OS?
--------------------------------------------------
Which is More Efficient?
Now, the interesting question is which model is more efficient for
general everyday coding of different languages.
First question is: is it more efficient in general for left/right word
motions to always land in the left boundary the word as in vim,
Notepad, Notepad++ ?
Certainly i think it is more intuitive that way. But otherwise i don't
know.
The second question is: whether it is good to have the movement change
depending on the language mode.
I don't know. But again it seems more intuitive that way, because
users have good expectation where the cursor will stop regardless what
language he's coding. Though, of course it MAY be less efficient,
because logically one'd think that it might be better to have word
motion behavior adopt to different language. But am not sure about
this in real world situations.
Though, i do find emacs syntax table annoying from my experience of
working with it a bit in the past few years... from the little i know,
i felt that it doesn't do much, its power to model syntax is quite
weak, and very complicated to use... but i don't know for sure.
This article is inspired from Paul Drummond question in gnu.emacs.help
--------------------------------------------------
2010-06-18
On 2010-06-17, Elena <egarr...@gmail.com> wrote:
is there some elisp code to move by tokens when a programming mode
is
active? For instance, in the following C code:
double value = f ();
the point - represented by | - would move like this:
|double value = f ();
double |value = f ();
double value |= f ();
double value = |f ();
double value = f |();
double value = f (|);
double value = f ()|;
cc-mode has functions c-forward-token-1 and c-forward-token-2. (thanks
to Andreas Politz)
It is easy to write a elisp code to do what you want, though, might be
tedious depending on what you mean by token, and whether you really
want the cursor to move by token. (might be too many stops)
Here's a function i wrote and have been using it for a couple of
years. You can mod it to get what u want. Basically that's the idea.
But depending what you mean by token, might be tedious to get it
right.
(defun forward-block ()
"Move cursor forward to next occurrence of double newline char.
In most major modes, this is the same as `forward-paragraph', however,
this function behaves the same in any mode.
forward-paragraph is mode dependent, because it depends on
syntax table that has different meaning for “paragraph” depending on
mode."
(interactive)
(skip-chars-forward "\n")
(when (not (search-forward-regexp "\n[[:blank:]]*\n" nil t))
(goto-char (point-max)) ) )
(defun backward-block ()
"Move cursor backward to previous occurrence of double newline char.
See: `forward-block'"
(interactive)
(skip-chars-backward "\n")
(when (not (search-backward-regexp "\n[[:blank:]]*\n" nil t))
(goto-char (point-min))
)
)
actually, you can just mod it so that it always just skip syntax
classes that's white space... but then if you have 1+1+8 that'll skip
the whole thing...
Xah
∑ http://xahlee.org/
☄
> Regarding camel case word jumping, see subword-mode (previously known
> as c-subword-mode) which is part of Emacs.
Thanks for the info on subword-mode!
great discovery. Few years ago i searched the web and found one or two
camelCase mode, i installed it and it works, but now a bundled package
is much better!
thanks.
Xah