an independent Chinese IME is available

39 views
Skip to first unread message

Sean

unread,
Jan 3, 2009, 10:12:20 PM1/3/09
to vim_use
Hi,

I just uploaded a script used as an IME (Input Method Editor) for
typing Chinese as an example:
http://vim.sourceforge.net/scripts/script.php?script_id=2506

The screen-shot can be found on http://maxiangjiang.googlepages.com/vim_ime.gif

For years, I have been dreaming of an independent IME to input Chinese
from within vim. After all, vim is a text editor, which should be able
to type in any language including CJK. It is a pain to configure and
use the supported |+multi_byte_ime| API, which is dependent on
binaries outside of vim.

As a backgrounds to those who never used CJK, the IME is a big
business in Asian countries. The basic idea is how to input ten-
thousands of characters comfortably using English keyboard.

Now, I hope vim can jump the boat comfortably.

Thanks

Sean

Tony Mechelynck

unread,
Jan 3, 2009, 11:10:17 PM1/3/09
to vim...@googlegroups.com

Interesting.

The "data file" http://maxiangjiang.googlepages.com/ChineseIME.dict
doesn't seem to be in the "key, space, hanzi" format described in the
ChineseIME.vim plugin.

Maybe someday (if I come around to it) I'll try to build a datafile by
Kangxi (or similar) radicals and strokes, for people who can "see" hanzi
and maybe even "understand" them but not "say" them. I might even
already have the needed data in a different format somewhere on this
computer.


Best regards,
Tony.
--
Leibowitz's Rule:
When hammering a nail, you will never hit your finger if you
hold the hammer with both hands.

Sean

unread,
Jan 3, 2009, 11:31:26 PM1/3/09
to vim_use
The data file is valid. I just downloaded it and these are the first
three lines:

a 啊
a 阿
a 呵

The data file is in utf8 encoding. In order to read Chinese using vim,
I have the following settings on my .vimrc:

set enc=utf8
set gfn=Courier_New:h12:w7
set gfw=SimSun-18030,Arial_Unicode_MS

Yes, you are free to make your own data file. Actually, hundreds if
not thousands of "patents" were created on how to define the <key>
part of the mapping, among other things.

Welcome to the vim world of IME :))

Sean



On Jan 3, 8:10 pm, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:
> On 04/01/09 04:12, Sean wrote:
>
>
>
> > Hi,
>
> > I just uploaded a script used as an IME (Input Method Editor) for
> > typing Chinese as an example:
> >http://vim.sourceforge.net/scripts/script.php?script_id=2506
>
> > The screen-shot can be found onhttp://maxiangjiang.googlepages.com/vim_ime.gif

anhnmncb

unread,
Jan 3, 2009, 11:31:19 PM1/3/09
to vim...@googlegroups.com
Very interesting! But the speed is too slow and, if the word preceded is a
Chinese, then it will make vim hang, so when I tried to

imap <space> <C-x><C-u><C-u><C-p><C-n>

it will make vim hang. :(

Anyway, it's a good start.

>
> Thanks
>
> Sean
>
> >
>


--
Regards,
anhnmncb

anhnmncb

unread,
Jan 3, 2009, 11:41:48 PM1/3/09
to vim...@googlegroups.com

And I find punctation symbol can't be put in the data file to input Chinese
punctation, in that case vim will hang.

bill lam

unread,
Jan 3, 2009, 11:49:46 PM1/3/09
to vim...@googlegroups.com
On Sun, 04 Jan 2009, Tony Mechelynck wrote:
> Maybe someday (if I come around to it) I'll try to build a datafile by
> Kangxi (or similar) radicals and strokes, for people who can "see" hanzi
> and maybe even "understand" them but not "say" them. I might even
> already have the needed data in a different format somewhere on this
> computer.

If you are interested I can send you the source file that radical for
traditional chinese in (about 30000 characters).

I didn't use this plugin, but I guess it should limit the characters
displayed in popup box in order to be useful. Anyway, for beginners,
it is easier to just input the first and the last radical only, and
let it display a list of characters to chose from. If it can accept
exactly two keystroke before calling up the popup box, there should be
no speed penalty.

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩163 馬戴 楚江懷古
露氣寒光集 微陽下楚丘 猿啼洞庭樹 人在木蘭舟
廣澤生明月 蒼山夾亂流 雲中君不見 竟夕自悲秋

pansz

unread,
Jan 3, 2009, 11:56:17 PM1/3/09
to vim...@googlegroups.com
Sean 写道:

> Yes, you are free to make your own data file. Actually, hundreds if
> not thousands of "patents" were created on how to define the <key>
> part of the mapping, among other things.
>
> Welcome to the vim world of IME :))

For patents... It seems that the name IME is proprietary for Microsoft.
That might be the reason most input methods in the Linux world are named
IM or XIM, they never has a name IME.

I suspect you may have to change the name 'IME' into something else for
the legal reason. Anyway, 'Input Method Editor' isn't the best
description for your masterpiece, 'Input Method Inside an Editor' is,
probably IMIE is a better name.

Sean

unread,
Jan 3, 2009, 11:57:28 PM1/3/09
to vim_use

I am interested in how to improve performance, which seems not the
issue for me, though.

These are my computing environments:

laptop: HP MODEL nc6320
OS: Windows XP
CPU: 1.66GHz
RAM: 2GB

Also, the gvim.exe Memory Usage is less than 7MB with or without my
plugin.

My own data file is over 32K lines, and the Chinese shows up within
one second after I type ma<C-I>, as shown on my screenshot. If I
typed <C-I> on empty space, I did find it is kind of "hanging" and I
had to type <C-C> to stop it. However, it makes little sense to type
<C-I> over empty space.

I also tried to sort the data file every time after I do update, but
it makes no difference in performance to me.

It would be interesting to know if it makes difference if you cut the
sample data file by half?

By the way, I disabled the standard $VIMRUNTIME/plugin directory.

Thanks

Sean




On Jan 3, 8:31 pm, anhnmncb <anhnm...@sina.com> wrote:
> On 2009-01-04, Sean wrote:
>
> > Hi,
>
> > I just uploaded a script used as an IME (Input Method Editor) for
> > typing Chinese as an example:
> >http://vim.sourceforge.net/scripts/script.php?script_id=2506
>
> > The screen-shot can be found onhttp://maxiangjiang.googlepages.com/vim_ime.gif

Tony Mechelynck

unread,
Jan 4, 2009, 12:49:16 AM1/4/09
to vim...@googlegroups.com
On 04/01/09 05:49, bill lam wrote:
> On Sun, 04 Jan 2009, Tony Mechelynck wrote:
>> Maybe someday (if I come around to it) I'll try to build a datafile by
>> Kangxi (or similar) radicals and strokes, for people who can "see" hanzi
>> and maybe even "understand" them but not "say" them. I might even
>> already have the needed data in a different format somewhere on this
>> computer.
>
> If you are interested I can send you the source file that radical for
> traditional chinese in (about 30000 characters).

Yes, why not? The Unihan database is more like 30 _million_ characters,
but of course it includes a lot of info that would be superfluous in
this case.

>
> I didn't use this plugin, but I guess it should limit the characters
> displayed in popup box in order to be useful. Anyway, for beginners,
> it is easier to just input the first and the last radical only, and
> let it display a list of characters to chose from. If it can accept
> exactly two keystroke before calling up the popup box, there should be
> no speed penalty.
>

OTOH, it might be advantageous to modify the plugin to take advantage of
Vim's ability to display a comment next to each completion (for
instance, depending on the data source, a translation or a reading for
the hanzi in question).


Best regards,
Tony.
--
Underlying Principle of Socio-Genetics:
Superiority is recessive.

Tony Mechelynck

unread,
Jan 4, 2009, 12:54:51 AM1/4/09
to vim...@googlegroups.com
On 04/01/09 05:41, anhnmncb wrote:
[...]

> And I find punctation symbol can't be put in the data file to input Chinese
> punctation, in that case vim will hang.
[...]

You could create a keymap for CJK punctuation, or even just use
|i_CTRL-V_digit| -- most of them are in the block starting at U+3000
after all.

Best regards,
Tony.
--
"... an experienced, industrious, ambitious, and often quite often
picturesque liar."
-- Mark Twain

anhnmncb

unread,
Jan 4, 2009, 1:20:29 AM1/4/09
to vim...@googlegroups.com
On 2009-01-04, Sean wrote:
>
>
> I am interested in how to improve performance, which seems not the
> issue for me, though.
>
> These are my computing environments:
>
> laptop: HP MODEL nc6320
> OS: Windows XP
> CPU: 1.66GHz
> RAM: 2GB

Please don't top post.
windows 2000, centrino 1.66GHz and 768 RAM here.

>
> Also, the gvim.exe Memory Usage is less than 7MB with or without my
> plugin.
>
> My own data file is over 32K lines, and the Chinese shows up within
> one second after I type ma<C-I>, as shown on my screenshot. If I

hey, one second is too long, and if the candidate list needs to take 0.5s to
pop up, then the max speed of input per minute will just 120/s, too slow!

> typed <C-I> on empty space, I did find it is kind of "hanging" and I
> had to type <C-C> to stop it. However, it makes little sense to type
><C-I> over empty space.

But hitting <space> to input hanzi is more quick and convenient than <C-i>.

>
> I also tried to sort the data file every time after I do update, but
> it makes no difference in performance to me.
>
> It would be interesting to know if it makes difference if you cut the
> sample data file by half?

My data file is for cangjie input method, and has 30790 lines long, I don't
think it's a big data file for IM.

>
> By the way, I disabled the standard $VIMRUNTIME/plugin directory.
>
> Thanks
>
> Sean


--
Regards,
anhnmncb

anhnmncb

unread,
Jan 4, 2009, 1:23:52 AM1/4/09
to vim...@googlegroups.com
On 2009-01-04, Tony Mechelynck wrote:
>
> On 04/01/09 05:41, anhnmncb wrote:
> [...]
>> And I find punctation symbol can't be put in the data file to input Chinese
>> punctation, in that case vim will hang.
> [...]
>
> You could create a keymap for CJK punctuation, or even just use
>|i_CTRL-V_digit| -- most of them are in the block starting at U+3000
> after all.

Yes, but this is vim's way, it's not an IM's way.

--
Regards,
anhnmncb

Tony Mechelynck

unread,
Jan 4, 2009, 1:34:47 AM1/4/09
to vim...@googlegroups.com
On 04/01/09 07:20, anhnmncb wrote:
> On 2009-01-04, Sean wrote:
>>
>> I am interested in how to improve performance, which seems not the
>> issue for me, though.
>>
>> These are my computing environments:
>>
>> laptop: HP MODEL nc6320
>> OS: Windows XP
>> CPU: 1.66GHz
>> RAM: 2GB
>
> Please don't top post.
> windows 2000, centrino 1.66GHz and 768 RAM here.
>
>> Also, the gvim.exe Memory Usage is less than 7MB with or without my
>> plugin.
>>
>> My own data file is over 32K lines, and the Chinese shows up within
>> one second after I type ma<C-I>, as shown on my screenshot. If I
>
> hey, one second is too long, and if the candidate list needs to take 0.5s to
> pop up, then the max speed of input per minute will just 120/s, too slow!
>
>> typed<C-I> on empty space, I did find it is kind of "hanging" and I
>> had to type<C-C> to stop it. However, it makes little sense to type
>> <C-I> over empty space.
>
> But hitting<space> to input hanzi is more quick and convenient than<C-i>.

Ctrl-I and Tab are synonymous to Vim anyway.

>
>> I also tried to sort the data file every time after I do update, but
>> it makes no difference in performance to me.
>>
>> It would be interesting to know if it makes difference if you cut the
>> sample data file by half?
>
> My data file is for cangjie input method, and has 30790 lines long, I don't
> think it's a big data file for IM.
>
>> By the way, I disabled the standard $VIMRUNTIME/plugin directory.
>>
>> Thanks
>>
>> Sean
>
>

Best regards,
Tony.
--
THE WOMBAT

The wombat lives across the seas,
Among the far Antipodes.
He may exist on nuts and berries,
Or then again, on missionaries;
His distant habitat precludes
Conclusive knowledge of his moods.
But I would not engage the wombat
In any form of mortal combat.

pansz

unread,
Jan 4, 2009, 1:37:47 AM1/4/09
to vim...@googlegroups.com
Sean 写道:

> I am interested in how to improve performance, which seems not the
> issue for me, though.
>
> These are my computing environments:
>
> laptop: HP MODEL nc6320
> OS: Windows XP
> CPU: 1.66GHz
> RAM: 2GB
>
> Also, the gvim.exe Memory Usage is less than 7MB with or without my
> plugin.
>
> My own data file is over 32K lines, and the Chinese shows up within
> one second after I type ma<C-I>, as shown on my screen shot. If I

> typed <C-I> on empty space, I did find it is kind of "hanging" and I
> had to type <C-C> to stop it. However, it makes little sense to type
> <C-I> over empty space.
<C-I> does not work for me and I have to use <C-X><C-U> to work, I don't
see you mapping <C-I> to anything I suspect you mapped it in your .vimrc?

It make sense to type over empty space since that may be an extra
keystroke. i.e. you typed 马,then press your <C-I> after 马。

The performance is low compared to most existing input methods, the
performance does improve when I truncate the dict file, however, in
practical the dict file would be only larger than the sample, not smaller.

After all, any use-case this can be used for a full sentence? I consider
the following:
1. cannot use space for completion, which is very common in IMEs.
2. cannot use number keys for selection, e.g. when there's multiple
selections there is no way to select the 3rd quickly.
3. completion window does not pop-up when typing, must type<C-I> which
means we don't know whether a word is available and we don't know an
error when typing.
4. cannot match the first characters if the whole word missing: type
woyao<C-I> and nothing happens, we expect 'wo' can be matched and show
我, since the problem here is woyao not recorded as vocabulary.

At the current stage, the ease of use is low, and if we rely on vim's
built-in completion things cannot be easily improved.

Sean

unread,
Jan 4, 2009, 1:56:08 AM1/4/09
to vim_use
An updated version is uploaded on http://vim.sourceforge.net/scripts/script.php?script_id=2506
with the following three improvement:

(1) fix completion hanging after starting from non-word characters
(2) add pumheight=10 to limit the height of popup menu
(3) add two entries on data file for testing datafile:
(3.1) english<C-^> should show Chinese translation for
"English"
(3.2) chinese<C-^> should show Chiense translation for
"Chinese"

Sean

Sean

unread,
Jan 4, 2009, 2:32:28 AM1/4/09
to vim_use

> 1. cannot use space for completion, which is very common in IMEs.

If you want, you can map <space> as
imap <Space> <C-X><C-U><C-U><C-P><C-N>

A problem in general is that input is kind of mixture of Chinese and
English. It is a good idea if we want to make it as a fully-fledged
IME. In such a case, another plugin, "Automatically open the popup
menu for completion" from Takeshi can be used together. It is
available on http://www.vim.org/scripts/script.php?script_id=1879

> 2. cannot use number keys for selection, e.g. when there's multiple
> selections there is no way to select the 3rd quickly.

This is a limitation of vim omni completion feature. I am not sure if
Bram also thinks it is a limitation or not.

> 3. completion window does not pop-up when typing, must type<C-I> which
> means we don't know whether a word is available and we don't know an
> error when typing.

This is an advanced feature, and it is available on another plugin
mentioned above.

> 4. cannot match the first characters if the whole word missing: type
> woyao<C-I> and nothing happens, we expect 'wo' can be matched and show
> 我, since the problem here is woyao not recorded as vocabulary.

Yes, it depends on the data file. For demonstration purpose, I updated
a new version of data file including woyao as the <key>. Therefore,
you can type woyan<C-^> now.

The feature you mentioned here seems like "associate" (联想), which is
not available for vim omni completion.

Anyway, the best part of an independent vim IME like my script is that
the data file is under control. For example, my own data file is a
mixture of PinYin and English, which suits my own backgrounds.

By the way, <C-^> is used in the script now, as it is consistent with
the original meaning of i_<C-^> in vim.

Sean

anhnmncb

unread,
Jan 4, 2009, 2:49:29 AM1/4/09
to vim...@googlegroups.com
Hi Sean, I think the first thing need to be done is to improve the speed of
completion, I can't live if I have typed 10 hanzis in 5s, and need to wait 8s
for vim to complete them.

--
Regards,
anhnmncb

pansz

unread,
Jan 4, 2009, 3:09:41 AM1/4/09
to vim...@googlegroups.com
Sean 写道:

>
>> 1. cannot use space for completion, which is very common in IMEs.
>
> If you want, you can map <space> as
> imap <Space> <C-X><C-U><C-U><C-P><C-N>
Well, it works, but we will be unable to type the space character at
all. For <space> to work we may need a "mode" then, simple ascii insert
mode and complex character input mode.


>> 4. cannot match the first characters if the whole word missing: type
>> woyao<C-I> and nothing happens, we expect 'wo' can be matched and show
>> 我, since the problem here is woyao not recorded as vocabulary.

> The feature you mentioned here seems like "associate" (联想), which is
> not available for vim omni completion.

What I meant is not "associate" 联想, it is (say) part-completion:

for example: suppose we have the following in the dict:
aaa 阿
aaa 啊
bbb 波
bbb 播

but we don't have the vocabulary: aaabbb.

Then how can I complete when I type: aaabbb? Typical IM should try to
complete something as possible, so when we press <space> it tries to
complete aaa as 阿 and/or open the selection menu for aaa, leaving bbb
behind, then if we press <space> again, it will try to complete bbb to
波 and/or open the selection menu for bbb.

This is also useful even if we had an item in the dict and we want other
combinations:

aaabbb 阿波

now if we type aaabbb and want it to complete to 啊播,how? when we
press the first <space> we should get a menu which list 阿波 as the
first item and 阿 and 啊 as the second, third item。if we selected 阿波
then everything finished, if we selected 啊, then we will got the menu
for the next character which shows 波 and 播。


anhnmncb

unread,
Jan 4, 2009, 3:14:16 AM1/4/09
to vim...@googlegroups.com
On Sun, 04 Jan 2009 16:09:41 +0800, pansz <pans...@routon.com> wrote:

>
> Sean 写道:
>>
>>> 1. cannot use space for completion, which is very common in IMEs.
>>
>> If you want, you can map <space> as
>> imap <Space> <C-X><C-U><C-U><C-P><C-N>
> Well, it works, but we will be unable to type the space character at
> all. For <space> to work we may need a "mode" then, simple ascii insert
> mode and complex character input mode.

I use these lines for toggle im:

let s:ywim = 0
function YW_IMtoggle()
if s:ywim == 0
imap <space> <C-x><C-u><C-u><C-p><C-n>
let s:ywim = 1
elseif s:ywim == 1
iunmap <space>
let s:ywim = 0
endif
endfunction

imap <C-\> <C-o>:call YW_IMtoggle()<CR>
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

anhnmncb

unread,
Jan 4, 2009, 3:37:00 AM1/4/09
to vim...@googlegroups.com
These lines in your script change the vim options globally, I don't think it's
a good way.

set completeopt=menu,preview,longest
set pumheight=10

set completefunc=ChineseIME

And why not add a advice for imap in vimscript.org instead of writing it in your
plugin?

imap <C-^> <C-X><C-U><C-U><C-P><C-N>

--
Regards,
anhnmncb

StarWing

unread,
Jan 4, 2009, 9:05:54 AM1/4/09
to vim_use
i think it used a little difficulty....just need some imploved.

Tony Mechelynck

unread,
Jan 4, 2009, 9:54:46 AM1/4/09
to vim...@googlegroups.com
On 04/01/09 15:05, StarWing wrote:
> i think it used a little difficulty....just need some imploved.
>
> On 1月4日, 下午4时37分, anhnmncb<anhnm...@sina.com> wrote:
>> These lines in your script change the vim options globally, I don't think it's
>> a good way.
>>
>> set completeopt=menu,preview,longest
>> set pumheight=10
>>
>> set completefunc=ChineseIME

Yes, all these should use ":setlocal" and be in a filetype-plugin, or
maybe in some function called only on demand; not in a global plugin
used for every file (including, let's say, C sources in ASCII or,
speaking of "human" languages, Russian text in Cyrillic).

>>
>> And why not add a advice for imap in vimscript.org instead of writing it in your
>> plugin?
>>
>> imap<C-^> <C-X><C-U><C-U><C-P><C-N>

Here also, ":imap <buffer> <C-^> etc." might be better; and, as you
say, it should perhaps be commented-out by default, since some users
might prefer using another {lhs}. Myself, I don't know if my fr_BE
keyboard has a key or key combo for Ctrl-^ so I might prefer <F12> as
the {lhs}.


Best regards,
Tony.
--
A poem: read aloud:

<> !*''# Waka waka bang splat tick tick hash,
^"`$$- Caret quote back-tick dollar dollar dash,
!*=@$_ Bang splat equal at dollar under-score,
%*<> ~#4 Percent splat waka waka tilde number four,
&[]../ Ampersand bracket bracket dot dot slash,
|{,,SYSTEM HALTED Vertical-bar curly-bracket comma comma CRASH.

Fred Bremmer and Steve Kroese (Calvin College & Seminary of Grand
Rapids, MI.)

Sean

unread,
Jan 4, 2009, 6:57:17 PM1/4/09
to vim_use
Hello,

Thanks for comments and encourages from everyone, I have just uploaded
a new version with all suggestions considered. It is available on
http://vim.sourceforge.net/scripts/script.php?script_id=2506

Improvements were made using the following options:
---------------------------------------------------------------
g:CacheIMEDataAtStartup:
add data file in memory at startup
:pro: fast popup menu shown when <key> is matched
:con: slow initial loading when vim is opened
default: 0

g:ChineseIMESpaceToggle:
toggle the use of <Space> to trigger popup menu
i_<Tab> is used to toggle this feature
:pro: convenient and consistent like other IMEs
:con: need to get used to <Space> key
default: 0

g:ChineseIMEMappingCtrl6:
define i_<C-^> as <C-X><C-U><C-U><C-P><C-N>
default: 0
---------------------------------------------------------------

Sean

On Jan 4, 6:54 am, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:

Tony Mechelynck

unread,
Jan 4, 2009, 8:22:52 PM1/4/09
to vim...@googlegroups.com
On 05/01/09 00:57, Sean wrote:
> Hello,
>
> Thanks for comments and encourages from everyone, I have just uploaded
> a new version with all suggestions considered. It is available on
> http://vim.sourceforge.net/scripts/script.php?script_id=2506
[...]

Note that if (like me) you have downloaded a previous version of this
script, you may need to wipe your browser's cache in order to see the
new version (e.g. refresh the page with Ctrl-Shift-R in Firefox or
SeaMonkey).

Best regards,
Tony.
--
Life would be so much easier if we could just look at the source code.

Sean

unread,
Jan 4, 2009, 10:45:49 PM1/4/09
to vim_use
Now, this plugin is ready.

Performance is boosted.
No more cache, no more full table scan, after the data file is sorted
first.

The new version can be downloaded from
http://vim.sourceforge.net/scripts/script.php?script_id=2506

The new sample data file can be downloaded from
http://maxiangjiang.googlepages.com/ChineseIME.dict

Feedback is welcome.

Sean

On Jan 4, 5:22 pm, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:

pansz

unread,
Jan 5, 2009, 12:45:09 AM1/5/09
to vim...@googlegroups.com
Sean 写道:

> Now, this plugin is ready.
>
> Performance is boosted.
> No more cache, no more full table scan, after the data file is sorted
> first.
>
> The new version can be downloaded from
> http://vim.sourceforge.net/scripts/script.php?script_id=2506
>
> The new sample data file can be downloaded from
> http://maxiangjiang.googlepages.com/ChineseIME.dict
>
> Feedback is welcome.

Seems to have a typo on the line 122:

if !exists("g:ChineseIMESpaceToggle")
let g:ChineseIMEMappingCtrl6 = 0
endif

Sean

unread,
Jan 5, 2009, 12:59:58 AM1/5/09
to vim_use
Thanks. It is fixed and more 'convenient' features for <Space>

g:ChineseIMESpaceToggle:
toggle punctuation
toggle the use of <Space> to trigger popup
toggle cursor color to identify the 'IME mode'

anhnmncb

unread,
Jan 5, 2009, 3:54:15 AM1/5/09
to vim...@googlegroups.com
On 2009-01-05, Sean wrote:
>
> Now, this plugin is ready.
>
> Performance is boosted.
> No more cache, no more full table scan, after the data file is sorted
> first.

Excellent work, thank you! But I find the only one charactor's searching still
takes some time(maybe 3~5 seconds), maybe it still can have some improving?

One suggestion, if the charctor is Chinese, then <space> should input a real
<space> instead of saids that "pattern not found".

Anyway, this time it's really ready for use!

>
> The new version can be downloaded from
> http://vim.sourceforge.net/scripts/script.php?script_id=2506
>
> The new sample data file can be downloaded from
> http://maxiangjiang.googlepages.com/ChineseIME.dict
>
> Feedback is welcome.
>
> Sean
>
> On Jan 4, 5:22pm, Tony Mechelynck <antoine.mechely...@gmail.com>
> wrote:
>> On 05/01/09 00:57, Sean wrote:> Hello,
>>
>> > Thanks for comments and encourages from everyone, I have just uploaded
>> > a new version with all suggestions considered. It is available on
>> >http://vim.sourceforge.net/scripts/script.php?script_id=2506
>>
>> [...]
>>
>> Note that if (like me) you have downloaded a previous version of this
>> script, you may need to wipe your browser's cache in order to see the
>> new version (e.g. refresh the page with Ctrl-Shift-R in Firefox or
>> SeaMonkey).
>>
>> Best regards,
>> Tony.
>> --
>> Life would be so much easier if we could just look at the source code.
> >
>


--
Regards,
anhnmncb

anhnmncb

unread,
Jan 5, 2009, 4:25:24 AM1/5/09
to vim...@googlegroups.com
On 2009-01-05, Sean wrote:
>
> Now, this plugin is ready.
>
> Performance is boosted.
> No more cache, no more full table scan, after the data file is sorted
> first.
>
> The new version can be downloaded from
> http://vim.sourceforge.net/scripts/script.php?script_id=2506
>
> The new sample data file can be downloaded from
> http://maxiangjiang.googlepages.com/ChineseIME.dict

Actually, the options of 'pumheight' and 'completeopt' take effect globally,
so it's no use to set them locally, I think you can add some variables, so it
works in this way: when IME is on, store their settings then set them, when
off, restore theirs.

>
> Feedback is welcome.
>
> Sean
>
> On Jan 4, 5:22pm, Tony Mechelynck <antoine.mechely...@gmail.com>
> wrote:
>> On 05/01/09 00:57, Sean wrote:> Hello,
>>
>> > Thanks for comments and encourages from everyone, I have just uploaded
>> > a new version with all suggestions considered. It is available on
>> >http://vim.sourceforge.net/scripts/script.php?script_id=2506
>>
>> [...]
>>
>> Note that if (like me) you have downloaded a previous version of this
>> script, you may need to wipe your browser's cache in order to see the
>> new version (e.g. refresh the page with Ctrl-Shift-R in Firefox or
>> SeaMonkey).
>>
>> Best regards,
>> Tony.
>> --
>> Life would be so much easier if we could just look at the source code.
> >
>


--
Regards,
anhnmncb

anhnmncb

unread,
Jan 5, 2009, 4:31:27 AM1/5/09
to vim...@googlegroups.com
On 2009-01-05, anhnmncb wrote:
>
> On 2009-01-05, Sean wrote:
>>
>> Now, this plugin is ready.
>>
>> Performance is boosted.
>> No more cache, no more full table scan, after the data file is sorted
>> first.
>>
>> The new version can be downloaded from
>> http://vim.sourceforge.net/scripts/script.php?script_id=2506
>>
>> The new sample data file can be downloaded from
>> http://maxiangjiang.googlepages.com/ChineseIME.dict
>
> Actually, the options of 'pumheight' and 'completeopt' take effect globally,
> so it's no use to set them locally, I think you can add some variables, so it
> works in this way: when IME is on, store their settings then set them, when
> off, restore theirs.

Here is my way:

let s:ywim = 0
function YW_IMtoggle()
if s:ywim == 0

let s:oldpumheight = &pumheight
let s:oldcompleteopt = &completeopt
let g:ChineseIMESpaceToggle=1
set pumheight=10
set completeopt=menu,preview,longest


let s:ywim = 1
elseif s:ywim == 1

let g:ChineseIMESpaceToggle=0


let s:ywim = 0

set pumheight=s:oldpumheight
set completeopt=s:oldcompleteopt


endif
endfunction
imap <C-\> <C-o>:call YW_IMtoggle()<CR>

I think use <C-\> or something others instead of <tab> is better.

Sean

unread,
Jan 5, 2009, 12:50:27 PM1/5/09
to vim_use
Thanks. I just made two cosmetical modification:

(1) g:ChineseIME_Toggle_InertMode (i_<C-\> as default now)
(2) g:ChineseIME_Toggle_i_Ctrl6

Sean

John Beckett

unread,
Jan 5, 2009, 4:45:24 PM1/5/09
to vim...@googlegroups.com
Sean wrote:
> Thanks. I just made two cosmetical modification:

Good stuff, but please follow the conventions of this mailing list:

* Bottom post, that is:
* Quote a few relevant lines; put your reply underneath.

If nothing is really relevant, delete it all.

In case anyone is wondering ... these issues have been discussed several
times, and consensus prefers bottom posting. Whatever the merits, it is
best for us to use the same conventions. I am a manager of this list
(preventing spam), and have resolved to irritate people with replies
like this so everyone is aware of how we like things. See
http://groups.google.com/group/vim_use/web/vim-information

John

anhnmncb

unread,
Jan 5, 2009, 6:36:38 PM1/5/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009 01:50:27 +0800, Sean <maxian...@gmail.com> wrote:

>
> Thanks. I just made two cosmetical modification:
>
> (1) g:ChineseIME_Toggle_InertMode (i_<C-\> as default now)
> (2) g:ChineseIME_Toggle_i_Ctrl6
>
> Sean

The new version lacks a head " in the line 192.
And IMHO you should also need to save/restore the pumheight and
completeopt.

What about the improve for <space> behavior when in different condition
and the slow speed for searching just one charactor? :)

pansz

unread,
Jan 5, 2009, 7:52:50 PM1/5/09
to vim...@googlegroups.com
John Beckett 写道:

> Sean wrote:
>> Thanks. I just made two cosmetical modification:
>
> Good stuff, but please follow the conventions of this mailing list:
>
> * Bottom post, that is:
> * Quote a few relevant lines; put your reply underneath.
>
> If nothing is really relevant, delete it all.

I suggest using thunderbird, which has good support of bottom post, if
anybody knows more mail client which support bottom post well, please
tell me, thanks.

for the OP:
It seems that you mixed chinese character(字) and word(词) together
in the same dict and it would be hard for different input method to
share the same word table(词汇表).

For example if there's a new word “一个新词”, you may have pinyin,
shanpin, wubi, or something else, each of them have different mappings
for chinese character, but when you have a new word it should be valid
for all input methods.

Any work around?

bill lam

unread,
Jan 5, 2009, 8:36:26 PM1/5/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009, pansz wrote:
> I suggest using thunderbird, which has good support of bottom post, if
> anybody knows more mail client which support bottom post well, please
> tell me, thanks.

I don't think email client is a good excuse for top (or bottom) post.
Anyway if vim is configured as an external editor for composing email.
It can search for the first blank line and that should be the bottom
of quoted message.

set editor="vim -c \"set spell tw=70 et\" \"+call search('^$')\""

FWIW I use mutt.

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩284 張祜 集靈臺二首之一
日光斜照集靈台 紅樹花迎曉露開 昨夜上皇新授籙 太真含笑入簾來

Tony Mechelynck

unread,
Jan 5, 2009, 9:20:58 PM1/5/09
to vim...@googlegroups.com
On 06/01/09 01:52, pansz wrote:
[...]

> I suggest using thunderbird, which has good support of bottom post, if
> anybody knows more mail client which support bottom post well, please
> tell me, thanks.
[...]

I use SeaMonkey (version 2.0 alpha) which shares much of its Mail&News
code with Thunderbird, and in particular supports both top- and
bottom-posting exactly the way Thunderbird does. One of the reasons why
I prefer SeaMonkey over Thunderbird is its "Suite" architecture -- it
includes not only a Mail/News client and an address book but also a
browser, a chat client, and, for those who know how to use it (I don't
-- but then I use Vim instead) an HTML editor. Another thing which I
like in this Suite concept is that HTTP/FTP links in mail will open the
page in the browser, RSS/Atom feeds seen in pages in the browser can
(with a one-click action in the browser) be subscribed in the mailer,
and an IRC link in either mailer or browser will open the chat client,
in all cases regardless of the "default" application settings in your
OS. Of course, other people will call "bloat" what I call "versatility",
so you should use what suits you (not me) best.

...and, yes, I realize that most of this reply is OT for this list.


Best regards,
Tony.
--
Put your Nose to the Grindstone!
-- Amalgamated Plastic Surgeons and Toolmakers, Ltd.

Tony Mechelynck

unread,
Jan 5, 2009, 9:46:51 PM1/5/09
to vim...@googlegroups.com
On 06/01/09 02:36, bill lam wrote:
> On Tue, 06 Jan 2009, pansz wrote:
>> I suggest using thunderbird, which has good support of bottom post, if
>> anybody knows more mail client which support bottom post well, please
>> tell me, thanks.
>
> I don't think email client is a good excuse for top (or bottom) post.
> Anyway if vim is configured as an external editor for composing email.
> It can search for the first blank line and that should be the bottom
> of quoted message.
>
> set editor="vim -c \"set spell tw=70 et\" \"+call search('^$')\""
>
> FWIW I use mutt.
>

Depending on how the mail is passed to the external editor (and by what
mail useragent program), the first blank line might actually be the
boundary between the headers and the body, i.e., the top of the message
proper.

The bottom of the quoted message (before you compose your reply) should
be the bottom of what is passed to the external editor, so I recommend

GO
or
:$append

Best regards,
Tony.
--
Some programming languages manage to absorb change but withstand
progress.

Sean

unread,
Jan 6, 2009, 12:06:25 AM1/6/09
to vim_use
>> The new version lacks a head " in the line 192.
>> And IMHO you should also need to save/restore the pumheight and completeopt.

DONE. Thanks

>> What about the improve for <space> behavior when in different condition

<Space> mapping is for convenience. I am not sure if more
"overloading" is good or not. Also, to insert a normal space, one has
to use <C-V><Space>.

The i_<C-\> IME Insert Mode can do more now, for inputing Chinese
comfortably.

=> toggle punctuation
=> toggle the use of <Space> to trigger popup
=> toggle cursor color to identify the 'IME mode'
=> toggle options 'pumheight', 'completeopt', 'lazyredraw'

>> and the slow speed for searching just one charactor? :)

Technically, nothing more can be done there, as the algorithm for
searching is now doing no scanning, no loop, and no cache at all.

Perhaps we can limit the maximum item shown on the popup menu to be
something like 30? But I expect complains if someone wants to
<PageDown> everything available. The best way, in my opinion, is to
avoid one character search, because none (human & machine) knows what
to translate except the whole list.

Now, the new version is uploaded to improve the following two things:

(1) i_<C-\> IME Insert Mode has more default setting
(2) optional comments can be added at the middle on each line of the
data file

The (2) above is to make the data file a little more controllable,
based on my new algorithm.

For example, in the data file we have:
(The corresponding data file is also updated)

-----------------------
ma3 1 马
ma3 2 馬
ma3 3 码
ma3 4 玛
ma3 9 吗
-----------------------

By typing ma3<Space>, 5 Chinese characters will be shown in the order
specified, and the default is the first one. The design idea is to
make the whole data file "sortable" by vim sort function. I also made
the middle parts optional, avoiding restrictions as much as possible.

Feedback is always welcome.


Sean

anhnmncb

unread,
Jan 6, 2009, 12:31:06 AM1/6/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009 13:06:25 +0800, Sean wrote:

>
>>> The new version lacks a head " in the line 192.
>>> And IMHO you should also need to save/restore the pumheight and
>>> completeopt.
>
> DONE. Thanks
>
>>> What about the improve for <space> behavior when in different condition
>
> <Space> mapping is for convenience. I am not sure if more
> "overloading" is good or not. Also, to insert a normal space, one has
> to use <C-V><Space>.

I think it is needed for speeding up the inputting,
input method is not only for inputting Chinese charactor,
but also for quickly input speed, and the sum of time you
need to switch back and from between English and Chinese mode is a bit
long.

>
> The i_<C-\> IME Insert Mode can do more now, for inputing Chinese
> comfortably.
>
> => toggle punctuation
> => toggle the use of <Space> to trigger popup
> => toggle cursor color to identify the 'IME mode'
> => toggle options 'pumheight', 'completeopt', 'lazyredraw'
>
>>> and the slow speed for searching just one charactor? :)
>
> Technically, nothing more can be done there, as the algorithm for
> searching is now doing no scanning, no loop, and no cache at all.

Maybe it can be done in this way? If search for, say, "w",
then ChineseIME will just search the list that
starting from "w" and consisting just one or two charactors long.
And if search for "ww", it searches the lists
that start with "ww" and just two or three charactors long.

anhnmncb

unread,
Jan 6, 2009, 1:19:39 AM1/6/09
to vim...@googlegroups.com
> On Tue, 06 Jan 2009 13:06:25 +0800, Sean wrote:
>
> Feedback is always welcome.

1. Maybe need to define an autocmd so that
when leave from insert mode, then disable ChineseIME?

2. Why not name your script to "vim" i.e. Vim Input Method for Chinese? :)

pansz

unread,
Jan 6, 2009, 1:43:06 AM1/6/09
to vim...@googlegroups.com
anhnmncb 写道:

>> On Tue, 06 Jan 2009 13:06:25 +0800, Sean wrote:
>>
>> Feedback is always welcome.
>
> 1. Maybe need to define an autocmd so that
> when leave from insert mode, then disable ChineseIME?
>
> 2. Why not name your script to "vim" i.e. Vim Input Method for Chinese? :)
>
The name ChineseIM.vim contains an vim already.

But for the name, I think a Capitalized filename creates unnecessary
inconveniences when edited in command line. Especially for
case-sensitive file systems.

Since vim is designed for cross-platform I think it might be better to
use only lowercase filename, which works well in all platforms. For
example: inputmethod.vim would be a lot better since this can be used to
input much more than just Chinese character.

Tony Mechelynck

unread,
Jan 6, 2009, 1:55:11 AM1/6/09
to vim...@googlegroups.com

If you really want _all_ platforms, use an 8.3 name in single case.
However in this case 8.3 is probably irrelevant since IIRC Dos versions
of Vim, if still used, almost never have +multi_byte.

Best regards,
Tony.
--
"I'd love to go out with you, but there are important world issues that
need worrying about."

Sean

unread,
Jan 6, 2009, 2:52:59 AM1/6/09
to vim_use
>> Technically, nothing more can be done there, as the algorithm for
>> searching is now doing no scanning, no loop, and no cache at all.
>
> Maybe it can be done in this way? If search for, say, "w",
> then ChineseIME will just search the list that
> starting from "w" and consisting just one or two charactors long.
> And if search for "ww", it searches the lists
> that start with "ww" and just two or three charactors long

Good idea! I just implemented it, and uploaded another new version :)

Improved performance when searching by one or two alphabets:
(1) to limit search where short characters are typed
(2) the default setting is hard-coded to be 2
(3) for example, a<C-^> only shows maps for a only, not 'aiqing'

Sean

Sean

unread,
Jan 6, 2009, 3:00:19 AM1/6/09
to vim_use
> > set editor="vim -c \"set spell tw=70 et\" \"+call search('^$')\""
>
> > FWIW I use mutt.
>
> Depending on how the mail is passed to the external editor (and by what
> mail useragent program), the first blank line might actually be the
> boundary between the headers and the body, i.e., the top of the message
> proper.
>
> The bottom of the quoted message (before you compose your reply) should
> be the bottom of what is passed to the external editor, so I recommend

I have not used any standalone mail client for years. I read this
thread on firefox, by the way.

To reply, I simply did copy/paste to vim, from where I did another
copy/paste back to browser.

Among other things, I have never lost my typing :)

Sean

anhnmncb

unread,
Jan 6, 2009, 3:01:11 AM1/6/09
to vim...@googlegroups.com
Yes, now the speed is great :) But...

Still need some improves, say if "ww" has no Chinese charator but "www"
has,
then it should search the longer list instead of saying no pattern found.

I'm so glad that ChineseIME is better and better everyday!

>
> Sean

_sc_

unread,
Jan 6, 2009, 3:12:39 AM1/6/09
to vim...@googlegroups.com
>
> I have not used any standalone mail client for years. I read this
> thread on firefox, by the way.
>
> To reply, I simply did copy/paste to vim, from where I did another
> copy/paste back to browser.
>
> Among other things, I have never lost my typing :)

you might want to check out the "It's All Text" plugin for
firefox -- it allows you to use gvim to edit any editable
thing in firefox -- it just might save you some copying and
pasting

sc


anhnmncb

unread,
Jan 6, 2009, 3:34:16 AM1/6/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009 15:52:59 +0800, Sean wrote:

>
Would you consider my following advices please?

1. If the charactor that cursor is on isn't non-black-space ascii,
then <space> just do normal <space> i.e. insert a real <space>
instead of saying "pattern not found" that is really useless.

2. add an autocmd to toggle IME off when escape from insert mode.

Both of them aren't hard to implement I think.

>
> Sean

anhnmncb

unread,
Jan 6, 2009, 3:51:22 AM1/6/09
to vim...@googlegroups.com
A weird problem that I can't figure it out, ChineseIME just works
when I start a new vim, and use it in [no name] buffer,
then after I edit other files, <C-\> has no any effect, why?
My .vimrc just has one line relative to ChineseIME:

let g:ChineseIME_Toggle_InertMode=1

Dominique Pelle

unread,
Jan 6, 2009, 3:53:52 AM1/6/09
to vim...@googlegroups.com
bill lam wrote:

> I don't think email client is a good excuse for top (or bottom) post.
> Anyway if vim is configured as an external editor for composing email.
> It can search for the first blank line and that should be the bottom
> of quoted message.
>
> set editor="vim -c \"set spell tw=70 et\" \"+call search('^$')\""
>
> FWIW I use mutt.

A tad shorter and doing the same:

set editor="vim +'/^$/' -c 'set spell tw=70 et'"

-- Dominique

bill lam

unread,
Jan 6, 2009, 4:00:03 AM1/6/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009, Dominique Pelle wrote:
> set editor="vim +'/^$/' -c 'set spell tw=70 et'"

I once used that but it raises an error message when there is no empty
line.

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3

唐詩191 杜甫 詠懷古跡五首之二
搖落深知宋玉悲 風流儒雅亦吾師 悵望千秋一灑淚 蕭條異代不同時
江山故宅空文藻 雲雨荒臺豈夢思 最是楚宮俱泯滅 舟人指點到今疑

Dotan Cohen

unread,
Jan 6, 2009, 3:57:07 AM1/6/09
to vim...@googlegroups.com
2009/1/6 bill lam <cbil...@gmail.com>:

> I don't think email client is a good excuse for top (or bottom) post.

Exactly. I use Gmail, which uses top post and has no option to change.
So I _move_the_cursor_down_manually_. Takes a lot of effort, let me
tell you!

--
Dotan Cohen

http://what-is-what.com
http://gibberish.co.il

א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת
ا-ب-ت-ث-ج-ح-خ-د-ذ-ر-ز-س-ش-ص-ض-ط-ظ-ع-غ-ف-ق-ك-ل-م-ن-ه‍-و-ي
А-Б-В-Г-Д-Е-Ё-Ж-З-И-Й-К-Л-М-Н-О-П-Р-С-Т-У-Ф-Х-Ц-Ч-Ш-Щ-Ъ-Ы-Ь-Э-Ю-Я
а-б-в-г-д-е-ё-ж-з-и-й-к-л-м-н-о-п-р-с-т-у-ф-х-ц-ч-ш-щ-ъ-ы-ь-э-ю-я
ä-ö-ü-ß-Ä-Ö-Ü

anhnmncb

unread,
Jan 6, 2009, 8:53:34 AM1/6/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009 16:51:22 +0800, anhnmncb <anhn...@sina.com> wrote:

> A weird problem that I can't figure it out, ChineseIME just works
> when I start a new vim, and use it in [no name] buffer,
> then after I edit other files, <C-\> has no any effect, why?
> My .vimrc just has one line relative to ChineseIME:
>
> let g:ChineseIME_Toggle_InertMode=1
>

I find the problem, the script use a local imap for <C-\> at line 237 of
the script, you'd better to change it from:

imap <buffer> <C-\> <C-O>:call ChineseIME_Toggle_InertMode()<CR>

to:

imap <C-\> <C-O>:call ChineseIME_Toggle_InertMode()<CR>

Sean

unread,
Jan 6, 2009, 8:48:05 PM1/6/09
to vim_use
An updated version of ChineseIM.vim and data file are available.
Following are improvements:

(1) Made new naming
(a) Plugin File => ChineseIM.vim
(b) Data File => http://maxiangjiang.googlepages.com/ChineseIM.dict
(c) global-variable => g:ChineseIM_InsertMode_Toggle
(d) global-variable => g:ChineseIM_Ctrl6_Toggle

(2) Added a quick demo, to play without data file installed.
(a) Assumption: vim is configured to show Chinese
(b) source this script file by :source %
(c) when in Insert mode, type:
ma<C-X><C-U>
chin<C-X><C-U>

(3) Showing more information on the popup menu:
It is great when we have English Data File.
To demo the usage, I added 2 entries ("english", "chinese") to the
new data file. Now type:
engl<C-X><C-U>
chin<C-X><C-U>

(4) In ChineseIM_InsertMode, made punctuation intelligent :))

, ==> Chinese , plus <Space>
. ==> Chinese . plus <Space>
: ==> Chinese : plus <Space>
; ==> Chinese ; plus <Space>
? ==> Chinese ? plus <Space>
\ ==> Chinese \ plus <Space>

(5) In ChineseIM_InsertMode, added one more indication
(a) cursor color turns to green
(d) Status line shows " -- INSERT (lang) --"

(6) uploaded a new data file, adding pinyin with tones to limit
selection.
For example, "ma1", "ma2", "ma3", "ma4" works now

(7) Downloading:
(a) Data File: http://maxiangjiang.googlepages.com/ChineseIM.dict
(b) Plugin: http://vim.sourceforge.net/scripts/script.php?script_id=2506

Again, feedback is always welcome.

Thanks

Sean


Sean

unread,
Jan 6, 2009, 9:42:48 PM1/6/09
to vim_use

Yue Wu

unread,
Jan 6, 2009, 10:39:40 PM1/6/09
to vim...@googlegroups.com

Didn't you see my reply? your script still use imap-local
so it would work just for [no name] buffer or the file
you start vim to edit from command line.

>
> Again, feedback is always welcome.

1. I think you can define a variable to toggle inputting Chinese
or English punctations.

2. The searching still need some improvings, say if "ww" and "www"
has no Chinese charator but "wwww" has, then typing "ww" should search for


the longer list instead of saying no pattern found.

3. If the charactor that cursor is on isn't non-black-space ascii,


then <space> just do normal <space> i.e. insert a real <space>

instead of saying "pattern not found" that is really useless. And I think
it's more intelligent.

4. Add an autocmd to toggle IME off when escape from insert mode.
Something like this:

au InsertLeave * call ChineseIM_InsertMode_ToggleOff()

function ChineseIM_InsertMode_ToggleOff()
" ----------------------------------- options
let &pumheight=s:saved_pumheight
let &completeopt=s:saved_completeopt
let &lazyredraw=s:saved_lazyredraw
" ----------------------------------- IM mode indication
let &iminsert=s:saved_iminsert
highlight Cursor guifg=bg guibg=fg
" ----------------------------------- <Space>
imap <Space> <Space>
" ----------------------------------- bracket
imap ( (
imap ) )
imap < <
imap > >
imap [ [
imap ] ]
" ----------------------------------- punctuation
imap , ,
imap . .
imap : :
imap ; ;
imap ? ?
imap \\ \\
" -----------------------------------
let s:n += 1
endfunction


--
Regards,
Van.

bill lam

unread,
Jan 6, 2009, 11:38:19 PM1/6/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009, Sean wrote:
> Again, feedback is always welcome.

Does it (or will it) support wildcard?
eg,
ap?e will match ape, apple but not apply as key.

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3

唐詩155 許渾 早秋
遙夜汎清瑟 西風生翠蘿 殘螢栖玉露 早雁拂銀河
高樹曉還密 遠山晴更多 淮南一葉下 自覺老煙波

Sean

unread,
Jan 7, 2009, 12:54:25 AM1/7/09
to vim_use
> Does it (or will it) support wildcard?
>> eg, ap?e will match ape, apple but not apply as key.

Yes! I can certainly make vim do it, and much more.

The problem is performance. It takes time to load the whole data
file into cache, and build data tree there.

As of now, I tried to make it work as fast as possible without full
table scan or huge cache. I also want to avoid making it as a dedicate
IM editor unless we explicitly pick up the option.

These are my design goals with Chinese IM plugin:

(1) without any negative impact to vim if we don't use the IM
(2) with decent performance if we start to use the IM

bill lam

unread,
Jan 7, 2009, 9:46:42 PM1/7/09
to vim...@googlegroups.com
On Tue, 06 Jan 2009, Sean wrote:
> (2) with decent performance if we start to use the IM

Can the dict be pre-compiled to speed up load time?

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3

唐詩166 崔塗 孤雁
幾行歸塞盡 片影獨何之 暮雨相呼失 寒塘欲下遲
渚雲低暗渡 關月冷相隨 未必逢矰繳 孤飛自可疑

Sean

unread,
Jan 8, 2009, 5:09:45 PM1/8/09
to vim_use
>> (2) with decent performance if we start to use the IM
>Can the dict be pre-compiled to speed up load time?

Well, data file in binary is not an option. The data file format
used is the simplest one, while readable and editable.

The only "overhead" for vimim plugin is to load the data file into
vim list, and nothing more. That is the minimum requirement if we
want to input Chinese after all.

Another thing is about utf-8 encoding, which takes 30% more space
for multi-bytes. But, vim is doing utf-8 internally anyway.

How about any other editors, say, emacs? I don't believe they can be
loading faster and searching faster than vimim does.

Sean

Antony Scriven

unread,
Jan 9, 2009, 8:04:21 AM1/9/09
to vim...@googlegroups.com
On Thu, Jan 8, 2009 at 10:09 PM, Sean <maxian...@gmail.com> wrote:

> > > [...]
> > [...]
>
> Well, data file in binary is not an option. [...]

Which is nothing to do with the subject line. C'mon
people, get it together! --Antony

Sean

unread,
Jan 19, 2009, 4:15:33 PM1/19/09
to vim_use

> > Does it (or will it) support wildcard?
> >> eg, ap?e will match ape,applebut not apply as key.

Yes, it is supported now:
http://maxiangjiang.googlepages.com/vimim_wildcard.gif

The latest version also starts to support fuzzy search:
http://maxiangjiang.googlepages.com/vimim_fuzzy.gif


Sean


bill lam

unread,
Jan 19, 2009, 10:15:51 PM1/19/09
to vim...@googlegroups.com
On Mon, 19 Jan 2009, Sean wrote:
> > > Does it (or will it) support wildcard?
> > >> eg, ap?e will match ape,applebut not apply as key.
>
> Yes, it is supported now:
> http://maxiangjiang.googlepages.com/vimim_wildcard.gif

Oh, thanks. It is a very nice feature.

I'm not sure if it supports another feature on numerals in keys. eg in
pinyin input, they are 4 tones
fu1
fu2
fu3
fu4

When I type 'fu' it display a screenful of selection, but if I
continue to type 2 (intended to input fu2), will it be mistaken for
choosing the second entry in the popup box?

I usually display choices as 5 6 7... and reserve 1 2 3 4 for tone.

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3

唐詩202 盧綸 晚次鄂州
雲開遠見漢陽城 猶是孤帆一日程 估客晝眠知浪靜 舟人夜語覺潮生
三湘愁鬢逢秋色 萬里歸心對月明 舊業已隨征戰盡 更堪江上鼓鼙聲

Tony Mechelynck

unread,
Jan 19, 2009, 10:27:38 PM1/19/09
to vim...@googlegroups.com
On 20/01/09 04:15, bill lam wrote:
> On Mon, 19 Jan 2009, Sean wrote:
>>>> Does it (or will it) support wildcard?
>>>>> eg, ap?e will match ape,applebut not apply as key.
>> Yes, it is supported now:
>> http://maxiangjiang.googlepages.com/vimim_wildcard.gif
>
> Oh, thanks. It is a very nice feature.
>
> I'm not sure if it supports another feature on numerals in keys. eg in
> pinyin input, they are 4 tones
> fu1
> fu2
> fu3
> fu4
>
> When I type 'fu' it display a screenful of selection, but if I
> continue to type 2 (intended to input fu2), will it be mistaken for
> choosing the second entry in the popup box?
>
> I usually display choices as 5 6 7... and reserve 1 2 3 4 for tone.
>

If that's a problem (and 1 2 3 4 are used for the first 4 results), you
could decide to represent the tones respectively by postfixed - / ~ \
instead -- or go whole hog and use marked vowels, ā á ǎ à and the like,
which can be entered by means of digraphs (using - ' < ! as the second
element of the digraph) if your keyboard hasn't got them built-in.

Best regards,
Tony.
--
Religion has done love a great service by making it a sin.
-- Anatole France

Sean

unread,
Jan 19, 2009, 11:40:01 PM1/19/09
to vim_use
To avoid too many selections from short word (one char or two chars),
I took a special design:

(1) For one-char word, no more search:
For example: in your data file,

a xxx
a xyz
a whatever_your_multibyte

You always have 3 choices when you type a<C-^>. (No more search
for "ab", "ac", etc)

(2) For two-char word, it is the same as above if there is a match.
Thus, "fu" in your example only search all matches for "fu" and stops
there. (No search for "fu3" for example). The label is always used for
selection and navigation.

(2.1) New feature: fuzzy search is being supported. Thus if there is
no match for two-char word, fuzzy search will be started. Thus "zg"
way bring up "zhongguo" if existing in the datafile.

(3) Feel free to use "fu1", "fu2", "fu3", "fu4" if existing in your
data file.

Please take a look at the pictures at:
http://maxiangjiang.googlepages.com/vimim.html

Sean

unread,
Feb 5, 2009, 6:05:26 PM2/5/09
to vim_use


On Jan 19, 7:27 pm, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:
> On 20/01/09 04:15, bill lam wrote:
>
>
>
> > On Mon, 19 Jan 2009, Sean wrote:
> >>>> Does it (or will it) support wildcard?
> >>>>> eg, ap?e will match ape,applebutnotapply as key.
> >> Yes, it is supported now:
> >>http://maxiangjiang.googlepages.com/vimim_wildcard.gif
>
> > Oh, thanks. It is a very nicefeature.
>
> >I'mnotsureifitsupportsanotherfeatureonnumeralsinkeys. eg in
> > pinyin input, they are 4 tones
> > fu1
> > fu2
> > fu3
> > fu4
>
> > When I type 'fu' it display a screenful of selection, butifI
> > continue to type 2 (intended to input fu2), will it be mistaken for
> > choosing the second entry in the popup box?
>
> > I usually display choices as 5 6 7... and reserve 1 2 3 4 for tone.
>
> Ifthat's a problem (and 1 2 3 4 are used for the first 4 results), you
> could decide to represent the tones respectively by postfixed - / ~ \
> instead -- or go whole hog and use marked vowels, ā á ǎ à and the like,
> which can be entered by means of digraphs (using - ' < ! as the second
> element of the digraph)ifyour keyboard hasn't got them built-in.
>
> Best regards,
> Tony.
> --
> Religion has done love a great service by making it a sin.
>                 -- Anatole France

Actually, what Tony offered is an ideal result. However, the majority
users may not want to learn anything new. :))

I came out with a comprise result: to reserve number 1,2,3,4 for
pinyin ton, and leave the rest of number as selection.

The result is shown on http://maxiangjiang.googlepages.com/vimim_pinyin_tone.gif.

The plugin can be downloaded from http://maxiangjiang.googlepages.com/vimim.vim

Sean

bill lam

unread,
Feb 5, 2009, 9:11:45 PM2/5/09
to vim...@googlegroups.com
On Thu, 05 Feb 2009, Sean wrote:
> I came out with a comprise result: to reserve number 1,2,3,4 for
> pinyin ton, and leave the rest of number as selection.

I can see it is hardcoded to use 4 numbers only, Can it be set to be
user configurable from any number from 1 to 9? Yes, 9. There are 9
tones in cantonese.

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3

唐詩312 王維 渭城曲
渭城朝雨浥輕塵 客舍青青柳色新 勸君更盡一杯酒 西出陽關無故人

Sean

unread,
Feb 5, 2009, 10:34:24 PM2/5/09
to vim_use


On Feb 5, 6:11 pm, bill lam <cbill....@gmail.com> wrote:
> On Thu, 05 Feb 2009, Sean wrote:
> > I came out with a comprise result: to reserve number 1,2,3,4 for
> > pinyin ton, and leave the rest of number as selection.
>
> I can see it is hardcoded to use 4 numbers only, Can it be set to be
> user configurable from any number from 1 to 9?  Yes, 9. There are 9
> tones in cantonese.
>

Wow. There are 9 tones out there? No wonder everyone laughs at me when
I tried to speak Cantonese. :)

Here you go: you can try this global option:
let g:vimim_enable_pinyin_tone_input=9

Tony Mechelynck

unread,
Feb 5, 2009, 10:42:55 PM2/5/09
to vim...@googlegroups.com
On 06/02/09 00:05, Sean wrote:
[...]

> Actually, what Tony offered is an ideal result. However, the majority
> users may not want to learn anything new. :))
[...]

If you (well, not _you_ of course) don't want to learn anything new,
don't use Vim but stay with good^H^H^H^Hbad old Notepad.

Best regards,
Tony.
--
According to Kentucky state law, every person must take a bath at least
once a year.

Tony Mechelynck

unread,
Feb 6, 2009, 12:13:03 AM2/6/09
to vim...@googlegroups.com
On 06/02/09 04:34, Sean wrote:
[...]

> Wow. There are 9 tones out there? No wonder everyone laughs at me when
> I tried to speak Cantonese. :)
[...]

IIUC, depending whom you ask, how you count them, and how you romanize
the language, there may be 6, 8 or 9 tones in Cantonese: see among
others http://en.wikipedia.org/wiki/Cantonese

Best regards,
Tony.
--
If the code and the comments disagree, then both are probably wrong.
-- Norm Schryer

bill lam

unread,
Feb 6, 2009, 2:36:45 AM2/6/09
to vim...@googlegroups.com
On Fri, 06 Feb 2009, Tony Mechelynck wrote:
>
> On 06/02/09 04:34, Sean wrote:
> [...]
> > Wow. There are 9 tones out there? No wonder everyone laughs at me when
> > I tried to speak Cantonese. :)
> [...]
>
> IIUC, depending whom you ask, how you count them, and how you romanize
> the language, there may be 6, 8 or 9 tones in Cantonese: see among
> others http://en.wikipedia.org/wiki/Cantonese

IIUC, there are 9 tones. However, for romanization or ime purpose.
the last three tones are always associate with romanization ended with
p t k (入聲), so that for they can be mixed with the first 6 tones
without causing ambiguity: 7->1, 8->3, 9->6. Hypothetically If
mandarin also encode the 4th tone using a trailing letter, there will
be only 3 tones in mandarin and obviously that will not be correct.
Not sure about 8 tones.

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3

唐詩133 劉長卿 秋日登吳公臺上寺遠眺
古臺搖落後 秋日望鄉心 野寺人來少 雲峰水隔深
夕陽依舊壘 寒磬滿空林 惆悵南朝事 長江獨至今

Reply all
Reply to author
Forward
0 new messages