Interesting.
The "data file" http://maxiangjiang.googlepages.com/ChineseIME.dict
doesn't seem to be in the "key, space, hanzi" format described in the
ChineseIME.vim plugin.
Maybe someday (if I come around to it) I'll try to build a datafile by
Kangxi (or similar) radicals and strokes, for people who can "see" hanzi
and maybe even "understand" them but not "say" them. I might even
already have the needed data in a different format somewhere on this
computer.
Best regards,
Tony.
--
Leibowitz's Rule:
When hammering a nail, you will never hit your finger if you
hold the hammer with both hands.
And I find punctation symbol can't be put in the data file to input Chinese
punctation, in that case vim will hang.
If you are interested I can send you the source file that radical for
traditional chinese in (about 30000 characters).
I didn't use this plugin, but I guess it should limit the characters
displayed in popup box in order to be useful. Anyway, for beginners,
it is easier to just input the first and the last radical only, and
let it display a list of characters to chose from. If it can accept
exactly two keystroke before calling up the popup box, there should be
no speed penalty.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩163 馬戴 楚江懷古
露氣寒光集 微陽下楚丘 猿啼洞庭樹 人在木蘭舟
廣澤生明月 蒼山夾亂流 雲中君不見 竟夕自悲秋
For patents... It seems that the name IME is proprietary for Microsoft.
That might be the reason most input methods in the Linux world are named
IM or XIM, they never has a name IME.
I suspect you may have to change the name 'IME' into something else for
the legal reason. Anyway, 'Input Method Editor' isn't the best
description for your masterpiece, 'Input Method Inside an Editor' is,
probably IMIE is a better name.
Yes, why not? The Unihan database is more like 30 _million_ characters,
but of course it includes a lot of info that would be superfluous in
this case.
>
> I didn't use this plugin, but I guess it should limit the characters
> displayed in popup box in order to be useful. Anyway, for beginners,
> it is easier to just input the first and the last radical only, and
> let it display a list of characters to chose from. If it can accept
> exactly two keystroke before calling up the popup box, there should be
> no speed penalty.
>
OTOH, it might be advantageous to modify the plugin to take advantage of
Vim's ability to display a comment next to each completion (for
instance, depending on the data source, a translation or a reading for
the hanzi in question).
Best regards,
Tony.
--
Underlying Principle of Socio-Genetics:
Superiority is recessive.
You could create a keymap for CJK punctuation, or even just use
|i_CTRL-V_digit| -- most of them are in the block starting at U+3000
after all.
Best regards,
Tony.
--
"... an experienced, industrious, ambitious, and often quite often
picturesque liar."
-- Mark Twain
Please don't top post.
windows 2000, centrino 1.66GHz and 768 RAM here.
>
> Also, the gvim.exe Memory Usage is less than 7MB with or without my
> plugin.
>
> My own data file is over 32K lines, and the Chinese shows up within
> one second after I type ma<C-I>, as shown on my screenshot. If I
hey, one second is too long, and if the candidate list needs to take 0.5s to
pop up, then the max speed of input per minute will just 120/s, too slow!
> typed <C-I> on empty space, I did find it is kind of "hanging" and I
> had to type <C-C> to stop it. However, it makes little sense to type
><C-I> over empty space.
But hitting <space> to input hanzi is more quick and convenient than <C-i>.
>
> I also tried to sort the data file every time after I do update, but
> it makes no difference in performance to me.
>
> It would be interesting to know if it makes difference if you cut the
> sample data file by half?
My data file is for cangjie input method, and has 30790 lines long, I don't
think it's a big data file for IM.
>
> By the way, I disabled the standard $VIMRUNTIME/plugin directory.
>
> Thanks
>
> Sean
--
Regards,
anhnmncb
Yes, but this is vim's way, it's not an IM's way.
--
Regards,
anhnmncb
Ctrl-I and Tab are synonymous to Vim anyway.
>
>> I also tried to sort the data file every time after I do update, but
>> it makes no difference in performance to me.
>>
>> It would be interesting to know if it makes difference if you cut the
>> sample data file by half?
>
> My data file is for cangjie input method, and has 30790 lines long, I don't
> think it's a big data file for IM.
>
>> By the way, I disabled the standard $VIMRUNTIME/plugin directory.
>>
>> Thanks
>>
>> Sean
>
>
Best regards,
Tony.
--
THE WOMBAT
The wombat lives across the seas,
Among the far Antipodes.
He may exist on nuts and berries,
Or then again, on missionaries;
His distant habitat precludes
Conclusive knowledge of his moods.
But I would not engage the wombat
In any form of mortal combat.
It make sense to type over empty space since that may be an extra
keystroke. i.e. you typed 马,then press your <C-I> after 马。
The performance is low compared to most existing input methods, the
performance does improve when I truncate the dict file, however, in
practical the dict file would be only larger than the sample, not smaller.
After all, any use-case this can be used for a full sentence? I consider
the following:
1. cannot use space for completion, which is very common in IMEs.
2. cannot use number keys for selection, e.g. when there's multiple
selections there is no way to select the 3rd quickly.
3. completion window does not pop-up when typing, must type<C-I> which
means we don't know whether a word is available and we don't know an
error when typing.
4. cannot match the first characters if the whole word missing: type
woyao<C-I> and nothing happens, we expect 'wo' can be matched and show
我, since the problem here is woyao not recorded as vocabulary.
At the current stage, the ease of use is low, and if we rely on vim's
built-in completion things cannot be easily improved.
--
Regards,
anhnmncb
>> 4. cannot match the first characters if the whole word missing: type
>> woyao<C-I> and nothing happens, we expect 'wo' can be matched and show
>> 我, since the problem here is woyao not recorded as vocabulary.
> The feature you mentioned here seems like "associate" (联想), which is
> not available for vim omni completion.
What I meant is not "associate" 联想, it is (say) part-completion:
for example: suppose we have the following in the dict:
aaa 阿
aaa 啊
bbb 波
bbb 播
but we don't have the vocabulary: aaabbb.
Then how can I complete when I type: aaabbb? Typical IM should try to
complete something as possible, so when we press <space> it tries to
complete aaa as 阿 and/or open the selection menu for aaa, leaving bbb
behind, then if we press <space> again, it will try to complete bbb to
波 and/or open the selection menu for bbb.
This is also useful even if we had an item in the dict and we want other
combinations:
aaabbb 阿波
now if we type aaabbb and want it to complete to 啊播,how? when we
press the first <space> we should get a menu which list 阿波 as the
first item and 阿 and 啊 as the second, third item。if we selected 阿波
then everything finished, if we selected 啊, then we will got the menu
for the next character which shows 波 and 播。
set completeopt=menu,preview,longest
set pumheight=10
set completefunc=ChineseIME
And why not add a advice for imap in vimscript.org instead of writing it in your
plugin?
imap <C-^> <C-X><C-U><C-U><C-P><C-N>
--
Regards,
anhnmncb
Yes, all these should use ":setlocal" and be in a filetype-plugin, or
maybe in some function called only on demand; not in a global plugin
used for every file (including, let's say, C sources in ASCII or,
speaking of "human" languages, Russian text in Cyrillic).
>>
>> And why not add a advice for imap in vimscript.org instead of writing it in your
>> plugin?
>>
>> imap<C-^> <C-X><C-U><C-U><C-P><C-N>
Here also, ":imap <buffer> <C-^> etc." might be better; and, as you
say, it should perhaps be commented-out by default, since some users
might prefer using another {lhs}. Myself, I don't know if my fr_BE
keyboard has a key or key combo for Ctrl-^ so I might prefer <F12> as
the {lhs}.
Best regards,
Tony.
--
A poem: read aloud:
<> !*''# Waka waka bang splat tick tick hash,
^"`$$- Caret quote back-tick dollar dollar dash,
!*=@$_ Bang splat equal at dollar under-score,
%*<> ~#4 Percent splat waka waka tilde number four,
&[]../ Ampersand bracket bracket dot dot slash,
|{,,SYSTEM HALTED Vertical-bar curly-bracket comma comma CRASH.
Fred Bremmer and Steve Kroese (Calvin College & Seminary of Grand
Rapids, MI.)
Note that if (like me) you have downloaded a previous version of this
script, you may need to wipe your browser's cache in order to see the
new version (e.g. refresh the page with Ctrl-Shift-R in Firefox or
SeaMonkey).
Best regards,
Tony.
--
Life would be so much easier if we could just look at the source code.
Seems to have a typo on the line 122:
if !exists("g:ChineseIMESpaceToggle")
let g:ChineseIMEMappingCtrl6 = 0
endif
Here is my way:
let s:ywim = 0
function YW_IMtoggle()
if s:ywim == 0
let s:oldpumheight = &pumheight
let s:oldcompleteopt = &completeopt
let g:ChineseIMESpaceToggle=1
set pumheight=10
set completeopt=menu,preview,longest
let s:ywim = 1
elseif s:ywim == 1
let g:ChineseIMESpaceToggle=0
let s:ywim = 0
set pumheight=s:oldpumheight
set completeopt=s:oldcompleteopt
endif
endfunction
imap <C-\> <C-o>:call YW_IMtoggle()<CR>
I think use <C-\> or something others instead of <tab> is better.
Good stuff, but please follow the conventions of this mailing list:
* Bottom post, that is:
* Quote a few relevant lines; put your reply underneath.
If nothing is really relevant, delete it all.
In case anyone is wondering ... these issues have been discussed several
times, and consensus prefers bottom posting. Whatever the merits, it is
best for us to use the same conventions. I am a manager of this list
(preventing spam), and have resolved to irritate people with replies
like this so everyone is aware of how we like things. See
http://groups.google.com/group/vim_use/web/vim-information
John
I suggest using thunderbird, which has good support of bottom post, if
anybody knows more mail client which support bottom post well, please
tell me, thanks.
for the OP:
It seems that you mixed chinese character(字) and word(词) together
in the same dict and it would be hard for different input method to
share the same word table(词汇表).
For example if there's a new word “一个新词”, you may have pinyin,
shanpin, wubi, or something else, each of them have different mappings
for chinese character, but when you have a new word it should be valid
for all input methods.
Any work around?
I don't think email client is a good excuse for top (or bottom) post.
Anyway if vim is configured as an external editor for composing email.
It can search for the first blank line and that should be the bottom
of quoted message.
set editor="vim -c \"set spell tw=70 et\" \"+call search('^$')\""
FWIW I use mutt.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩284 張祜 集靈臺二首之一
日光斜照集靈台 紅樹花迎曉露開 昨夜上皇新授籙 太真含笑入簾來
I use SeaMonkey (version 2.0 alpha) which shares much of its Mail&News
code with Thunderbird, and in particular supports both top- and
bottom-posting exactly the way Thunderbird does. One of the reasons why
I prefer SeaMonkey over Thunderbird is its "Suite" architecture -- it
includes not only a Mail/News client and an address book but also a
browser, a chat client, and, for those who know how to use it (I don't
-- but then I use Vim instead) an HTML editor. Another thing which I
like in this Suite concept is that HTTP/FTP links in mail will open the
page in the browser, RSS/Atom feeds seen in pages in the browser can
(with a one-click action in the browser) be subscribed in the mailer,
and an IRC link in either mailer or browser will open the chat client,
in all cases regardless of the "default" application settings in your
OS. Of course, other people will call "bloat" what I call "versatility",
so you should use what suits you (not me) best.
...and, yes, I realize that most of this reply is OT for this list.
Best regards,
Tony.
--
Put your Nose to the Grindstone!
-- Amalgamated Plastic Surgeons and Toolmakers, Ltd.
Depending on how the mail is passed to the external editor (and by what
mail useragent program), the first blank line might actually be the
boundary between the headers and the body, i.e., the top of the message
proper.
The bottom of the quoted message (before you compose your reply) should
be the bottom of what is passed to the external editor, so I recommend
GO
or
:$append
Best regards,
Tony.
--
Some programming languages manage to absorb change but withstand
progress.
1. Maybe need to define an autocmd so that
when leave from insert mode, then disable ChineseIME?
2. Why not name your script to "vim" i.e. Vim Input Method for Chinese? :)
But for the name, I think a Capitalized filename creates unnecessary
inconveniences when edited in command line. Especially for
case-sensitive file systems.
Since vim is designed for cross-platform I think it might be better to
use only lowercase filename, which works well in all platforms. For
example: inputmethod.vim would be a lot better since this can be used to
input much more than just Chinese character.
If you really want _all_ platforms, use an 8.3 name in single case.
However in this case 8.3 is probably irrelevant since IIRC Dos versions
of Vim, if still used, almost never have +multi_byte.
Best regards,
Tony.
--
"I'd love to go out with you, but there are important world issues that
need worrying about."
you might want to check out the "It's All Text" plugin for
firefox -- it allows you to use gvim to edit any editable
thing in firefox -- it just might save you some copying and
pasting
sc
let g:ChineseIME_Toggle_InertMode=1
> I don't think email client is a good excuse for top (or bottom) post.
> Anyway if vim is configured as an external editor for composing email.
> It can search for the first blank line and that should be the bottom
> of quoted message.
>
> set editor="vim -c \"set spell tw=70 et\" \"+call search('^$')\""
>
> FWIW I use mutt.
A tad shorter and doing the same:
set editor="vim +'/^$/' -c 'set spell tw=70 et'"
-- Dominique
I once used that but it raises an error message when there is no empty
line.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩191 杜甫 詠懷古跡五首之二
搖落深知宋玉悲 風流儒雅亦吾師 悵望千秋一灑淚 蕭條異代不同時
江山故宅空文藻 雲雨荒臺豈夢思 最是楚宮俱泯滅 舟人指點到今疑
Exactly. I use Gmail, which uses top post and has no option to change.
So I _move_the_cursor_down_manually_. Takes a lot of effort, let me
tell you!
--
Dotan Cohen
http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת
ا-ب-ت-ث-ج-ح-خ-د-ذ-ر-ز-س-ش-ص-ض-ط-ظ-ع-غ-ف-ق-ك-ل-م-ن-ه-و-ي
А-Б-В-Г-Д-Е-Ё-Ж-З-И-Й-К-Л-М-Н-О-П-Р-С-Т-У-Ф-Х-Ц-Ч-Ш-Щ-Ъ-Ы-Ь-Э-Ю-Я
а-б-в-г-д-е-ё-ж-з-и-й-к-л-м-н-о-п-р-с-т-у-ф-х-ц-ч-ш-щ-ъ-ы-ь-э-ю-я
ä-ö-ü-ß-Ä-Ö-Ü
> A weird problem that I can't figure it out, ChineseIME just works
> when I start a new vim, and use it in [no name] buffer,
> then after I edit other files, <C-\> has no any effect, why?
> My .vimrc just has one line relative to ChineseIME:
>
> let g:ChineseIME_Toggle_InertMode=1
>
I find the problem, the script use a local imap for <C-\> at line 237 of
the script, you'd better to change it from:
imap <buffer> <C-\> <C-O>:call ChineseIME_Toggle_InertMode()<CR>
to:
imap <C-\> <C-O>:call ChineseIME_Toggle_InertMode()<CR>
Didn't you see my reply? your script still use imap-local
so it would work just for [no name] buffer or the file
you start vim to edit from command line.
>
> Again, feedback is always welcome.
1. I think you can define a variable to toggle inputting Chinese
or English punctations.
2. The searching still need some improvings, say if "ww" and "www"
has no Chinese charator but "wwww" has, then typing "ww" should search for
the longer list instead of saying no pattern found.
3. If the charactor that cursor is on isn't non-black-space ascii,
then <space> just do normal <space> i.e. insert a real <space>
instead of saying "pattern not found" that is really useless. And I think
it's more intelligent.
4. Add an autocmd to toggle IME off when escape from insert mode.
Something like this:
au InsertLeave * call ChineseIM_InsertMode_ToggleOff()
function ChineseIM_InsertMode_ToggleOff()
" ----------------------------------- options
let &pumheight=s:saved_pumheight
let &completeopt=s:saved_completeopt
let &lazyredraw=s:saved_lazyredraw
" ----------------------------------- IM mode indication
let &iminsert=s:saved_iminsert
highlight Cursor guifg=bg guibg=fg
" ----------------------------------- <Space>
imap <Space> <Space>
" ----------------------------------- bracket
imap ( (
imap ) )
imap < <
imap > >
imap [ [
imap ] ]
" ----------------------------------- punctuation
imap , ,
imap . .
imap : :
imap ; ;
imap ? ?
imap \\ \\
" -----------------------------------
let s:n += 1
endfunction
--
Regards,
Van.
Does it (or will it) support wildcard?
eg,
ap?e will match ape, apple but not apply as key.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩155 許渾 早秋
遙夜汎清瑟 西風生翠蘿 殘螢栖玉露 早雁拂銀河
高樹曉還密 遠山晴更多 淮南一葉下 自覺老煙波
Can the dict be pre-compiled to speed up load time?
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩166 崔塗 孤雁
幾行歸塞盡 片影獨何之 暮雨相呼失 寒塘欲下遲
渚雲低暗渡 關月冷相隨 未必逢矰繳 孤飛自可疑
> > > [...]
> > [...]
>
> Well, data file in binary is not an option. [...]
Which is nothing to do with the subject line. C'mon
people, get it together! --Antony
Oh, thanks. It is a very nice feature.
I'm not sure if it supports another feature on numerals in keys. eg in
pinyin input, they are 4 tones
fu1
fu2
fu3
fu4
When I type 'fu' it display a screenful of selection, but if I
continue to type 2 (intended to input fu2), will it be mistaken for
choosing the second entry in the popup box?
I usually display choices as 5 6 7... and reserve 1 2 3 4 for tone.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩202 盧綸 晚次鄂州
雲開遠見漢陽城 猶是孤帆一日程 估客晝眠知浪靜 舟人夜語覺潮生
三湘愁鬢逢秋色 萬里歸心對月明 舊業已隨征戰盡 更堪江上鼓鼙聲
If that's a problem (and 1 2 3 4 are used for the first 4 results), you
could decide to represent the tones respectively by postfixed - / ~ \
instead -- or go whole hog and use marked vowels, ā á ǎ à and the like,
which can be entered by means of digraphs (using - ' < ! as the second
element of the digraph) if your keyboard hasn't got them built-in.
Best regards,
Tony.
--
Religion has done love a great service by making it a sin.
-- Anatole France
I can see it is hardcoded to use 4 numbers only, Can it be set to be
user configurable from any number from 1 to 9? Yes, 9. There are 9
tones in cantonese.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩312 王維 渭城曲
渭城朝雨浥輕塵 客舍青青柳色新 勸君更盡一杯酒 西出陽關無故人
If you (well, not _you_ of course) don't want to learn anything new,
don't use Vim but stay with good^H^H^H^Hbad old Notepad.
Best regards,
Tony.
--
According to Kentucky state law, every person must take a bath at least
once a year.
IIUC, depending whom you ask, how you count them, and how you romanize
the language, there may be 6, 8 or 9 tones in Cantonese: see among
others http://en.wikipedia.org/wiki/Cantonese
Best regards,
Tony.
--
If the code and the comments disagree, then both are probably wrong.
-- Norm Schryer
IIUC, there are 9 tones. However, for romanization or ime purpose.
the last three tones are always associate with romanization ended with
p t k (入聲), so that for they can be mixed with the first 6 tones
without causing ambiguity: 7->1, 8->3, 9->6. Hypothetically If
mandarin also encode the 4th tone using a trailing letter, there will
be only 3 tones in mandarin and obviously that will not be correct.
Not sure about 8 tones.
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩133 劉長卿 秋日登吳公臺上寺遠眺
古臺搖落後 秋日望鄉心 野寺人來少 雲峰水隔深
夕陽依舊壘 寒磬滿空林 惆悵南朝事 長江獨至今