Tony Mechelynck wrote:
> On 07/10/13 14:02, Ken Takata wrote:
> > Hi,
> >
> > I wrote a patch for the following items from todo.txt:
> >
> >> Have an option for spell checking to not mark any Chinese, Japanese or other
> >> double-width characters as error. Or perhaps all characters above 256.
> >> (Bill Sun) Helps a lot for mixed Asian and latin text.
> >
> >> - have some way not to give spelling errors for a range of characters.
> >> E.g. for Chinese and other languages with specific characters for which we
> >> don't have a spell file. Useful when there is also text in other
> >> languages in the file.
> >
> > When I write mixed Japanese and English text, it really annoys me.
> > Current Vim's spell checking algorithm doesn't support Chinese, Japanese or
> > other East Asian languages. So I just exclude these characters from spell
> > checking. (No options)
> > Please check the attached patch.
> >
> > Regards,
> > Ken Takata
> >
>
> "All characters above 256" would seem a little rash IMHO: after all,
> Russian, Ukrainian, Bulgarian, Greek, etc. can (or should be able to)
> use spell checking even though their writing systems are entirely above
> U+00FF, and even in Latin script, some French nouns such as �il (eye),
> �uf (egg), b�uf (ox or beef), �il-de-b�uf (a small round window), v�u
> (wish), �dipe (Oedipus), �sophage (oesophagus), etc., use characters
> (the oe / OE digraphs, which in French are one character each) above
> U+00FF. Similarly for the accented letters of non-West-European
> languages, many of which fall outside tha Latin1 range.
>
> I suppose that excluding CJK is the right thing to do, since the nearest
> thing to "spell checking" for handwritten CJK would mean checking that
> the correct brush strokes were used, but "wrong" brush stroke
> combinations (other than simplified vs. traditional glyphs, or than
> Japanese "national" /kokuji/ characters in a Chinese text, etc.) cannot
> be produced as computer text even in Unicode; or else it might mean
> checking that word elements ("Han syllables") are meaningfully combined,
> which IMHO is more akin to checking semantics or syntax than orthography.
I was wondering if this should be an option or a spell setting of some
kind. So, you argue that we won't every have useful spell checking for
CJK characters, so we should just ignore them.
What if if have some text in a language that is spell checked, and by
some mistake a few CJK characters show up (copy/paste error, encoding
conversion mistake, etc.). Then they should be marked as errors right?
For me, I ocasionally get these characters when an Asian name is used.
I don't really care if that is highlighted as an error or not (can't
read it anyway). Other names are marked as errors, so perhaps foreign
names should be as well?
Following that line of thinking it should be an option. Perhaps a
special entry in 'spelllang' "cjk" ?
--
DEAD PERSON: I'm getting better!
CUSTOMER: No, you're not -- you'll be stone dead in a moment.
MORTICIAN: Oh, I can't take him like that -- it's against regulations.
The Quest for the Holy Grail (Monty Python)
/// Bram Moolenaar -- Br...@Moolenaar.net --
http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features --
http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language --
http://www.Zimbu.org ///
\\\ help me help AIDS victims --
http://ICCF-Holland.org ///