On 19 May 2010 10:54, Nikolay Elenkov <
nikolay...@gmail.com> wrote:
> Thanks for the explanation. I added the option to toggle look ahead
> via a checkbox. You can see how it looks at
http://code.google.com/p/wwwjdic/
> (I failed screenshot resizing, will add better ones later).
I couldn't see the pictures on that page.
> I pushed the new
> version to the market yesterday, so you might start getting more requests.
I got 130 requests between today and yesterday morning, GMT.
> One thing I noticed with the look ahead is that when it's on, simple
> (radical level)
> kanji are sometimes not recognized. You can easily reproduce it with the web
> interface too. Draw 虫 for example and you get a list of kanji that
> have it as their
> left-side radical, but not the actual '虫' kanji. If you switch off
> look ahead, it matches
> the simple kanji.
I understand the phenomenon you mean: see this video:
http://www.youtube.com/watch?v=5hSWD0HXfs8
As you can see from the video, mostly it happens if one writes the 虫
bit tall and thin. If the 虫 is tall and thin it looks like part of
another character.
Also, there is a bug in the software which makes the recognition data
from the KanjiVG data, which also contributes to this. I discussed
this a bit on the KanjiVG mailing list. Basically the KanjiVG data for
the final stroke of 虫 is curved, and the bug lies in the recognizer
which creates the data, which, due to a poor algorithm, biases it
towards a very curved line. Then the data for, say, 蝠, in the KanjiVG
data is not curved, and the recognizer then thinks you have drawn 蝠
rather than 虫 if the final stroke is straight. If Alexandre is reading
this, this kind of bug is why I haven't responded yet about
automatically recognizing for KanjiVG stroke shapes. I really need to
fix this problem.
> Since people are likely to try it with simple kanji
> first, I left
> look ahead off by default. Is this the intended behaviour?
To some extent it's the intended behaviour. There is a design decision
I made here to assume that the person writing the character is
starting on the left and working rightwards, or starting at the top
and working downwards. When a character is read in, it is normalized
to a size of 100x100, but the horizontal and vertical are not scaled.
If it is tall and thin, it's normalized to the left, and if it is
short and fat, it is normalized to the top. That would give the normal
order of drawing kanji, top to bottom and left to right. Sometimes
that would be wrong though.
Now that I come to mention this, the lookahead is an experimental
feature and I still haven't set up any kind of way of testing how
accurately it works on real user data. I think the next thing to do is
to work out some kind of testing regimen for lookahead, and see what
parameters or algorithms give the best matches.
> Btw, with
> some kanji,
> say '本' it matches the simple kanji even with look ahead on.
There is no kanji with 本 as a left-hand component, so it won't do that
quite so much. If you draw the 本 very tall and thin you'll get other
candidates before 本 though.
> P.S. People are already saying kanji recognition rocks :) Thanks for
> all your work.
I'm glad to hear it.
BTW, could I request you to change your link to my site to the following please?
http://kanji.sljfaq.org/kanji16/draw-canvas.html
The Android browser should be able to cope with this, I think.
At least one person came to the site via your link yesterday, then went to
http://kanji.sljfaq.org/draw.html
and drew one picture before leaving :). I don't know if it works on
the phone. I know that the Nintendo DSi browser doesn't work with the
web page, unfortunately, since the touch pen insists on scrolling.
The reason I ask is that I think it would make sense for them to come
to the nearest approximation to the thing they're seeing on their
phone, rather than the
kanji.sljfaq.org top page, which is basically
something which I put there only for people who type the URL in.
Actually, originally the top page was only there to carry the "privacy
policy". There aren't any links to the top page from anywhere else on
the site, and the draw-canvas page is a better jumping-off page for
users.