On Tue, 2 Feb 2021 at 22:35, Ben Bullock <
benkasmi...@gmail.com> wrote:
> One I would like to draw your attention to is 隺 for which kanjidic appears to have an incorrect stroke count of 11, it should be 10. Amusingly it's possible to work out which kanji sites are using kanjidic for their information source by looking at what stroke count this has at each web site.
Yes, it should be 10 and the SKIP should be 2-3-7. Fixed.
Having been spreading Japanese lexical data around the network world
for about 30 years, It's hard not to come across it all the time. It's
also often possible to detect the sites/systems which don't update
their files. I get a bit cross when people contact me about errors
that were in fact fixed ages ago. I get even crosser when I cop abuse
on various forums for those long-fixed errors. (And don't get me
started on people who prefer to spray criticisms over proposing
corrections.)
> Also I would guess that Halpern has 竸 in the dictionary, but kanjidic has two different things for it, 1-11-11 and 2-2-8, both mathematically unlikely given the symmetry in the character (how would two identical parts result in an odd number of strokes or a different number of strokes left and right?)
Halpern only has this in one of his later dictionaries, of which I
don't have a copy. I think the 1-11-11 is correct; it's consistent
with the 1-10-10 he has for 競. The misclassification code for that one
is 2-10-10, so for 竸 I'm making it 2-10-12. The stroke count of 22 is
supported by several sources, including Unihan.
> According to the version of kanjidic2.xml mentioned on the page above, there are 13108 characters in total but only 12156 have skip codes.
Yes, the ~900 are the kanji that are in JIS X 0213 but not in JIS X 0212.
Thanks. That looks very useful. I'll add them to the kanjidic file for
JIS213. Great to have them.
Jim