On Mon, 11 Jun 2012, TGara wrote:
> Now, which of these do the mods consider actual errors and would be willing
> to start a however slow process of correcting, given an error listing?
I'm all for having consistent values of stroke types, but I think that has
to begin with at least a strong first draft of what the list of allowed
values for the field actually is. To my knowledge, no such list currently
exists. The closest thing we have is the "strokes.txt" file, which
doesn't match the current database, and so your description of some values
as "wrong" isn't terribly meaningful - there's nothing we could really say
would be "right." Unicode's list of strokes in the CJK Strokes range
(U+31C0 to U+31E3) seems to be the basis for what's currently in the
database, but it's not the only reasonable possibility on how to classify
strokes, and different approaches have different advantages and
disadvantages making it not trivial to choose the best one. I wrote about
this a bit in my March 12 message here about consistency rules; and my
test suite for KanjiVG could easily check for consistency of the stroke
type field at such time as there's a definition of what the correct values
are allowed to be.
Having a consistent number of points (and also a consistent pattern of
things like which points are "corner" points) per stroke type would
certainly be nice, but I don't think there's much point even starting on
that piece of the puzzle until we have a list of stroke types in the first
place. Once we have a list of stroke types, and choose the pattern of
control points (how many, and which ones are corners) for each stroke type,
then there's some reasonable possibility of fixing just that one thing in
an automated way by computing the standard-pattern control points that
best match the whatever-pattern curves already in the database.
--
Matthew Skala
msk...@ansuz.sooke.bc.ca People before principles.
http://ansuz.sooke.bc.ca/