On Fri, 21 Oct 2016, Karl Rosvold wrote:
> forward, I think. Just identify the component as an enclosure and say how
> many of its strokes are written after the next thing. There are other kanji
It's certainly do-able, either that way or by simply adding a stroke
number attribute to each stroke and not requiring the XML file to be
sorted on that attribute. But I don't know a database that actually does
record this information in a straightforward way - everybody either breaks
the hierarchy, or does not store stroke order in the first place.
KanjiVG breaks the hierarchy by storing the enclosing component as two
separate groups in the SVG file. From that data it's difficult to recover
the structure of which groups go together, although my IDSgrep import
script does a fair job.
As a matter of fact I'm departing for Tokyo tomorrow morning, on a trip
that will include giving a talk at SISAP 2016
) about the algorithmic
aspects of IDSgrep. This talk isn't really about where the data comes
from, only how bit vectors can be used to speed the search. It will quite
possibly be my last academic conference talk. I'm leaving academia and
going into the music synthesizer business. More on that in my Web log,
URL in signature.