On Sun, 27 Mar 2022, Alexandre Courbot wrote:
> It's just an idea for now. But I have the feeling that the time spent
> improving the tooling would get amortized pretty quickly and make the
> project more inviting to potential contributors with the right
> knowledge.
About ten years ago, I said much the same thing. The syntax and semantics
of the KanjiVG files are not well-defined, and I thought (and still think)
the project really needed to not only define the files, but also put a
high priority on having automated tools to verify them.
I'd really like to see a situation of being able to type "make check" and
get a report of which files were well-formed and which ones did not meet
the spec. In the current purely-manual maintenance situation, it's a
problem for anything that tries to extract information from KanjiVG to
actually be able to read the files - because the same fields are used in
different ways in different kanji and it's not even possible to say which
kanji are using those fields correctly and which aren't because there is
nothing written down saying what is the "correct" meaning of the values.
My own IDSgrep package (
https://tsukurimashou.osdn.jp/idsgrep.php) is an
example of the kind of thing that could benefit from KanjiVG, but that
suffers because KanjiVG's semantics are not really defined. Its KanjiVG
importer has to do a lot of guessing and patching to extract usably
consistent data from the inconsistent KanjiVG format. The task is further
complicated by the fact that IDSgrep handles kanji as trees of components,
whereas in KanjiVG the first-class entities are strokes, and the mapping
between strokes and components is not simple; but doing that mapping
would be a lot easier if the KanjiVG format were consistent from one
kanji to the next.
I thought that the top priority for being able to even start on having
automated tests, would be to package KanjiVG like a free software package,
with a configure script that would do whatever was necessary to get the
software to work on whatever system was trying to run it, and a Makefile
that would build whatever needed to be built, and run whatever tests
existed.
However, the suggestion of packaging KanjiVG like a software package was
not at all popular. Everyone else who participated in the discussion
except myself not only seemed to think it was not the top priority, but
that it would be an actively bad thing and should not be done at all.
General consensus was that KanjiVG should remain purely a set of data
files, possibly with some scripts on the side but without an automated
framework for running the scripts, and with any validity checking of the
files done only by human intervention. Although I was willing to do most
of the work of implementing the packaging and build system, I wasn't
willing to do it if it would be actively rejected, and nobody else was
volunteering, so that went nowhere.
My life is a lot different now from what it was in 2012 and I'm not in a
position to make such an offer of participation now even if it were to be
accepted. But I'll be interested to see if such ideas have become more
popular.
--
Matthew Skala
msk...@ansuz.sooke.bc.ca People before tribes.
https://ansuz.sooke.bc.ca/