Proposal: remove single-element groups

28 views
Skip to first unread message

Ben Bullock

unread,
Apr 3, 2022, 1:50:00 AM4/3/22
to KanjiVG

I'd like to propose the wholesale removal of single-element groups like kvg:0723b-g2 in kanji/0723b.svg:

   <g id="kvg:0723b-g1" kvg:element="乂" kvg:position="top">        <g id="kvg:0723b-g2" kvg:element="丿">            <path id="kvg:0723b-s1" kvg:type="㇒" d="M73.16,9.14c0.15,0.8,0.31,2.06-0.3,3.2c-3.62,6.75-24.39,21.56-52.81,30.62"/>        </g>        <path id="kvg:0723b-s2" kvg:type="㇏" d="M32.27,17.58C49.75,20,73,32.25,79.81,40.72"/>    </g>

I'm thinking of wholesale-removing groups where the element has only one stroke, and there is only one path within the group.

But before doing that, is anyone using these single-element groups for some purpose?

(See also https://github.com/KanjiVG/kanjivg/issues/250).


Alexandre Courbot

unread,
Apr 3, 2022, 10:32:28 PM4/3/22
to KanjiVG
I'd be careful before removing this kind of data, as I suspect it may have some semantic value. We really need someone with a strong Japanese academic background to make the calls on such cases.

I agree that the added-value of single-stroke groups for the katakana "no" radical is limited, but on the other hand you have single-stroke kanji like 乙 and 一 which may be more meaningful.

--
--
You received this message because you are subscribed to the "KanjiVG" group.
For options and unsubscribing, visit this group at
http://groups.google.com/group/kanjivg
---
You received this message because you are subscribed to the Google Groups "KanjiVG" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kanjivg+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kanjivg/2ab69dfc-dfcf-4f6f-b4c0-00dadac7cbb2n%40googlegroups.com.

Ben Bullock

unread,
Apr 4, 2022, 1:53:52 AM4/4/22
to KanjiVG
On Mon, 4 Apr 2022 at 11:32, Alexandre Courbot <gnu...@gmail.com> wrote:
I'd be careful before removing this kind of data, as I suspect it may have some semantic value. We really need someone with a strong Japanese academic background to make the calls on such cases.

I'm not referring to anything which requires linguistic judgement. I'm referring to such things as the groups around each line of 彡, or the first line of each of the upper and lower part of 爻. These superfluous groups carry no meaning at all and they seem to have been mechanically added at some point. If they are removed and then turn out to have been necessary for some reason, they can just be mechanically added back again. I doubt these are being used by anybody, but I thought I should ask before changing the files, just in case. The formatting on the example I posted got tangled in the google groups, perhaps these pull requests make it clearer what I'm proposing to do:


I don't think it's at all controversial, but let me know if this does break something.
 
I agree that the added-value of single-stroke groups for the katakana "no" radical is limited, but on the other hand you have single-stroke kanji like 乙 and 一 which may be more meaningful.

I'm not at all proposing deleting that kind of thing.

Ulrich Apel

unread,
Apr 4, 2022, 3:24:47 AM4/4/22
to kan...@googlegroups.com
Hi everybody,

if you want more "human" animations of the stroke order, you have to make writing pauses not only between strokes but also between stroke groups, elements, positioning etc. with varying length.  This should also concern single stroke groups.  I am not sure, whether, this feature is used at the moment, but I would prefer to keep the group information somewhere, perhaps as an attribute, if it makes things easier.

Best wishes

Ulrich

--
--
You received this message because you are subscribed to the "KanjiVG" group.
For options and unsubscribing, visit this group at
http://groups.google.com/group/kanjivg
---
You received this message because you are subscribed to the Google Groups "KanjiVG" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kanjivg+u...@googlegroups.com.

Lee Hericks

unread,
Apr 4, 2022, 4:38:57 AM4/4/22
to kan...@googlegroups.com
This is an excellent point. 

At some point did proper documentation of KanjiVG get completed? The project needs a manifesto of sorts, a documentation of the features and reasons, to avoid suggesting removal of important information

Sent from my iPhone

On Apr 4, 2022, at 16:24, Ulrich Apel <ulric...@gmail.com> wrote:



Alexandre Courbot

unread,
Apr 4, 2022, 7:28:56 AM4/4/22
to KanjiVG
Hi Ulrich,

Would animation timing be the only information lost by removing these one-stroke groups? Do you remember the motivation for the current layout?

Cheers,
Alex.

Ulrich Apel

unread,
Apr 4, 2022, 8:09:39 AM4/4/22
to kan...@googlegroups.com
Hi Alex,

probably animation timing is the main information connected to these groups. 

Chances are high, that strokes that are already a radical or an element on their own, are in the extra groups, because we treated all radicals in the same way.  If there are ways with which an XML layman like me can restore this group information, it is fine with me to remove the groups.

Best wishes,
Ulrich

msk...@ansuz.sooke.bc.ca

unread,
Apr 4, 2022, 8:44:40 AM4/4/22
to KanjiVG
On Mon, 4 Apr 2022, Alexandre Courbot wrote:
> I'd be careful before removing this kind of data, as I suspect it may have
> some semantic value. We really need someone with a strong Japanese academic
> background to make the calls on such cases.

When this came up ten years ago I advocated that we should have a rule
that every group contains *either* two or more other groups, *or* one or
more strokes; never a stroke being sibling to a group. Following this
rule would entail creating more single-stroke groups in many cases, and
retaining most if not all of those that curently exist.

https://groups.google.com/g/kanjivg/c/R2qdhAGnABY/m/R5FsXNd9zOQJ

The reason I advocated that was that "no stroke sibling to a group" is a
rule that can be tested easily and automatically. If we say "no
single-stroke groups ever" then we lose important information in some
cases - so, okay, we probably have a consensus against absolutely
forbidding single-stroke groups. If we say "single-stroke groups only
when required to represent semantics" then we have a rule that cannot
easily be tested (the test needs to know when semantics are required) and
for which objective answers may not even exist (because humans may not
agree on what is or isn't necessary semantic information).

Note that if my other recommendation about representing stroke order
explicitly instead of implicitly is *not* adopted, then that's an
additional reason single-stroke groups might be needed: because stroke
order might require you to split a multi-stroke component into individual
strokes or subsequences of strokes represented in different parts of the
file, and then information needed to undo the splitting might need to be
attached to a group that happens to contain only one of the strokes.

I thought and still think that rules for structuring the files ought to go
with automated tests of whether the rules are followed; else, we will end
up not actually following the rules. And that in turn means rules that
can be automatically tested, may be preferable to rules that can't.

--
Matthew Skala
msk...@ansuz.sooke.bc.ca People before tribes.
https://ansuz.sooke.bc.ca/

Ben Bullock

unread,
Apr 4, 2022, 9:29:41 AM4/4/22
to KanjiVG
To clarify, I'm proposing removing redundant single stroke groups like the ones around the strokes in 彡 or 爻, not removing every group which only contains a single stroke, which is what my message seems to be saying, on reading it back. I think I've been looking at the KanjiVG file contents for too long, and I wrongly assumed that everyone else would understand what I meant when i said I wanted to remove these groups. The change I meant to propose but didn't make very clear will not remove anything other than redundant information, and it will not affect stroke order animations. I apologise for wasting people's time but I was only referring to the redundant single stroke groups, not all of the groups which only contain a single stroke. These diffs show some of what I propose to do but it's a little confusing since I also reordered some strokes where a "tree" element was wrongly grouped as a child of  爻 in the same set of edits:


Similarly here:


These groups are completely redundant and can be mechanically removed or added at will since they cover every instance of 彡 in KanjiVG.


--
--
You received this message because you are subscribed to the "KanjiVG" group.
For options and unsubscribing, visit this group at
http://groups.google.com/group/kanjivg
---
You received this message because you are subscribed to the Google Groups "KanjiVG" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kanjivg+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages