Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
New script for checking kanjivg data; eight errors found
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ben Bullock  
View profile  
 More options Jul 14 2011, 9:22 pm
From: Ben Bullock <benkasminbull...@gmail.com>
Date: Fri, 15 Jul 2011 10:22:08 +0900
Local: Thurs, Jul 14 2011 9:22 pm
Subject: New script for checking kanjivg data; eight errors found

The following script:

https://github.com/benkasminbullock/kanjivg/blob/master/check-all-str...

is based on using the stroke "type" information and parsing the path
information.

It averages the direction of all the strokes of a particular type
(using the start and end points, as extracted by Image::SVG::Path)
then it looks at each individual stroke and prints out any stroke
which seems to be very much in the wrong direction (this is set to 1.0
radians at the moment, but can be set to any number).

When the script is run on the "normal" files, excluding the -Kaisho
and others, the output looks like this:

https://github.com/benkasminbullock/kanjivg/blob/master/check-1.0

Using this, the eight errors shown in the attached pictures were
found, from looking at about a quarter of the file (from line 1 to
line 117 of check-1.0). Many more errors might be found if the limit
is set to less than one radian, for example 0.5. Also it should be run
on the -Kaisho and other files. These aren't visible at
kanji.sljfaq.org which is why it was difficult to check them.

Another point is that many of the things found by this script seem to
arise from the KanjiVG classification of the strokes rather than the
errors in strokes which even a non-expert can pick out. Sorting this
out is a job for someone who understands the system of classification,
so apologies but that is not done here.

There are many more cases to examine! If anyone on the list would like
to volunteer to check just a few of the lines of check-1.0 that would
be very helpful.

All the corrections are now uploaded to github:

https://github.com/benkasminbullock/kanjivg

  7109.png
23K Download

  5ae3.png
30K Download

  7274.png
24K Download

  7bf6.png
29K Download

  8831.png
36K Download

  9ae2.png
28K Download

  7e66.png
37K Download

  88d8.png
25K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alexandre Courbot  
View profile  
 More options Jul 17 2011, 10:03 am
From: Alexandre Courbot <gnu...@gmail.com>
Date: Sun, 17 Jul 2011 23:03:51 +0900
Local: Sun, Jul 17 2011 10:03 am
Subject: Re: [kanjivg] New script for checking kanjivg data; eight errors found

> When the script is run on the "normal" files, excluding the -Kaisho
> and others, the output looks like this:

> https://github.com/benkasminbullock/kanjivg/blob/master/check-1.0

I see this file has been checked into your branch. It's probably not
necessary to have generated content in the git repo, mind if I remove
it?

> There are many more cases to examine! If anyone on the list would like
> to volunteer to check just a few of the lines of check-1.0 that would
> be very helpful.

Indeed, it would be great to have some sanity tests to ensure quality
and consistency of everything. This looks like a huge job though.

> All the corrections are now uploaded to github:

> https://github.com/benkasminbullock/kanjivg

Merged, thanks!

Alex.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »