Online viewer http://kanjivg.lemoda.net/

106 views
Skip to first unread message

Ben Bullock

unread,
Jun 13, 2009, 10:35:34 PM6/13/09
to KanjiVG
I made a primitive online viewer for kanjivg (not sure what the VG
stands for?).

It is available here: http://kanjivg.lemoda.net/

[This is only a temporary site to prove the concept.]

Unfortunately this does not work with Internet Explorer since it only
produces SVG, which Internet Explorer doesn't do. I didn't have time
to make something which produces PNG images yet.

Firefox users need to click on the image (the broken image symbol is a
link to the image itself) since Firefox does not seem to be able to
show SVG and text on the same page (I think).

Google Chrome/Safari users should be able to see the picture updated
instantly.

Incidentally I updated the parser program so that it puts the data
into an SQLite database, and the above image viewer actually pulls
each image from the database and creates the SVG from it.

If you downloaded the previous parser, note that it had a bug whereby
it didn't parse parts ("smooth curves") of the SVG and so if you tried
parsing "学" bits would be missing. I'm not sure how the SVG was
created but the format is a little untidy (dare I say it?). But the
kanji images seem to be very high quality.

The numbers are the stroke numbers. I haven't done the work to make
the program put them somewhere intelligent so sometimes the numbers
clash with each other or with the lines at the moment.

If anyone wants the updated parser/db interface for kanjivg, I can
send it by email. I think the "files" section here is not the best
place for this so I'll think about making it a project on Source Forge
or CPAN or something like that.

Alexandre Courbot

unread,
Jun 13, 2009, 11:16:34 PM6/13/09
to kan...@googlegroups.com
Hi Ben,

> I made a primitive online viewer for kanjivg (not sure what the VG
> stands for?).

Vector Graphics, save as SVG since it is the format used.

> It is available here: http://kanjivg.lemoda.net/

Amazing. This is working great and the rendered images are really cool already.

> If you downloaded the previous parser, note that it had a bug whereby
> it didn't parse parts ("smooth curves") of the SVG and so if you tried
> parsing "学" bits would be missing. I'm not sure how the SVG was
> created but the format is a little untidy (dare I say it?). But the
> kanji images seem to be very high quality.

I think it's Adobe Illustrator. Indeed, most paths could be cleaned up
and simplified - that would also make the file smaller.

> If anyone wants the updated parser/db interface for kanjivg, I can
> send it by email. I think the "files" section here is not the best
> place for this so I'll think about making it a project on Source Forge
> or CPAN or something like that.

Actually I'm planning to do something similar for KanjiVG's website -
each kanji would have its own page where the user can see the data in
a user-friendly form as well as different renderings (image, stroke
order animation, etc.) and submit fixes. I was initially planning to
do that in Python (and a little bit of PHP for the front-end), as it's
the language my lib is written in so far (and I don't know about Perl
anyway). Would you like to join in? We could build the most complete
(although not most error-free at the moment) site proposing stroke
order diagrams!

On the main repository, every file is splitted into its own set of
files (one for the XML, one for the SVG) so I don't think the SQLite
database would be useful there.

Alex.

Ben Bullock

unread,
Jun 14, 2009, 5:18:35 AM6/14/09
to kan...@googlegroups.com
2009/6/14 Alexandre Courbot <gnu...@gmail.com>:

>> I made a primitive online viewer for kanjivg (not sure what the VG
>> stands for?).
>
> Vector Graphics, save as SVG since it is the format used.

Oh, I should have guessed!

>> It is available here: http://kanjivg.lemoda.net/
>
> Amazing. This is working great and the rendered images are really cool already.

Thanks.

>> If you downloaded the previous parser, note that it had a bug whereby
>> it didn't parse parts ("smooth curves") of the SVG and so if you tried
>> parsing "学" bits would be missing. I'm not sure how the SVG was
>> created but the format is a little untidy (dare I say it?). But the
>> kanji images seem to be very high quality.
>
> I think it's Adobe Illustrator. Indeed, most paths could be cleaned up
> and simplified - that would also make the file smaller.

I didn't check about simplification but it seems odd that there are
both relative and absolute paths mixed together in the same kanji
stroke.

I have also noticed that about twenty or so strokes don't have path information.

>> If anyone wants the updated parser/db interface for kanjivg, I can
>> send it by email. I think the "files" section here is not the best
>> place for this so I'll think about making it a project on Source Forge
>> or CPAN or something like that.
>
> Actually I'm planning to do something similar for KanjiVG's website -
> each kanji would have its own page where the user can see the data in
> a user-friendly form as well as different renderings (image, stroke
> order animation, etc.) and submit fixes.

I think it's better to start that project at the back end and think
about what you are going to do about version control, applying fixes
and so on. I think the user interface (web site or something) is the
easy part of this problem.

> I was initially planning to
> do that in Python (and a little bit of PHP for the front-end), as it's
> the language my lib is written in so far (and I don't know about Perl
> anyway).

The above site I made is mostly JavaScript in fact, there are only a
few lines of Perl which just pulls the SVG from the database and
writes it to a file.

Once the data is put into a database it is possible to then access it
using another language. If I get around to making a PNG version of the
stroke diagrams I might write it in C since there is a fairly easy
library in C called Cairo which I already use to make PNGs. SVG is
obviously much better than PNG for storing the kanji stroke data, but
it is not a very good format for presentation since Internet Explorer
doesn't support it and there are (really bad) errors in all the
renderers I know about. Inkscape, libsvg on Linux, Chrome, and Firefox
are all definitely buggy.

> Would you like to join in? We could build the most complete
> (although not most error-free at the moment) site proposing stroke
> order diagrams!

I'm sorry but unfortunately I already have a lot of other things to
do, so I don't have enough free time to make a commitment. The above
viewer is something I made to visually check that the curve
information was inserted into the database correctly.

Jeroen Hoek

unread,
Jun 14, 2009, 6:15:06 AM6/14/09
to kan...@googlegroups.com
2009/6/14 Ben Bullock <benkasmi...@gmail.com>:

> Unfortunately this does not work with Internet Explorer since it only
> produces SVG, which Internet Explorer doesn't do. I didn't have time
> to make something which produces PNG images yet.
>
> Firefox users need to click on the image (the broken image symbol is a
> link to the image itself) since Firefox does not seem to be able to
> show SVG and text on the same page (I think).

Hi Ben,

Could you try referencing the SVG with object tags instead of img
tags? That works fine in Firefox at least.

<object id="strokediagram" data="img/kanji39340.svg" type="image/svg+xml"/>

For reference:
http://wiki.svg.org/SVG_and_HTML

The last section seems the proper way of handling this type of data.
See attached screenshot as well.

Kind regards,

Jeroen Hoek

Screenshot-KanjiVG - Mozilla Firefox.png

Ben Bullock

unread,
Jun 14, 2009, 6:54:52 AM6/14/09
to kan...@googlegroups.com
2009/6/14 Jeroen Hoek <ma...@jeroenhoek.nl>:

> Could you try referencing the SVG with object tags instead of img
> tags? That works fine in Firefox at least.
>
> <object id="strokediagram" data="img/kanji39340.svg" type="image/svg+xml"/>

I changed it as you suggest and it is viewing OK now in both Firefox
and Chrome. Thanks for this fix.

Ben Bullock

unread,
Jun 14, 2009, 5:39:22 PM6/14/09
to kan...@googlegroups.com
2009/6/14 Ben Bullock <benkasmi...@gmail.com>:

Update: I changed it back to the way it was before, because it was not
working very well with Google Chrome (my default browser) like that. I
thought it was a bug on the Linux version of Chrome but it's the same
on Windows.

Obviously SVG just isn't well supported in browsers so the next step
is to make PNG from the data rather than trying to show SVG.

Alexandre Courbot

unread,
Jun 14, 2009, 10:42:37 PM6/14/09
to kan...@googlegroups.com
> I have also noticed that about twenty or so strokes don't have
> path information.

Right, I know about that too. These are the few I need to fix in order
to have a perfect match between XML and SVG data. There are about 50
kanjis in that case.

> I think it's better to start that project at the back end and think
> about what you are going to do about version control, applying fixes
> and so on. I think the user interface (web site or something) is the
> easy part of this problem.

Yeah, this part has been an headache, but we're going through it.

We already have a version control (SVN) which splits the data between
XML descriptions and complete, editable SVG files, one of both per
kanji. This makes editing easy. Every day, a script generates the
release files if a commit has been performed. This keeps things easy
and practical.

In an ideal world, users would be able to submit fixes straight from
the kanji page, but that seems difficult as of now. Maybe we will just
make the SVN public so that anyone can submit patches. I have to clean
it up first.


> I'm sorry but unfortunately I already have a lot of other things to
> do, so I don't have enough free time to make a commitment. The above
> viewer is something I made to visually check that the curve
> information was inserted into the database correctly.

Sure, I understand that. Still your program gave me some good ideas on
how to do the per-kanji pages.

Alex.

Ben Bullock

unread,
Jun 15, 2009, 10:18:30 AM6/15/09
to kan...@googlegroups.com
2009/6/15 Ben Bullock <benkasmi...@gmail.com>:

> Obviously SVG just isn't well supported in browsers so the next step
> is to make PNG from the data rather than trying to show SVG.

Another update: now it makes both png and svg files.

Just for fun, I made the PNG one have random stroke colours.

Alexandre Courbot

unread,
Jun 16, 2009, 12:02:06 AM6/16/09
to kan...@googlegroups.com
> Another update: now it makes both png and svg files.
>
> Just for fun, I made the PNG one have random stroke colours.

Works like a charm. Great!

By the way, KanjiVG is not my project, I'm just doing some code here
and there and use it in my own (Tagaini Jisho) - credit for KanjiVG is
due to Ulrich.

Alex.

Jeroen Hoek

unread,
Jun 16, 2009, 5:25:50 AM6/16/09
to kan...@googlegroups.com
Is Ulrich Apel planning on publishing an article on KanjiVG? I would love to read it.

~ Jeroen

2009/6/16 Alexandre Courbot <gnu...@gmail.com>

Alexandre Courbot

unread,
Jun 16, 2009, 11:22:06 AM6/16/09
to kan...@googlegroups.com
> Is Ulrich Apel planning on publishing an article on KanjiVG? I would love to
> read it.

There is at least this one:

http://www2005.org/cdrom/docs/p1152.pdf

And probably a couple of others, although KanjiVG is not referenced by
its present name. I'm pretty sure there is some research potential to
be exploited there, so I hope Ulrich will come with a couple of
perspectives for KanjiVG!

Alex.

Dr. Ulrich Apel

unread,
Jun 16, 2009, 1:30:24 PM6/16/09
to kan...@googlegroups.com
Hi everybody,

I am very exited about the progress of KanjiVG in the last months
thanks to Alex and all of you.

> Is Ulrich Apel planning on publishing an article on KanjiVG? I would
> love to read it.

Thank you, Jeroen, for the question!

There are some papers on the topic together with Julien Quint who has
made the programming of animations and much of the theoretical work
about the information science side of the project -- pretty much like
Alex is doing it now. Unfortunately, Julien became very busy with
another job and other projects and we couldn't continue together.

One paper by Julien and me from SVGOpen 2004 is: Teaching and
Reference Material on Japanese Kanji in SVG
Stroke Order, Animated Drawing of Characters, Kanji Components and
their Relationships.
<http://www.svgopen.org/2004/papers/svgopen/>.

A similar paper from Coling 2004 is part of: <http://acl.ldc.upenn.edu/coling2004/W10/pdf/proceedings.pdf
>

Julien and I even won the Best Poster Award at the 14th World Wide Web
Conference (WWW 2005): <http://www.www2005.org/award.html>, <http://delivery.acm.org/10.1145/1070000/1062914/p1152-quint.pdf?key1=1062914&key2=3106415421&coll=GUIDE&dl=GUIDE&CFID=39765720&CFTOKEN=95415220
>

I am also planning a research project about the extension of KanjVG at
my new employer Tuebingen University.


By the way, the original Illustrator data also contains information
about numbering the starting points of the strokes. These numbers
should be in way that overlapping and misunderstanding is avoided.
Perhaps a new version of KanjiVG might include this information too.

Best wishes

Ulrich

Alexandre Courbot

unread,
Jun 17, 2009, 8:57:36 PM6/17/09
to kan...@googlegroups.com
Hi Ulrich,

Thanks for the list of papers!

> By the way, the original Illustrator data also contains information
> about numbering the starting points of the strokes.  These numbers
> should be in way that overlapping and misunderstanding is avoided.
> Perhaps a new version of KanjiVG might include this information too.

I did not include them because this information would better be
computed IMHO. It wouldn't hurt much to add an attribute with the
position of the number, but this implies that we don't forget to
update them every time we edit a file, which is rather error-prone.

Alex.

Ben Bullock

unread,
Jun 18, 2009, 2:18:00 AM6/18/09
to kan...@googlegroups.com
2009/6/18 Alexandre Courbot <gnu...@gmail.com>:
>
> Hi Ulrich,
>
> Thanks for the list of papers!
>
>> By the way, the original Illustrator data also contains information
>> about numbering the starting points of the strokes.  These numbers
>> should be in way that overlapping and misunderstanding is avoided.
>> Perhaps a new version of KanjiVG might include this information too.
>
> I did not include them because this information would better be
> computed IMHO.

Do you have an algorithm for this? It's not obvious to me how to compute it.

Alexandre Courbot

unread,
Jun 18, 2009, 9:51:46 PM6/18/09
to kan...@googlegroups.com
>> I did not include them because this information would better be
>> computed IMHO.
>
> Do you have an algorithm for this? It's not obvious to me how to compute it.

Nor is it to me, to be honest. I guess that even manually, it is
difficult to always find a comprehensive layout where there are no
ambiguities as to which path a number belongs to. Still, using
bounding boxes for paths and numbers and ensuring they don't overlap
sound like a way to do that.

Alex.

Ben Bullock

unread,
Jun 19, 2009, 12:32:22 AM6/19/09
to kan...@googlegroups.com
2009/6/19 Alexandre Courbot <gnu...@gmail.com>:

>> Do you have an algorithm for this? It's not obvious to me how to compute it.
>
> Nor is it to me, to be honest. I guess that even manually, it is
> difficult to always find a comprehensive layout where there are no
> ambiguities as to which path a number belongs to. Still, using
> bounding boxes for paths and numbers and ensuring they don't overlap
> sound like a way to do that.

My vote is to include the information on the location of the number in
the data file.

Alexandre Courbot

unread,
Jun 19, 2009, 12:35:23 AM6/19/09
to kan...@googlegroups.com
>> Nor is it to me, to be honest. I guess that even manually, it is
>> difficult to always find a comprehensive layout where there are no
>> ambiguities as to which path a number belongs to. Still, using
>> bounding boxes for paths and numbers and ensuring they don't overlap
>> sound like a way to do that.
>
> My vote is to include the information on the location of the number in
> the data file.

Okay, I'll try to update the file then. Note that the information is
still subject to human errors.

Alex.

Enrique Saul Gonzalez

unread,
Jun 23, 2009, 12:45:34 AM6/23/09
to kan...@googlegroups.com
Hello everyone,

I'm coming a bit late to this discussion, but I was wondering if you
had considered using a different color scheme for the kanji images.
Having an intuitive, easy-to-follow scheme and providing a legend
could reduce the need for actual numbers, which also has the effect of
reducing clutter.

For example I am using a scheme inspired in the colors of the rainbow
for a kanji learning game I'm developing.
I uploaded a couple samples (to Picasa for convenience):
http://picasaweb.google.com/esaulgd/Kanji?authkey=Gv1sRgCKb3mIykpK3KFA&feat=directlink

BTW, you'll see the shapes themselves are pretty rough. This is
because I'm currently using the data from the Tomoe project. I'd like
to use the KanjiVG data instead, and for this purpose it would be
great if I could take a look at the code for the online viewer. Would
it be possible to obtain a copy?

Thanks and congrats on the great work so far.

-- Enrique
--
東京大学大学院学際情報学府
馬場研究室 M2 エンリケ・サウール・ゴンザレズ
電話番号: 080-5679-1659

Alexandre Courbot

unread,
Jun 23, 2009, 3:00:36 AM6/23/09
to kan...@googlegroups.com
> I'm coming a bit late to this discussion, but I was wondering if you
> had considered using a different color scheme for the kanji images.
> Having an intuitive, easy-to-follow scheme and providing a legend
> could reduce the need for actual numbers, which also has the effect of
> reducing clutter.

Actually KanjiVG does not impose any color scheme. It just gives you
the stroke paths, so you can apply whatever shape or color that
pleases you.

> For example I am using a scheme inspired in the colors of the rainbow
> for a kanji learning game I'm developing.
> I uploaded a couple samples (to Picasa for convenience):
> http://picasaweb.google.com/esaulgd/Kanji?authkey=Gv1sRgCKb3mIykpK3KFA&feat=directlink

Looks interesting. Be careful as the licence for KanjiVG prohibits
commercial use.

> BTW, you'll see the shapes themselves are pretty rough. This is
> because I'm currently using the data from the Tomoe project. I'd like
> to use the KanjiVG data instead, and for this purpose it would be
> great if I could take a look at the code for the online viewer. Would
> it be possible to obtain a copy?

Ben posted the code of the viewer he used for
http://kanjivg.lemoda.net/ . You can find it here:

http://kanjivg.googlegroups.com/web/kanjivg.tar?gda=mJNv2T0AAACZ1HcjKkQOwo5IfNM_r-emiwmoZe8lhpcIAhudhKj761dp9oANqoIL0POiyte4AGLlNv--OykrTYJH3lVGu2Z5

Anyway, parsing SVG paths is not very hard. You can find documentation
about that here: http://www.w3.org/TR/SVG/paths.html#PathData

Have fun,
Alex.

Mathieu Blondel

unread,
Jun 23, 2009, 11:42:14 AM6/23/09
to KanjiVG
Hi everyone,

I contacted Julien Quint years ago in the hope to get the kanji data
released under a free license after the "AAAA" website went down.
Apparently it was difficult at that time so I'm glad the kanji finally
made it to the public! Thank you for putting this together.

Incidentally, the data could prove to be useful for handwriting
recognition as well. They could be used either as training data for
learning algorithms or as test data. It would be just a matter of
sampling points from the lines and bézier curves.

Mathieu

Ben Bullock

unread,
Jun 23, 2009, 7:31:57 PM6/23/09
to kan...@googlegroups.com
2009/6/24 Mathieu Blondel <mblo...@gmail.com>:

> Incidentally, the data could prove to be useful for handwriting
> recognition as well. They could be used either as training data for
> learning algorithms or as test data.

The original reason I requested the data from Ulrich Apel was in order
to use it to test against a data set for "handwriting" recognition.
The current release of the data sprang from a discussion on Jim
Breen's dictionary mailing list. I'm also very glad that the data was
released.

> It would be just a matter of
> sampling points from the lines and bézier curves.

FYI, all the data in KanjiVG is cubic Bezier curves.

Ben Bullock

unread,
Jun 23, 2009, 10:32:11 PM6/23/09
to kan...@googlegroups.com
2009/6/23 Enrique Saul Gonzalez <esa...@gmail.com>:

> BTW, you'll see the shapes themselves are pretty rough. This is
> because I'm currently using the data from the Tomoe project. I'd like
> to use the KanjiVG data instead, and for this purpose it would be
> great if I could take a look at the code for the online viewer. Would
> it be possible to obtain a copy?

I don't plan to release this source code at the moment. The basis of
the viewer is converting the SVG curves (or paths) from KanjiVG into
calls to the Cairo graphics library. The Cairo routines
"cairo_curve_to" and "cairo_rel_curve_to" correspond exactly to the
"C" and "c" curve information in KanjiVG. For the "S" and "s" curve
information you need to go back to the previously drawn curve and get
its second control point, since Cairo doesn't offer anything directly
equivalent. In my implementation I only use "cairo_curve_to", so I
convert all the c/s/S stuff. If you can read C (the computer
language), here is the algorithm:

>>>>>>>> Start C snippet

/* If the curves are relative to the current point ("c" or "s" in
SVG path notation), add the value of the current point to make
them absolute. */

if (curve_type == 'c' || curve_type == 's') {
int i;
for (i = 0; i < 3; i++) {
c.pt[i].x += last_point.x;
c.pt[i].y += last_point.y;
}
if (curve_type == 'c') {
curve_type = 'C';
} else if (curve_type == 's') {
curve_type = 'S';
} else {
fprintf (stderr, "Unknown type of curve %c\n", curve_type);
exit (1);
}
}
if (curve_type == 'S') {
point second_point;
secondpoint (k, & second_point);
c.pt[2] = c.pt[1];
c.pt[1] = c.pt[0];
c.pt[0].x = 2.0*last_point.x - second_point.x;
c.pt[0].y = 2.0*last_point.y - second_point.y;
}

<<<<<<<<<<<< end snippet

Enrique Saul Gonzalez

unread,
Jun 24, 2009, 1:20:13 AM6/24/09
to kan...@googlegroups.com
Thank you for the quick reply.

On Tue, Jun 23, 2009 at 4:00 PM, Alexandre Courbot <gnu...@gmail.com> wrote:
>
> Actually KanjiVG does not impose any color scheme. It just gives you
> the stroke paths, so you can apply whatever shape or color that
> pleases you.
>

I was talking about the online viewer, so this would be mostly a
suggestion to Ben, I think.

> Looks interesting. Be careful as the licence for KanjiVG prohibits
> commercial use.

This is a purely academic development (my master thesis actually). Of
course I will give full credit and follow all licensing restrictions.
I can post more info on the project as it develops if that's okay.

> Ben posted the code of the viewer he used for
> http://kanjivg.lemoda.net/ . You can find it here:
>
> http://kanjivg.googlegroups.com/web/kanjivg.tar?gda=mJNv2T0AAACZ1HcjKkQOwo5IfNM_r-emiwmoZe8lhpcIAhudhKj761dp9oANqoIL0POiyte4AGLlNv--OykrTYJH3lVGu2Z5
>
> Anyway, parsing SVG paths is not very hard. You can find documentation
> about that here: http://www.w3.org/TR/SVG/paths.html#PathData

Thanks a lot for the references. They'll be quite useful. Same goes for Ben.

I'm developing in XNA (Visual C#). There doesn't seem to be native SVG
support, but hopefully the graphics libraries support bezier curves.

Thanks.
-- Enrique

Reply all
Reply to author
Forward
0 new messages