--
You received this message because you are subscribed to the "KanjiVG" group.
For options and unsubscribing, visit this group at
http://groups.google.com/group/kanjivg
> handwriting-zh_TW.xml (11853 characters)
> handwriting_zh_CN.xml (6763 characters)
handwriting_zh_CN.xml is the same as in Tomoe. handwriting_zh_TW.xml
has been constructed by Christoph Burgmer by aggregating components
from other characters (both Chinese and Japanese).
However, in both cases, these characters were designed to serve as
templates for handwriting recognition. If you try to render them,
you'll find out that they consist of straight lines and are pretty
ugly. Therefore, if your purpose is to use characters as a visual help
for learning, the Tomoe data is probably not the best, although it may
be useful to bootstrap a new project.
My 2 cents,
Mathieu
--
You received this message because you are subscribed to the "KanjiVG" group.
For options and unsubscribing, visit this group at
http://groups.google.com/group/kanjivg
> Nevetheless more the 3000 likely correct stroke order with stroke
> types and components is not nothing.
> Maybe a good editor could help industrialize the production. A java-
> based would be a good idea since the recognizer could be called (I
> managed to call both HanziDict and zinnia).
> A wiki too.
If you want access to the wiki I would be glad to give you the password.
As for the editor, yes, this is definitely needed. I am working on one
(using Python/Qt), but unfortunately lack the time to put it to
completion right now. In addition, some rules must be set up for the
file naming of non-Japanese characters, and I am not knowledgeable
enough to make the right decisions for it. Also, in order to provide
font-quality rendering, I need to know how the stroke height should
vary according to the control points. Ulrich sent me some weight
variations a while ago and I played with it a little bit, but the
result is not quite as good as the samples he showed me. I can do the
programming and maintain the project, but I really need *directions*
for that.
So, basically some people with deep knowledge of sinographs are needed
for the project to advance. Ulrich is the man, but unfortunately he
seems busy with other things. So people, please don't be afraid -
anybody who wants to get involved and has the necessary knowledge will
have free hand, and if we can put that editor thing to completion the
project will already feel much better.
Alex.
> Ulrich is the man, but unfortunately he
> seems busy with other things.
I am very sorry for my long silence. I do try to follow what is going in the mailing list, but yes, I am pretty busy at the moment. It should be better next month, when two deadlines are over and we have semester vacations. By the way, one deadline is connected to get financing for KanjiVG, too.
It seems that colleagues from Buddhist studies in Japan have developed a system for the description of kanji variants. I will try to look into this in February. I am also planning to discuss KanjiVG with our colleagues from Sinology in Tuebingen.
Adaption of stroke order shouldn't be difficult – a rough versions of simplified Chinese for existing characters, too. If our IT assistant Roger -- who is member of this mailing list too -- has some spare time in February or March, I guess we two could get a first working version.
I used a Japanese schoolbook font as model KanjVG. This won't be a solution for Chinese. Probably one will have to try to go directly to esthetic rules of calligraphy.
For missing non-variant characters, a component analysis is necessary.
I have worked with students on missing Unicode characters, but have no results from them yet.
For better looking fonts with KanjiVG one could apply ideas from METAFONT: laying shapes over the paths, defining stroke ends, defining stroke weights and so on: ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/974/CS-TR-83-974.pdf or http://www.tug.org/TUGboat/Articles/tb05-2/tb10hobby.pdf. The paper is from 1983, but the approach stood a prototype. I wouldn't be astonished, if a combination with KanjiVG would lead to very nice results.
> Ulrich sent me some weight
> variations a while ago and I played with it a little bit, but the
> result is not quite as good as the samples he showed me. I can do the
> programming and maintain the project, but I really need *directions*
> for that.
Alex, could you make your preliminary results available somewhere, then I will have a look at them.
Ulrich
> @Alex:
> What do you mean by : "In addition, some rules must be set up for the
> file naming of non-Japanese characters"
> I create the file naming conventions for the Commons Stroke Order Project. If I understand better your requirement, I may help.
> Basically, the naming convention on the Commons SOP are:
> ROCPRCJapan
> *-tbw.png (5) *-bw.png (1,006)*-jbw.png (52)
>
> * : the CJK character (unicode)
> - : a separator.
> t,,j : the country code. With t for Taiwan, [nothing] for China, j for Japan.
> bw : the image kind, there bw for 'Black and White diagrams'
> .png : the extension.
>
> Clarify your meaning, then I will to move further.
Basically, something like this. The most important thing is to be able
to differenciate between the different versions of a character. The
current naming scheme of KanjiVG is:
xxxxx-Variant.svg
where the x's are the unicode in hexadecimal, and -Variant being an
optional code describing when the kanji is a variant, and of what.
This brings two questions that I think we should answer first (and put
down on the wiki for the record):
1) If there are variants, what are they variants of? I.e. what is the
reference version of the stroke? If KanjiVG turns international,
shouldn't it have its own -Variant code for consistency?
2) There are various suffixes as of now: Kaisho, Jinmei, JinmeiKaisho,
VtLst, HzFst, Vt4, ... It would greatly help if we could clearly
explain what they stand for (I have no idea) and again write it down
to the wiki (which I don't mind doing provided I get a clear
explanations and e.g. supporting sources).
Which brings me to 3):
3) Since there seems to be many kinds of variants across countries,
wouldn't it make more sense to classify the variants according to
their characteristics (what I think the current variants are about)
instead of nationality? I mean, even in China there seems to be
several schools, so maybe we could cover the range of variants more
accurately this way.
Speaking without any sense to back up here - but I hope I can finally
make sense of these strange names soon. ;)
Alex.
> I am very sorry for my long silence. I do try to follow what is going in the mailing list, but yes, I am pretty busy at the moment. It should be better next month, when two deadlines are over and we have semester vacations. By the way, one deadline is connected to get financing for KanjiVG, too.
That would be fantastic - especially if that allows you to get back on
the ship! :)
> For better looking fonts with KanjiVG one could apply ideas from METAFONT: laying shapes over the paths, defining stroke ends, defining stroke weights and so on: ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/974/CS-TR-83-974.pdf or http://www.tug.org/TUGboat/Articles/tb05-2/tb10hobby.pdf. The paper is from 1983, but the approach stood a prototype. I wouldn't be astonished, if a combination with KanjiVG would lead to very nice results.
I will have a look at that.
>> Ulrich sent me some weight
>> variations a while ago and I played with it a little bit, but the
>> result is not quite as good as the samples he showed me. I can do the
>> programming and maintain the project, but I really need *directions*
>> for that.
>
>
> Alex, could you make your preliminary results available somewhere, then I will have a look at them.
I have attached a couple screenshots. I used the stroke weight
variations you sent me some time ago (the ones in AppleScript) and an
algorithm that makes a linear progression along the length of the
path. owari-sample is the one you sent me and is the reference.
owari-rendered is the one rendered using my algorithm. If you look at
the first two strokes, you will notice that they get thinner, then
bigger before the curve. This is probably not desired and suggests my
linear progression algorithm is wrong. There is probably something to
be done with the path's control points, but somehow my brain seems to
be hermetic to AppleScript and I could not make any sense out of it
(hence my naive try). Any clue here would be very welcome and would
also help me establishing some rules about the number and nature of
control points for the editor (i.e. when the user selects a stroke
type, a sample is inserted and only some parameters of it could be
changed).
Alex.
> Yes, you should describe the variant inside the XML file so that a
> file can be used accurately for several locales.
>
> The variant XML tag should link to locales.
>
> <vairiants>
> <variant locale="ja_JP" style="Kaisho"/>
> <variant locale="zh_TW" style="Kaishu" source="bishuen"/>
> <variant locale="zh_CN" style="Kaishu" source="kangxi"/>
> </variants>
>
> Why locale ? Because it is iso so that everybody knows...
> Moreover, in XML, you can make up your mind and change the format
> whenever you want to extend it.
So do you mean that the variant tag would link to another file? Or
that all the variants would be described (strokes + structure) in the
same file?
For the next version of KanjiVG format, I aim at having XML and SVG
files merged into a single file (this is already possible for
non-variants) that would be SVG-compliant. Therefore I'd rather be in
favour of having a single variant per file + an easy to figure naming
system. The style and source tags could also be added to the SVG file.
> 9a6c-zh_CN.xml (simplified for 99ac chinese simplified only)
>
> The debate about the suffix is therefore less important than the
> variant information that should in any case reside in the XML...
Indeed, but we would still need to know how to rename the large number
of existing files, and for this we need a match between the current
naming scheme and an eventual new one.
Alex.