[kanji] New search method by shape

21 views
Skip to first unread message

Ben Bullock

unread,
Mar 6, 2022, 7:26:54 PMMar 6
to sljfaq.org
I'm going to introduce a new search method to kanji.sljfaq.org of "by shape only" which is based on the kana lookup I posted previously. This new method has the advantage that it can look up things like the swastika symbol 卍:

swastika.png

or the 亞 symbol:

asia.png

without knowing the stroke order or even dividing the input into "strokes". It has a lot of disadvantages as well, in that it's quite poor at distinguishing characters when they start to get complicated or distorted. 

The orange circles in the above illustrations, and the scores, will not be present in the kanji.sljfaq.org version.

I think what I'll do is to have a "radio button" to select between search methods, rather than the "on/off".

The software is basically already ready to go, but installing it on the kanji.sljfaq.org requires some server down time, so I'm thinking about doing it on Sunday morning, 13 March. 

As usual, this does depend on the amount of things I have to deal with other than web sites.


Ben Bullock

unread,
Mar 12, 2022, 3:00:54 AMMar 12
to sljfaq.org
On Mon, 7 Mar 2022 at 09:26, Ben Bullock <benkasmi...@gmail.com> wrote:
The software is basically already ready to go, but installing it on the kanji.sljfaq.org requires some server down time, so I'm thinking about doing it on Sunday morning, 13 March. 

I'm going to put this off until next week, 20th March around 6 am GMT.

The shape match stuff is now working OK, but I'm going to take advantage of the server down time to also switch over the stroke-order-independent matching to the system used at qhanzi.com at the same time. The switch of server type required to get the shape matching working is the same switch required to use the qhanzi system.

Hopefully at some point I will be able to merge the shape matching and stroke-order-independent matching, and the stroke-order dependent matching into one thing, but for the time being the stroke-order dependent matching will be the CGI thing, as long as that goes on functioning with the new server type, and there will be two different server processes for the stroke-order-independent and the shape-only matching.


Ben Bullock

unread,
Mar 13, 2022, 12:26:34 AMMar 13
to sljfaq.org
On Sat, 12 Mar 2022 at 17:00, Ben Bullock <benkasmi...@gmail.com> wrote:


On Mon, 7 Mar 2022 at 09:26, Ben Bullock <benkasmi...@gmail.com> wrote:
The software is basically already ready to go, but installing it on the kanji.sljfaq.org requires some server down time, so I'm thinking about doing it on Sunday morning, 13 March. 

I'm going to put this off until next week, 20th March around 6 am GMT.

The old Kudo lookup system for kanji.sljfaq.org is well beyond its sell-by date so I’m also going to just retire that at the same time I do these other upgrades. The default input will then become the Handwritten (HTML5) input using the JavaScript canvas element. This is now very standard so I don't think the old Kudo input method is useful any more. The canvas links will become redirects.

I may have mentioned before but Kudo's system itself is long gone from the web:


I don't know what he did with the data he'd collected, anyway I now have terabytes of unlabelled kanji input data.




 

Ben Bullock

unread,
Mar 17, 2022, 1:44:41 AMMar 17
to sljfaq.org
A short progress report on this upgrade:

I've changed my mind about removing the old drawing method, I think I'll keep it as an option.

The basic work of changing the non-stroke order lookup to the qhanzi method is done. There is no deployment script to install it on the web site yet though.

The shape search work is done.

A lot of the front-end work on JavaScript is done, but I'm taking this opportunity to try to revamp the site as much as possible.

It might be ready to go on Sunday morning but I'm not too sure about that, so it might turn into next Sunday. I might turn on just the persistent process option on the web site on Sunday March 20 2022 to see if it will be able to cope, and also install the new stroke order independent lookup if it seems OK.

The announcement page English version is basically done, and I should translate it to Japanese as well.

Unfortunately I'd neglected the kanji.sljfaq.org web site for a long time, because I was worried that I'd break something, but it is now definitely time to overhaul it.


Paul Zecher

unread,
Mar 17, 2022, 1:45:00 AMMar 17
to sljf...@googlegroups.com

 

 

Hi Ben,

 

This is Paul (or Kanjiguy) writing from Fort Collins, Colorado.

 

Thank you for the update. I was actually going to test myself pretty soon and I was going to do this by writing all the kanji down and using your “platform” to see how many I can recall on my own. And for proof make an image for it. Last I was there I noted the image function, which I hadn’t used before, and it has a nice color for the stokes. I’m not sure it accomplishes much, but my memory is not so great. It never was, but… we’ll see how I do!

 

Thanks again. And good luck and all that jazz.

 

Paul

 

 

 

 

 

 

 

Sent from Mail for Windows

 

From: Ben Bullock
Sent: Sunday, March 6, 2022 5:32 PM
To: sljfaq.org
Subject: [kanji] New search method by shape

 

I'm going to introduce a new search method to kanji.sljfaq.org of "by shape only" which is based on the kana lookup I posted previously. This new method has the advantage that it can look up things like the swastika symbol :

 

 

or the symbol:

 

 

without knowing the stroke order or even dividing the input into "strokes". It has a lot of disadvantages as well, in that it's quite poor at distinguishing characters when they start to get complicated or distorted. 

 

The orange circles in the above illustrations, and the scores, will not be present in the kanji.sljfaq.org version.

 

I think what I'll do is to have a "radio button" to select between search methods, rather than the "on/off".

 

The software is basically already ready to go, but installing it on the kanji.sljfaq.org requires some server down time, so I'm thinking about doing it on Sunday morning, 13 March. 

 

As usual, this does depend on the amount of things I have to deal with other than web sites.

 

 

--
You received this message because you are subscribed to the Google Groups "sljfaq.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sljfaqorg+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sljfaqorg/CAN5Y6m_go611oOy%3DX1F7c9C%3DGPCT-2orct9sGOAT8AKt9B8nqw%40mail.gmail.com.

 

Ben Bullock

unread,
Mar 17, 2022, 1:48:22 AMMar 17
to sljfaq.org
On Thursday, 17 March 2022 at 14:45:00 UTC+9 Paul Zecher wrote:

 

 

Hi Ben,

 

This is Paul (or Kanjiguy) writing from Fort Collins, Colorado


Hi Paul, I apologise for holding your message for so long. I did not receive a notification from Google about it I think, perhaps I have done something wrong in my email setup, I'll check that. Anyway sorry about that. 


Thank you for the update. I was actually going to test myself pretty soon and I was going to do this by writing all the kanji down and using your “platform” to see how many I can recall on my own. And for proof make an image for it. Last I was there I noted the image function, which I hadn’t used before, and it has a nice color for the stokes. I’m not sure it accomplishes much, but my memory is not so great. It never was, but… we’ll see how I do!


The colours on the image should be the same as the colours on the "draw-canvas.html" page here:


What is going to happen is that this is going to be the default page, hopefully within March.

You can turn off the colours here, by unchecking "Use multicoloured strokes":

 
Thanks for the feedback, it is useful for me.

Ben Bullock

unread,
Mar 17, 2022, 1:55:29 AMMar 17
to sljfaq.org
On Thursday, 17 March 2022 at 14:48:22 UTC+9 Ben Bullock wrote:
On Thursday, 17 March 2022 at 14:45:00 UTC+9 Paul Zecher wrote:

Hi Paul, I apologise for holding your message for so long. I did not receive a notification from Google about it I think, perhaps I have done something wrong in my email setup, I'll check that. Anyway sorry about that. 


On checking the email, I didn't have any filters for those messages, but it seems that Google thinks that its own Google Groups notification messages and other messages sent to this list are spam, because they are "similar to other messages identified as spam in the past", which is odd. I definitely have not marked those as spam. If anyone here thinks this email list is spam, do feel free to unsubscribe. Thanks! Or could it be that other people's groups or Google Groups in general contain too much spam? A little frustrating to deal with this, Google's anti-spam should not be doing this in my opinion, since Google can control what is sent from its own groups, so if people genuinely are using them for spam, Google should just shut the things down. Anyway.
 

Ben Bullock

unread,
Mar 17, 2022, 1:57:24 AMMar 17
to sljfaq.org
Checking further I found a whole conversation from the jmdict mailing list marked as spam too, so evidently Google has some kind of autoimmune disease where the antibodies are attacking the host organism.
 
 

Jim Breen

unread,
Mar 17, 2022, 2:05:34 AMMar 17
to sljf...@googlegroups.com
On Thu, 17 Mar 2022 at 16:57, Ben Bullock <benkasmi...@gmail.com> wrote:

> Checking further I found a whole conversation from the jmdict mailing list marked as spam too, so evidently Google has some kind of autoimmune disease where the antibodies are attacking the host organism.

Fortunately this is easily overcome in Gmail, once you have detected
it. Simply set a "never send to spam" filter for the selected "From"
address. I've had to do that in a few instances.

Jim

--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/

Ben Bullock

unread,
Mar 19, 2022, 9:02:44 AMMar 19
to sljfaq.org
Another progress report.

There is no possibility at all that this will be ready by tomorrow. The kanji.sljfaq.org site was not in very good shape, so I've spent a lot of time working on just the HTML templates and the JavaScript and CSS. So I haven't done any of the necessary work on writing the server code which needs to be put in place before it starts running. The front-end stuff is in much better shape than it has been for several years.

Obviously I will make an announcement later on. I'm not 100% sure I can get this all done by the end of March.

I'm not sure why I had let kanji.sljfaq.org get into such a bad state in terms of the inconsistencies and so forth, and some pages in a complete mess.

Ben Bullock

unread,
Mar 19, 2022, 9:55:50 PMMar 19
to sljfaq.org
Here is the announcement page for the upgraded kanji.sljfaq.org:


(This is a sneak preview, manually uploaded since the HTML upgrades are not in place yet, and some of the pages linked from there have not been uploaded yet, so sorry but the links don't all work.)

Reply all
Reply to author
Forward
0 new messages