[Qhanzi] Multiradical for all hanzi

19 views
Skip to first unread message

Ben Bullock

unread,
Nov 24, 2020, 1:11:43 AM11/24/20
to sljfaq.org
I've now updated the multiradical server so that it does all Unicode plane 1 hanzi, the 28,000 hanzi set and not just the 6,000 hanzi GB set it currently does. This is what it now looks like:

multirad-all-unicode.png

Before doing the upgrade to the new version, I thought I should check how many people are using the current thing:


mr-qhanzi-users-ga.png

It seems like about 100 people a day are currently using the multiradical lookup, so suddenly changing it to a new thing might be a bit drastic, so I'm planning to make a "new version" announcement and put the new version alongside the old version for a time, then see what happens. If necessary I can keep the GB-only version going in some form or another.


Ben Bullock

unread,
Nov 24, 2020, 8:22:25 PM11/24/20
to sljfaq.org
On Tue, 24 Nov 2020 at 15:11, Ben Bullock <benkasmi...@gmail.com> wrote:
I've now updated the multiradical server so that it does all Unicode plane 1 hanzi, the 28,000 hanzi set and not just the 6,000 hanzi GB set it currently does. 

I've done some modification of the output using CSS grids so that the results appear on the right of the selector if there is enough screen space:

mr-24-nov-2020.png

I also removed the old scrolling thing which I was using, but that turned out to have had unanticipated benefits. With the new scrolling shown above, when a very large number of results is returned, it becomes extremely slow to display them. The web server is responding in a fraction of a second, but the display of the results becomes very slow. Thus I truncated the number of results returned to a maximum of 400 total. Another problem can be seen at the top left of the results, some of the Unicode elements don't have any representation in common fonts. Looking at Wiktionary, the image they use is taken from a Google web font. The reason it is sorted into that box is because it doesn't have stroke count information in Unihan. It is this thing:


Wiktionary gives it 29 strokes, so I might have to scrape the stroke count from Wiktionary. The online version of the Unihan doesn't seem to be maintained:


It says "The Unicode Standard (Version 3.2)"! And Unicode is now on 13.0.0. Too busy adding emojis to bother with boring old Unihan?

I'll probably post another progress report on this thing later today or tomorrow.




 

Ben Bullock

unread,
Nov 24, 2020, 8:57:35 PM11/24/20
to sljfaq.org


On Wed, 25 Nov 2020 at 10:22, Ben Bullock <benkasmi...@gmail.com> wrote:
... Another problem can be seen at the top left of the results, some of the Unicode elements don't have any representation in common fonts. Looking at Wiktionary, the image they use is taken from a Google web font. The reason it is sorted into that box is because it doesn't have stroke count information in Unihan.

On closer inspection it seems this does have stroke count information in Unihan as the second part of the kRSUnicode field, so I've been able to sort them, but they don't display in my browser yet.

 

Ben Bullock

unread,
Nov 25, 2020, 6:54:18 AM11/25/20
to sljfaq.org
I'm trying out CSS grids for a responsive design for the multiradical page. At the moment it looks like this (iPhone):

mr-iphone-25-11-2020.png

On a narrow screen it looks like this:

mr-narrow-25-11-2020.png

On a wider screen it looks like this:

mr-wide-25-11-2020.png

I was worried that the user would find the buttons difficult to locate but actually it doesn't seem to be a problem for me in practice.


Ben Bullock

unread,
Dec 2, 2020, 2:38:27 AM12/2/20
to sljfaq.org
The new "all hanzi multiradical" is now on the web for testing:

https://www.qhanzi.com/new/mr.html

Reply all
Reply to author
Forward
0 new messages