Hi, I'm back. I went a little crazy with the CNS data, laying the foundation for what will become a discussion of Taiwan's educational character sets, and how they relate to fonts and publishing. There are things I don't understand about variation selectors, compatibility ideographs, and how fonts work, so that will have to wait, but the pages for CNS 11643 Planes 1-7 are finished, at least for the 99.95% of it that is in Unicode:
These are not yet linked from the main site, as I haven't yet decided how to approach it. For the new site, really what I want to do is provide the four basic lists from Taiwan in their entirety:
常用國字標準字體表 (4,408 hanzi) common
次常用國字標準字體表 (6,341 hanzi) less common
罕用字體表 (18,480 hanzi) rare
異體國字字表 (18,609 hanzi) variants
These date to 1982-1984 (all of these are in CNS 11643 Planes 1-7 and Unicode), and I'm not sure if and how they have been updated. Handling the actual data is trivial, once you learn a little bit about Ruby (in my case) or any other regular-expression language. Ken Lunde provides the first two lists, but I haven't yet tried to find the data to generate the other two.
NOTE: If you are in Sierra or High Sierra, you'll see a blank glyph at CNS T3-272A (U+2F98F) on Plane 3 -- this is due to the Hiragino Sans CNS font, which is the default for that code point but doesn't actually have glyph for it. There are a lot of blank glyphs in that font from the CJK Compatibility Ideographs Supplement, but I think only one of them is on the first seven planes of CNS:
Curiously, Baoli TC does have a glyph for it in Sierra/High Sierra, but it gets bumped by the Hiragino Sans glyph, which macOS doesn't know is blank.
CNS Planes 10-14 are more problematic. They are probably best approached from the perspective of Unicode's source data, rather than CNS. Plane 15 has CNS characters not yet in Unicode. This includes a steady flow of new submissions used in names and places in Taiwan. A lot of these have to do with Hakka, Southern Min, and other languages. Maybe fun if you know them or need them for your research, but outside of my wheelhouse...
ER