Performance isn't an issue, but I'm a bit of a premature optimiser anyway... so knowing the fastest solution would let me sleep easy!
Regarding what I'm trying to achieve - it's automatic creation of full Unicode support for the Sphinx full text search engine, basically mapping and normalising characters and creating charset tables for supplied Unicode block ranges.
I've just pushed the code if anyone's interested (sorry, it's in a rough state at the moment):
https://github.com/Mutatio/sphinx-character-map/blob/master/characterMap.goCJK is still pending (but support is easy enough to add), it's the normalisation of accented Latin / Greek / Other scripts that I want to master first. Also all other command-line features are missing, it's purely at the prototyping stage.
Cheers,
- Martin