New issue 398: Support logograms (like CJK characters) as account names.
https://bitbucket.org/blais/beancount/issues/398/support-logograms-like-cjk-characters-as
Alex Hou:
According to beancount language syntax, account names should follow this rule:
> Each component of the account names begin with a capital letter or a number and are followed letters, numbers or dash \(-\) characters. All other characters are disallowed.
But logograms \(Chinese characters, Japanese hiragana and katakana, Korean alphabet, etc.\) have no ‘capital letter’.
Please support these characters as account names.
[
http://www.programminginkorean.com/programming/hangul-in-unicode/](http://www.programminginkorean.com/programming/hangul-in-unicode/)
> Hangul Syllables \(AC00–D7A3\)
> Hangul Jamo \(1100–11FF\)
> Hangul Compatibility Jamo \(3130–318F\)
> Hangul Jamo Extended-A \(A960–A97F\)
> Hangul Jamo Extended-B \(D7B0–D7FF\)
[
https://stackoverflow.com/a/30200250/4458143](https://stackoverflow.com/a/30200250/4458143)
[
http://www.rikai.com/library/kanjitables/kanji\_codes.unicode.shtml](http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml)
> Hiragana \(3040 - 309F\)
> Katakana \(30A0 - 30FF\)
[
https://en.wikipedia.org/wiki/Halfwidth\_and\_Fullwidth\_Forms\_\(Unicode\_block\)](https://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms_(Unicode_block))
> halfwidth forms of compatibility jamo characters for Hangul \(FFA0–FFDC\)
> half-width katakana \( FF65–FF9F \)
[
https://stackoverflow.com/a/1366113/4458143](https://stackoverflow.com/a/1366113/4458143)
[
https://en.wikipedia.org/wiki/CJK\_Unified\_Ideographs](https://en.wikipedia.org/wiki/CJK_Unified_Ideographs)
```
Block Range Comment
CJK Unified Ideographs 4E00-9FFF Common
CJK Unified Ideographs Extension A 3400-4DBF Rare
CJK Unified Ideographs Extension B 20000-2A6DF Rare, historic
CJK Unified Ideographs Extension C 2A700–2B73F Rare, historic
CJK Unified Ideographs Extension D 2B740–2B81F Uncommon, some in current use
CJK Unified Ideographs Extension E 2B820–2CEAF Rare, historic
CJK Compatibility Ideographs F900-FAFF Duplicates, unifiable variants, corporate characters
CJK Compatibility Ideographs Supplement 2F800-2FA1F Unifiable variants
```
I’ve organized these unicode ranges:
```python
r'\u1100-\u11FF' # Hangul Jamo
r'\u3040-\u309F' # Hiragana
r'\u30A0-\u30FF' # Katakana
r'\u3130-\u318F' # Hangul Compatibility Jamo
r'\u3400-\u4DBF' # CJK Unified Ideographs Extension A
r'\u4E00-\u9FFF' # CJK Unified Ideographs
r'\uA960-\uA97F' # Hangul Jamo Extended-A
r'\uAC00-\uD7A3' # Hangul Syllables
r'\uD7B0-\uD7FF' # Hangul Jamo Extended-B
r'\uF900-\uFAFF' # CJK Compatibility Ideographs
r'\uFF65-\uFF9F' # half-width katakana
r'\uFFA0-\uFFDC' # halfwidth forms of compatibility jamo characters for Hangul
r'\u20000-\u2A6DF' # CJK Unified Ideographs Extension B
r'\u2A700-\u2B73F' # CJK Unified Ideographs Extension C
r'\u2B740-\u2B81F' # CJK Unified Ideographs Extension D
r'\u2B820-\u2CEAF' # CJK Unified Ideographs Extension E
r'\u2F800-\u2FA1F' # CJK Compatibility Ideographs Supplement
```