Problem displaying Chinese characters in terminal

114 views
Skip to first unread message

matthieu...@gmail.com

unread,
Jul 23, 2012, 12:40:47 AM7/23/12
to cjklib...@googlegroups.com
Hi,
I just installed the latest version of cjklib on my Linux machine (Ubuntu 12.04 / Python 2.7.3).

After following the instructions to install, I tried the simple examples, but instead of the expected results, I see this:
>>> from cjklib import characterlookup
>>> cjk = characterlookup.CharacterLookup('C')
>>> cjk.getStrokeOrder(u'说')
[u'\u31d4', u'\u31ca', u'\u31d4', u'\u31d2', u'\u31d1', u'\u31d5', u'\u31d0', u'\u31d3', u'\u31df']

I try to find different forums / stackoverflow and the like to be able to see the symbols directly, but no luck so far.

I am able to do this:
>>> def f (x): print(x)
... 
>>> map(f, cjk.getStrokeOrder(u'说'))
[None, None, None, None, None, None, None, None, None]
>>> 

But this is not ideal...
Am I missing something to see the right character directly instead of those unicode codes?

Thanks for the help.

Matthieu

Christoph Burgmer

unread,
Jul 23, 2012, 4:16:05 PM7/23/12
to cjklib...@googlegroups.com, matthieu...@gmail.com
Hi Matthieu,

Python doesn't print non-ASCII characters directly, but instead encodes most of those with the Unicode hexadecimal code point.

To be fair, I do think that makes sense, as when working on the console, it can be difficult to tell certain characters apart.

If you want to print those characters to stdout, the way you do it seems good. Here's an alternative:

>>> print ' '.join([u'\u31d4', u'\u31ca', u'\u31d4', u'\u31d2', u'\u31d1', u'\u31d5', u'\u31d0', u'\u31d3', u'\u31df'])

Does that help?

-Christoph

matthieu...@gmail.com

unread,
Jul 24, 2012, 12:36:26 AM7/24/12
to cjklib...@googlegroups.com
That works perfectly.... and I agree, it makes sense. I just thought I was missing something since I was just typing the example and not getting the same result...
Thanks for the tip and good job with this library. I am just getting familiar with it, but it looks very impressing.
Danke,
Matthieu

Christoph Burgmer

unread,
Jul 24, 2012, 11:05:55 AM7/24/12
to cjklib...@googlegroups.com, matthieu...@gmail.com
Oh right, now I think I understand where you are coming from. I believe you compared this to the documentation?

The funny thing is if you copy the Python output 1:1 the documentation engine will then un-encode and render the real characters when generating the HTML page.

I was kind of happy about this specific behavior, as presenting encoded Unicode strings doesn't make for an intuitive example. I didn't see that it might irritate new users.

-Christoph

matthieu...@gmail.com

unread,
Jul 24, 2012, 11:13:31 AM7/24/12
to cjklib...@googlegroups.com
I see. Makes sense and yes, it's definitely better looking on the webpage.
No worries, did not irritate me at all... :)

Matthieu
Reply all
Reply to author
Forward
0 new messages