Re: ncurses getch & unicode (was: decoding keyboard input when using curses)

Iñigo Serna

unread,

Aug 20, 2009, 6:12:07 PM8/20/09

to pytho...@python.org

Hi again,

2009/8/20 Iñigo Serna <inigo...@gmail.com>
>
> I have the same problem mentioned in http://groups.google.com/group/comp.lang.python/browse_thread/thread/c70c80cd9bc7bac6?pli=1 some months ago.
>
> Python 2.6 program which uses ncurses module in a terminal configured to use UTF-8 encoding.
>
> When trying to get input from keyboard, a non-ascii character (like ç) is returned as 2 integers < 255, needing 2 calls to getch method to get both.
> These two integers \xc3 \xa7 forms the utf-8 encoded representation of ç character.
>
> ncurses get_wch documentation states the function should return an unique integer > 255 with the ordinal representation of that unicode char encoded in UTF-8, \xc3a7.

Answering myself, I've copied at the bottom of this email a working
solution, but the question still remains: why win.getch() doesn't
return the correct value?

Kind regards,
Iñigo Serna

######################################################################
# test.py
import curses

import locale
locale.setlocale(locale.LC_ALL, '')
print locale.getpreferredencoding()

def get_char(win):
    def get_check_next_byte():
        c = win.getch()
        if 128 <= c <= 191:
            return c
        else:
            raise UnicodeError

    bytes = []
    c = win.getch()
    if c <= 127:
        # 1 bytes
        bytes.append(c)
    elif 194 <= c <= 223:
        # 2 bytes
        bytes.append(c)
        bytes.append(get_check_next_byte())
    elif 224 <= c <= 239:
        # 3 bytes
        bytes.append(c)
        bytes.append(get_check_next_byte())
        bytes.append(get_check_next_byte())
    elif 240 <= c <= 244:
        # 4 bytes
        bytes.append(c)
        bytes.append(get_check_next_byte())
        bytes.append(get_check_next_byte())
        bytes.append(get_check_next_byte())
    buf = ''.join([chr(b) for b in bytes])
    buf = buf.decode('utf-8')
    return buf

def getcodes(win):
    codes = []
    while True:
        try:
            ch = get_char(win)
        except KeyboardInterrupt:
            return codes
        codes.append(ch)

lst = curses.wrapper(getcodes)
print lst
for c in lst:
print c.encode('utf-8'),
print
######################################################################

Thomas Dickey

unread,

Aug 21, 2009, 4:47:42 AM8/21/09

to

On Aug 20, 6:12 pm, Iñigo Serna <inigose...@gmail.com> wrote:
> Hi again,
>

> 2009/8/20 Iñigo Serna <inigose...@gmail.com>
> > I have the same problem mentioned inhttp://groups.google.com/group/comp.lang.python/browse_thread/thread/...some months ago.
>
> > Python 2.6 program which usesncursesmodule in a terminal configured to use UTF-8 encoding.

>
> > When trying to get input from keyboard, a non-ascii character (like ç) is returned as 2 integers < 255, needing 2 calls to getch method to get both.
> > These two integers \xc3 \xa7 forms the utf-8 encoded representation of ç character.
>

> >ncursesget_wch documentation states the function should return an unique integer > 255 with the ordinal representation of that unicode char encoded in UTF-8, \xc3a7.

>
> Answering myself, I've copied at the bottom of this email a working
> solution, but the question still remains: why win.getch() doesn't
> return the correct value?

The code looks consistent with the curses functions...

> Kind regards,
> Iñigo Serna
>
> ######################################################################
> # test.py
> import curses
>
> import locale
> locale.setlocale(locale.LC_ALL, '')
> print locale.getpreferredencoding()
>
> def get_char(win):
> def get_check_next_byte():
> c = win.getch()

You're using "getch", not "get_wch" (Python's ncurses binding may/may
not have the latter).
curses getch returns 8-bit values, get_wch would return wider values.

Iñigo Serna

unread,

Aug 21, 2009, 7:58:38 AM8/21/09

to Thomas Dickey, pytho...@python.org

2009/8/21 Thomas Dickey <dic...@his.com>:

you are right, ncurses binding does not have get_wch, only getch, and
this last is the only one called in ncurses library bindings.

Anyway, I've written a patch to include the get_wch method in the bindings.
See http://bugs.python.org/issue6755

Thanks for the clarification,
Iñigo

Thomas Dickey

unread,

Aug 21, 2009, 3:53:17 PM8/21/09

to Iñigo Serna, pytho...@python.org

On Fri, 21 Aug 2009, Iñigo Serna wrote:

> 2009/8/21 Thomas Dickey <dic...@his.com>:

> you are right, ncurses binding does not have get_wch, only getch, and
> this last is the only one called in ncurses library bindings.
>
>
> Anyway, I've written a patch to include the get_wch method in the bindings.
> See http://bugs.python.org/issue6755
>
>
> Thanks for the clarification,

no problem (report bugs)

> Iñigo
>

--
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net