Paul Rubin <no.e...@nospam.invalid> writes:
> but for decades before anyone cared about Unicode, keyboards
>had cursor keys and function keys that send escape sequences. Should we
>expect KEY to properly read those and encode them somehow? Are there
>even Unicode codepoints for them (I don't know)?
No.
Already Forth-94 has EKEY for this kind of usage, and Forth-2012 has
extended it with words like EKEY>FKEY and K-UP to allow checking for
cursor keys.
>What does your keyboard actually transmit when you type "Ä" (capital A
>with umlaut, codepoint 00C4)?
xev reports:
KeyPress event, serial 34, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198524391, (117,102), root:(155,197),
state 0x0, keycode 50 (keysym 0xffe1, Shift_L), same_screen YES,
XLookupString gives 0 bytes:
XmbLookupString gives 0 bytes:
XFilterEvent returns: False
KeyPress event, serial 37, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198525304, (117,102), root:(155,197),
state 0x1, keycode 48 (keysym 0xc4, Adiaeresis), same_screen YES,
XLookupString gives 2 bytes: (c3 84) "Ä"
XmbLookupString gives 2 bytes: (c3 84) "Ä"
XFilterEvent returns: False
KeyRelease event, serial 37, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198525463, (117,102), root:(155,197),
state 0x1, keycode 48 (keysym 0xc4, Adiaeresis), same_screen YES,
XLookupString gives 2 bytes: (c3 84) "Ä"
XFilterEvent returns: False
KeyRelease event, serial 37, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198525780, (117,102), root:(155,197),
state 0x1, keycode 50 (keysym 0xffe1, Shift_L), same_screen YES,
XLookupString gives 0 bytes:
XFilterEvent returns: False
>My guess is it actually send an ISO
>8859-1 character (single byte) which also happens to be 00C4 so your
>EMIT possibly has to translate it to some other encoding like UTF-8 on
>output.
As you can see, the keyboard transmits events that contain key codes,
and X translates the event to a keysym (which seems to use Latin-1 or
the Unicode code point) and a string (which uses UTF-8). Let's take
an example where there is no overlap between Unicode code points and
Latin-1: Pressing AltGr-E on a German keyboard, giving the Euro sign:
KeyPress event, serial 37, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198870996, (161,147), root:(199,242),
state 0x0, keycode 108 (keysym 0xfe03, ISO_Level3_Shift), same_screen YES,
XKeysymToKeycode returns keycode: 92
XLookupString gives 0 bytes:
XmbLookupString gives 0 bytes:
XFilterEvent returns: False
KeyPress event, serial 37, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198872498, (161,147), root:(199,242),
state 0x80, keycode 26 (keysym 0x20ac, EuroSign), same_screen YES,
XLookupString gives 3 bytes: (e2 82 ac) "€"
XmbLookupString gives 3 bytes: (e2 82 ac) "€"
XFilterEvent returns: False
KeyRelease event, serial 37, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198872574, (161,147), root:(199,242),
state 0x80, keycode 26 (keysym 0x20ac, EuroSign), same_screen YES,
XLookupString gives 3 bytes: (e2 82 ac) "€"
XFilterEvent returns: False
KeyRelease event, serial 37, synthetic NO, window 0x4400001,
root 0x13f, subw 0x0, time 198872931, (161,147), root:(199,242),
state 0x80, keycode 108 (keysym 0xfe03, ISO_Level3_Shift), same_screen YES,
XKeysymToKeycode returns keycode: 92
XLookupString gives 0 bytes:
XFilterEvent returns: False
So X uses the Unicode code point for the keysym.