Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Unicode UTF-8 problem with Putty

175 views
Skip to first unread message

RedGrittyBrick

unread,
Apr 19, 2012, 6:46:35 AM4/19/12
to
In UTF-8 mode Putty ignores ANSI (DEC) box-drawing escape sequences.
Is this a bug in Putty?

I am converting an application from code-page 437 to UTF-8.
Previously Putty was set to interpret data as CP437, I changed that to
UTF-8 (Settings, Window, Translation)

Putty displays correctly the UTF-8 emitted by the app but it now ignores
the DEC graphics characters used for drawing lines and boxes as borders
around the applications text-mode windows

Here's the problem, Cut & paste from Putty window:

1 ────┼─────────┼─────────┼─────────┼─────────┤
2 ────┼─────────┼─────────┼─────────┼─────────┤
3 ────┼─────────┼─────────┼─────────┼─────────┤
4 ────┼─────────┼─────────┼─────────┼─────────┤
5 ────┼lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk──────┤
6 ────┼x x──────┤
7 ────┼x x──────┤
8 ────┼xFoo { } ©RGB я ...x──────┤
9 ────┼xBar { } ®Acme ئ . x──────┤
10 ────┼x x──────┤
11 ────┼mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj──────┤
12 ────┼─────────┼─────────┼─────────┼─────────┤
13 ────┼─────────┼─────────┼─────────┼─────────┤
14 ────┼─────────┼─────────┼─────────┼─────────┤
15 ────┼─────────┼─────────┼─────────┼─────────┤
16 ────┼─────────┼─────────┼─────────┼─────────┤

The background grid is displayed using Unicode characters U+2500 and
U+253c sent in UTF-8 encoding (e2 94 80 and e2 94 bc). That works.

The border "lqqq...qqqk" is sent using the DEC graphics feature of
VT100/ANSI terminals. Esc ( 0 is sent to switch the G0 set to the
line-drawing character set. This is being ignored.

Here's part of a `hexdump -C` of data captured using Linux's `script`
command:

80 e2 94 80 e2 94 80 e2 94 80 e2 94 80 e2 94 a4 |................|
1b 5b 32 34 3b 35 36 48 1b 5b 35 3b 31 31 48 1b |.[24;56H.[5;11H.|
28 30 1b 5b 33 34 6d 1b 5b 34 30 6d 6c 71 71 71 |(0.[34m.[40mlqqq|
71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 |qqqqqqqqqqqqqqqq|
71 71 71 71 71 71 71 71 71 71 71 71 6b 1b 28 42 |qqqqqqqqqqqqk.(B|
1b 5b 30 6d 1b 5b 33 37 6d 1b 5b 34 30 6d 1b 5b |.[0m.[37m.[40m.[|
36 3b 31 31 48 1b 28 30 1b 5b 33 34 6d 1b 5b 34 |6;11H.(0.[34m.[4|
30 6d 78 1b 28 42 1b 5b 30 6d 1b 5b 33 37 6d 1b |0mx.(B.[0m.[37m.|
5b 34 30 6d 20 20 20 20 20 20 20 20 20 20 20 20 |[40m |
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |

You can see the UTF-8 triplets for background grid.
You can see Esc ( 0 to select DEC graphics (line draw) and
later Esc ( B to reselect normal characters. There is a colour selection
inside that sequence but the following test suggests that is OK.

I updated Putty to 0.62 - same issue.

As a further confirmation, I set Putty Window Translation to CP437 and
re-ran the program ...

ΓöÇ3öΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇ
ΓöÇ4öΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇ
ΓöÇ5öΓöÇΓö┌───────────────────────────────┐ÇΓöÇΓö╝ΓöÇ
ΓöÇ6öΓöÇΓö│ │ÇΓöÇΓö╝ΓöÇ
ΓöÇ7öΓöÇΓö│ │ÇΓöÇΓö╝ΓöÇ
ΓöÇ8öΓöÇΓö│Foo { } ┬⌐RGB ╤Å ...ΓöÇΓö╝ΓöÇ
ΓöÇ9öΓöÇΓö│Bar { } ┬«Acme ╪ª .│ÇΓöÇΓö╝ΓöÇ
Γö10öΓöÇΓö│ │ÇΓöÇΓö╝ΓöÇ
Γö11öΓöÇΓö└───────────────────────────────┘ÇΓöÇΓö╝ΓöÇ
Γö12öΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇ
Γö13öΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇ
Γö14öΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇ

As expected the background grid is now interpreted as triplets of CP437
characters rather than as UTF-8 3-byte characters.

But interestingly, the window border is now displayed correctly. This
suggests that the relevant Putty settings for "handling of line-drawing
characters" are OK.

What is wrong?

--
RGB

Simon Tatham

unread,
Apr 19, 2012, 6:52:09 AM4/19/12
to
RedGrittyBrick <RedGrit...@spamweary.invalid> wrote:
> In UTF-8 mode Putty ignores ANSI (DEC) box-drawing escape sequences.
> Is this a bug in Putty?

Not as far as we're concerned. It is on our wishlist, but listed as a
possible enhancement rather than a bug:

http://www.chiark.greenend.org.uk/~sgtatham/putty/wishlist/utf8-plus-vt100.html

--
Simon Tatham What do we want? ROT13!
<ana...@pobox.com> When do we want it? ABJ!

RedGrittyBrick

unread,
Apr 19, 2012, 10:15:29 AM4/19/12
to
On 19/04/2012 11:52, Simon Tatham wrote:
> RedGrittyBrick<RedGrit...@spamweary.invalid> wrote:
>> In UTF-8 mode Putty ignores ANSI (DEC) box-drawing escape sequences.
>> Is this a bug in Putty?
>
> Not as far as we're concerned. It is on our wishlist, but listed as a
> possible enhancement rather than a bug:
>
> http://www.chiark.greenend.org.uk/~sgtatham/putty/wishlist/utf8-plus-vt100.html
>

Ok, thanks.

SecureCRT displays the output as I expected/wanted but that obviously
doesn't affect the factors that determined the priority of the Putty
enhancement.

Does anyone have ideas for workarounds? The app is developed using a
tool that has it's own variant of termcap. I have at my disposal

gs=\E(0 start graphics mode
ge=\E(B end graphics mode
gb=lmkjqx a block of 6 ASCII bytes representing ┌ ┐ └ ┘ ─ │

I can't put UTF-8 sequences in gb.

I could leave gs and ge empty and put in something like ++++-| or ..`'-|
for gb but the resulting borders would be a bit sad.

Any ideas?

--
RGB

Simon Tatham

unread,
Apr 19, 2012, 10:18:57 AM4/19/12
to
RedGrittyBrick <RedGrit...@spamweary.invalid> wrote:
> Does anyone have ideas for workarounds?
[...]
> I can't put UTF-8 sequences in gb.

I'm afraid I would have suggested that you have the app output the
UTF-8 encodings of the line-drawing characters. That's what any
sensible UTF-8 application _should_ be doing. (PuTTY is not the only
terminal which won't be happy with the VT100 approach, and in any case
doing it by UTF-8 is more robust since it won't leave the terminal in
a strange state if the application terminates unexpectedly in between
ESC(0 and ESC(B.)

So the best I can recommend is fixing the termcap implementation...
--
Simon Tatham "You may call that a cheap shot.
<ana...@pobox.com> I prefer to think of it as good value."

0 new messages