Before we go any further, be aware that font/character issues are deeply
complex. They bamboozle many programmers, especially if they think that
one byte is one character or that one byte is always that character.
This is not the case. Take care!
On 27/01/2012 10:48, TM wrote:
> I haven't installed a Chinese keyboard. My system has standard UK
> keyboard& language settings.
>
> TKcon is happy to display Chinese characters:
> ls --> GOL10001700A00 夹线条.pdf
Tkcon works at the level of Tcl result strings and has access to Tk's
main font handling system. That means it knows directly what the
characters are — there's no misinterpretation step involved — and knows
exactly how to display those characters.
> Wish outputs gibberish:
> ls --> GOL10001700A00 å¤¹çº¿æ ¡.pdf
The issue there is that although it's to a Tk window, it's been mangled
through a (fake) channel that's set to the system encoding and that
causes problems. (Alas, it seems to be the "wrong sort" of mangling too,
with the bytes on the channel being UTF-8 but being interpreted as a
single-byte encoding; looks like there's a bug here.)
You can get the same displayed output in Tkcon (or something very
similar in Tclsh) by using:
encoding convertto utf-8 $filename
> And so does DOS:
> dir --> GOL10001700A00 ???.pdf
Again, mangling through an encoding though this time in a "correct" way
(there's no chinese characters in the encoding, so they *can't* be
represented at all and are instead converted to "?" symbols). This is
indeed information-lossy.
You can get the same displayed output in Tkcon using:
encoding convertto cp1252 $filename
> Tkcon: encoding system --> cp1252
> Wish: encoding system --> cp1252
> tclsh: encoding system --> cp1252
Yes, cp1252 can't contain any symbols from any east Asian alphabet.
Tkcon doesn't care and doesn't need to care. Wish cares (but shouldn't
as it is directing to a channel where we can ensure that both ends are
correct). Tclsh cares and is directing output to a channel where that
care is properly justified.
Donal.