I have to process in my script some accented characters like 'á', 'é' and so on.
These characters must also be saved to and loaded from a file. To make it
platform-independent, I force unicode encoding like in this example:
----------
# test.tcl
set enc unicode
# print file content
if {[file exists test]} {
set h [open test r]
puts [encoding convertfrom $enc [gets $h]]
close $h
}
# read new file content
puts -nonewline "> "
flush stdout
set h [open test w]
puts -nonewline $h [encoding convertto $enc [gets stdin]]
close $h
----------
When I run this with TclKit 8.4.13 on WinXP, I get the following output:
----------
C:\> tclkitsh-win32 test.tcl
> hello
C:\> tclkitsh-win32 test.tcl
hello
> áéíóú
C:\> tclkitsh-win32 test.tcl
´¥á´╝┐´¥í´¥ó´¥ú
>
C:\>
----------
Running the same script with pure Tcl, I get the following output:
----------
C:\> del test
C:\> tclsh test.tcl
> hello
C:\> tclsh test.tcl
hello
> áéíóú
C:\> tclsh test.tcl
áéíóú
>
C:\>
----------
Has anyone already encountered a similar issue and found a work-around?
Thanks a lot,
Christoph
The german console encoding is not included in the default set of
encodings distributed with Tclkit. Windows uses a different encoding for
its dos console window.
Btw. its much easier to force an encoding on a channel with fconfigure
-encoding instead of manual conversion with encoding convertfrom/convertto.
Michael
Hi Michael,
thanks for the hint. I added all "normal" Tcl encodings to TclKit as described
in http://www.equi4.com/tkunicode.html and now it works fine :-)
Thanks a lot,
Christoph
You seem to have your solution already, but for completeness and for
the archives I'll try to expand on this anyway.
Christoph Schmidt writes:
> I have to process in my script some accented characters like 'á', 'é'
> and so on. These characters must also be saved to and loaded from a
> [...]
> # print file content
> if {[file exists test]} {
> set h [open test r]
> puts [encoding convertfrom $enc [gets $h]]
[gets] already gives you decoded text, not raw bytes, so there is no
encoding from which to convert after that. [gets] follows the
encoding that you can set with [fconfigure]. Using [encoding
convertfrom] means that you do a second conversion. This will only
work if you are lucky, but it is not by design. What you probably
want instead is this:
set h [open test r]
fconfigure $h -encoding $enc
puts [gets $h]
NB: The same goes for [puts]. It also uses whatever encoding is set
on its stream, in this case stdout, for generating text in the right
external encoding.
benny