Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug in TclKit encoding?

22 views
Skip to first unread message

Christoph Schmidt

unread,
Jan 7, 2007, 6:44:22 AM1/7/07
to
Hi *,

I have to process in my script some accented characters like 'á', 'é' and so on.
These characters must also be saved to and loaded from a file. To make it
platform-independent, I force unicode encoding like in this example:

----------
# test.tcl

set enc unicode

# print file content
if {[file exists test]} {
set h [open test r]
puts [encoding convertfrom $enc [gets $h]]
close $h
}

# read new file content
puts -nonewline "> "
flush stdout
set h [open test w]
puts -nonewline $h [encoding convertto $enc [gets stdin]]
close $h
----------

When I run this with TclKit 8.4.13 on WinXP, I get the following output:

----------
C:\> tclkitsh-win32 test.tcl
> hello

C:\> tclkitsh-win32 test.tcl
hello
> áéíóú

C:\> tclkitsh-win32 test.tcl
´¥á´╝┐´¥í´¥ó´¥ú
>

C:\>
----------

Running the same script with pure Tcl, I get the following output:

----------
C:\> del test

C:\> tclsh test.tcl
> hello

C:\> tclsh test.tcl
hello
> áéíóú

C:\> tclsh test.tcl
áéíóú
>

C:\>
----------

Has anyone already encountered a similar issue and found a work-around?

Thanks a lot,
Christoph

Michael Schlenker

unread,
Jan 7, 2007, 8:08:11 AM1/7/07
to
Christoph Schmidt schrieb:

The german console encoding is not included in the default set of
encodings distributed with Tclkit. Windows uses a different encoding for
its dos console window.

Btw. its much easier to force an encoding on a channel with fconfigure
-encoding instead of manual conversion with encoding convertfrom/convertto.

Michael

Christoph Schmidt

unread,
Jan 7, 2007, 9:19:39 AM1/7/07
to
Michael Schlenker schrieb:

Hi Michael,

thanks for the hint. I added all "normal" Tcl encodings to TclKit as described
in http://www.equi4.com/tkunicode.html and now it works fine :-)

Thanks a lot,
Christoph

Benjamin Riefenstahl

unread,
Jan 12, 2007, 10:48:28 AM1/12/07
to
Hi Christoph,

You seem to have your solution already, but for completeness and for
the archives I'll try to expand on this anyway.

Christoph Schmidt writes:
> I have to process in my script some accented characters like 'á', 'é'
> and so on. These characters must also be saved to and loaded from a

> [...]


> # print file content
> if {[file exists test]} {
> set h [open test r]
> puts [encoding convertfrom $enc [gets $h]]

[gets] already gives you decoded text, not raw bytes, so there is no
encoding from which to convert after that. [gets] follows the
encoding that you can set with [fconfigure]. Using [encoding
convertfrom] means that you do a second conversion. This will only
work if you are lucky, but it is not by design. What you probably
want instead is this:

set h [open test r]

fconfigure $h -encoding $enc
puts [gets $h]

NB: The same goes for [puts]. It also uses whatever encoding is set
on its stream, in this case stdout, for generating text in the right
external encoding.

benny

0 new messages