Google Grupper har inte längre stöd för nya Usenet-inlägg eller -prenumerationer. Historiskt innehåll förblir synligt.
Dismiss

Tclkit 8.5.x encoding problem

2 visningar
Hoppa till det första olästa meddelandet

s-imai

oläst,
12 mars 2008 00:02:402008-03-12
till
Tclkit 8.5.x doesn't judge added encoding automatically.
Is this a change of specifications or a bug?

Here is a patch.
http://reddog.s35.xrea.com/software/kitInit.c.patch_for_8.5

-----
Satoshi Imai
s-i...@japan.interq.or.jp


--
Message posted using http://www.talkaboutprogramming.com/group/comp.lang.tcl/
More information at http://www.talkaboutprogramming.com/faq.html

Donald G Porter

oläst,
12 mars 2008 11:15:392008-03-12
till
s-imai wrote:
> Tclkit 8.5.x doesn't judge added encoding automatically.

I don't understand what you mean.

> Is this a change of specifications or a bug?

Tcl 8.5 added routines so that programs like Tclkit could more easily
initialize their encodings correctly.

http://tip.tcl.tk/258

If Tclkit sources still haven't been updated to use them, please
complain to the maintainers. Or adapt your patch to use the new
interfaces, and contribute it.

--
| Don Porter Mathematical and Computational Sciences Division |
| donald...@nist.gov Information Technology Laboratory |
| http://math.nist.gov/~DPorter/ NIST |
|______________________________________________________________________|

s-imai

oläst,
12 mars 2008 20:11:132008-03-12
till
Hi.

> I don't understand what you mean.

I'm sorry about it.
There is a transcript of the steps needed to add all encodings.

http://www.equi4.com/tclkit/unicode.html

Tckkit84x and Tclkit85x doesn't have the cp932.enc as default encoding.
It's a japanese encoding.
For example Tclkit84x recognized that added cp932.enc into VFS as default
encoding automatically.
Tclkit85x doesn't recognize that added cp932.enc into VFS as default
encoding automatically.

Because Tclkit 8.5.x. doesn't call the TclpSetInitialEncodings(). Why?

s-imai

oläst,
16 mars 2008 08:54:072008-03-16
till
Hi.

I found the workaround.
I modified the xxx.vfs/lib/app-xxx/pkgIndex.tcl like this.

[pkgIndex.tcl]
package ifneeded app-xxx 1.0 [list source -encoding cp932 [file join $dir
tkfind.tcl]]

I inserted "-encoding cp932" option.
Is this a best solution without modifying tclkit?

Don Porter

oläst,
16 mars 2008 10:48:032008-03-16
till
s-imai wrote:
> I found the workaround.
> I modified the xxx.vfs/lib/app-xxx/pkgIndex.tcl like this.
>
> [pkgIndex.tcl]
> package ifneeded app-xxx 1.0 [list source -encoding cp932 [file join $dir
> tkfind.tcl]]
>
> I inserted "-encoding cp932" option.
> Is this a best solution without modifying tclkit?

If you have a script stored in a file in an encoding
other than iso8851-1 which contains any character not
found in the iso8851-1 encoding, then the best way to
support that is with the [source -encoding] and
`tclsh -encoding` options, depending on whether your
file contains a package or an application.

I expect the most common successful practice will
become to store all scripts in the utf-8 encoding
and do all [source]-ing with -encoding utf-8.

My expectation (what I will try to make happen) is to
have the default encoding of [source] become utf-8
in Tcl 9.

Donal K. Fellows

oläst,
16 mars 2008 12:06:322008-03-16
till
Don Porter wrote:
> If you have a script stored in a file in an encoding
> other than iso8851-1 which contains any character not
> found in the iso8851-1 encoding, then the best way to
> support that is with the [source -encoding] and
> `tclsh -encoding` options, depending on whether your
> file contains a package or an application.

Actually it's only the ascii encoding that's reasonably safe. Even
iso8859-1 is unsafe when running on a platform where the system encoding
is something different (more and more common).

Donal.

Alexandre Ferrieux

oläst,
16 mars 2008 18:06:442008-03-16
till
On Mar 16, 5:06 pm, "Donal K. Fellows"

Would it make sense to introduce explicit channel options in the file
itself (like in <?xml ... encoding=... ?> ? Something like a first-
line comment like:

#? -encoding iso8851-1 -eofchar '' -translation crlf

This could be done with little loss of performance in Tcl_FSEvalFileEx
by imposing that this comment be in 1st column of 1st line in order to
be effective (one [gets] and one [read] instead of just one [read]).

-Alex

Donal K. Fellows

oläst,
16 mars 2008 20:14:182008-03-16
till
Alexandre Ferrieux wrote:
> Would it make sense to introduce explicit channel options in the file
> itself (like in <?xml ... encoding=... ?> ?

I seem to recall this was discussed (at length!) back when TIP #137 was
under development. Can't remember what exactly was said though. (I do
know that it's tricky if you've got an encoding that doesn't embed
ASCII, which is true of some of them. I believe the XML specification
has extensive discussion of this.)

Donal.

suchenwi

oläst,
17 mars 2008 04:53:442008-03-17
till
On 16 Mrz., 23:06, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

>       #? -encoding iso8851-1 -eofchar '' -translation crlf

I'm skeptical of introducing "magic" comments. Besides, the first line
of Tcl scripts is often reserved to be
#!/usr/bin/env tclsh
or similar instructions to another shell what to start the script
with.

However, this reminds me of an older idea: why not allow "" as channel
for fconfigure, meaning "the current file itself"? E.g

fconfigure "" -encoding utf-8

I find it plausible (just as self-referential as "" is the current
interpreter in interp commands), and it is not in conflict with
existing scripts where channel names never could be the empty string..

Donal K. Fellows

oläst,
17 mars 2008 06:31:202008-03-17
till
suchenwi wrote:
> However, this reminds me of an older idea: why not allow "" as channel
> for fconfigure, meaning "the current file itself"?

That's too late. The implementation of [source] reads the file in its
entirety (up to any ^Z) before executing it.

Donal.

Alexandre Ferrieux

oläst,
17 mars 2008 09:15:532008-03-17
till
On Mar 17, 9:53 am, suchenwi <richard.suchenwirth-

bauersa...@siemens.com> wrote:
> On 16 Mrz., 23:06, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
> wrote:
>
> > #? -encoding iso8851-1 -eofchar '' -translation crlf
>
> I'm skeptical of introducing "magic" comments. Besides, the first line
> of Tcl scripts is often reserved to be
> #!/usr/bin/env tclsh
> or similar instructions to another shell what to start the script
> with.

You're right, but there's a dichotomy here:
- scripts designed as "main":
#!/usr/bin/env tclsh -encoding iso8851-1
- scripts designed to be sourced by another one:
#? some magic here

> fconfigure "" -encoding utf-8

No, Tcl_FSEvalFileEx currently does a huge [read] before [eval]ling
anything, so any cross-encoding stuff must occur on the resulting
string (not the channel). But even with something like

parser-reconfigure -encoding utf-8

you're exposed to the possibility of the initial string being already
smashed (as in "loss of information") because it contained a forbidden
byte in the initial encoding.

-Alex

suchenwi

oläst,
19 mars 2008 08:57:112008-03-19
till
On 17 Mrz., 14:15, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

> You're right, but there's a dichotomy here:
>  - scripts designed as "main":
>       #!/usr/bin/env tclsh -encoding iso8851-1
>  - scripts designed to be sourced by another one:
>       #? some magic here

I often write "library" scripts still with the shell prefix, and at
their end provide the usual self-test section (which could also be a
demo):

if {[file tail [info script]] eq [file tail $argv0]} {
.....
}
This way, they can be sourced, but still do something useful when
activated by accident or intention :^)

0 nya meddelanden