sled...@gmail.com wrote:
[Note, quoting some context from the prior article is always a good
idea. This is Usent afterall, not google. Note the quotations below.]
> thanks for the reply...
>
> set j ????????
>
> Is there a more direct way to get the \uxxxx values of the characters
> stored in a string?
What I showed you is the direct way to get the hex values of the
characters stored in a string. And it sounded (from your first vague
posting) like this was what you wanted.
But if you want the code point values (the xxxx part of the \uxxxx
escape is a "code point" value, which is *different* from "the bytes of
a utf-8 encoded string) you'd just iterate over the string by character
and use the %c conversion of the 'scan' command to obtain the code
point value. Then you can use the %x conversion of format to get the
hex values:
$ rlwrap tclsh
% set str "Hello."
Hello.
% foreach c [split $str ""] {
puts -nonewline [format {\u%04x} [scan $c %c z ; set z]]
} ; puts ""
\u0048\u0065\u006c\u006c\u006f\u002e
%
And that string of \uxxxx items is the equivalent of the "Hello."
string that was first put into the "str" variable. You could also do:
set str "\u0048\u0065\u006c\u006c\u006f\u002e"
And you end up with the identical string in "str" because the Tcl
parser interprets the \uxxxx escapes for you, converting each to the
character represented by that unicode code point.
But if you already have a string, unless you want to write it out in a
Tcl script or otherwise convert it to the \uxxxx format for feeding to
the Tcl parser, you don't need the hex values. The hex escape (\uxxxx)
is for entering 'characters' into your script code with you can not
otherwise type them directly on your keyboard.
> It seemed obtaining the hex values was a prereq to getting the
> corresponding \u values. But is that really necessary?
You need the code point values for creating proper \uxxxx escapes. But
unless you are writing code to output Tcl code, or unless wherever you
are sending this data understands how to interpret the \uxxxx escapes,
they are not very useful to you other than as a debugging aid to see
what code points are actually in the string.