On 04/04/2019 16:16, Alexandru wrote:
> I guess I need to use something like "split" but then I have a
> problem if the list elements also contain empty spaces.
In that case, you need to write a parser that actually gets the list of
content words that are there. Often, but not always, this is done with a
bit of [regexp], possibly with [lmap] to filter the result:
set elem {"Geh\X2\00E4\X0\use" ""}
set elems [lmap {quoted value} \
[regexp -inline -all {"([^""]*)"} $elem] {set value}]
That produces the list “{Geh\X2\00E4\X0\use} {}” (excluding quote
marks), in which the first element is what I believe to be the first
word in your input, and the second element is the empty string.
I've not necessarily got it right! But further decoding (which would be
nice to put inside the [lmap] body) depends on knowing more about the
format than I actually do. Perhaps this does it?
set elems [lmap {- value} [regexp -inline -all {"([^""]*)"} $elem] {
subst -nocommands -novariables [
regsub -all {\\X2\\(\w{4})\\X0\\} $value {\\u\1}]
}]
Decoding this sort of thing can be a total black art, and doing it well
requires knowing what is really going on with quoting rules. (In this
case, Tcl 8.7's [regsub -command] would be very helpful.)
Donal.
--
Donal Fellows — Tcl user, Tcl maintainer, TIP editor.