[Request Tcl 8.1] concat support for list objects

0 views
Skip to first unread message

scriptic...@auto.genned.post

unread,
May 19, 1999, 3:00:00 AM5/19/99
to

Tcl 8.1 Request: Generated by Scriptics' bug entry form at
http://www.scriptics.com/support/bugForm.html
Responses to this post are encouraged.
------

Submitted by: Jeffrey Hobbs
CVS: May 17, tclUtil.c $Id: tclUtil.c,v 1.11 1999/05/06 19:21:11 stanton Exp $
OperatingSystem: Sun Solaris
OperatingSystemVersion: 2.5.1
CustomShell: mod core? not me...
Synopsis: concat support for list objects

DesiredBehavior:
As pointed out on c.l.tcl, concat actually converts everything to
strings
and returns a string, as opposed to what the docs say. This features
enhancement involves situations where concat was passed only objects of
the list type. In this case, it concats them together into one list,
and returns an object of type list.

Patch:
The following needs to be inserted as the first code in Tcl_ConcatObj
in tclUtil.c:

for (i = 0; i < objc; i++) {
if (objv[i]->typePtr != &tclListType) {
break;
}
}
if (i == objc) {
Tcl_Obj **listv;
int listc;
/*
* All the items are of list type, concat them together
* as lists, and return as a list obj
*/
objPtr = Tcl_NewListObj(0, NULL);
for (i = 0; i < objc; i++) {
/*
* No check is made on the result code, as we checked
* above that they are all of list type...
*/
Tcl_ListObjGetElements(NULL, objv[i], &listc, &listv);
Tcl_ListObjReplace(NULL, objPtr, INT_MAX, 0, listc, listv);
}
return objPtr;
}

PatchFiles:
generic/tclUtil.c

Comments:
In tests on larger lists, this actually takes a bit more time to return
to assemble and return the object than the regular bytes concat mode.
However, when the returned object is next used as a list (like with
foreach), there is a notable time savings.

Again, this only affects calls to concat where all args passed are
already recgonized list objects. This passes all tests, and doesn't
break any syntax logic.

Alexandre Ferrieux

unread,
May 19, 1999, 3:00:00 AM5/19/99
to
scriptic...@auto.genned.post wrote:
>
> Synopsis: concat support for list objects
>
> As pointed out on c.l.tcl, concat actually converts everything to
> strings
> and returns a string, as opposed to what the docs say. This features
> enhancement involves situations where concat was passed only objects of
> the list type. In this case, it concats them together into one list,
> and returns an object of type list.


Sorry but this 'feature enhancement' actually breaks the semantics :)

Indeed, the current [concat $a $b] (just as "$a $b"), beside being
ill-documented, actually does a good job of preserving \n inside the
arguments, which is important for eval/uplevel (which use [concat], yes,
and this is even documented). Now if you transform [concat] in what I
called [lconcat], you will clearly lose those \n. Worse, this behavior
will look nondeterministic because it will happen only if (by mistake)
someone has previously made a list operation on the code "puts a\nputs
b"...

Boom :^P

-Alex

Jeffrey Hobbs

unread,
May 19, 1999, 3:00:00 AM5/19/99
to Alexandre Ferrieux
Alexandre Ferrieux wrote:
> scriptic...@auto.genned.post wrote:
> > Synopsis: concat support for list objects
> >
> > As pointed out on c.l.tcl, concat actually converts everything to
> > strings
> > and returns a string, as opposed to what the docs say. This features
> > enhancement involves situations where concat was passed only objects of
> > the list type. In this case, it concats them together into one list,
> > and returns an object of type list.
>
> Sorry but this 'feature enhancement' actually breaks the semantics :)

You are correct that it breaks semantics, but not quite exact on how bad.
:)
Actually, embedded \n's would be maintained (as if it is really already
a list with a \n in it, you still get the \n out, but if you convert
the string to a list, and then get the string back out, you have lost
your \n anyway, either way). The whitespaced items would be surrounded
by {}s though in a string rep. The point in the semantics that is totally
left out is that leading and trailing whitespace should be eliminated.

However, that is an interesting point when considering lists. If
something is in list form, than any whitespace that got there was
meant to be there, and thus one could say that "excess" leading and
trailing whitespace was already removed.

In any case, you can build cases with whitespace that break on this
patch, so forget it and wait for lconcat (which isn't much more than
the code in this patch).

** Jeffrey Hobbs jeff.hobbs @SPAM acm.org **
** I'm really just a Tcl-bot My opinions are MY opinions **

Jeffrey.Hobbs.vcf

Bob Techentin

unread,
May 19, 1999, 3:00:00 AM5/19/99
to
Jeffrey Hobbs wrote:
>
> Alexandre Ferrieux wrote:
> > scriptic...@auto.genned.post wrote:
> > > Synopsis: concat support for list objects
> > >
> > > As pointed out on c.l.tcl, concat actually converts everything to
> > > strings and returns a string, as opposed to what the docs say.

[snip]


>
> In any case, you can build cases with whitespace that break on this
> patch, so forget it and wait for lconcat (which isn't much more than
> the code in this patch).
>

I hadn't realized that concat had this behavior.

Just FYI, I took the initiative yesterday and threw together a staw-man
implementation framework for a list extension package on the Tcler's
Wiki. See
http://216.71.55.6/cgi-bin/wikit/InternalOrganizationOfTheListExtension
Let me know what you think. :-)

I had included a 'concat' function, but I just stated that it was the
same as the core function. Now that I see the utility of an "all lists
all the time" version, I have changed the docs a little.

--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html

Harald Kirsch

unread,
May 19, 1999, 3:00:00 AM5/19/99
to
In article <3742D8...@cnet.francetelecom.fr> Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> writes:
> Indeed, the current [concat $a $b] (just as "$a $b"), beside being
> ill-documented, actually does a good job of preserving \n inside the
> arguments, which is important for eval/uplevel (which use [concat], yes,
> and this is even documented). Now if you transform [concat] in what I
> called [lconcat], you will clearly lose those \n. Worse, this behavior
> will look nondeterministic because it will happen only if (by mistake)
> someone has previously made a list operation on the code "puts a\nputs
> b"...

The very root of the problem stated is that commands are not lists,
i.e. if there are two variables cmd1 and cmd2 holding commands which
could be evaluated like

eval $cmd1
eval $cmd2

neither

eval [concat $cmd1 $cmd2]

nor

eval [list $cmd1 $cmd2]

will work. Try

set cmd1 "puts one ;# "
set cmd2 "puts two"
set cmd [concat $cmd1 \; $cmd2]
eval $cmd

to see that it does not work.

What is the GENERAL way to concat two (or n) commands such that
whenever consecutive evaluation with eval is possible, also the
concatenation can be evaluated, with the same overall result, if
possible. (The last condition may require that the single commands do
not to contain (parts of) control structures.)

Also look at

set cmd1 "puts one ;# {"
set cmd2 "puts two"
set cmd [concat $cmd1 \;\n\; $cmd2]
eval $cmd

Is that the way to go or can it be tricked to behave wrong?

Harald Kirsch


--
---------------------+------------------+--------------------------
Harald Kirsch (@home)| | Now I rebooted.
k...@iitb.fhg.de | | --- Jerry Pournelle, BYTE
gegen Punktfilitis hilft nur `chmod u-w ~'

Bryan Oakley

unread,
May 19, 1999, 3:00:00 AM5/19/99
to
Harald Kirsch <k...@iitb.fhg.de> wrote in message
news:KIR.99Ma...@Gauss.iitb.fhg.de...

> In article <3742D8...@cnet.francetelecom.fr> Alexandre Ferrieux
<alexandre...@cnet.francetelecom.fr> writes:
> > Indeed, the current [concat $a $b] (just as "$a $b"), beside being
> > ill-documented, actually does a good job of preserving \n inside the
> > arguments, which is important for eval/uplevel (which use [concat], yes,
> > and this is even documented). Now if you transform [concat] in what I
> > called [lconcat], you will clearly lose those \n. Worse, this behavior
> > will look nondeterministic because it will happen only if (by mistake)
> > someone has previously made a list operation on the code "puts a\nputs
> > b"...
>
> The very root of the problem stated is that commands are not lists,

Not true. A command is a list, but a script is not.

> i.e. if there are two variables cmd1 and cmd2 holding commands which
> could be evaluated like
>
> eval $cmd1
> eval $cmd2
>
> neither
>
> eval [concat $cmd1 $cmd2]
>
> nor
>
> eval [list $cmd1 $cmd2]
>
> will work.

Sure it will. Just not as you are apparently expecting... Reread the eval
man page to see that it will itself concat all of its arguments to create a
string which it passes to the interpreter.

> Try
>
> set cmd1 "puts one ;# "
> set cmd2 "puts two"
> set cmd [concat $cmd1 \; $cmd2]
> eval $cmd
>
> to see that it does not work.

It works exactly as expected to me. Remember that a comment goes to the end
of the line. When you do the concat, cmd ends up looking like "puts one; #;
puts two". So, you end up with a single string that looks like this:

puts one; #; puts two

You're comment is not empty, it is actually (sans the quotes) '#; puts two'

That's exactly what I would expect if I typed in that command line in an
editor. Which is good, because I'd hate for eval to treat a command line
differently than if a command line were read in from a file.

Remember that eval does it's own concat of arguments; it doesn't treat each
argument as a discrete command to be executed. So you can't create a well
formed list of commands and pass it to exec and have each list element
executed as a complete command in turn.

>
> What is the GENERAL way to concat two (or n) commands such that
> whenever consecutive evaluation with eval is possible, also the
> concatenation can be evaluated, with the same overall result, if
> possible. (The last condition may require that the single commands do
> not to contain (parts of) control structures.)

I would say the right thing is to use join to make sure each command is
separated by a newline:

set cmd [join [list $cmd1 $cmd2] \n]
eval $cmd

That, or execute each command discretely in a loop, much like the
interpreter does (conceptually, at any rate) when reading in code from a
file:

foreach command [list $cmd1 $cmd2] {
eval $command
}

Again, remember that eval doesn't work with a list of commands -- it will
accept a list but uses concat to create a single command line that gets
handed to the interpreter, much as if the interpreter had read that line in
from a file. If you have comments embedded in that command line, you will
get the results you are seeing.

>
> Also look at
>
> set cmd1 "puts one ;# {"
> set cmd2 "puts two"
> set cmd [concat $cmd1 \;\n\; $cmd2]
> eval $cmd
>
> Is that the way to go or can it be tricked to behave wrong?

I'm not sure I understand that question. What you did there is a hack, not a
true solution to the problem. The true solution is to realize eval doesn't
work the way you think it does when given a list. eval uses concat to make a
string, which is sent to the interpreter. It is then treated as any other
string. If you want to create a list of commands to be executed, go ahead
and create the list, but join the list with newlines before sending it to
eval, or use a loop to eval each element in the list separately.

Harald Kirsch

unread,
May 20, 1999, 3:00:00 AM5/20/99
to
"Bryan Oakley" <oak...@channelpoint.com> writes:

> Harald Kirsch <k...@iitb.fhg.de> wrote in message

> > The very root of the problem stated is that commands are not lists,
>
> Not true. A command is a list, but a script is not.

True, indeed what I had in mind was in fact `a script is not a list' or
better: eval has a different idea of how to break a string into elements
than tcl's list functions.

> >
> > What is the GENERAL way to concat two (or n) commands such that
> > whenever consecutive evaluation with eval is possible, also the
> > concatenation can be evaluated, with the same overall result, if
> > possible. (The last condition may require that the single commands do
> > not to contain (parts of) control structures.)
>
> I would say the right thing is to use join to make sure each command is
> separated by a newline:
>
> set cmd [join [list $cmd1 $cmd2] \n]
> eval $cmd
>

Every other day I learn something new about tcl. Can we somehow
rigurously prove that this is the ``correct'' way to computationally
build up scripts from single commands? Something like:

Given two commands stored in cmd1 and cmd2 such that

eval $cmd1 (1)
eval $cmd2 (2)

does not produce a tcl-error (like `extra chars after close quote' and
the like) then

eval [join [list $cmd1 $cmd2] \n] (3)

is equivalent (1) and (2) in that it, well, `does the same'.

> > Also look at
> >
> > set cmd1 "puts one ;# {"
> > set cmd2 "puts two"
> > set cmd [concat $cmd1 \;\n\; $cmd2]
> > eval $cmd
> >
> > Is that the way to go or can it be tricked to behave wrong?
>
> I'm not sure I understand that question. What you did there is a hack, not a
> true solution to the problem.

I agree, and the whole intention of my posting was to learn how the true
solution is :-) Your's seems to be correct, at least as far as I cannot
come up with a pair of commands which break it.

> The true solution is to realize eval doesn't
> work the way you think it does when given a list.

Of course it does; it is only that you don't know what I think :-)

Thank you,
Harald Kirsch

--
-------------------------------------------------+------------------
Harald Kirsch, k...@iitb.fhg.de, +49 721 6091 369 |
FhG/IITB, Fraunhoferstr.1, 76131 Karlsruhe |

Paul Duffin

unread,
May 20, 1999, 3:00:00 AM5/20/99
to
Alexandre Ferrieux wrote:
>
> scriptic...@auto.genned.post wrote:
> >
> > Synopsis: concat support for list objects
> >
> > As pointed out on c.l.tcl, concat actually converts everything to
> > strings
> > and returns a string, as opposed to what the docs say. This features
> > enhancement involves situations where concat was passed only objects of
> > the list type. In this case, it concats them together into one list,
> > and returns an object of type list.
>
> Sorry but this 'feature enhancement' actually breaks the semantics :)
>
> Indeed, the current [concat $a $b] (just as "$a $b"), beside being
> ill-documented, actually does a good job of preserving \n inside the
> arguments, which is important for eval/uplevel (which use [concat], yes,
> and this is even documented). Now if you transform [concat] in what I
> called [lconcat], you will clearly lose those \n. Worse, this behavior
> will look nondeterministic because it will happen only if (by mistake)
> someone has previously made a list operation on the code "puts a\nputs
> b"...
>

Actually Alex it is possible. What you have to do is concatenate the
string reps together and use that as the string representation of the
new list object, then you convert all objects to lists and concatenate
the internal representation together. If any of the objects cannot be
converted to a list then you can't create a list object so you simply
return a string object.
e.g.
% set a {
1 2
3 4
}

1 2
3 4

% set b {
5 6
7 8
}

5 6
7 8

% concat $a $b
1 2
3 4 5 6
7 8

would still work but the resulting object would be a list and not a
string.

--
Paul Duffin
DT/6000 Development Email: pdu...@hursley.ibm.com
IBM UK Laboratories Ltd., Hursley Park nr. Winchester
Internal: 7-246880 International: +44 1962-816880

Alexandre Ferrieux

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Jeffrey Hobbs
Jeffrey Hobbs wrote:
>
> Alexandre Ferrieux wrote:
> > scriptic...@auto.genned.post wrote:
> > > Synopsis: concat support for list objects
> > >
> > > As pointed out on c.l.tcl, concat actually converts everything to
> > > strings
> > > and returns a string, as opposed to what the docs say. This features
> > > enhancement involves situations where concat was passed only objects of
> > > the list type. In this case, it concats them together into one list,
> > > and returns an object of type list.
> >
> > Sorry but this 'feature enhancement' actually breaks the semantics :)
>
> You are correct that it breaks semantics, but not quite exact on how bad.
> :)
> Actually, embedded \n's would be maintained (as if it is really already
> a list with a \n in it, you still get the \n out, but if you convert
> the string to a list, and then get the string back out, you have lost
> your \n anyway, either way)

Sorry but the above is obfuscated by the absence of a definition of 'in'
and 'out' and 'either way' (which two ways???).

What remains, is that in the dual representation, some types have a
bijective conversion (e.g. integers, and most floats), some types don't
(e.g. lists). In this latter case, the exact whitespace around elements
is of course a war casualty. Now since the dual repr can keep both repr
until one of them is updated, in most cases a list operation by itself
won't hurt. However, in *your* case, you build a new object from scratch
based on the two lists, hence all the separating spaces in the incoming
strings will be lost in the result (which will maybe, as you mentioned,
never be converted back to a string anyway).

IOW, keeping in mind that [concat] lurks behind [eval/uplevel], you
support this case:

eval $cmd $args

but not this case:

eval $cmd1 "\n" $cmd2

I agree that they apply in fact to very different situations (I have
never made out why the second use was here; this kind of semantic
redundancy (with vanilla string concatenation) stinks of Perl...)

> The whitespaced items would be surrounded
> by {}s though in a string rep.

I'm not talking about *those* internal spaces, but about the
element-separators.
In the example above, the list repr of "\n" is the empty list :)

> The point in the semantics that is totally
> left out is that leading and trailing whitespace should be eliminated.

BTW, do you have an idea *why* it is defined this way ? A pre-bytecode
historical remains ?

> In any case, you can build cases with whitespace that break on this
> patch, so forget it and wait for lconcat (which isn't much more than
> the code in this patch).

Sure. And to be consistent with what you told yesterday about patchless
submissions, I suggest that you go ahead and add the three missing lines
:)

-Alex

Alexandre Ferrieux

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Jeffrey Hobbs

but not this case:

eval $cmd1 "\n" $cmd2

I agree that they apply in fact to very different situations (for me,
the first happens 99.99% of the time)

Alexandre Ferrieux

unread,
May 20, 1999, 3:00:00 AM5/20/99
to
Paul Duffin wrote:

>
> Alexandre Ferrieux wrote:
> >
> > Now if you transform [concat] in what I
> > called [lconcat], you will clearly lose those \n.
>
> Actually Alex it is possible. What you have to do is concatenate the
> string reps together and use that as the string representation of the
> new list object, then you convert all objects to lists and concatenate
> the internal representation together. If any of the objects cannot be
> converted to a list then you can't create a list object so you simply
> return a string object.

Paul, this is very clever !!!
This 'parallel concatenation' of both repr, if done only lazily (i.e.
only when the list repr of all incoming args have already been
computed), seems to bring the best of both worlds !

Hmmm... Wait a minute: it still costs in memory, the size of a resulting
big string that may be unwanted after all. OTOH, I agree that it will
only happen when all the incoming parameters' string repr are also still
around, meaning that the program is not pure-lists after all. Okay :)

Jeff or Paul, the patch ?

-Alex

Jeffrey Hobbs

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Harald Kirsch
Harald Kirsch wrote:
> In article <3742D8...@cnet.francetelecom.fr> Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> writes:
> > Indeed, the current [concat $a $b] (just as "$a $b"), beside being
> > ill-documented, actually does a good job of preserving \n inside the
> > arguments, which is important for eval/uplevel (which use [concat], yes,
....

> The very root of the problem stated is that commands are not lists,
> i.e. if there are two variables cmd1 and cmd2 holding commands which
> could be evaluated like
....

> set cmd1 "puts one ;# "
> set cmd2 "puts two"
> set cmd [concat $cmd1 \; $cmd2]
> eval $cmd
> to see that it does not work.
>
> What is the GENERAL way to concat two (or n) commands such that

concat has never been the proper was to handle eval/uplevel as speculated
above (Alex, you're not doing that, right?) for exactly the reason above.
This is actually what messed me up in making my erroneous patch, as I never
used concat in this manner (occasionally for putting strings together, but
mostly for putting lists together). To get the right affect above, you
should append strings on, with \n's in between. This is what I do in tkcon
when ripping about potential commands and putting them back together (with
a little more to it than that).

Actually though, going back to the semantics of the patch, I'm not sure
that the patch was wrong at all. Let's walk through carefully...

concat operates on objects as pure strings. It puts them together with
a space between them, stripping off surrounding whitespace. It differs
from "join" in that join actually takes one list of items, and allows
you to put them together with any chars (space default), but never strips
whitespace (although each item in the one list is still taken as a string).

OK, let's say we are passed only lists and have the patch I proposed.
If the items are already in list form (which the patch checks), then
any problem relating to loss of whitespace inside the item is not due
to concat, but due to the conversion to a list obj (by something like
llength), and is outside the realm of the concat patch (read: not our
problem). So where I thought the real problem lay was that I didn't
strip whitespace from the ends, as I treat it all as lists.

Thinking further on that, I realize that it shouldn't be a problem.
A list in string form NEVER has leading or trailing whitespace. If
such a char would be there, the item is {}'d!

Have I erred again in my logic, or was I right with the original patch?

Jeffrey.Hobbs.vcf

Jeffrey Hobbs

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Bob Techentin
Bob Techentin wrote:
> Jeffrey Hobbs wrote:
> > Alexandre Ferrieux wrote:
> > > scriptic...@auto.genned.post wrote:
> > > > Synopsis: concat support for list objects
> > > >
> > > > As pointed out on c.l.tcl, concat actually converts everything to
> > > > strings and returns a string, as opposed to what the docs say.
...

> > In any case, you can build cases with whitespace that break on this
> > patch, so forget it and wait for lconcat (which isn't much more than
> > the code in this patch).
>
> I hadn't realized that concat had this behavior.

I had never really paid attention to that either. However, I believe that
the patch is actually "correct", as there never is trailing or leading
whitespace in the string form or a list object (see response to Kirsh:
"Re: Commands are not lists" for my detailed walkthrough). In saying the
above, I hadn't really thought about that fact, and I can't seem to build
a case that fails (but my brain might be on the blip again...)

> Just FYI, I took the initiative yesterday and threw together a staw-man
> implementation framework for a list extension package on the Tcler's
> Wiki. See
> http://216.71.55.6/cgi-bin/wikit/InternalOrganizationOfTheListExtension

Ooo, nice. It's nice to see that filtered down a bit finally. I'm not
sure that all is covered, but let me go over a few points on that page:

What is the difference between
list::append -in list item item ...
and list::insert $list end item item ...

Really the append just simplifies the insert by letting us pass in a
varname,
so we don't need to extend it in this way.

With the following:
::list::set ?-in? list index value ?index value ...?
::list::element list index ?index index ...?
are you atttempting to treat the list as a 1D numeric-indexed array?
Interesting...

What does this really mean:
::list::range ?-indices? list first last
"Like the core lrange function. Can also return the enumerated indices
from the list."

You are passing in indices in the first place... You want to return
the numbers between first and last instead of the elements? Isn't
that what a for loop is for?

Also, whereever you consider adding a -in, you should really have, and
encourage the use of, --. This is important in the object aspect of
things, and would probably improve speed. The point is that most of
those take variable arguments. That means you need to start checking
the first ones as strings until we know there aren't any switches,
whereupon we know the next one is our list (or perhaps the list's
varname). Where it is a list, using the -- will avoid us ever
converting it to a string.

To be honest though, I'm not all that convinced of the usefulness of
-in for every method of a new list command. You are putting the
burden of varname checking on all the lists commands, and the extra
code required internally to do it in place might not outweight any
benefits (saving you from writing "set list [listop $list ...]").
Then again, I haven't done the coding for it...

Jeffrey.Hobbs.vcf

Bob Techentin

unread,
May 20, 1999, 3:00:00 AM5/20/99
to
Jeffrey Hobbs wrote:
>
> What is the difference between
> list::append -in list item item ...
> and list::insert $list end item item ...
>
> Really the append just simplifies the insert by letting us pass in a
> varname, so we don't need to extend it in this way.

The two commands "::list::append mylist" and "::list::insert mylist end"
are equivalent. As I had written them, both support the varname syntax
with the "-in" switch.


> With the following:
> ::list::set ?-in? list index value ?index value ...?
> ::list::element list index ?index index ...?
> are you atttempting to treat the list as a 1D numeric-indexed array?
> Interesting...

There was a request for a "generalized lindex allowing selection of
more than one element..." The ::list::set operation seemed a logical
complement.


> What does this really mean:
> ::list::range ?-indices? list first last
> "Like the core lrange function. Can also return the enumerated indices
> from the list."
>
> You are passing in indices in the first place... You want to return
> the numbers between first and last instead of the elements? Isn't
> that what a for loop is for?

I may have gone off a little on the deep end here. There were requests
for operations ("apply" and "remove") on elements of the list. It is
only natural to link these operations to the "search" operation. And
while I was reading the "regexp" man page, I was inspired by the
-indices switch. My intention was to allow:

::list::remove -in myList [::list::search -glob $myList T*]

Now that I look at this syntax more carefully, I realize that "search"
would return a list of indices, but the functions I've documented accept
variable numbers of arguments. Perhaps those should be changed to
accept a list of one or more indices. This might also fix up some of
the problems you cite below.


> Also, whereever you consider adding a -in, you should really have, and
> encourage the use of, --. This is important in the object aspect of
> things, and would probably improve speed. The point is that most of
> those take variable arguments. That means you need to start checking
> the first ones as strings until we know there aren't any switches,
> whereupon we know the next one is our list (or perhaps the list's
> varname). Where it is a list, using the -- will avoid us ever
> converting it to a string.

Good points. I'll add -- when I get a chance. (Unless you would like
to. :-)


> To be honest though, I'm not all that convinced of the usefulness of
> -in for every method of a new list command. You are putting the
> burden of varname checking on all the lists commands, and the extra
> code required internally to do it in place might not outweight any
> benefits (saving you from writing "set list [listop $list ...]").
> Then again, I haven't done the coding for it...

I considered an alternative implementation that would use different
command names to operate on lists "in place". Something like:

set myList [::list::replace $myList 3 5 "newData"]
::list::replace-inplace myList 3 5 "newData

It would be more efficient, but it just doesn't feel right. Are there
other approaches that might be more efficient than the "-in" switch, but
more esthetic than duplicate procedure names? How about replacing the
variable length argument lists with required lists of elements? Change,
for example,

::list::replace ?-in? list first last ?element element ...?

which has an unknown number of arguments, into a form in which all
arguments (except the switches) are required. The variable number of
elements is replaced by a required list of zero or more elements,
something like:

::list::replace ?-in? list first last {?element element ...?}

Bob

Alexandre Ferrieux

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Jeffrey Hobbs
Jeffrey Hobbs wrote:
>
> concat has never been the proper was to handle eval/uplevel as speculated
> above (Alex, you're not doing that, right?)

No, as said in a nearby post :)

> Actually though, going back to the semantics of the patch, I'm not sure
> that the patch was wrong at all. Let's walk through carefully...

I still think it is; let's do that.

> OK, let's say we are passed only lists and have the patch I proposed.
> If the items are already in list form (which the patch checks), then
> any problem relating to loss of whitespace inside the item is not due
> to concat, but due to the conversion to a list obj (by something like
> llength), and is outside the realm of the concat patch (read: not our
> problem).

Wrong. The incoming args may have *both* string and list repr !!!
What people call a 'conversion' is actually a lazy computation of an
alternate representation. Immediately after it, the object holds both a
string and a (something). Then, only when one of the two reprs is
updated, the other is discarded and set to NULL.

Your patch checks for the 'Tcl type' of the object, which is indeed the
type of the internal repr. If it happens to be a freshly computed list
whose string repr is still around, you lose :)

An example that produces this situation is the following:

set a "puts a\n"
set b "puts b\n"
set c [lindex a 0] ;# yes it is 'puts' so what ?
;# at this point $a stands on both feet ("a\n",[list
a]}
eval $a $b
### Error: bad argument "b": should be "nonewline"
### means it has tried to eval 'puts a puts b'

Hence the very clever middle-ground suggested by Paul nearby. Can you
please comment there ?

-Alex

Jeffrey Hobbs

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Harald Kirsch
Harald Kirsch wrote:

> "Bryan Oakley" <oak...@channelpoint.com> writes:
> > I would say the right thing is to use join to make sure each command is
> > separated by a newline:
> > set cmd [join [list $cmd1 $cmd2] \n]
> > eval $cmd

> Every other day I learn something new about tcl. Can we somehow
> rigurously prove that this is the ``correct'' way to computationally
> build up scripts from single commands? Something like:

Something like that is mostly correct. If you really want to be build
up commands on the fly, consider looking at tkConCmd(Sep|Split), that
rips apart a string and puts it back together into distinct commands.
The use of info complete is important. However, it all depends on what
your input is, and what you want back out. I have seen other like
implementations which follow the same lines. I mention tkcon though
because you wanted a "rigourously prove"n correctness, and that has
3 years of command line interpretation behind it, from Tcl7 to 8.1
and including the patch I mentioned.

Jeffrey.Hobbs.vcf

Jeffrey Hobbs

unread,
May 20, 1999, 3:00:00 AM5/20/99
to
Alexandre Ferrieux wrote:

> Jeffrey Hobbs wrote:
> > Actually, embedded \n's would be maintained (as if it is really already
> > a list with a \n in it, you still get the \n out, but if you convert
> > the string to a list, and then get the string back out, you have lost
> > your \n anyway, either way)
>
> Sorry but the above is obfuscated by the absence of a definition of 'in'
> and 'out' and 'either way' (which two ways???).

The "two ways" refer to if you are dealing with a list object
or a non-list object. Since the patch only acted when ALL the
incoming objects were list objects, it is still valid. If the
\n is still in the list after conversion, it will stay. If you
lost the \n in conversion, it wasn't due to the patch, and you
won't get if magically back in concat somehow.

> What remains, is that in the dual representation, some types have a
> bijective conversion (e.g. integers, and most floats), some types don't
> (e.g. lists). In this latter case, the exact whitespace around elements
> is of course a war casualty. Now since the dual repr can keep both repr
> until one of them is updated, in most cases a list operation by itself
> won't hurt. However, in *your* case, you build a new object from scratch
> based on the two lists, hence all the separating spaces in the incoming
> strings will be lost in the result (which will maybe, as you mentioned,
> never be converted back to a string anyway).

Ah, be careful with the wording here. The last sentence is
a little off what the patch does. First of all, the patch
ONLY works when ALL incoming objects are ALREADY lists (they
could all be in valid list format, but I only care if they
have already been converted - which means any ORIGINAL string
rep would be lost, and we only get the string rep that the
list obj would give us anyway). Thus the new LIST object that
I build and return does affect "incoming strings" as you said,
because there weren't any.

> IOW, keeping in mind that [concat] lurks behind [eval/uplevel], you
> support this case:
> eval $cmd $args
> but not this case:
> eval $cmd1 "\n" $cmd2

Not true. In case two, the "\n" is a string, and my patch
will not take effect. If it were a list, the original version
of concat returns the same as the patched version.

> > The whitespaced items would be surrounded
> > by {}s though in a string rep.
>
> I'm not talking about *those* internal spaces, but about the
> element-separators.
> In the example above, the list repr of "\n" is the empty list :)

That would be true if you passed "\n" and I converted it to
a list, but I don't. However, if you pass [list \n], that is
something different, and my patch is true to the original.

> > The point in the semantics that is totally
> > left out is that leading and trailing whitespace should be eliminated.
>
> BTW, do you have an idea *why* it is defined this way ? A pre-bytecode
> historical remains ?

Does hellifiknow answer that for you? Actually, I assume it has good
historical reasons and is necessary if you consider some of the other
commands that use concat, but have always been intended to work in
the pure string domain. Interestingly, there is a point in the Tcl
test code where the result from Tcl_ConcatObj is expect as a list
object, so this minor infraction also confused someone at Scriptics.
However, the Tcl_ConcatObj code, unlike the "concat" code, does
properly document the function as working entirely within the string
domain and returning a string object.

Jeffrey.Hobbs.vcf

Alexandre Ferrieux

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Jeffrey Hobbs
Jeffrey Hobbs wrote:
>
> > What remains, is that in the dual representation, some types have a
> > bijective conversion (e.g. integers, and most floats), some types don't
> > (e.g. lists). In this latter case, the exact whitespace around elements
> > is of course a war casualty. Now since the dual repr can keep both repr
> > until one of them is updated, in most cases a list operation by itself
> > won't hurt. However, in *your* case, you build a new object from scratch
> > based on the two lists, hence all the separating spaces in the incoming
> > strings will be lost in the result (which will maybe, as you mentioned,
> > never be converted back to a string anyway).
>
> Ah, be careful with the wording here. The last sentence is
> a little off what the patch does. First of all, the patch
> ONLY works when ALL incoming objects are ALREADY lists (they
> could all be in valid list format, but I only care if they
> have already been converted

Yes, we agree on that. This laziness is key to performance.

> which means any ORIGINAL string
> rep would be lost,

Wrong. A Tcl_Obj has two feet: the left "string" and the right
"internal".
If, for example, something is born a string it stands on left foot (i.e.
internal==NULL).
If somebody asks e.g. [lindex] on it, it will compute its internal
representation, hence right foot hits the ground. But this does not
bring left foot up. So the object at this time is a happy two-feet
individual. Then (if) at a later time somebody updates the internal
repr, e.g. by an [lappend] to the only var referencing it, the left foot
will go up of course (string==NULL), but not before.

Sorry if this is basically the same as a previous post, but I believe
the analogy helps illustrate the point.

> > IOW, keeping in mind that [concat] lurks behind [eval/uplevel], you
> > support this case:
> > eval $cmd $args
> > but not this case:
> > eval $cmd1 "\n" $cmd2
>
> Not true. In case two, the "\n" is a string, and my patch
> will not take effect. If it were a list, the original version
> of concat returns the same as the patched version.

Ooops. Right; what I meant was:

set cmd1 [list puts a]
set cmd2 [list puts b]
set nl "\n"
set ign [lindex $nl 0]
eval $cmd1 $nl $cmd2

Okay ?

> > > The point in the semantics that is totally
> > > left out is that leading and trailing whitespace should be eliminated.
> >
> > BTW, do you have an idea *why* it is defined this way ? A pre-bytecode
> > historical remains ?
>
> Does hellifiknow answer that for you?

Yup :)

-Alex

Jeffrey Hobbs

unread,
May 20, 1999, 3:00:00 AM5/20/99
to Alexandre Ferrieux
Alexandre Ferrieux wrote:
> Jeffrey Hobbs wrote:
> > OK, let's say we are passed only lists and have the patch I proposed.
> > If the items are already in list form (which the patch checks), then
> > any problem relating to loss of whitespace inside the item is not due
> > to concat, but due to the conversion to a list obj (by something like
> > llength), and is outside the realm of the concat patch (read: not our
> > problem).
>
> Wrong. The incoming args may have *both* string and list repr !!!
> What people call a 'conversion' is actually a lazy computation of an
....

> Your patch checks for the 'Tcl type' of the object, which is indeed the
> type of the internal repr. If it happens to be a freshly computed list
> whose string repr is still around, you lose :)

Ah yes, that is correct. I admit defeat. However, then the patch has
an easy fix, to ensure only true lists are taken. Add:
(objv[i]->bytes != NULL)
to the if that checks against the type as well. This will ensure
that the operation only occurs on objects that are all of the list
type and none of which have a string rep. Of course, interactively
this doesn't work because the string rep is requested a lot then.
However, in scripts this would be work, but not ideally (ideally being
that we know the string rep originated from the list rep, and not the
other way around). Then there is...

> Hence the very clever middle-ground suggested by Paul nearby. Can you
> please comment there ?

That was again (from Paul Duffin):

What you have to do is concatenate the
string reps together and use that as the string representation of the
new list object, then you convert all objects to lists and concatenate
the internal representation together. If any of the objects cannot be
converted to a list then you can't create a list object so you simply
return a string object.

It is an interesting suggestion, but I don't think the time costs from
the extra intelligence overweigh just creating a proper lconcat sometime
in the future. This also goes along with the fact that Tcl_ConcatObj is
doc'ed to work in the string domain. Now we just have to go back to
your original point, and fix the docs in concat.n.

Jeffrey.Hobbs.vcf

Bruce S. O. Adams

unread,
May 20, 1999, 3:00:00 AM5/20/99
to

Alexandre Ferrieux wrote:

I don't think you have a leg to stand on :-)


Jean-Claude Wippler

unread,
May 20, 1999, 3:00:00 AM5/20/99
to
"Bruce S. O. Adams" wrote:
>
> Alexandre Ferrieux wrote:
[in clarification of a comment by Jeff Hobbs]

> > Wrong. A Tcl_Obj has two feet: the left "string" and the right
> > "internal".
> > If, for example, something is born a string it stands on left foot (i.e.
> > internal==NULL).
> > If somebody asks e.g. [lindex] on it, it will compute its internal
> > representation, hence right foot hits the ground. But this does not
> > bring left foot up. So the object at this time is a happy two-feet
> > individual. Then (if) at a later time somebody updates the internal
> > repr, e.g. by an [lappend] to the only var referencing it, the left foot
> > will go up of course (string==NULL), but not before.
> >
> > Sorry if this is basically the same as a previous post, but I believe
> > the analogy helps illustrate the point.
>
> I don't think you have a leg to stand on :-)

Could this brilliant description be incorporated into the docs
somewhere, please? It is the shortest, clearest, *and* funniest
description of dual objects I have ever seen.

Thanks, Alex!

-- Jean-Claude


Harald Kirsch

unread,
May 20, 1999, 3:00:00 AM5/20/99
to
In article <3744148B...@icn.siemens.de> Jeffrey Hobbs <Jeffre...@icn.siemens.de> writes:
[about how to correctly assemble several commands into a script]

> I mention tkcon though
> because you wanted a "rigourously prove"n correctness, and that has
> 3 years of command line interpretation behind it, from Tcl7 to 8.1
> and including the patch I mentioned.

Not exactly the rigurousness I was looking for, but certainly not bad.

Paul Duffin

unread,
May 21, 1999, 3:00:00 AM5/21/99
to
Jeffrey Hobbs wrote:
>
> Alexandre Ferrieux wrote:
>
> > Hence the very clever middle-ground suggested by Paul nearby. Can you
> > please comment there ?
>
> That was again (from Paul Duffin):
>
> What you have to do is concatenate the
> string reps together and use that as the string representation of the
> new list object, then you convert all objects to lists and concatenate
> the internal representation together. If any of the objects cannot be
> converted to a list then you can't create a list object so you simply
> return a string object.
>
> It is an interesting suggestion, but I don't think the time costs from
> the extra intelligence overweigh just creating a proper lconcat sometime
> in the future. This also goes along with the fact that Tcl_ConcatObj is
> doc'ed to work in the string domain. Now we just have to go back to
> your original point, and fix the docs in concat.n.
>

Creating the string rep and the internal list rep at the same time is
obviously more expensive than creating either one on its own but it
does mean that the <list> objects which are being concatenated do not
lose their own internal representation if they have one and it is
fully backwardly compatible so those people who mistakenly used it
assuming it worked on lists will benefit as soon as the change is
made.

As for the costs, the above change only adds the costs of merging the
internal list reps together (the string reps are already merged) and
this cost is very small compared to the cost of recreating the
internal list rep from the string rep. Obviously if you do not need
the list representation then you could lose out but not by much.

If I have time (about 30 minutes should do it) I will try and put
together an implementation of this idea.

Alexandre Ferrieux

unread,
May 21, 1999, 3:00:00 AM5/21/99
to
Paul Duffin wrote:
>
> As for the costs, the above change only adds the costs of merging the
> internal list reps together (the string reps are already merged) and
> this cost is very small compared to the cost of recreating the
> internal list rep from the string rep. Obviously if you do not need
> the list representation then you could lose out but not by much.

True, you don't lose much wrt the current [concat].
Now I'm wondering if we will get the full [lconcat] performance with
this otherwise smart patch.
The reason is the no man's land between 'all have string reprs' and 'all
have list reprs'. If we note L an argument with only the list repr, S
string and LS both, what will happen if we get:

L L L LS L L L ?

Clearly the Ses of all the single Ls must be computed in order to build
the backwards-compatible string result (while if all were Ls we could
forget about strings). Yes but that is costly. And the problem is, it is
triggered by that tiny isolated S in one of the args, that may come from
a debugging statement nearby or whatever... Tough; looks like there is
still room for [lconcat] after all... or is there ?

Ideas ?

-Alex

Paul Duffin

unread,
May 21, 1999, 3:00:00 AM5/21/99
to
Alexandre Ferrieux wrote:
>
> Paul Duffin wrote:
> >
> > As for the costs, the above change only adds the costs of merging the
> > internal list reps together (the string reps are already merged) and
> > this cost is very small compared to the cost of recreating the
> > internal list rep from the string rep. Obviously if you do not need
> > the list representation then you could lose out but not by much.
>
> True, you don't lose much wrt the current [concat].
> Now I'm wondering if we will get the full [lconcat] performance with
> this otherwise smart patch.

It will not be possible to get lconcat performance simply because you
have to manipulate the strings.

> The reason is the no man's land between 'all have string reprs' and 'all
> have list reprs'. If we note L an argument with only the list repr, S
> string and LS both, what will happen if we get:
>
> L L L LS L L L ?
>
> Clearly the Ses of all the single Ls must be computed in order to build
> the backwards-compatible string result (while if all were Ls we could
> forget about strings). Yes but that is costly. And the problem is, it is
> triggered by that tiny isolated S in one of the args, that may come from
> a debugging statement nearby or whatever... Tough; looks like there is
> still room for [lconcat] after all... or is there ?
>

There is, I was just making the point that there is some value in
fixing up concat to improve its performance simply because it is used
so much already.

Jeffrey Hobbs

unread,
May 21, 1999, 3:00:00 AM5/21/99
to Paul Duffin
Paul Duffin wrote:

> Jeffrey Hobbs wrote:
> > That was again (from Paul Duffin):
> > What you have to do is concatenate the
> > string reps together and use that as the string representation of the
> > new list object, then you convert all objects to lists and concatenate
> > the internal representation together. If any of the objects cannot be
> > converted to a list then you can't create a list object so you simply
> > return a string object.
> >
> > It is an interesting suggestion, but I don't think the time costs from
> > the extra intelligence overweigh just creating a proper lconcat sometime

> As for the costs, the above change only adds the costs of merging the


> internal list reps together (the string reps are already merged) and
> this cost is very small compared to the cost of recreating the
> internal list rep from the string rep. Obviously if you do not need
> the list representation then you could lose out but not by much.

Isn't it a little more than that? If you pass in all objs that are
strings without list form, but all valid lists, then you will do the
traditional concat, and then subsequently make a list representation
out of each string obj, appending each of these to the result list.

If you pass in all lists without valid string reps, then you have
don't just lconcat these lists together for the result, but still
create a string form, which may never be used.

IOW, it seems that your proposal, for all input that has string of a
valid list form, will cause results that are to have a list and string
form of each object calculated at the end. This allows all semantics to
be maintained, but if you let it be for now, you do all the string stuff,
and if it is used as a list, then you will just convert the end object
from string into list. Hmmm, in the end, the speed differences over the
whole app might not be much different from the original, but I think that
a true lconcat is the real way to improve efficiency.

> If I have time (about 30 minutes should do it) I will try and put
> together an implementation of this idea.

For all the finger huffing and puffing on this topic, we could have
definitely implemented a whole new elist commmand...

Jeffrey.Hobbs.vcf

Paul Duffin

unread,
May 21, 1999, 3:00:00 AM5/21/99
to Jeffrey Hobbs
Jeffrey Hobbs wrote:
>
> Paul Duffin wrote:
> > Jeffrey Hobbs wrote:
> > > That was again (from Paul Duffin):
> > > What you have to do is concatenate the
> > > string reps together and use that as the string representation of the
> > > new list object, then you convert all objects to lists and concatenate
> > > the internal representation together. If any of the objects cannot be
> > > converted to a list then you can't create a list object so you simply
> > > return a string object.
> > >
> > > It is an interesting suggestion, but I don't think the time costs from
> > > the extra intelligence overweigh just creating a proper lconcat sometime
>
> > As for the costs, the above change only adds the costs of merging the
> > internal list reps together (the string reps are already merged) and
> > this cost is very small compared to the cost of recreating the
> > internal list rep from the string rep. Obviously if you do not need
> > the list representation then you could lose out but not by much.
>
> Isn't it a little more than that? If you pass in all objs that are
> strings without list form, but all valid lists, then you will do the
> traditional concat, and then subsequently make a list representation
> out of each string obj, appending each of these to the result list.
>
> If you pass in all lists without valid string reps, then you have
> don't just lconcat these lists together for the result, but still
> create a string form, which may never be used.
>

Both of these could easily be addressed by checking before doing
anything whether all args are lists, or all are strings and only
create both reps if some args have both. You could go even further by
checking whether the arguments which are strings are perfect lists
ignoring preceding and trailing whitespaces. (By perfect I mean
string -> list -> string results in the same string). If they are then
you could throw away the string rep and just treat them as lists.

Obviously there is a point where the improvements don't justify the
cost for implementation.

The main reason for this proposal is to try and preserve the internal
representations of the Tcl_Obj's. Tcl_Obj's scale so much better than
strings in both performance (caching) and memory usage (sharing).

> IOW, it seems that your proposal, for all input that has string of a
> valid list form, will cause results that are to have a list and string
> form of each object calculated at the end. This allows all semantics to
> be maintained, but if you let it be for now, you do all the string stuff,
> and if it is used as a list, then you will just convert the end object
> from string into list. Hmmm, in the end, the speed differences over the
> whole app might not be much different from the original, but I think that

As with most performance things it depends on the application.
Applications which do a lot of concatenating of lists, especially if
they are very large will see a drastic speed up.

> a true lconcat is the real way to improve efficiency.
>

Of course it is but improving concat will have more immediate results.

> > If I have time (about 30 minutes should do it) I will try and put
> > together an implementation of this idea.
>
> For all the finger huffing and puffing on this topic, we could have
> definitely implemented a whole new elist commmand...
>

It is better that we get it right then get it quick. (One feature in our
device driver was added in a week (not by me !!) and we have been fixing
it for the last 6 years).

Alexandre Ferrieux

unread,
May 21, 1999, 3:00:00 AM5/21/99
to

So... Do you volunteer to write *the* lconcat ?

-Alex

Donal K. Fellows

unread,
May 21, 1999, 3:00:00 AM5/21/99
to
In article <KIR.99Ma...@Gauss.iitb.fhg.de>,
Harald Kirsch <k...@iitb.fhg.de> wrote:
> Try

>
> set cmd1 "puts one ;# "
> set cmd2 "puts two"
> set cmd [concat $cmd1 \; $cmd2]
> eval $cmd
>
> to see that it does not work.

$cmd1 is not a well-formatted list and is thus outside the sphere of
reference. Well-formatted lists have no unquoted metacharacters at
all. Ever. Anything with a list-rep must be a well-formatted list.

> What is the GENERAL way to concat two (or n) commands such that

> whenever consecutive evaluation with eval is possible, also the
> concatenation can be evaluated, with the same overall result, if
> possible. (The last condition may require that the single commands do
> not to contain (parts of) control structures.)

The requirement is that:
eval [list $word(0,0) $word(0,1) ... $word(0,n0)] \
[list $word(1,0) $word(1,1) ... $word(1,n1)] \
[list $word(2,0) $word(2,1) ... $word(2,n2)]
be the same as:
eval [concat [list $word(0,0) $word(0,1) ... $word(0,n0)] \
[list $word(1,0) $word(1,1) ... $word(1,n1)] \
[list $word(2,0) $word(2,1) ... $word(2,n2)]]
be the same as:
eval [list $word(0,0) $word(0,1) ... $word(0,n0) \
$word(1,0) $word(1,1) ... $word(1,n1) \
$word(2,0) $word(2,1) ... $word(2,n2)]

Note that none of this makes any reference whatsoever to
metacharacters of any form. This is because they are never ever
unquoted in well-formed lists.

The general way to concatenate two *complete* commands is to use
something like [join [list $cmd1 $cmd2] "\n"] Note that joining
uncomplete commands is very difficult. Or maybe harder...

> Also look at


>
> set cmd1 "puts one ;# {"
> set cmd2 "puts two"

> set cmd [concat $cmd1 \;\n\; $cmd2]
> eval $cmd
>
> Is that the way to go or can it be tricked to behave wrong?

Yet again, $cmd1 is not a well-formatted list and is thus outside the
sphere of reference. Your adding a nasty comment is just being evil
for gratuitousness's sake. (The newline separator will work, BTW.
But that has nothing to do with lists...)

Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
-- The small advantage of not having California being part of my country would
be overweighed by having California as a heavily-armed rabid weasel on our
borders. -- David Parsons <o r c @ p e l l . p o r t l a n d . o r . u s>

Donal K. Fellows

unread,
May 24, 1999, 3:00:00 AM5/24/99
to
In article <374430...@cnet.francetelecom.fr>,

Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> wrote:
> Ooops. Right; what I meant was:
>
> set cmd1 [list puts a]
> set cmd2 [list puts b]
> set nl "\n"
> set ign [lindex $nl 0]
> eval $cmd1 $nl $cmd2
>
> Okay ?

OK, but that means that this code needs to be documented in the
changelog as being "POTENTIAL INCOMPATABILITY" and not that the code
itself is actually bad, since the above is not something that normal
code will do. It may break very wierd scripts, but it will instead
make everyone else's code go faster.

(Unless you can somehow come up with a way of making the contents of
$nl into a list on the sly. And that really depends on the degree of
constant sharing implemented by the compiler. I don't know...)

Donal K. Fellows

unread,
May 24, 1999, 3:00:00 AM5/24/99
to
In article <3743D1...@cnet.francetelecom.fr>,

Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> wrote:
> IOW, keeping in mind that [concat] lurks behind [eval/uplevel], you
> support this case:
>
> eval $cmd $args
>
> but not this case:
>
> eval $cmd1 "\n" $cmd2
>
> I agree that they apply in fact to very different situations (I have
> never made out why the second use was here; this kind of semantic
> redundancy (with vanilla string concatenation) stinks of Perl...)

Umm. You are actually wrong, Alex. Yes, [concat] does lurk behind
[eval] but you have to remember that [concat] always stripped
whitespace before concatenating the rest, and newlines *are*
whitespace.

% concat "a b " "\n" " c d"
a b c d
% eval "puts a" "\n" "puts b"


bad argument "b": should be "nonewline"

IOW, Jeff's patch really is behaviour preserving! It's just that a
lot of people got the behaviour wrong (me included...)

Paul Duffin

unread,
May 24, 1999, 3:00:00 AM5/24/99
to
Donal K. Fellows wrote:
>
> In article <KIR.99Ma...@Gauss.iitb.fhg.de>,
> Harald Kirsch <k...@iitb.fhg.de> wrote:
> > Try
> >
> > set cmd1 "puts one ;# "
> > set cmd2 "puts two"
> > set cmd [concat $cmd1 \; $cmd2]
> > eval $cmd
> >
> > to see that it does not work.
>
> $cmd1 is not a well-formatted list and is thus outside the sphere of
> reference. Well-formatted lists have no unquoted metacharacters at
> all. Ever. Anything with a list-rep must be a well-formatted list.
>

As the only metacharacter in $cmd1 that I can see is ";" I presume that
is what you are talking about. Your statement about metacharacters is
not true about ; as the following code illustrates.

set a "; ; ;"
lindex $a 0

The last statement returns a ";" not an error therefore a is a well
formed list. You cannot however construct a list which contains an
unquoted single semicolon.

set a ";"
lappend b $a $a

The last statement returns {;} {;}, I do not know why ; should be
treated in this way as it is not a list meta character but is a command
meta character.

> > What is the GENERAL way to concat two (or n) commands such that
> > whenever consecutive evaluation with eval is possible, also the
> > concatenation can be evaluated, with the same overall result, if
> > possible. (The last condition may require that the single commands do
> > not to contain (parts of) control structures.)
>
> The requirement is that:
> eval [list $word(0,0) $word(0,1) ... $word(0,n0)] \
> [list $word(1,0) $word(1,1) ... $word(1,n1)] \
> [list $word(2,0) $word(2,1) ... $word(2,n2)]
> be the same as:
> eval [concat [list $word(0,0) $word(0,1) ... $word(0,n0)] \
> [list $word(1,0) $word(1,1) ... $word(1,n1)] \
> [list $word(2,0) $word(2,1) ... $word(2,n2)]]
> be the same as:
> eval [list $word(0,0) $word(0,1) ... $word(0,n0) \
> $word(1,0) $word(1,1) ... $word(1,n1) \
> $word(2,0) $word(2,1) ... $word(2,n2)]
>

As illustrated above the possible string representations (which is what
eval uses) of constructed lists are a subset of the possible string
representations of well formed lists (as defined by the conversion from
string to list defined in SetListFromAny and invoked by the use of
lindex on a string). This means that the above statements do not
describe the requirements.

> Note that none of this makes any reference whatsoever to
> metacharacters of any form. This is because they are never ever
> unquoted in well-formed lists.
>
> The general way to concatenate two *complete* commands is to use
> something like [join [list $cmd1 $cmd2] "\n"] Note that joining
> uncomplete commands is very difficult. Or maybe harder...
>

The following statements are identical in behaviour

set a [join [list $cmd1 $cmd2] "\n"]
set a "$cmd1\n$cmd2"

and almost the same as the following which does remove some unneeded
white space.

set a [concat $cmd1 "\n" $cmd2]

> > Also look at
> >
> > set cmd1 "puts one ;# {"
> > set cmd2 "puts two"
> > set cmd [concat $cmd1 \;\n\; $cmd2]
> > eval $cmd
> >
> > Is that the way to go or can it be tricked to behave wrong?
>
> Yet again, $cmd1 is not a well-formatted list and is thus outside the
> sphere of reference. Your adding a nasty comment is just being evil
> for gratuitousness's sake. (The newline separator will work, BTW.
> But that has nothing to do with lists...)
>

Neither does [concat] ...

Paul Duffin

unread,
May 24, 1999, 3:00:00 AM5/24/99
to
Alexandre Ferrieux wrote:
>
> Paul Duffin wrote:
> >
> > Alexandre Ferrieux wrote:
> > >
> > > Now if you transform [concat] in what I
> > > called [lconcat], you will clearly lose those \n.
> >
> > Actually Alex it is possible. What you have to do is concatenate the

> > string reps together and use that as the string representation of the
> > new list object, then you convert all objects to lists and concatenate
> > the internal representation together. If any of the objects cannot be
> > converted to a list then you can't create a list object so you simply
> > return a string object.
>
> Paul, this is very clever !!!
> This 'parallel concatenation' of both repr, if done only lazily (i.e.
> only when the list repr of all incoming args have already been
> computed), seems to bring the best of both worlds !
>
> Hmmm... Wait a minute: it still costs in memory, the size of a resulting
> big string that may be unwanted after all. OTOH, I agree that it will
> only happen when all the incoming parameters' string repr are also still
> around, meaning that the program is not pure-lists after all. Okay :)
>
> Jeff or Paul, the patch ?
>

I have just thought of another solution which gives you optimal
performance from [concat] whether or not you end up using the
result as a string or a list.

What happens is that [concat] simply creates a new object of type
"concat" which contains an array of pointers to the objects passed
to concat. At this point neither the string representation, nor the
list representation has been calculated.

If the result is used as a list then the concat type is converted to
a list using the objects stored in the concat structure.

If the result is used as a string then the string representation is
updated from the string representations of the objects.

If either of the formats is not used then the string representation
is not calculated.

I have tested this and it is just as fast as concat when playing with
strings and nearly as fast as lconcat when playing with lists.

There is one problem with this at the moment as it stands and that is
if you use it as both a list and a string in that order then the
conversion to list from concat loses the white space information which
is needed when treating it as a string. The solution is to let all
list functions work with concat objects as well.

This latter requirement is exactly the sort of thing that Feather could
support with its interface mechanism.

Jeffrey Hobbs

unread,
May 25, 1999, 3:00:00 AM5/25/99
to Alexandre Ferrieux

Um... shhhh, don't tell anyone, but I already did. I'm working on a
prototype elist. The full functionality is still not finished though.

However, after all the ranting and raving, I think the patch I originally
proposed, along with the patch to the L vs LS problem you pointed out
(checking || on bytes != NULL) should still actually be considered for
concat. While only adding a few cycles, it guarantees better behavior
on pure list input. I still think Paul's more complex L and LS handling,
while clever, isn't worth the cycle burn (you end up with everything LS).

Jeffrey.Hobbs.vcf

Alexandre Ferrieux

unread,
May 25, 1999, 3:00:00 AM5/25/99
to
Donal K. Fellows wrote:
>
> In article <3743D1...@cnet.francetelecom.fr>,
> Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> wrote:
> > IOW, keeping in mind that [concat] lurks behind [eval/uplevel], you
> > support this case:
> >
> > eval $cmd $args
> >
> > but not this case:
> >
> > eval $cmd1 "\n" $cmd2
> >
> > I agree that they apply in fact to very different situations (I have
> > never made out why the second use was here; this kind of semantic
> > redundancy (with vanilla string concatenation) stinks of Perl...)
>
> Umm. You are actually wrong, Alex. Yes, [concat] does lurk behind
> [eval] but you have to remember that [concat] always stripped
> whitespace before concatenating the rest, and newlines *are*
> whitespace.
>
> % concat "a b " "\n" " c d"
> a b c d
> % eval "puts a" "\n" "puts b"
> bad argument "b": should be "nonewline"

Okay, I oversimplified the case, but still

% concat "a b" "c\nd" "e f"
a b c
d e f

> IOW, Jeff's patch really is behaviour preserving!

Nope, see above :)

BTW, Jeff and I now agree about the semantic slip of his 'pure L'
version. The remaining open options are:

- stick to an unmodified [concat], and add a maximally fast
[lconcat] (obvious implementation)

- modify [concat] in a more subtle way, like Paul suggested, so
that the "pure L" case be nearly as fast as lconcat (only
overhead: testing for the case !).

Now after sleeping it over, I think the second gets my vote because the
L L L LS L L L case should be statistically dominated by pure-L cases
(and the LS is likely to come from a debugging puts, and debugging slows
down anyway!), and hence the patch would mean a perf gain for *many*
(maybe >99%?) existing uses of [concat], at only a slight perf cost wrt
pure [lconcat]. Thanks Paul for driving this insight into my mind.

-Alex

Alexandre Ferrieux

unread,
May 25, 1999, 3:00:00 AM5/25/99
to

I agree in spirit (aim for a [concat] that flies with pure-Ls).
But how specifically do you intend to handle the L L LS L L case ?
Unless I miss sthg obvious, it seems that as soon as one S is there, all
the other Ses must be computed, simply because when facing an LS, there
is no way to distinguish a S that comes *from* the L, from a S from
which the L derives; of course we want to preserve only the second
variety, since it's the only one that actually contains more info than
the list. Short of a way to detect the variety at hand, the semantics
commands to stringify the N-1 others too, and parallel-concat them...
Tough !

-Alex

Jeffrey Hobbs

unread,
May 25, 1999, 3:00:00 AM5/25/99
to Alexandre Ferrieux

True, that is exactly the problem. My solution works quickly and
effectively
in the pure-L case, and just leaves it to the original in a mixed case.
That would mean the L would have to be computed by some later case. Paul
proposed maintaining a full mixed case scenario, where concat always puts
out LS (but all objs become LS in the interim). I just don't see the end
advantage in it, as opposed to the enabling concat for pure-L (which
maintains 100% semantic integrity at the Tcl level) and instituting an
lconcat later (which would force everything into the L domain, without
regard for S).

In working through code here, I found that the pure-L actually gets hit
quite often in real scripts. It doesn't work at all interactively, because
each time the value of a list is returned to you, the string is computed.

Jeffrey.Hobbs.vcf

Alexandre Ferrieux

unread,
May 25, 1999, 3:00:00 AM5/25/99
to

Unfortunately by private e-mail Donal has just told me (us) that
*constant* args actually are always LS (or at least S), and never ever
become "pure" L. So this nullifies the progress made on the pure-L front
in cases like:

eval somecmd $args ;# "somecmd" is born S, and possibly ages as LS, but
never L...

However there may be some light, though it means maybe harder core-patch
work: Paul's (other) idea of a 'causality' flag in all Tcl_Objs, that
says which repr comes from the other (IOW which has more info), in our
case L>S or L<S.

Come to think of it, to help us in the constant case above, we'd need
slightly more: namely, whether the S really has more info (and hence
cannot be regenerated as is by the L). Ouch - this is becoming ugly; in
the case above, even if for some reason the constant arg becomes LS
(actually L<S), in fact we know that it is "L=S", meaning that the two
repr have an exact same amount of info, hence we can (in our case) do as
if it were L, the favourable case. On the other hand, if it were

eval "somenoargcmd\n" $args ;# Why would I want to do that ? Shhh...

then obviously it would be a strict L<S, hence forcing an LS conversion
to all others. The problem is the cost of the test:

"Is this L<S actually an L=S ?"

Ideas ?

-Alex

Donal K. Fellows

unread,
May 25, 1999, 3:00:00 AM5/25/99
to
In article <374A85...@cnet.francetelecom.fr>,

Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> wrote:
> I agree in spirit (aim for a [concat] that flies with pure-Ls).
> But how specifically do you intend to handle the L L LS L L case ?

Degrade to the standard behaviour, I imagine. Nothing else is
semantically-preserving. That's why you have [lconcat] as well.

Remember, generating S from L can be *very* expensive indeed. Any
reasonably chance to avoid it is a good thing (and this is a pretty
good technique, as it doesn't increase the big-O of the command and is
actually a pretty fast thing to check for.)

Donal K. Fellows

unread,
May 26, 1999, 3:00:00 AM5/26/99
to
In article <374AB3...@cnet.francetelecom.fr>,

Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> wrote:
> Come to think of it, to help us in the constant case above, we'd need
> slightly more: namely, whether the S really has more info (and hence
> cannot be regenerated as is by the L). Ouch - this is becoming ugly; in
> the case above, even if for some reason the constant arg becomes LS
> (actually L<S), in fact we know that it is "L=S", meaning that the two
> repr have an exact same amount of info, hence we can (in our case) do as
> if it were L, the favourable case. On the other hand, if it were
>
> eval "somenoargcmd\n" $args ;# Why would I want to do that ? Shhh...
>
> then obviously it would be a strict L<S, hence forcing an LS conversion
> to all others. The problem is the cost of the test:
>
> "Is this L<S actually an L=S ?"
>
> Ideas ?

Don't change [concat] at all, but just its documentation. Introduce
[lconcat] as a strict list-domain operation. Make [eval] deal with
single list-only arguments efficiently (through Tcl_EvalObjv) -
something that is now a significant gain due to the introduction of
[lconcat].

This scheme has the advantage of breaking nothing in the core, and
maintaining the current semantics of the object system. It is also
easier to code, document, test, and maintain than anything more
complex, especially since all new behaviour is encapsulated within a
new command or utterly (100%) safe.

Paul Duffin

unread,
May 26, 1999, 3:00:00 AM5/26/99
to
Donal K. Fellows wrote:
>
> In article <374A85...@cnet.francetelecom.fr>,

> Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> wrote:
> > I agree in spirit (aim for a [concat] that flies with pure-Ls).
> > But how specifically do you intend to handle the L L LS L L case ?
>
> Degrade to the standard behaviour, I imagine. Nothing else is
> semantically-preserving. That's why you have [lconcat] as well.
>

My "concat" object type is ;-) and it is actually quite easy to do
thanks to Alex turning it on its head.

> Remember, generating S from L can be *very* expensive indeed. Any
> reasonably chance to avoid it is a good thing (and this is a pretty
> good technique, as it doesn't increase the big-O of the command and is
> actually a pretty fast thing to check for.)
>

True, generating S from L is not simply a matter of appending the
strings one after the other, it involves scanning every string
representation of the underlying objects to see what needs to be
quoted, creating enough space for it and then going back and doing
the quoting. It could be made faster if the UpdateStringOfList
function checked to see whether an element was a list first because it
would not need to scan it, it would simply have to wrap {} around it.

Donal K. Fellows

unread,
May 27, 1999, 3:00:00 AM5/27/99
to
In article <374BBF...@mailserver.hursley.ibm.com>,

Paul Duffin <pdu...@mailserver.hursley.ibm.com> wrote:
> True, generating S from L is not simply a matter of appending the
> strings one after the other, it involves scanning every string
> representation of the underlying objects to see what needs to be
> quoted, creating enough space for it and then going back and doing
> the quoting. It could be made faster if the UpdateStringOfList
> function checked to see whether an element was a list first because it
> would not need to scan it, it would simply have to wrap {} around it.

Not that simple. With nested generated lists (yes, I have real code
in a real application that does this) conversion to strings is very
expensive, since it can mean the generation of strings for thousands
of separate Tcl_Objs. S->L (and L->S too) are in practise very
expensive operations which are best avoid if possible. It isn't
always possible, of course, but that doesn't mean you should go round
doing it gratuitously. :^)

Paul Duffin

unread,
May 27, 1999, 3:00:00 AM5/27/99
to
Donal K. Fellows wrote:
>
> In article <374BBF...@mailserver.hursley.ibm.com>,
> Paul Duffin <pdu...@mailserver.hursley.ibm.com> wrote:
> > True, generating S from L is not simply a matter of appending the
> > strings one after the other, it involves scanning every string
> > representation of the underlying objects to see what needs to be
> > quoted, creating enough space for it and then going back and doing
> > the quoting. It could be made faster if the UpdateStringOfList
> > function checked to see whether an element was a list first because it
> > would not need to scan it, it would simply have to wrap {} around it.
>
> Not that simple. With nested generated lists (yes, I have real code

You are right, if you have a list with only one element and that element
does not start with a { then you do not have to wrap it at all,
otherwise you just wrap {} around the string representation. I have
actually got a patch which does this among other things.

> in a real application that does this) conversion to strings is very
> expensive, since it can mean the generation of strings for thousands
> of separate Tcl_Objs. S->L (and L->S too) are in practise very
> expensive operations which are best avoid if possible. It isn't
> always possible, of course, but that doesn't mean you should go round
> doing it gratuitously. :^)
>

True. If you have a hierarchy of lists (a tree) with N nodes in you
have to create N strings, and each string is the sum of all of the
strings in the child beneath it.

Reply all
Reply to author
Forward
0 new messages