List package

12 views
Skip to first unread message

Robert Seeger

unread,
Mar 4, 1999, 3:00:00 AM3/4/99
to
Giventhe recent discussion of standard packages that may or may not be
included with Tcl, and how a list package would be one of the most
wanted, I'm working on putting together one for my own use. Of course,
if I can get it to work well, I'll release it to the public. Before I do
this, however, I'll like some input. I have a list of functions that I
think would be useful in a list package, which I've shown below. I'd
like to know what people think of these functions, and if I've missed
any that most people would want. Feel free to make any comments you
would like.

Also, keep in mind that these would all be in the ::list namespace, so
they won't pollute the global. Many of the items I have decided to use
are stolen from various other packages. If you see something that looks
like it was stolen from one of your ideas, I apologize. Actual credits
will be in whatever I release. This is just supposed to be a "what do
you think" type list. . .


> ## Procedures that modify the original list
>
> # Add item(s) to the end of the list
> append listVar item1 ?item2? . . .
>
> # Add item(s) to the beginning of the list
> preppend listVar item1 ?item2? . . .
>
> # Remove an item from the list. If -index is specified, then the element
> # is considered to be the index of the element of the list, and the item
> # located at that index is removed. Otherwise, it is considered to be a
> # string, and any occurances of that thring are removed
> remove ?-index? ?--? listVar element
>
> # Replace an item in the list with a new item. If -index is specified,
> # then the element is considered to be the index of the element of the
> # list, and the item at that index is replaced with the new element.
> # Otherwise, it is considered to be a string, and any occurances
> # of that thring are replaced with the new element
> replace ?-index? ?--? listVar element newelement
>
> # Insert an item into the list so that it will be located at the
> # specified index. If the index is 0, this is the same as preppend. If
> # the index is end, or greater than the value of the length of the list,
> # this is the same as append.
> insert listVar index element
>
> ## Procedures that do not modify the original list
>
> # Apply command to each item to the list. Return a list of the results
> apply list command
>
> # Return the reverse of the list
> reverse list
>
> # Remove duplicates from the list, and return the result
> unique list
>
> # Return a list containing all items in both lists, with duplicates removed
> union list1 list2
>
> # Return a list of items that are common to both lists
> interesct list1 list2
>
> # Return a list of items in the first list that are not in the second list
> subtract list1 list2
>
> # Return the list with words matching the pattern removed
> # mode can be -exact, -glob, or -regexp. Default is -glob.
> filter ?mode? list pattern
>
> # Return the list with words not matching the pattern removed.
> # mode can be -exact, -glob, or -regexp. Default is -glob.
> match ?mode? list pattern
>


Thanks for any help,
Rob Seeger
--
Whenever you're having a bad day, and it seems like everyone's
trying to piss you off, remember this - it takes 43 muscles to
frown, but only 4 to pull the trigger of a decent sniper rifle.
- Sorry, I stole this .sig, but I love it. . .

jeff_...@my-dejanews.com

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to rsee...@nycap.rr.com
In article <36DEBCA5...@nycap.rr.com>,

Robert Seeger <rsee...@nycap.rr.com> wrote:
> this, however, I'll like some input. I have a list of functions that I
> think would be useful in a list package, which I've shown below. I'd
....

> Also, keep in mind that these would all be in the ::list namespace, so

> > # Add item(s) to the end of the list


> > append listVar item1 ?item2? . . .

Hmmm, is it really a good idea to bother including what are already standard
commands? (this is lappend). We wouldn't want to confuse people thinking
that this is the one to use, which causes slower processing, which in turn
makes some think Tcl is slow... Of course, when it is just an alias, I don't
know how much slower it would really be.

> > # Add item(s) to the beginning of the list
> > preppend listVar item1 ?item2? . . .

That would be 'prepend', right?

> > # Remove an item from the list. If -index is specified, then the element
> > # is considered to be the index of the element of the list, and the item
> > # located at that index is removed. Otherwise, it is considered to be a
> > # string, and any occurances of that thring are removed
> > remove ?-index? ?--? listVar element

How about also allowing a -pattern switch, so element can be interp'd
as a regexp pattern?

jeff.hobbs @SPAM acm.org
jeffrey.hobbs @SPAM icn.siemens.de

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Robert Seeger

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to

jeff_...@my-dejanews.com wrote:
>
> In article <36DEBCA5...@nycap.rr.com>,
> Robert Seeger <rsee...@nycap.rr.com> wrote:
> > this, however, I'll like some input. I have a list of functions that I
> > think would be useful in a list package, which I've shown below. I'd
> ....
> > Also, keep in mind that these would all be in the ::list namespace, so
>
> > > # Add item(s) to the end of the list
> > > append listVar item1 ?item2? . . .
>
> Hmmm, is it really a good idea to bother including what are already standard
> commands? (this is lappend). We wouldn't want to confuse people thinking
> that this is the one to use, which causes slower processing, which in turn
> makes some think Tcl is slow... Of course, when it is just an alias, I don't
> know how much slower it would really be.

This was really just thrown in there to go with prepend (for symmetry, I
guess). If you think it would confuse people, it's just as easily taken
out. I would tend to use it where speed did not matter, and it made the
code look cleaner to me. However, I can see where others might use it in
places that would cause slowdown.


> > > # Add item(s) to the beginning of the list
> > > preppend listVar item1 ?item2? . . .
>
> That would be 'prepend', right?

Heh. Good point. I hate when I make silly mistakes. ;-)


> > > # Remove an item from the list. If -index is specified, then the element
> > > # is considered to be the index of the element of the list, and the item
> > > # located at that index is removed. Otherwise, it is considered to be a
> > > # string, and any occurances of that thring are removed
> > > remove ?-index? ?--? listVar element
>
> How about also allowing a -pattern switch, so element can be interp'd
> as a regexp pattern?

My suggestion would be to use the filter procedure instead for something
like this. The remove proc, in my opinion, is intended for quick
removals of a element. However, as adding the ability to remove by
regexp (and glob?) would not add significant slowdown, I see no reason
not to add it.


> jeff.hobbs @SPAM acm.org
> jeffrey.hobbs @SPAM icn.siemens.de
>
> -----------== Posted via Deja News, The Discussion Network ==----------
> http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own


Thanks for the comments Jeff. It occurs to me, from your point about
append, and it being slow, that I should add one other thought. I was
hoping, at some point in the future, that the code for this package
would be converted to C code. That way, if people wanted to compile an
extention, they could get the C code, and gain the extra speed benefits.
If they did not with to do any compiling (people on Windows without
compilers), they could just use the pure Tcl version. Thoughts on this?


Thanks again,

Bob Techentin

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to
Robert Seeger wrote:
>
> Given the recent discussion of standard packages that may or may not be
> included with Tcl, ...

>
> Also, keep in mind that these would all be in the ::list namespace, so
> they won't pollute the global.

These goals seem orthogonal. If you want to create a complete
list package in the ::list namespace, then you should probably
include all functionality - even duplicating the core lrange, linsert
etc. Then you will have a complete package, and authors using your
package can be consistent.

If, on the other hand, you want to create a "standard" package,
which might some day be included with the core Tcl, then (IMHO)
your goal should be a set of functions complimentary to the core.
If you might some day rename ::list::unique to 'lunique', then you
probably should not include ::list::range with slightly different
semantics from 'lrange'.


> >
> > # Return a list containing all items in both lists, with duplicates removed
> > union list1 list2

I'm not clear on this. What happens if either list1 or
list2 contain duplicates?


> > # Return the list with words not matching the pattern removed.
> > # mode can be -exact, -glob, or -regexp. Default is -glob.
> > match ?mode? list pattern
> >

Perhaps a description that eliminates the double negative would
be clearer. Like "Returns a list of words matching the pattern."

You might also look at Andreas Kupries' list functions on
http://www.oche.de/~akupries/soft/pool/f878.htm which include
'select' and 'lengthOfLongestEntry' functions.

Bob

--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html

Robert Seeger

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to

Bob Techentin wrote:
>
> Robert Seeger wrote:
> >
> > Given the recent discussion of standard packages that may or may not be
> > included with Tcl, ...
> >
> > Also, keep in mind that these would all be in the ::list namespace, so
> > they won't pollute the global.
>
> These goals seem orthogonal. If you want to create a complete
> list package in the ::list namespace, then you should probably
> include all functionality - even duplicating the core lrange, linsert
> etc. Then you will have a complete package, and authors using your
> package can be consistent.
>
> If, on the other hand, you want to create a "standard" package,
> which might some day be included with the core Tcl, then (IMHO)
> your goal should be a set of functions complimentary to the core.
> If you might some day rename ::list::unique to 'lunique', then you
> probably should not include ::list::range with slightly different
> semantics from 'lrange'.

Personally, I hate the l<name> commands. I'd like all list commands to
be placed in either

1) a seperate namespace (::list)
2) a 'list <subcommand>' ensemble

Admittedly, both would break a large chunk of existing code. However,
any new list commands, I'd like to keep in a seperate namespace. It just
makes code look cleaner, and more understandable.

> > >
> > > # Return a list containing all items in both lists, with duplicates removed
> > > union list1 list2
>
> I'm not clear on this. What happens if either list1 or
> list2 contain duplicates?

Since it's a set-like operation, I'd be inclined to say that the
resulting list would have no duplicate entries. Thus

::list::union {1 1 2 3 3 4 5 5} {1 2 3 3 4 4}

would return {1 2 3 4}.


> > > # Return the list with words not matching the pattern removed.
> > > # mode can be -exact, -glob, or -regexp. Default is -glob.
> > > match ?mode? list pattern
> > >
>
> Perhaps a description that eliminates the double negative would
> be clearer. Like "Returns a list of words matching the pattern."

Good point. I have a rather 'ugly' way of thinking and, thus, stating my
thoughts.

> You might also look at Andreas Kupries' list functions on
> http://www.oche.de/~akupries/soft/pool/f878.htm which include
> 'select' and 'lengthOfLongestEntry' functions.

Actually, much of the ideas for what I can up with were taken from
Andreas. His Pool package is great. With both Pool and TclX, almost all
the bases are covered as far as list operations. Not all, but most. . .

> Bob
>
> --
> Bob Techentin techenti...@mayo.edu
> Mayo Foundation (507) 284-2702
> Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html

Thanks for the comments,

Bob Techentin

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to
Robert Seeger wrote:

> Personally, I hate the l<name> commands. I'd like all list commands to
> be placed in either
>
> 1) a seperate namespace (::list)
> 2) a 'list <subcommand>' ensemble
>
> Admittedly, both would break a large chunk of existing code.

Then I'd vote for ::list, which would not break
existing code. I would also suggest additional
functions 'list' (or create), 'length', 'search',
'index', and 'sort'.

> > > > # Return a list containing all items in both lists, with duplicates removed
> > > > union list1 list2
> >
> > I'm not clear on this. What happens if either list1 or
> > list2 contain duplicates?
>
> Since it's a set-like operation, I'd be inclined to say that the
> resulting list would have no duplicate entries.

Well, how about a separate package of ::set
functions, and perhaps even another for ::queue
operations?

Bruce S. O. Adams

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to

jeff_...@my-dejanews.com wrote:

> In article <36DEBCA5...@nycap.rr.com>,
> Robert Seeger <rsee...@nycap.rr.com> wrote:
> > this, however, I'll like some input. I have a list of functions that I
> > think would be useful in a list package, which I've shown below. I'd
> ....

> > Also, keep in mind that these would all be in the ::list namespace, so
>

> > > # Add item(s) to the end of the list
> > > append listVar item1 ?item2? . . .
>
> Hmmm, is it really a good idea to bother including what are already standard
> commands? (this is lappend). We wouldn't want to confuse people thinking
> that this is the one to use, which causes slower processing, which in turn
> makes some think Tcl is slow... Of course, when it is just an alias, I don't
> know how much slower it would really be.
>

Ha! I knew someone would be suckered the way I almost was. The essential
feature
is that the lists will be passed by reference instead of expanded in every
call. I had to
examine the idea a bit to notice this too :-)

>
> > > # Add item(s) to the beginning of the list
> > > preppend listVar item1 ?item2? . . .
>
> That would be 'prepend', right?
>

> > > # Remove an item from the list. If -index is specified, then the element
> > > # is considered to be the index of the element of the list, and the item
> > > # located at that index is removed. Otherwise, it is considered to be a
> > > # string, and any occurances of that thring are removed
> > > remove ?-index? ?--? listVar element
>
> How about also allowing a -pattern switch, so element can be interp'd
> as a regexp pattern?
>

> jeff.hobbs @SPAM acm.org
> jeffrey.hobbs @SPAM icn.siemens.de
>
> -----------== Posted via Deja News, The Discussion Network ==----------
> http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Seconded but using the standard from of -regexp -glob -exact (and something for
regexp -nocase).
Regards,
Bruce A.


Bruce S. O. Adams

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to

Bob Techentin wrote:

> Robert Seeger wrote:
>
> > Personally, I hate the l<name> commands. I'd like all list commands to
> > be placed in either
> >
> > 1) a seperate namespace (::list)
> > 2) a 'list <subcommand>' ensemble
> >
> > Admittedly, both would break a large chunk of existing code.
>
> Then I'd vote for ::list, which would not break
> existing code. I would also suggest additional
> functions 'list' (or create), 'length', 'search',
> 'index', and 'sort'.
>
> > > > > # Return a list containing all items in both lists, with duplicates removed
> > > > > union list1 list2
> > >
> > > I'm not clear on this. What happens if either list1 or
> > > list2 contain duplicates?
> >
> > Since it's a set-like operation, I'd be inclined to say that the
> > resulting list would have no duplicate entries.
>
> Well, how about a separate package of ::set
> functions, and perhaps even another for ::queue
> operations?
>
> Bob
>

Seconded.


Bruce S. O. Adams

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to
> [oops]

> append, and it being slow, that I should add one other thought. I was
> hoping, at some point in the future, that the code for this package
> would be converted to C code. That way, if people wanted to compile an
> extention, they could get the C code, and gain the extra speed benefits.
> If they did not with to do any compiling (people on Windows without
> compilers), they could just use the pure Tcl version. Thoughts on this?
>
> Thanks again,
> Rob Seeger
> --
> Whenever you're having a bad day, and it seems like everyone's
> trying to piss you off, remember this - it takes 43 muscles to
> frown, but only 4 to pull the trigger of a decent sniper rifle.
> - Sorry, I stole this .sig, but I love it. . .

In my opinion, wherever possible all 'standard' packages should have both a pure
TCL and
and compiled version available. I would go with the TCL version as a 'prototype'
for
a C based version. Being able to pass lists by reference would inprove
performance on a whole
range of applications.
Regards,
Bruce A.

Andreas Kupries

unread,
Mar 5, 1999, 3:00:00 AM3/5/99
to
Robert Seeger <rsee...@nycap.rr.com> writes:

> Giventhe recent discussion of standard packages that may or may not be
> included with Tcl, and how a list package would be one of the most
> wanted, I'm working on putting together one for my own use. Of course,
> if I can get it to work well, I'll release it to the public. Before I do
> this, however, I'll like some input. I have a list of functions that I
> think would be useful in a list package, which I've shown below. I'd
> like to know what people think of these functions, and if I've missed
> any that most people would want. Feel free to make any comments you
> would like.


At
http://purl.org/thecliff/tcl/wiki/List

the readers of this newsgroup will find several pages containing
information about existing extensions, the list operations they
provide and a list :-) condensing that information into a set of
operations for a possible standard list extension.

As an example follows the comparison chart to be found at the given
location.

----------

Purpose: Comparison of available extensions with respect to the
provided list functionality.

Entries are prefixed with a shorthand for the extension containing the
command. Similar or equal functionality is grouped together wherever
possible.

* AT = AsserTcl /''done''/
* EL = ExtraL /''done''/
* JS = jstools /''done''/
* JT = jultaf /''done''/
* PB = Pool_Base /''done''/
* TclX = TclX /''done''/

----
(EL) oneof element list
(TclX) lcontain list element
Determine if the element is a list element of list. If the
element is contained in the list, 1 is returned, otherwise,
0 is returned.

(EL) lunion list list ...
returns the union of the lists

(TclX) union lista listb
Procedure to return the logical union of the two specified lists.
Any duplicate elements are removed.

(EL) lcommon list list ...
returns the common elements of the lists.

(TclX) intersect lista listb
Procedure to return the logical intersection of two lists.
The returned list will be sorted.

(EL) leor list1 list2
returns the elements that are not shared between both lists

(TclX) intersect3 lista listb
Procedure to intersects two lists, returning a list containing
three lists: The first list returned is everything in lista that
wasn't in listb. The second list contains the intersection of
the two lists, and the third list contains all the elements
that were in listb but weren't in lista. The returned lists
will be sorted.

(TclX) lempty list
Determine if the specified list is empty. If empty, 1 is
returned, otherwise, 0 is returned. This command is an
alternative to comparing a list to an empty string, however
it checks for a string of all whitespaces, which is an empty
list.
----
(JT) Juf::Sequence::assign LIST ?NAME ...?
Sets value of the variables specified by the NAME arguments to
that of the existing elements of LIST. Returns remaining list
elements. If the number of variables exceeds the list length,
the remaining variables will be removed.

(PB) ::pool::list::assign varList list
By Brent Welch. Assigns a set of variables from a list of values.
If there are more values than variables, they are ignored. If
there are fewer values than variables, the variables get the empty
string.

(TclX) lassign list var ?var...?
Assign successive elements of a list to specified variables.
If there are more variable names than fields, the remaining
variables are set to the empty string. If there are more
elements than variables, a list of the unassigned elements
is returned.

For example,

lassign {dave 100 200 {Dave Foo}} name uid gid longName

Assigns name to ``dave'', uid to ``100'', gid to ``200'', and
longName to ``Dave Foo''.
----
(EL) lfind mode list pattern
(TclX) lmatch ?mode? list pattern
Search the elements of list, returning a list of all elements
matching pattern. If none match, an empty list is returned.

The mode argument indicates how the elements of the list are
to be matched against pattern and it must have one of the
following values:

-exact The list element must contain exactly the same string
as pattern.

-glob Pattern is a glob-style pattern which is matched against
each list element using the same rules as the string
match command.

-regexp Pattern is treated as a regular expression and matched
against each list element using the same rules as the
regexp command.

If mode is omitted then it defaults to -glob.

(JT) Juf::Sequence::match PATTERN LIST
Returns a new list with all elements of LIST matching PATTERN
as with string match. == glob '''!'''

(PB) ::pool::list::match list pattern
All words not contained in list pattern are removed from list.
In set-notation: result = intersect (list, pattern). This is
not completely true, duplicate entries in 'list' remain in the
result, given that they appear at all.

(PB) ::pool::list::filter list pattern
All words contained in the list pattern are removed from list.
In set-notation: result = list - pattern. Returns the set
difference of list and pattern. '''Negative match'''.

----
(EL) lremdup ?-sorted? list ?var?
returns a list in which all duplactes are removed. with the
-sorted option the command will usually be a lot faster,
but $list must be sorted with lsort; The optional $var gives
the name of a variable in which the removed items will be stored.

(TclX) lrmdups list
Procedure to remove duplicate elements from a list. The returned
list will be sorted.

(PB) ::pool::list::uniq list
Removes duplicate entries from list. Returns the modified list.

----
(EL) laddnew listName ?item? ...
adds the items to the list if not already there

(JT) Juf::Sequence::append ?OPTION ...? NAME ?VALUE ...?
Works like the Tcl builtin lappend, but considers these options:

-nonempty Append only non-empty values.
-- Marks the end of the options. The argument
following this one will be treated as NAME
even if it starts with a -.

(TclX) lvarcat var string ?string...?
This command treats each string argument as a list and concatenates
them to the end of the contents of var, forming a a single list.
The list is stored back into var and also returned as the result.
if var does not exist, it is created.
----
(PB) ::pool::list::pop listVar
Removes the last element of the list contained in variable listVar.
Returns the last element of the list, or {} in case of an empty list.

(JT) Juf::Sequence::pop NAME ?COUNT?
Removes COUNT element from the end of the list stored in the
variable NAME and returns the last element removed. COUNT
defaults to 1.

(EL) lpop listName ?pos?
returns the last element from a list, thereby removing it from
the list. If pos is given it will return the pos element of the
list.

(JT) Juf::Sequence::shift NAME ?COUNT?
Removes COUNT element from the list stored in the variable NAME
and returns the last element removed. COUNT defaults to 1.

(EL) lshift listName
returns the first element from a list, thereby removing it from
the list.

(PB) ::pool::list::shift listVar
The list stored in the variable listVar is shifted down by one.
Returns the first element of the list stored in listVar, or {}
for an empty list. The latter is not a sure signal, as the list
may contain empty elements.

(PB) ::pool::list::remove listVar position
Removes the item at position from the list stored in variable
listVar.

(PB) ::pool::list::exchange listVar position newItem
Removes the item at position from the list stored in variable
listVar and inserts newItem in its place. Returns the changed list.

(TclX) lvarpop var ?indexExpr? ?string?
The lvarpop command pops (deletes) the element indexed by the
expression indexExpr from the list contained in the variable var.
If index is omitted, then 0 is assumed. If string, is specified,
then the deleted element is replaced by string. The replaced or
deleted element is returned. Thus ``lvarpop argv 0'' returns the
first element of argv, setting argv to contain the remainder of the
string.

If the expression indexExpr starts with the string end, then end
is replaced with the index of the last element in the list. If the
expression starts with len, then len is replaced with the length
of the list.
----
(PB) ::pool::list::push listvar args
The same as 'lappend', provided for symmetry only.

(EL) lpush listName ?item? ?position?
opposite of lpop.

(EL) lunshift listName ?item?
opposite of lshift: prepends ?item? to the list.

(PB) ::pool::list::prepend listVar newElement
(PB) ::pool::list::unshift listVar newElement
The list stored in the variable listVar is shifted up by one.
newElement is inserted afterward into the now open head position.

(TclX) lvarpush var string ?indexExpr?
The lvarpush command pushes (inserts) string as an element in the
list contained in the variable var. The element is inserted before
position indexExpr in the list. If index is omitted, then 0 is
assumed. If var does not exists, it is created.

If the expression indexExpr starts with the string end, then end
is replaced with the index of the last element in the list. If
the expression starts with len, then len is replaced with the
length of the list. Note the a value of end means insert the
string before the last element.
----
(PB) ::pool::list::head list
Returns the first element of list.

(PB) ::pool::list::last list
Returns the last element of list.

(PB) ::pool::list::prev list
Returns everything before the last element of list.

(PB) ::pool::list::tail list
Returns everything behind the first element of list.

----
(EL) remove listName ?item? ...
removes the items from the list

(PB) ::pool::list::delete listVar value
By Brent Welch. Deletes an item from the list stored in
listVar, by value. Returns 1 if the item was present, else 0.

----
(EL) lsub list ?-exclude? index_list
create a sublist from a set of indices. When -exclude is specified,
the elements of which the indexes are not in the list will be given.
eg.:

% lsub {Ape Ball Field {Antwerp city} Egg} {0 3}
Ape {Antwerp city}
% lsub {Ape Ball Field {Antwerp city} Egg} -exclude {0 3}
Ball Field Egg

(PB) ::pool::list::select list indices
Idea from a thread in c.l.t.
General permutation / selection of list elements. Takes the elements
of list whose indices were given to the command and returns a new list
containing them, in the specified order.

----

(EL) lcor
gives the positions of the elements in list in the reference list.
If an element is not found in the reference list, it returns -1.
Elements are matched only once. eg.:

% lcor {a b c d e f} {d b}
3 1
% lcor {a b c d e f} {b d d}
1 3 -1

(EL) llremove ?-sorted? list1 list2 ?var?
returns a list with all items in list1 that are not in list2. with
the -sorted option the command will usually be a lot faster, but
both given lists must be sorted with lsort; The optional $var give
the name of a variable in which the removed items will be stored.

(EL) lmerge ?list1? ?list2? ??spacing??
merges two lists into one eg.:

% lmerge {a b c} {1 2 3}
a 1 b 2 c 3
% lmerge {a b c d} {1 2} 2
a b 1 c d 2

(EL) lunmerge ?list? ?spacing? ?var?
unmerges items from a list to the result; the remaining items are
stored in the given variable ?var? eg.:

% lunmerge {a 1 b 2 c 3}
a b c
% lunmerge {a b 1 c d 2} 2 var
a b c d
% set var
1 2

(EL) lset listName ?item? ?indexlist?
sets all elements of the list at the given indices to value ?item?

(EL) larrayset array varlist valuelist
sets the values of valuelist to the respective elements in varlist
for the given array.

(EL) lregsub ?switches? exp list subSpec
does a regsub for each element in the list, and returns the resulting
list. eg.:

% lregsub {c$} {afdsg asdc sfgh {dfgh shgfc} dfhg} {!}
afdsg asd! sfgh {dfgh shgf!} dfhg
% lregsub {^([^.]+)\.([^.]+)$} {start.sh help.ps h.sh} {\2 \1}
{sh start} {ps help} {sh h}

(PB) ::pool::list::reverse list
Returns the reversed list.

(PB) ::pool::list::projection list column
Treats list as list of lists and extracts the column'th element of
each list item. If list is seen as matrix, then the procedure
returns the data of the specified column.

(PB) ::pool::list::apply cmd list
Applies cmd to all entries of list and concatenates the
individual results into a single list. The cmd must accept exactly
one argument, which will be the current element.

(PB) ::pool::list::lengthOfLongestEntry (list)
Determines the length of the longest entry contained in the list.

(AT) lall item_list list expr
List universal quantifier: evaluates expr for all items or
item sequences, and returns 1 if the expresson is true over
the whole list, and 0 otherwise

(AT) lexit item_list list expr
List existential quantifier: evaluates expr for all items or
item sequences, and returns 1 if the expresson is true for any
item in the list, and 0 otherwise

(JS) j:longest_match
find the longest common initial string in a list

--
Sincerely,
Andreas Kupries <a.ku...@westend.com>
<http://www.westend.com/~kupries/>
-------------------------------------------------------------------------------

Bob Techentin

unread,
Mar 8, 1999, 3:00:00 AM3/8/99
to
Andreas Kupries wrote about:
>
> http://purl.org/thecliff/tcl/wiki/List
>

Andreas,

This is a delightfully comprehensive list of functionality.
(Pun intended.)

I'm not sure, however, about one of the basic assumptions.
You (and others) have distinguished between list functions
that operate on lists "in place" or "by var name". I believe
the implication is that commands operating on lists-by-name
would be more efficient. For example, the command syntax for
lreplace is:

lreplace list first last ?element element ...?

where the list argument can be $myList, a literal string, or
the residual value of some function. The "new list" is returned
from the function.

A more efficient approach would be to refer to the list by
'varName', as is done for lappend. This way you don't have
to expand $myList, and you don't have to copy the entire
list. Especially for cases where you are changing the
list in place, like this:

set myList [lreplace $myList $index $index $newValue]

Yup. Would be much more efficient. Or would it?

Granted, most of what I know about the bytecode compiler is from
the "marketing literature," but isn't it possible that this
is already optimized? The Tcl_List*** functions all use Tcl_Objs,
and all the objects are reference counted. I can imagine an
implementation where the bytecode compiler does not expand
$myList, but simply passes the object reference to lreplace(),
and the new list consists mostly of reference copies of the
original list elements.

There would definitely be a loss of generality, as
you would need to assign literal lists and [return values]
to list variables before you could use these new list
package functions.

Is there a real, significant performance benefit to be had
here? My little test code shows that lreplace and lreplace with
copies of lists run at about the same speed, but if you slip in
an [eval] to force list evaluation, it runs 10x slower. It also
shows that building a list with 'lappend' is still much
faster than using concat.

Does anybody know how the compiler handles list functions?

Bob

--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html

set listLimit 1000

puts "building a list with lappend"
set myList ""
puts [time {
for {set i 0} {$i<$listLimit} {incr i} {
lappend myList "Thisisaninterestinglistelement:$i"
}
}]

puts "building a list with concat"
set myList ""
puts [time {
for {set i 0} {$i<$listLimit} {incr i} {
set myList [concat $myList "Thisisaninterestinglistelement:$i"]
}
}]


puts "basic lreplace"
puts [time {
for {set i 0} {$i<$listLimit} {incr i} {
set myList [lreplace $myList $i $i "newtext$i"]
}
}]

set myList ""
for {set i 0} {$i<$listLimit} {incr i} {
lappend myList "Thisisanotherinterestinglistelement:$i"
}

puts "lreplace with list copies"
puts [time {
for {set i 0} {$i<$listLimit} {incr i} {
set newList [lreplace $myList $i $i "newtext$i"]
set myList $newList
}
}]


set myList ""
for {set i 0} {$i<$listLimit} {incr i} {
lappend myList "Thisisanotherinterestinglistelement:$i"
}

puts "lreplace with eval list"
puts [time {
for {set i 0} {$i<$listLimit} {incr i} {
set myList [lreplace [eval list $myList] $i $i "newtext$i"]
}
}]

jeff_...@my-dejanews.com

unread,
Mar 8, 1999, 3:00:00 AM3/8/99
to techenti...@mayo.edu
In article <36E3E240...@mayo.edu>,

Bob Techentin <techenti...@mayo.edu> wrote:
> Andreas Kupries wrote about:
> > http://purl.org/thecliff/tcl/wiki/List

> A more efficient approach would be to refer to the list by


> 'varName', as is done for lappend. This way you don't have
> to expand $myList, and you don't have to copy the entire

It depends on the situation and the core commands that help your
extended list functionality. Mostly though, the differences are
a matter of convenience. Sometimes people want to change the
current list var, so setting it to an op on itself makes less
sense than just one op on the list:
## Version a
set myList [somereplace $myList 3 5]
## Version b
otherreplace myList 3 5

Minor cleanliness difference. However, it also depends on how you
use the result. In many situations, for example an lsort on a list
to foreach it, make life easier with:
foreach elem [lsort $list] { ... }
of course, if you return the list anyway, then
foreach elem [newsort list] { ... }
works just as good, if you wanted to change the list.

Andreas Kupries

unread,
Mar 8, 1999, 3:00:00 AM3/8/99
to

Bob Techentin <techenti...@mayo.edu> writes:

> > > I'm not clear on this. What happens if either list1 or
> > > list2 contain duplicates?

> > Since it's a set-like operation, I'd be inclined to say that the
> > resulting list would have no duplicate entries.

> Well, how about a separate package of ::set
> functions, and perhaps even another for ::queue
> operations?

I agree too, after some internal fighting with myself. My main
argument against was the duplication of some functionality, but is not
valid anymore the moment the transition to C is done, as we are then
able to use better/optimized representations (Tcl_Obj-Types) for the
different domains.

Oh, don't forget ::stack's.

Andreas Kupries

unread,
Mar 8, 1999, 3:00:00 AM3/8/99
to

Bob Techentin <techenti...@mayo.edu> writes:

> Andreas Kupries wrote about:
> >
> > http://purl.org/thecliff/tcl/wiki/List

> Andreas,

> This is a delightfully comprehensive list of functionality.
> (Pun intended.)

Many thanks for the praise. [*]

> I'm not sure, however, about one of the basic assumptions.
> You (and others) have distinguished between list functions
> that operate on lists "in place" or "by var name". I believe
> the implication is that commands operating on lists-by-name
> would be more efficient.

After the reading the rest of the text I believe that were was a
misunderstanding. When I say "in place" I mean "by varname", as I
don't have to expand the list, and to copy around. The modification is
done in the variable, in place.

> For example, the command syntax for
> lreplace is:

> lreplace list first last ?element element ...?

> where the list argument can be $myList, a literal string, or
> the residual value of some function. The "new list" is returned
> from the function.

> A more efficient approach would be to refer to the list by

> 'varName', as is done for lappend. This way you don't have
> to expand $myList, and you don't have to copy the entire

> list. Especially for cases where you are changing the
> list in place, like this:

> set myList [lreplace $myList $index $index $newValue]

> Yup. Would be much more efficient. Or would it?

> Granted, most of what I know about the bytecode compiler is from
> the "marketing literature," but isn't it possible that this
> is already optimized? The Tcl_List*** functions all use Tcl_Objs,
> and all the objects are reference counted. I can imagine an
> implementation where the bytecode compiler does not expand
> $myList, but simply passes the object reference to lreplace(),
> and the new list consists mostly of reference copies of the
> original list elements.

Thinking about it I would say that the argument of 'in place/by var'
vs. 'by value/copying around' mostly applies to version before 8.x, as
they really had to copy a string around while doing the various
manipulations, and 'in place/by var' was vastly more efficient.

It is my understanding that 8.x just copies Tcl_Obj-references, which
are much smaller, thus equalizing [x] both types of operation, at
least with respect to speed. With respect to memory, and depending on
the exact operation 'by value' may still need more (temporary) memory
than the 'by var/in place' variant. The affected (and copied) part is
the array of Tcl_Obj-references inside of list-objects. 'by var/in
place' allows direct manipulation of that array. Hm, in case of a
resize making the list bigger we still have to duplicate that array
for a short time (realloc does it). Well, I said that it is dependent
on the operation.

[x] Reading my explanations to your speed tests I should better say
that the difference in speed is certainly less than before, but still
existing.


> There would definitely be a loss of generality, as
> you would need to assign literal lists and [return values]
> to list variables before you could use these new list
> package functions.

Yes. Do you think that we should implement both variants ?



> Is there a real, significant performance benefit to be had
> here? My little test code shows that lreplace and lreplace with
> copies of lists run at about the same speed, but if you slip in
> an [eval] to force list evaluation, it runs 10x slower. It also

Which version ? I think the eval slows the system down so much because
it forces the system to:

* generate the string representation of the list,

* parse that back into arguments (basically a list!) and bytecode,

* and then executes the bytecode.


> shows that building a list with 'lappend' is still much
> faster than using concat.

Concat creates a new list-object, copies the pointers from the old
and returns its result. This is then used to replace the old Tcl_Obj.

Lappend just resizes the internal array of Tcl_Obj-pointers in the
list-object stored in the specified variable and then adds the new
objects. The resize may cause 'realloc' to copy the array, but it
happens much less often. And there is no overhead for the creation of
a new object.


> Does anybody know how the compiler handles list functions?

AFAIK list operations are not inlined.


[*] I hope that you start to contribute at that place too. Especially
comparing different implementations of a function with respect to
memory and speed is something that would fit in nicely.

Andreas Kupries

unread,
Mar 8, 1999, 3:00:00 AM3/8/99
to

Robert Seeger <rsee...@nycap.rr.com> writes:

> Personally, I hate the l<name> commands. I'd like all list commands to
> be placed in either

> 1) a seperate namespace (::list)
> 2) a 'list <subcommand>' ensemble

> Admittedly, both would break a large chunk of existing code. However,
> any new list commands, I'd like to keep in a seperate namespace. It just
> makes code look cleaner, and more understandable.

Not necessarily. Adding some aliases mapping the old names to the
correct new ones should fix that particular problem.

> Bob Techentin wrote:

> > You might also look at Andreas Kupries' list functions on
> > http://www.oche.de/~akupries/soft/pool/f878.htm which include
> > 'select' and 'lengthOfLongestEntry' functions.

> Actually, much of the ideas for what I can up with were taken from
> Andreas. His Pool package is great. With both Pool and TclX, almost all
> the bases are covered as far as list operations. Not all, but most. . .

Hey, there are people using it :-). Nice to know. And my thanks for
the recommendation.

Bob Techentin

unread,
Mar 9, 1999, 3:00:00 AM3/9/99
to
Oops. I didn't write what I meant.


> > You (and others) have distinguished between list functions
> > that operate on lists "in place" or "by var name".

I really did mean what you said. :-) A command which operates
on a list "in place" or "by varname" (like lappend) would require
the name of a list variable. A command which operates on a
list "by value" or by "copying around" would expand the list.

I think my previous post was a little confusing. Instead of
stating an thesis and disproving it, I'll try the direct
route.

<IMHO>

I do not believe you will get a significant performance
boost from "in place" lists. It looks to me like Tcl 8.x
bytecode compiler exploits Tcl objects and does not expand
the list variables. (I consider 3-4x significant)

I do believe that you lose a lot of generality with "in place"
lists. You MUST supply a list variable name. So you cannot
use "in place" list functions directly on hard-coded lists
or the return value from functions.

</IMHO>

> Do you think that we should implement both variants ?

Nope.

Paul Duffin

unread,
Mar 10, 1999, 3:00:00 AM3/10/99
to
Bob Techentin wrote:
>
> Oops. I didn't write what I meant.
>
> > > You (and others) have distinguished between list functions
> > > that operate on lists "in place" or "by var name".
>
> I really did mean what you said. :-) A command which operates
> on a list "in place" or "by varname" (like lappend) would require
> the name of a list variable. A command which operates on a
> list "by value" or by "copying around" would expand the list.
>
> I think my previous post was a little confusing. Instead of
> stating an thesis and disproving it, I'll try the direct
> route.
>
> <IMHO>
>
> I do not believe you will get a significant performance
> boost from "in place" lists. It looks to me like Tcl 8.x
> bytecode compiler exploits Tcl objects and does not expand
> the list variables. (I consider 3-4x significant)
>

Lets do some reference counting for current implementation of lreplace.

set a [list 1 2 3 4]
# 1 reference to list (from variable a).

set a [lreplace $a 1 1]
# While inside lreplace there are 2 references to list (1 from variable
# a and one from the argument to lreplace). Therefore the list (but
# not the elements) is duplicated.

# After setting the variable a to the new list and resetting the interp
# result there is 1 reference to the new list and 0 references to the
# old list so the old list is freed. Therefore the duplication was
# wasted.

And now lets do some for in-place implementation.

set a [list 1 2 3 4]
# 1 reference to list (from variable a).

lreplaceinplace a 1 1
# While inside lreplace there is still only 1 reference to the object
# from the variable so it can be modified directly without duplication.

So you do gain from using "in-place" but I agree that you need both.
What you need to come up with is a consistent naming scheme so it is
easy to see which command is in-place and which is not.

> I do believe that you lose a lot of generality with "in place"
> lists. You MUST supply a list variable name. So you cannot
> use "in place" list functions directly on hard-coded lists
> or the return value from functions.
>
> </IMHO>
>
> > Do you think that we should implement both variants ?
>
> Nope.
>

--
Paul Duffin
DT/6000 Development Email: pdu...@hursley.ibm.com
IBM UK Laboratories Ltd., Hursley Park nr. Winchester
Internal: 7-246880 International: +44 1962-816880

Bruce S. O. Adams

unread,
Mar 10, 1999, 3:00:00 AM3/10/99
to

Paul Duffin wrote:

[snip - sorry]

Surely a consistant 'naming' scheme for the variables at least, already exists
in the form of "$list" (for pass by value) and "list" (for pass by
reference)
if you'll excuse my use of the equivalent terms from C/C++.
Are we advocating two sets of methods, one of each addressing mode
(assembly speak for method of argument passing)? This kind of thing is useful
for writing optimised code but doesn't it tend to unecessarily complicate the
language?
Presumably when code optimisation is an issue the normal practice is to recode
the procedure / procedures in question in C. Do we really need the capability
to
write optimised pure TCL? If we do (of which I am not certain), I would
suggest
that such features be provided in a single coherent package which would provide

the following kinds of facilities:

address modes - pass by reference
pass by value
to be usable in all places in which an
unecessary
evaluation might otherwise occur

inline procs - expand the code in the calling procedure to
remove
the overhead of procedure calls

constant epressions - e.g. const set a [ 1 && 2 && 4 && 8 ]
evaluated once only by the
interpreter at the start rather than
when the set statement itself
is reached.

I am in two minds about this kind of thing. On the one hand, it looks like a
fun project. On the
other hand these kind of optimisations are exactly the kind of thing that a
really well implemented
byte-code compiler (how many TCL compiler projects are there out there? I know
of at least one - Ice)
should be able to spot for us, so we don't have to worry about them.
Regards,
Bruce A.

------------------------
Any good jobs going?
------------------------


Laurent Duperval

unread,
Mar 10, 1999, 3:00:00 AM3/10/99
to
In article <36E67042...@rmc-ltd.com>,

"Bruce S. O. Adams" <bruce...@rmc-ltd.com> wrote:
>
> Surely a consistant 'naming' scheme for the variables at least, already exists
> in the form of "$list" (for pass by value) and "list" (for pass by
> reference)
> if you'll excuse my use of the equivalent terms from C/C++.

Hmmm... But does the parser realize this? I don't think so. The parser does
its thing and then calls the appropriate procs to Do Something Clever(tm).
It's the procs on the receiveing end that know if the first argument is a
list or the name of a variable. How are you going to treat something like
this (which I use quite often):

set list [list a b c d]
set l $list
lreplace $l 1 foo ; # or whateever

L

--
Use The Penguin, Luke!

Laurent Duperval

Bruce S. O. Adams

unread,
Mar 10, 1999, 3:00:00 AM3/10/99
to

Laurent Duperval wrote:

> In article <36E67042...@rmc-ltd.com>,
> "Bruce S. O. Adams" <bruce...@rmc-ltd.com> wrote:
> >
> > Surely a consistant 'naming' scheme for the variables at least, already exists
> > in the form of "$list" (for pass by value) and "list" (for pass by
> > reference)
> > if you'll excuse my use of the equivalent terms from C/C++.
>
> Hmmm... But does the parser realize this? I don't think so. The parser does
> its thing and then calls the appropriate procs to Do Something Clever(tm).
> It's the procs on the receiveing end that know if the first argument is a
> list or the name of a variable. How are you going to treat something like
> this (which I use quite often):
>
> set list [list a b c d]
> set l $list

This is just an extra assignment. Which could be handled either by copying
the list object, making a reference to it (provided it doesn't change) or evaluating

the $list first.


>
> lreplace $l 1 foo ; # or whateever
>
> L
>

The question is when does substitution actually take place. If it is done by the
TCL 'parser' (I would only apply this term loosely, to something so 'simple') then
lots of things will happen inefficiently. If I were implementing things, I would
try
and leave evaluation as late as possible. i.e. variable substituation would only
occur immediately before the variable is to be used. This is why we need to use
eval when we have a single $var which doesn't necessarily expand to a single item.
e.g.

proc foo { a b c } {
puts "a=$a b=$b c=$c"
}

set args [ list $a $b $c ]

eval foo $args ---------> |exec| foo $a $b $c

foo $args ---------> wrong number of arguments error

I wonder, did things still work the 'logical' way before the byte compiler?
It is only a small step from expanding variables just in time, to not bothering
when it isn't necessary e.g. for lists. This is doubly true as the C side of TCL
already knows a lot about lists via the Tcl_ListObj API.
Regards,
Bruce A.


Bryan Oakley

unread,
Mar 10, 1999, 3:00:00 AM3/10/99
to
"Bruce S. O. Adams" wrote:
>
> Surely a consistant 'naming' scheme for the variables at least, already exists
> in the form of "$list" (for pass by value) and "list" (for pass by
> reference)

But any command will never see the distinction. Remember that the parser
does one pass and then sends the results to a command. So, for a
statement like "foo $bar", foo will never see the dollar sign.

Perhaps a simple solution is to create a single set of commands, but
provide a switch to designate a variable to operatore on:

list::append -variable someList blah

This would cause "blah" to be appended to a list variable named
"someList".

I'm not necessarily saying this is good. Just exploring the
possibilities.

--
Bryan Oakley mailto:oak...@channelpoint.com
ChannelPoint, Inc. http://purl.oclc.org/net/oakley

Laurent Duperval

unread,
Mar 10, 1999, 3:00:00 AM3/10/99
to
In article <36E6AFB2...@rmc-ltd.com>,

"Bruce S. O. Adams" <bruce...@rmc-ltd.com> wrote:
> The question is when does substitution actually take place. If it is done by
the
> TCL 'parser' (I would only apply this term loosely, to something so 'simple')
then
> lots of things will happen inefficiently. If I were implementing things, I
would
> try
> and leave evaluation as late as possible. i.e. variable substituation would
only
> occur immediately before the variable is to be used. This is why we need to
use
> eval when we have a single $var which doesn't necessarily expand to a single
item.
> e.g.
>
> proc foo { a b c } {
> puts "a=$a b=$b c=$c"
> }
>
> set args [ list $a $b $c ]
>

This creates 1 element, which is a list containing 3 elements. Any other
interpretation, IMO, is wrong.

So what if, instead, I have:

set args "a b c"

How would you expect

foo $a

to behave?

> eval foo $args ---------> |exec| foo $a $b $c
>
> foo $args ---------> wrong number of arguments error
>
> I wonder, did things still work the 'logical' way before the byte compiler?

If by logical, you mean that a list is one element instead of a number of
elements, I believe so, yes. But then, old Tcl hands could give a better
response, since I'm too lazy to go thru the CHANGES file to see for sure.

Bob Techentin

unread,
Mar 10, 1999, 3:00:00 AM3/10/99
to
Paul Duffin wrote:
>
> Bob Techentin wrote:
> >
> >
> > I do not believe you will get a significant performance
> > boost from "in place" lists. It looks to me like Tcl 8.x
> > bytecode compiler exploits Tcl objects and does not expand
> > the list variables. (I consider 3-4x significant)
> >
>
> Lets do some reference counting for current implementation of lreplace.
>

[snip]

I agree that you will probably save some reference counting.
But I'm still of the opinion that it won't make a big difference
in overall list command speed.

But I'm still not sure. I read the source, and lreplace does
in fact make an internal list object, and it increments and
decrements reference counts to the list elements.

On the way to figuring this out, I ran across an interesting little
tidbit in tclExecute.c. If you set the variable tcl_traceExec, you
can get the bytcode compiler to spit out a trace of the commands it
is executing. This trace convinced me that $myList does not
get expanded before calling lreplace. Try

set tcl_traceExec 3
proc compileThisAndRunIt { } {
set myList [list 1 2 3 4]
set myList [lreplace $myList 2 2]
}
compileThisAndRunIt


I guess the only real way to find out if there is any real
performance boost from "in place" list functions is to write
one. *sigh*

Bob

Paul Duffin

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
Bryan Oakley wrote:
>
> "Bruce S. O. Adams" wrote:
> >
> > Surely a consistant 'naming' scheme for the variables at least, already exists
> > in the form of "$list" (for pass by value) and "list" (for pass by
> > reference)
>
> But any command will never see the distinction. Remember that the parser
> does one pass and then sends the results to a command. So, for a
> statement like "foo $bar", foo will never see the dollar sign.
>
> Perhaps a simple solution is to create a single set of commands, but
> provide a switch to designate a variable to operatore on:
>
> list::append -variable someList blah
>
> This would cause "blah" to be appended to a list variable named
> "someList".
>
> I'm not necessarily saying this is good. Just exploring the
> possibilities.
>

That is what I was talking about, a consistent way of specifying whether
the object to be modified is a variable or a list object. This could
either be done by having two different commands or one with different
options.

e.g.
list::append list fred # Would create the list {list fred}
list::append -variable list fred # Would append fred to the list in
# variable list.

or

list::append list fred
list::appendv list fred

Bruce S. O. Adams

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to

Laurent Duperval wrote:

As far as the interpreter goes, I guess it has to give args its string
interpretation by default
and reinterpret it as a list later. It depends on how it is implemented. It is
possible that, since
this kind of thing is used so frequently in defining lists, the list interpretation
is the default or perhaps
even both. Has anyone got a spec for the byte compiler side of things lying
around?
This does suggest to me however, that if you _know_ when you are coding that a
variable is
going to be a list, you should initialize it as one. i.e.

set myList [list]

is better than either of the following:

set myList ""
set myList {}

If this is true then it should go in the style guide. Anyone care to comment?

>
> > eval foo $args ---------> |exec| foo $a $b $c
> >
> > foo $args ---------> wrong number of arguments error
> >
> > I wonder, did things still work the 'logical' way before the byte compiler?
>
> If by logical, you mean that a list is one element instead of a number of
> elements, I believe so, yes. But then, old Tcl hands could give a better
> response, since I'm too lazy to go thru the CHANGES file to see for sure.
>
> L
>
> --
> Use The Penguin, Luke!
>

I've been itching to ask this for ages, Use it for what? (ruder responses withheld
:-)

>
> Laurent Duperval
>
> -----------== Posted via Deja News, The Discussion Network ==----------
> http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Regards,
Bruce A.


Laurent Duperval

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
In article <36E79BF9...@rmc-ltd.com>,

"Bruce S. O. Adams" <bruce...@rmc-ltd.com> wrote:
> > So what if, instead, I have:
> >
> > set args "a b c"
> >
> > How would you expect
> >
> > foo $a
> >
> > to behave?
> >
>
> As far as the interpreter goes, I guess it has to give args its string
> interpretation by default
> and reinterpret it as a list later.

So you're doing special-case interpreting, which is what the parser tries to
avoid as much as possible. Personnaly, I prefer it that way.

I've the feeling that you're so used to programming in C that you're trying to
recreate pointers in Tcl. Reminds me of the old days when I had constant
problems with scanf(). :-)


> set myList [list]
>
> is better than either of the following:
>
> set myList ""
> set myList {}
>

I understand the former but why the latter? You're calling a procedure for
basically nothing. It's like doing:

cat > file
^D

instead of

touch file

to create an empty file. The results are the smae but you're wasting time and
energy for nothing.


> > Use The Penguin, Luke!
> >
>
> I've been itching to ask this for ages, Use it for what? (ruder responses
withheld
> :-)
>

Depends on what your bag is. :-)

L

--
Use The Penguin, Luke!

Laurent Duperval

Laurent Duperval

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
In article <36E78F...@mailserver.hursley.ibm.com>,

Paul Duffin <pdu...@mailserver.hursley.ibm.com> wrote:
> That is what I was talking about, a consistent way of specifying whether
> the object to be modified is a variable or a list object. This could
> either be done by having two different commands or one with different
> options.
>
> e.g.
> list::append list fred # Would create the list {list fred}
> list::append -variable list fred # Would append fred to the list in
> # variable list.
>

This is ok.

> or
>
> list::append list fred
> list::appendv list fred
>

UGh!

deletev, rangev, etc., ad nauseum.

Donal K. Fellows

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
In article <36E65F...@mailserver.hursley.ibm.com>,

Paul Duffin <pdu...@mailserver.hursley.ibm.com> wrote:
> Lets do some reference counting for current implementation of lreplace.
>
> set a [list 1 2 3 4]
> # 1 reference to list (from variable a).
>
> set a [lreplace $a 1 1]
> # While inside lreplace there are 2 references to list (1 from variable
> # a and one from the argument to lreplace). Therefore the list (but
> # not the elements) is duplicated.
>
> # After setting the variable a to the new list and resetting the interp
> # result there is 1 reference to the new list and 0 references to the
> # old list so the old list is freed. Therefore the duplication was
> # wasted.
>
> And now lets do some for in-place implementation.
>
> set a [list 1 2 3 4]
> # 1 reference to list (from variable a).
>
> lreplaceinplace a 1 1
> # While inside lreplace there is still only 1 reference to the object
> # from the variable so it can be modified directly without duplication.
>
> So you do gain from using "in-place" but I agree that you need both.
> What you need to come up with is a consistent naming scheme so it is
> easy to see which command is in-place and which is not.

I thought a bit about this a while back (months ago, while in the
middle of one of those Tcl-performance discussions,) and came up with
the following idea:

Your assessment of the reference-counting of the behaviour of lreplace
is correct, and when you are dealing with long lists, this additional
copy operation can bite rather much. However, a more efficient
version is possible without the introduction of a new version of
[lreplace] that takes a variable name as an argument. All it takes is
the introduction of a [getunset] command whose semantics are as
follows:

The command [getunset foo] returns the contents of the variable
foo, and at the same time unsets/empties that variable.

Like that, we can substitute [lreplaceinplace a 1 1] with:

set a [lreplace [getunset a] 1 1]

The performance will be virtually identical. A similar effect can be
done by the use of the construct illustrated below (which is in some
ways equivalent to [getunset] in this context.) Those functional
programmers among you might recognise something similar to the K
combinator...

bash$ tclsh8.0
% proc second {a b} { set a }
% for {set i 0} {$i<100000} {incr i} {
lappend a $i; lappend b $i
}
% proc t {} {
global a b
puts [time {set a [lreplace $a 40 80]}]
puts [time {set b [lreplace [second $b [set b {}]] 40 80]}]
}
% t
44030 microseconds per iteration
2952 microseconds per iteration
% unset a b
% for {set i 0} {$i<100000} {incr i} {
lappend a $i; lappend b $i
}
% llength $a
100000
% llength $b
100000
% t
41253 microseconds per iteration
2789 microseconds per iteration
% llength $a
99959
% llength $b
99959

Note that the second technique (with the [second] procedure) is nearly
15 times faster than the traditional technique when working on a list
of 10^5 elements. This is purely due to the lack of a copy, since the
code itself is considerably more complex (including an extra procedure
call.) I'll happily admit to being surprised just how much faster
this technique is... :^)

Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
Department of Computer Science, University of Manchester, U.K. +44-161-275-6137
--
"And remember, evidence is nothing." - Stacy Strock <sp...@adisfwb.com>

Donal K. Fellows

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
In article <7c8oub$1h4$1...@m1.cs.man.ac.uk>,

Donal K. Fellows <fell...@cs.man.ac.uk> wrote:
> % proc second {a b} { set a }

The pedants among you will have noticed that this procedure should
have been called "first" and not "second"...

Bruce S. O. Adams

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to

Laurent Duperval wrote:

> In article <36E79BF9...@rmc-ltd.com>,
> "Bruce S. O. Adams" <bruce...@rmc-ltd.com> wrote:
> > > So what if, instead, I have:
> > >
> > > set args "a b c"
> > >
> > > How would you expect
> > >
> > > foo $a
> > >
> > > to behave?
> > >
> >
> > As far as the interpreter goes, I guess it has to give args its string
> > interpretation by default
> > and reinterpret it as a list later.
>
> So you're doing special-case interpreting, which is what the parser tries to
> avoid as much as possible. Personnaly, I prefer it that way.
>

I am trying to guess how what is given to the byte-code virtual machine. Assuming

the set command creates a Tcl_Obj which internal representations are populated?
All those that are possible? just the string version read from the source file? or
does
it try to be as intelligent as possible (in my opinion a good idea, where it can
be done
reliably) and use the information available to guess (or rather know) which
interpretation
is likely to be used. Hence:

set foo [list]

Would mean that the Tcl_Obj representation of foo would be set up as a list,
should such
such a distinction be necessary. Whereas the situation for the quotes and braces
cases
is less clear.

>
> I've the feeling that you're so used to programming in C that you're trying to
> recreate pointers in Tcl. Reminds me of the old days when I had constant
> problems with scanf(). :-)
>

Pointer arithmetic tends to be a little suspect. I avoid it where possible, that
said I have
an assembly language & electronic engineering background (as well as AI) so I like
to
know how things work and use that knowledge to advantage where possible. :-)

>
> > set myList [list]
> >
> > is better than either of the following:
> >
> > set myList ""
> > set myList {}
> >
>
> I understand the former but why the latter? You're calling a procedure for
> basically nothing. It's like doing:
>
> cat > file
> ^D
>
> instead of
>
> touch file
>
> to create an empty file. The results are the smae but you're wasting time and
> energy for nothing.

This is the question, which someone who knows about the byte compiler will
have to answer.

>
> > > Use The Penguin, Luke!
> > >
> >
> > I've been itching to ask this for ages, Use it for what? (ruder responses
> withheld
> > :-)
> >
>
> Depends on what your bag is. :-)
>
> L
>

Penguin skin bags? I'll get the RSPCA on you :-)

BA

Bruce S. O. Adams

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to

Laurent Duperval wrote:

> In article <36E78F...@mailserver.hursley.ibm.com>,


> Paul Duffin <pdu...@mailserver.hursley.ibm.com> wrote:
> > That is what I was talking about, a consistent way of specifying whether
> > the object to be modified is a variable or a list object. This could
> > either be done by having two different commands or one with different
> > options.
> >
> > e.g.
> > list::append list fred # Would create the list {list fred}
> > list::append -variable list fred # Would append fred to the list in
> > # variable list.
> >
>
> This is ok.
>
> > or
> >
> > list::append list fred
> > list::appendv list fred
> >
>
> UGh!
>
> deletev, rangev, etc., ad nauseum.
>
> L
>
> --
> Use The Penguin, Luke!
>
> Laurent Duperval
>
> -----------== Posted via Deja News, The Discussion Network ==----------
> http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Wouldn't this kind of thing be more appropriately done as an operator.
Of course we don't want to go too perly :-)


Bryan Oakley

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
Paul Duffin wrote:
> That is what I was talking about, a consistent way of specifying whether
> the object to be modified is a variable or a list object. This could
> either be done by having two different commands or one with different
> options.
>
> e.g.
> list::append list fred # Would create the list {list fred}
> list::append -variable list fred # Would append fred to the list in
> # variable list.
>
> or
>
> list::append list fred
> list::appendv list fred

I much prefer the latter, if anyone is counting votes.

Bruce S. O. Adams

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to

"Donal K. Fellows" wrote:

> In article <36E65F...@mailserver.hursley.ibm.com>,

> % proc second {a b} { set a }

> Donal.
> --
> Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
> Department of Computer Science, University of Manchester, U.K. +44-161-275-6137
> --
> "And remember, evidence is nothing." - Stacy Strock <sp...@adisfwb.com>

This is very impressive. I would suggest it gets dumped in a FAQ somewhere
(perhaps under
a section on performance optimisation. TDDPer's take note :-).
What does this suggest about the byte-compiler implementation to you? It seems
to me
that in the first case $a is evaluated to the full list (i.e. as you say copied)
whilst in the
second case $b isn't, becuase it is an argument to a function. i.e.

In function calls $ prefixed variables are not expanded (i.e. passed by value),
they are
passed by reference.

Is this observation correct? It makes sense from an implementation point of view.
Regards,
Bruce A.

Bob Techentin

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
"Donal K. Fellows" wrote:
>
> Note that the second technique (with the [second] procedure) is nearly
> 15 times faster than the traditional technique when working on a list
> of 10^5 elements.

Wow! That's really cool. I take back everything I said about
not believing that "in place" list operations would be faster.

Heribert Dahms

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
In <7c8n5u$4a3$1...@nnrp1.dejanews.com> ldup...@my-dejanews.com writes:

: It's like doing:


:
: cat > file
: ^D
:
: instead of
:
: touch file
:
: to create an empty file. The results are the smae but you're wasting time and
: energy for nothing.

Warning! touch doesn't empty an existing file!
I use
cp /dev/null file
or in some shells
>file

BTW, I use "make clean" instead of "touch *.c" to preserve modification date.


Bye, Heribert (da...@ifk20.mach.uni-karlsruhe.de)

Philippe Le Foll

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Bryan Oakley wrote:
>
> Paul Duffin wrote:
> > That is what I was talking about, a consistent way of specifying whether
> > the object to be modified is a variable or a list object. This could
> > either be done by having two different commands or one with different
> > options.
> >
> > e.g.
> > list::append list fred # Would create the list {list fred}
> > list::append -variable list fred # Would append fred to the list in
> > # variable list.
> >
> > or
> >
> > list::append list fred
> > list::appendv list fred
>

Why not using "append list fred" for both, when you enter in C mode
you can check if object fred is a list, if it is just take it as an
object
in which case you dont have to transform it in a string. If the internal
rep is not valid in this case update internal rep from external rep.
TCL-8 provides a C routine that do this automatically: Tcl_ConvertToType

Philippe
--
La vitesse peut tuer: Utilisez Windows (o^o)
======================================oOO==(~)==OOo========
phi...@fridu.com, Philippe Le Foll Fridu (+33/0)609.794.781

Paul Duffin

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Bruce S. O. Adams wrote:
>
> This is the question, which someone who knows about the byte compiler will
> have to answer.
>

I gave cut out the question because it was lost in the middle of the
replying and snipping, either that or I haven't got the whole discussion
yet. Anyway what Bruce wants to know is how the byte compiler and
Tcl_Objs work and how they affect the structure of your Tcl programs.

1) Generally Tcl internally passes Tcl_Objs around as much as possible,

2) They are only duplicated if they need to be modified while they are
shared. (copy-on-write)

3) The string representations are only created when necessary.

4) Eval loses all Tcl_Obj information.

5) Common literals are all resolved to the same Tcl_Obj. (I think this
takes place within procedures). This was affected by some problems
in early version of Tcl 8.0 when shared Tcl_Objs had their string
representation modified but it was a symptom rather than the cause.

Here goes with some simple examples.

set a {alpha beta gamma}

a new string Tcl_Obj (1) is created and used as 'a's value.

lindex $a 1

the Tcl_Obj (1) is turned into a list, this involves the
creation of three more string Tcl_Obj (2,3,4), one for each
element.

set b {alpha beta gamma}

Tcl_Obj (1) is reused as the value of 'b'.

set c [list alpha beta gamma]

three Tcl_Objs (5,6,7) are created, one for each parameter to
list. The list command creates a list Tcl_Obj (8) with no
string representation.

puts $c

Tcl_Obj (8) now has its string representation updated because
it is needed by puts.

lindex $b 1

the Tcl_Obj (1) is already a list, this just accesses one of its
elements.

proc foo1 {} {
set a "hello"
set b "hello"
set c "hello"
}

is as efficient as

proc foo1 {} {
set a "hello"
set b $a
set c $b
}

once it has been compiled, but it does take more time and more effort
to compile.

Donal K. Fellows

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
In article <36E7EF01...@rmc-ltd.com>,

Bruce S. O. Adams <bruce...@rmc-ltd.com> wrote:
> Laurent Duperval wrote:
>> "Bruce S. O. Adams" <bruce...@rmc-ltd.com> wrote:
[attribution lost?]

>>>> So what if, instead, I have:
>>>>
>>>> set args "a b c"
>>>>
>>>> How would you expect
>>>>
>>>> foo $a
>>>>
>>>> to behave?

Assuming you mean the $a and the args to be the same variable (:^)
then the command foo will be called with a single argument whose
string format will be "a b c". It's non-string representation is
immaterial, and in any case could be anything.

>>> As far as the interpreter goes, I guess it has to give args its
>>> string interpretation by default and reinterpret it as a list
>>> later.

No. You are assuming that a value's string and list interpretations
are mutually exclusive. This is not the case.

>> So you're doing special-case interpreting, which is what the parser
>> tries to avoid as much as possible. Personnaly, I prefer it that
>> way.
>
> I am trying to guess how what is given to the byte-code virtual
> machine. Assuming the set command creates a Tcl_Obj which internal
> representations are populated?

Not necessarily. The compiler might initially populate the
internalRep for numeric values (I've not checked that,) but won't for
a list, since the list-ness is purely due to the interpretation of the
string in question. However, if the object in question really is a
list, then the second time through the execution, it will already have
a list interpretation.

> All those that are possible? just
> the string version read from the source file? or does it try to be
> as intelligent as possible (in my opinion a good idea, where it can
> be done reliably) and use the information available to guess (or
> rather know) which interpretation is likely to be used. Hence:
>
> set foo [list]
>
> Would mean that the Tcl_Obj representation of foo would be set up as
> a list, should such such a distinction be necessary. Whereas the
> situation for the quotes and braces cases is less clear.

Yes, but the distinction is not necessary. And yes, the behaviour
with quotes and braces is less clear initially, but will become
clearer with execution.

>>> set myList [list]
>>>
>>> is better than either of the following:
>>>
>>> set myList ""
>>> set myList {}
>>
>> I understand the former but why the latter? You're calling a procedure for

>> basically nothing. It's like doing:


>>
>> cat > file
>> ^D
>>
>> instead of
>>
>> touch file
>>
>> to create an empty file. The results are the smae but you're
>> wasting time and energy for nothing.
>

> This is the question, which someone who knows about the byte compiler will
> have to answer.

You could do all that, but there is basically no point, since the
conversion of an empty string into an empty list is a really cheap
operation, only performing one malloc() which would need to be done
anyway (to supply the structure that implements the list's internal
rep.)

Paul Duffin

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Bruce S. O. Adams wrote:
>
> > Note that the second technique (with the [second] procedure) is nearly
> > 15 times faster than the traditional technique when working on a list
> > of 10^5 elements. This is purely due to the lack of a copy, since the
> > code itself is considerably more complex (including an extra procedure
> > call.) I'll happily admit to being surprised just how much faster
> > this technique is... :^)
> >
> > Donal.
> > --
> > Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
> > Department of Computer Science, University of Manchester, U.K. +44-161-275-6137
> > --
> > "And remember, evidence is nothing." - Stacy Strock <sp...@adisfwb.com>
>
> This is very impressive. I would suggest it gets dumped in a FAQ somewhere
> (perhaps under
> a section on performance optimisation. TDDPer's take note :-).
> What does this suggest about the byte-compiler implementation to you? It seems
> to me
> that in the first case $a is evaluated to the full list (i.e. as you say copied)
> whilst in the
> second case $b isn't, becuase it is an argument to a function. i.e.
>
> In function calls $ prefixed variables are not expanded (i.e. passed by value),
> they are
> passed by reference.
>
> Is this observation correct? It makes sense from an implementation point of view.
> Regards,
> Bruce A.

No, it is actually much simpler than that, there is no special treatment
of function arguments.

Tcl passes all values around internally by passing pointers to Tcl_Objs
they are only duplicated if necessary, i.e. when they need to be
modified and they are shared.

e.g.

set a "hello"

Tcl_Obj (1) is created.

proc fred {b} {

'b' references Tcl_Obj (1) which now has a reference count of 2.
1 from 'a' and 1 from 'b'.

append b " fred"

a duplicate (2) of Tcl_Obj (1) is created because Tcl_Obj (1)
is shared and the string is appended to that Tcl_Obj (2).

}

fred $a

Paul Duffin

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Bruce S. O. Adams wrote:
>
> Laurent Duperval wrote:
>
> > In article <36E78F...@mailserver.hursley.ibm.com>,

> > Paul Duffin <pdu...@mailserver.hursley.ibm.com> wrote:
> > > That is what I was talking about, a consistent way of specifying whether
> > > the object to be modified is a variable or a list object. This could
> > > either be done by having two different commands or one with different
> > > options.
> > >
> > > e.g.
> > > list::append list fred # Would create the list {list fred}
> > > list::append -variable list fred # Would append fred to the list in
> > > # variable list.
> > >
> >
> > This is ok.

> >
> > > or
> > >
> > > list::append list fred
> > > list::appendv list fred
> > >
> >
> > UGh!
> >
> > deletev, rangev, etc., ad nauseum.
> >
> > L
> >
> > --
> > Use The Penguin, Luke!
> >
> > Laurent Duperval
> >
> > -----------== Posted via Deja News, The Discussion Network ==----------
> > http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
>
> Wouldn't this kind of thing be more appropriately done as an operator.
> Of course we don't want to go too perly :-)

NO NO NO NO NO

The Tcl way of doing this is through options or having different
commands. As for whether it should be options or commands there are
arguments for and against.

1) Options require more processing of arguments.
2) Options mean that the - character has special meaning so you need
to override this with something like --.
3) It allows for adding future options without breaking existing code.

1) Simple commands require simpler argument processing.
2) Difficult to expand in the future without breaking existing code.
3) Number of commands increases exponentially with each new difference.
4) Pollutes the command namespace.

On balance I prefer using Options.

Donal K. Fellows

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
In article <36E6F7B2...@mayo.edu>,

Bob Techentin <techenti...@mayo.edu> wrote:
> I guess the only real way to find out if there is any real
> performance boost from "in place" list functions is to write
> one. *sigh*

But it is fairly easy to simulate one. And you do get a hefty
performance boost with long lists provided the reference in the
variable is the only one to the list being operated on. Otherwise,
you get no gain at all (naturally.)

Bruce S. O. Adams

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to

Philippe Le Foll wrote:

> Bryan Oakley wrote:


> >
> > Paul Duffin wrote:
> > > That is what I was talking about, a consistent way of specifying whether
> > > the object to be modified is a variable or a list object. This could
> > > either be done by having two different commands or one with different
> > > options.
> > >
> > > e.g.
> > > list::append list fred # Would create the list {list fred}
> > > list::append -variable list fred # Would append fred to the list in
> > > # variable list.
> > >

> > > or
> > >
> > > list::append list fred
> > > list::appendv list fred
> >
>

> Why not using "append list fred" for both, when you enter in C mode
> you can check if object fred is a list, if it is just take it as an
> object
> in which case you dont have to transform it in a string. If the internal
> rep is not valid in this case update internal rep from external rep.
> TCL-8 provides a C routine that do this automatically: Tcl_ConvertToType
>
> Philippe
> --
> La vitesse peut tuer: Utilisez Windows (o^o)
> ======================================oOO==(~)==OOo========
> phi...@fridu.com, Philippe Le Foll Fridu (+33/0)609.794.781

This seems to be a sensible suggestion. This would need to be approached very
carefully
though as it could cause confusion. Another issue is what if the object has a
valid interpretation
as both a list and a variable name. Remember, Tcl is not a strongly typed language
(though an _optional_
extension to make it behave as one is a good idea). e.g.

set VariableName "Fred Bloggs"
set $VariableName "Value of fred"

list::append $VariableName

vs.

set VariableName [list Fred Bloggs]

list::append $VariableName

Fortunately in this case, we have the "list" function to distinguish them. For
this reason I would again
advocate using list to make list variables as opposed to strings or quotes, even
empty ones.
i.e.
set emptyList [list]
Regards,
Bruce A.


Doug Hughes

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
> Donal.
> --
> Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
> Department of Computer Science, University of Manchester, U.K. +44-161-275-6137
> --
> "And remember, evidence is nothing." - Stacy Strock <sp...@adisfwb.com>

The interesting question then becomes, at what point (number of list elements)
does the call of the additional procedure become less efficient than using
lreplace directly? I'm guessing, that for a small list of some X elements
(50, 100, 400?) that the overhead of the additional procedure call might
make it slower. I could be totally wrong though. Have you put your test
through a loop to determine if there is some 'fulcrum' around which the
"second" method becomes more efficient? Or is it always more efficient?

nice work, btw.

--
____________________________________________________________________________
Doug Hughes Engineering Network Services
System/Net Admin Auburn University
do...@eng.auburn.edu

Paul Duffin

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Bruce S. O. Adams wrote:
>
> snip

I thought that as we were discussing this sort of thing I would let you
know what I am currently working on.

Feather

A set of MUTABLE Tcl object types which will eventually include the
following.

map Tcl hash table (array)
vector Tcl list
string Tcl string (much less copying)
struct C structures
chain Linked list, very efficient manipulation (no indexing)

Plus generic methods for accessing these types, along the lines of the
STL for C++.

These should really improve the ability of Tcl to manipulate complex
data structures.

Also in plan but requiring some 'minor' changes to the core are the
following.

proper lambda functions
curried functions

Bruce S. O. Adams

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to

Paul Duffin wrote:

> Bruce S. O. Adams wrote:
> >
> > snip
>
> I thought that as we were discussing this sort of thing I would let you
> know what I am currently working on.
>
> Feather
>
> A set of MUTABLE Tcl object types which will eventually include the
> following.
>
> map Tcl hash table (array)

Something, worth noting. A collegue of mine recently ran into a hard coded

limit on Tcl hash arrays. Something, about doubling the size every time
buckets
were filled to more than three items. Worth, making things more flexible
if you
can :-).

>
> vector Tcl list
> string Tcl string (much less copying)

Please elaborate.


>
> struct C structures

Yes please :-). This is a sorely lacking area in pure TCL.


>
> chain Linked list, very efficient manipulation (no indexing)
>

Surely you can have indexing, just very inefficiently :-)

>
> Plus generic methods for accessing these types, along the lines of the
> STL for C++.
>
> These should really improve the ability of Tcl to manipulate complex
> data structures.
>
> Also in plan but requiring some 'minor' changes to the core are the
> following.
>
> proper lambda functions
> curried functions
>
> --
> Paul Duffin
> DT/6000 Development Email: pdu...@hursley.ibm.com
> IBM UK Laboratories Ltd., Hursley Park nr. Winchester
> Internal: 7-246880 International: +44 1962-816880

Cool. Any idea when it will be finished? What kind of license (please say
free :-)?
What is the significance of mutable in the above. Its obviously important
as its
capitalised. I can't see any relation to the C++ keyword though.
Regards,
Bruce A.

P.S. I'm sure you'll let us all know if require assistance, unlikely though
it is :-)


Alexandre Ferrieux

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Paul Duffin wrote:
>
> I thought that as we were discussing this sort of thing I would let you
> know what I am currently working on.
>
> Feather
>
> A set of MUTABLE Tcl object types which will eventually include the
> following...

Just wanted to repeat what I've already told in private e-mail: I
believe that *this* is of utmost importance to Tcl's future. If all I've
seen in discussions with Paul comes to the core pretty soon, it may be
a serious argument to rethink comparisons with Python (for those who
care to make a choice).

-A

Mike Tiller

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> writes:

I agree this is very important.

I was hoping the discussion about a "standard tcl library" would
continue. Unfortunately, the interest seems to have chilled a bit. I
think a package like "Feather" would be an ideal candidate for such a
library.

Just imagine being able to do a complete build from source and get all
this data structure support built in. In my mind, a data structures
package (with a good C language API) is the single most important
thing you could put in a standard library. It would allow extension
writers to build on these data structures. Packages like BLT would no
longer have to support a "vector" primitive. Hierarchy widgets could
be passed a "tree" structure.

I haven't heard any comments from Scriptics on the subject of a
"standard Tcl library". To summarize, even if the only thing in a
"standard Tcl library" was better data structure suppport it would be
well worth it!

My $0.02

> -A

--
Michael Tiller
Ford Motor Company

Heribert Dahms

unread,
Mar 13, 1999, 3:00:00 AM3/13/99
to
In <36E93A53...@rmc-ltd.com> bruce...@rmc-ltd.com writes:

: Something, worth noting. A collegue of mine recently ran into a hard coded


:
: limit on Tcl hash arrays. Something, about doubling the size every time
: buckets
: were filled to more than three items. Worth, making things more flexible
: if you
: can :-).

Hard coded limit? No, in the contrary, it's as flexible as possible given
the speed/memory size tradeoff. You start with a small array, and if
your keys hash rather uniformly, the hash table well be kept balanced.
A million entries only require 10 times doubling, on the average!
Setup of an array with a lot of keys may be slow, but after that access
is very fast.


Bye, Heribert (da...@ifk20.mach.uni-karlsruhe.de)

lvi...@cas.org

unread,
Mar 13, 1999, 3:00:00 AM3/13/99
to

According to Paul Duffin <pdu...@mailserver.hursley.ibm.com>:
:Bruce S. O. Adams wrote:
:>
:> This is the question, which someone who knows about the byte compiler will
:> have to answer.
:>
:
:I gave cut out the question because it was lost in the middle of the

:yet. Anyway what Bruce wants to know is how the byte compiler and

:Tcl_Objs work and how they affect the structure of your Tcl programs.

And in particular it is my rememberance of the thread (which has disappeared
from my system by now) that the issue was:

When the byte compiler is building the bytes for a line such as

myownproc $list other args

does it expand the list, then byte compile, or does it byte compile first
and leave the list expansion for possible exansion later.

And it's my understanding that the list is expanded before byte compilation
occurs.

--
<URL: mailto:lvi...@cas.org> Quote: Saving the world before bedtime.
<*> O- <URL: http://www.purl.org/NET/lvirden/>
Unless explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.

Paul Duffin

unread,
Mar 15, 1999, 3:00:00 AM3/15/99
to
lvi...@cas.org wrote:
>
> According to Paul Duffin <pdu...@mailserver.hursley.ibm.com>:
> :Bruce S. O. Adams wrote:
> :>
> :> This is the question, which someone who knows about the byte compiler will
> :> have to answer.
> :>
> :
> :I gave cut out the question because it was lost in the middle of the
>
> :yet. Anyway what Bruce wants to know is how the byte compiler and
> :Tcl_Objs work and how they affect the structure of your Tcl programs.
>
> And in particular it is my rememberance of the thread (which has disappeared
> from my system by now) that the issue was:
>
> When the byte compiler is building the bytes for a line such as
>
> myownproc $list other args
>
> does it expand the list, then byte compile, or does it byte compile first
> and leave the list expansion for possible exansion later.
>
> And it's my understanding that the list is expanded before byte compilation
> occurs.
>

When it byte compiles the line it treats '$line' as a variable,
'myownproc' as a command and 'other' and 'args' as simple word
arguments. e.g. you end up with something like the following.

PUSH (myownproc)
LOAD_SCALAR (<index>) or PUSH (list), LOAD_SCALAR_STK ()
PUSH (other)
PUSH (args)
INVOKE_STK (4)

The stack is basically an array of Tcl_Obj * which grows upwardly in
memory. The difference between the two different LOAD instructions
depend on whether the variable is part of a proc or global. If it is in
the proc then it gets added to an indexed table of variables so it is
very fast, otherwise it gets looked up by name.

Bob Techentin

unread,
Mar 15, 1999, 3:00:00 AM3/15/99
to
Doug Hughes wrote:

> The interesting question then becomes, at what point (number of list elements)
> does the call of the additional procedure become less efficient than using
> lreplace directly?

Well, running a loop and a little timer answers that question for me.
Using Donal's test case (removing elements 40-80), on an HP 715/100,
it looks like break-even is at about 300 elements.

list size relative speed of Donal's replace-in-place
--------------------------------------------------------------------
100 0.66
200 0.77
300 1.0
400 1.1
500 1.3
1000 1.6
2000 2.3
5000 6.8
10000 11
100000 23

So I'd say that you could gain some benefit from this technique for
lists that you know will be over 1000 long.

Donal K. Fellows

unread,
Mar 15, 1999, 3:00:00 AM3/15/99
to
In article <36E901...@mailserver.hursley.ibm.com>,

Paul Duffin <pdu...@mailserver.hursley.ibm.com> wrote:
> No, it is actually much simpler than that, there is no special treatment
> of function arguments.

But you must remember that all program literals have an extra
reference to them from the block of bytecode that the bytecode
evaluator is currently executing.

> Tcl passes all values around internally by passing pointers to Tcl_Objs
> they are only duplicated if necessary, i.e. when they need to be
> modified and they are shared.
>
> e.g.
>
> set a "hello"
>
> Tcl_Obj (1) is created.

a now contains a reference to an object, but that object also has
another reference to it in (the bytecode-form of) the original program
text.

> proc fred {b} {
>
> 'b' references Tcl_Obj (1) which now has a reference count of 2.
> 1 from 'a' and 1 from 'b'.

No. The reference count will be three.

> append b " fred"
>
> a duplicate (2) of Tcl_Obj (1) is created because Tcl_Obj (1)
> is shared and the string is appended to that Tcl_Obj (2).

Correct, except that the original will still have two references to it.

> }
>
> fred $a

It takes careful thought to wrap your head around references. I've
just got loads of practise from working at a low level in Tcl8 and
other (private) library code. I know more about this than I want.

Donal K. Fellows

unread,
Mar 15, 1999, 3:00:00 AM3/15/99
to
In article <1999Mar1...@netman.eng.auburn.edu>,

Doug Hughes <do...@eng.auburn.edu> wrote:
> The interesting question then becomes, at what point (number of list
> elements) does the call of the additional procedure become less
> efficient than using lreplace directly? I'm guessing, that for a

> small list of some X elements (50, 100, 400?) that the overhead of
> the additional procedure call might make it slower. I could be
> totally wrong though. Have you put your test through a loop to
> determine if there is some 'fulcrum' around which the "second"
> method becomes more efficient? Or is it always more efficient?

I hadn't checked, but with the aid of a little more code I remedy this.

proc first {a b} { set a }
proc t {{n 100000}} {
for {set i 0} {$i<$n} {incr i} {


lappend a $i; lappend b $i
}

puts [time {set a [lreplace $a 40 80]}]

puts [time {set b [lreplace [first $b [set b {}]] 40 80]}]
}

I find that the cusp point (for the particular replacement I'm
performing) seems to be about 90 elements. Below that, I'm running
into problems with errors due to the granularity of gettimeofday() on
Solaris, and it would take much more work with the code to make a test
that ran multiple times to get a more accurate figure...

At no point did I measure the first code as being slower than the
second. This works particularly well because the procedure [first] is
coded to be maximally efficient (only an empty procedure will have a
shorter encoding IIRC) and I'm not actually bothering to fiddle around
with [unset] which would have scuppered some of the performance quite
a bit.

> nice work, btw.

:^)

All it takes is a bit of thought about exactly where the references
are and what the exact evaluation order of Tcl is. I'm still
impressed at how big the effect is. I didn't expect it to be that big.

Donal K. Fellows

unread,
Mar 15, 1999, 3:00:00 AM3/15/99
to
In article <36E7F425...@rmc-ltd.com>,

Bruce S. O. Adams <bruce...@rmc-ltd.com> wrote:
> What does this suggest about the byte-compiler implementation to
> you? It seems to me that in the first case $a is evaluated to the
> full list (i.e. as you say copied) whilst in the second case $b
> isn't, becuase it is an argument to a function. i.e.
>
> In function calls $ prefixed variables are not expanded
> (i.e. passed by value), they are passed by reference.
>
> Is this observation correct? It makes sense from an implementation
> point of view.

This view is sort-of correct; the value in the variable is passed in
by reference (and is actually the case with all other values passed in
too.) This is because variables themselves only ever contain
references, and why calling values Tcl_Obj* really makes sense at an
implementation level. You just can't pretend that they are objects in
a "C++ at the scripting level" sense.