I have been working on Tcl for about 6 months and find it to be an
efficient product, but there are always going to be things that would
make it better. ;)
Is there a simple way (single command) to count the number of matching
elements in a list. I have been using a short process to sort the
elements and then run through the list looking for a change in the
indexed element being evaluated. Simple, but there must be an easier
way to do this.
Sample Data (sorted):
dave dave dave dave mary mary jane jane jane ted
Looking for this type of output (I know one command will probably not
do all of this) But there has got to be a more efficient method for
getting a list of elements and their quantity within the list):
Dave - 4
mary - 2
jane - 3
ted - 1
Thanks in advance!
Dave
P.S. I have considered using arrays, but (don't laugh) I have not have
much success grasping the concepts and utilizations as outlined within
the various books I have read on the subject of Arrays. It seems like
Lists are more powerful (at least easier to work with).
Sent via Deja.com http://www.deja.com/
Before you buy.
proc listCount {ll} {
foreach name $ll {
if {[info exists count($name)]} {
incr count($name)
} else {
set count($name) 1
}
}
foreach name [array names count] {
puts "$name $count($name)"
}
}
Then, given an unordered list as input, listCount will
print (one per line) the item names and the count of appearances.
Does this do it?
Miguel
In article <861rp4$5or$1...@nnrp1.deja.com>,
[snip]
>
> P.S. I have considered using arrays, but (don't laugh) I have not have
> much success grasping the concepts and utilizations as outlined within
> the various books I have read on the subject of Arrays. It seems like
> Lists are more powerful (at least easier to work with).
Your intuition is correct! Use an array. Don't even bother sorting the
list. Like this:
set listData [list dave mary dave mary ted mary jane dave jane jane
ted]
if { [info exists counter] } {unset counter}
foreach item $listData {
if { [info exists counter($item)] } {
incr counter($item)
} else {
set counter($item) 1
}
}
array get counter
Bob
--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html
Takes out the whip...
> I have been working on Tcl for about 6 months and find it to be an
> efficient product, but there are always going to be things that would
> make it better. ;)
>
> Is there a simple way (single command) to count the number of matching
> elements in a list. I have been using a short process to sort the
> elements and then run through the list looking for a change in the
> indexed element being evaluated. Simple, but there must be an easier
> way to do this.
>
>
<snip>
>
> P.S. I have considered using arrays, but (don't laugh) I have not have
> much success grasping the concepts and utilizations as outlined within
> the various books I have read on the subject of Arrays. It seems like
> Lists are more powerful (at least easier to work with).
>
They are? I dunno, I use arrays much more often than lists. Any way, here's
how I would do it, which probably show my ignorance. There will be some
problems if your list contains items with whitespace in them and said list
is not properly constructed.
# Doesn't need to be sorted
set aList [list dave dave dave dave mary mary jane jane jane ted]
foreach item $aList {
if [info exists arrayName($item)] {
incr arrayName($item)
} else {
set arrayName($item) 1
}
}
Then:
array names arrayName
contains a list of values found in your list so
foreach ix [array names arrayName] {
puts "There are $arrayName($ix) items of type $ix"
}
L
--
Penguin Power! Nothing I say reflects the views of my employer
Laurent Duperval mailto:laurent....@cgi.ca
CGI - FWFM Project Phone: (514) 350-3368
Does
% set sample_list {dave dave dave dave mary mary jane jane jane ted}
dave dave dave dave mary mary jane jane jane ted
% proc count_members list {
# Incidentally, there's no requirement that $list be sorted.
foreach member $list {
if [info exists count($member)] {
incr count($member)
} else {
set count($member) 1
}
}
parray count
}
% count_members $sample_list
count(dave) = 4
count(jane) = 3
count(mary) = 2
count(ted) = 1
help?
--
Cameron Laird <cla...@NeoSoft.com>
Business: http://www.Phaseit.net
Personal: http://starbase.neosoft.com/~claird/home.html
THANKS EVERYONE FOR THE PROMPT RESPONSES!
What you all provided was educational! Here is some thing to think
about... in this problem
This script basically looks at a log file and prepars a report in a web
browser for the user to review the successful & failed access attempts
to the system. the functional part of this program is in the
following process: (sorry for the extra HTML calls, but this spits out
the table line by line as the program runs to the Client browser..)
-------------------- SNIP FROM SCRIPT -------------------------
proc count_names {names search_type} {
set names [lsort $names]
set temp_names ""
set counter 1
puts "<table>"
puts "<th align=\"center\" colspan=\"2\"><h2>Report of
$search_type</th>"
puts "<tr><td width=\"50%\"><h4>LOGIN NAME USED</font></td><td
width=\"50%\"><h4>QTY USED</td></tr>"
for {set i 0} {$i < [llength $names]} {incr i} {
set element [lindex $names $i]
if {![string match *$element* $temp_names]} {
puts "<tr><td width=\"50%\">[lindex $names [expr $i -
1] ]</td><td width=\"50">$counter</td></tr>"
set temp_names ""
lappend temp_names $element
set counter 0
}
if {[string match *$element* $temp_names]} {
incr counter
}
}
puts "</table></p>"
}
-------------------- END SNIP ----------------
The thing about the above process is that it does not know anything
about the number, name or type of data that it is intrepreting. It
basically sorts the data and then starts a count up through the then of
the list looking for similar elements, counting them, then reseting the
counter for the next non-similar element.
I forget to emphasize the fact that I have NO IDEA what type of data
will exist in the list --- hence making the count(dave), count (mary),
count(jane).... difficult, although I guess not impossible with some
cleaver dual arrays tracking/reporting?
Any other ideas given this new information. BTW, this code IS WORKING
correctly. I just hate doing long winded code when a more elegant
solution may exist (except PERL). ;)
Sorry for the lengthy message, I appreciate your help.
Dave
BTW, I have tried to use cgi.tcl for some of my original code with this
to better format/simplify the table. I was NOT able to get a variable
to be processed within the "cgi_tr/th" commands - it intrepreted the
$variable as a "$variable" not the value of the "variable". Anyone
have any ideas on how to get that to work correctly - maybe I
overlooked something.
If I understand you at all, this is an example of
why associative arrays are a Good Thing. You can
do calculations with
count($datum)
whatever kind of screwy value $datum holds.
'Nother example of the Tao of Tcl: reduce
for {set i 0} {$i < [llength $names]} {incr i} {
set element [lindex $names $i]
...
to
foreach element $names {
...
You'll be happier--or at least have the freedom to
turn your attention elsewhere.
And, working in the spirit of "There's more than one way to do it"
handily borrowed from the perl community, does the following work but
not instruct quite as much? :^)
proc count_members list {
foreach x $list {
if {[catch {incr count($x)}]} {[set count($x) 1]}
}
parray count
}
It should also be faster in the case where you have a lot of repetitions
of relatively few values, following the standard recommendation to
optimise for the normal case.
Donal.
--
Donal K. Fellows (at home)
--
FOOLED you! Absorb EGO SHATTERING impulse rays, polyester poltroon!!
I think you got carried away with grouping.
The action of the [if] should be just
{set count($x) 1}
The two one-liner-ish variants that interest
me less are
proc count_members list {
foreach x $list {
set count($x) [expr {[info exists count($x)] ? $count($x) + 1 : 1}]
}
parray count
}
and
proc count_members list {
foreach x $list {
lappend list($x) {}
}
foreach name [array names list] {
puts "count($x) = [llength $list($x)]"
}
}
I suspect there's also an esoteric coding based
on a lambda calculation that really will fit in
one (logical) line, but I haven't reached it yet.
Too lazy even to get the code right...
> foreach x $list {
> expr {[catch {incr count($x)}] || [set count($x) 1]}
&& (D'oh!)
> }
--
| Don Porter, D.Sc. Mathematical and Computational Sciences Division |
| donald...@nist.gov Information Technology Laboratory |
| http://math.nist.gov/mcsd/Staff/DPorter/ NIST |
|______________________________________________________________________|
Too lazy to try right now, but how would it compare with
foreach x $list {
expr {[catch {incr count($x)}] || [set count($x) 1]}
}
?
Don Porter wrote:
>
> Too lazy to try right now, but how would it compare with
>
> foreach x $list {
> expr {[catch {incr count($x)}] || [set count($x) 1]}
> }
>
> ?
I like it, but should be:
foreach x $list {
expr {[catch {incr count($x)}] && [set count($x) 1]}
}
:)
--
Jorge Pacios
jpa...@gmv.es
Two questions, relating to the docs:
(1) If I understand it correctly, your proposal works because
expr {a && b} will not evaluate b whenever a is zero; right?
BUT: if this is true, it is not in the docs! The manual for
expr does not mention this feature; I think it should ...
(2) How come parray is not documented in the man pages of tcl?
I found out about it from your example; thanks.
Miguel
> (1) If I understand it correctly, your proposal works because
> expr {a && b} will not evaluate b whenever a is zero; right?
> BUT: if this is true, it is not in the docs! The manual for
> expr does not mention this feature; I think it should ...
>
SORRY! Reread the man page for expr more carefully; the
feature is documented, and I did make an ass of myself ...
It's documented. From expr(n):
The &&, ||, and ?: operators have ``lazy evaluation'',
just as in C, which means that operands are not evaluated
if they are not needed to determine the outcome. For
example, in the command
expr {$v ? [a] : [b]}
only one of [a] or [b] will actually be evaluated, depend-
ing on the value of $v. Note, however, that this is only
true if the entire expression is enclosed in braces; oth-
erwise the Tcl parser will evaluate both [a] and [b]
before invoking the expr command.
>(2) How come parray is not documented in the man pages of tcl?
> I found out about it from your example; thanks.
See library(n).
I'll bite. Edit/Copy/Paste/Viloa! It looks to me like Donal wins. :-)
I hadn't thought of exploiting [catch] in that manner. Thanks for the
tip.
Although there is a little variance between Tcl 8.0.4 and 8.2.0, the
timings look basically like this (your absolute milage may vary):
Cameron Laird (and others) 36
Donal Fellows 25
Don Porter 30
Bob
--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html
proc count_members1 list {
foreach member $list {
if [info exists count($member)] {
incr count($member)
} else {
set count($member) 1
}
}
}
proc count_members2 list {
foreach x $list {
if {[catch {incr count($x)}]} {set count($x) 1}
}
}
proc count_members3 list {
foreach x $list {
expr {[catch {incr count($x)}] && [set count($x) 1]}
}
}
# build a list of 10,000 items
set items [list john paul jones mary]
for {set i 0} {$i<10000} {incr i} {
lappend data [lindex $items [expr {int(rand()*[llength $items])}]]
}
foreach proc [info proc count_members*] {
puts ""
puts "$proc"
puts [time {$proc $data} 10]
}
Well, not quite as fast as it should (hence Donal's remark about
optimizing only the normal case): a caught error costs many cycles
building ::errorInfo. There's that pending RFE about a [catch -fast]
that omits the ::errorInfo part, yielding the same speed for positive
and negative catches. Jeff, 8.4 ?
At the same time, the very frequent idiom above could use smarter core
support. Eons ago I suggested [array setdefault count 0;foreach x $list
{incr count($x)}], which is actually doable in pure Tcl with a read
trace, but clearly we're talking about making the silicon scream [*]
here, so a C implementation is due. Nobody seemed to care...
[*] The overhead of a Tcl-level read trace hits even for normal
(existing) elements of the array, which completely dominates the slight
gain over [catch] or [info exists[...
> foreach x $list {
> lappend list($x) {}
> }
> foreach name [array names list] {
> puts "count($x) = [llength $list($x)]"
> }
Actually I don't know the figures, but intuitively I suspect this one to
be even faster than [catch] if the individual counts stay reasonably
bound. Also, maybe using single characters instead of list elements
could save some memory (I don't know for CPU cycles). I often use this
trick in Bourne Shell to avoid forking an 'expr':
n=""
while :
do
...
n=$n.
case $n in
.............) break;;
esac
done
Granted, this unary coding is an awful hack. It's time to admit it, I
love them :^)
-Alex
Reread TFM.
From expr.n manpage:
The &&, ||, and ?: operators have ``lazy evaluation'', just
as in C, which means that operands are not evaluated if they
are not needed to determine the outcome.
> (2) How come parray is not documented in the man pages of tcl?
> I found out about it from your example; thanks.
Maybe switch on the light while you read it :)
From library.n manpage:
parray arrayName
Prints on standard output the names and values of all
the elements in the array arrayName. ArrayName must be
an array accessible to the caller of parray. It may be
either local or global.
-Alex
count_members1
178418 microseconds per iteration
count_members2
50098 microseconds per iteration
count_members3
47266 microseconds per iteration
count_members4
46289 microseconds per iteration
======================================================================
#!/usr/local/bin/neotcl8.0
proc count_members1 list {
foreach member $list {
if [info exists count($member)] {
incr count($member)
} else {
set count($member) 1
}
}
}
proc count_members2 list {
foreach x $list {
if {[catch {incr count($x)}]} {set count($x) 1}
}
}
proc count_members3 list {
foreach x $list {
expr {[catch {incr count($x)}] && [set count($x) 1]}
}
}
proc count_members4 list {
foreach x $list {
lappend ulist($x) {}
}
foreach name [array names ulist] {
set count($name) [llength $ulist($x)]
}
}
# build a list of 10,000 items
set items [list john paul jones mary]
for {set i 0} {$i<10000} {incr i} {
lappend data [lindex $items [expr {int(rand()*[llength $items])}]]
}
puts "[info patchlevel] over $tcl_platform(os) $tcl_platform(osVersion)."
foreach proc [info proc count_members*] {
puts ""
puts "$proc"
puts [time {$proc $data} 10]
}
Should there be symlinks or something so that if someone does a
man -s n parray
they get the library man page?
--
<URL: mailto:lvi...@cas.org> <URL: http://www.usenix.org/events/tcl2k/>
<*> O- <URL: http://www.purl.org/NET/lvirden/> Tcl2K - Austin, Texas, US
Unless explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.
Yes.
count_members1
150252 microseconds per iteration
count_members2
79474 microseconds per iteration
count_members3
93077 microseconds per iteration
count_members4
83105 microseconds per iteration
count_members5
72036 microseconds per iteration
========================================================================
8.0.4 over Linux 2.2.5-15.
count_members1
302323 microseconds per iteration
count_members2
67778 microseconds per iteration
count_members3
71509 microseconds per iteration
count_members4
75355 microseconds per iteration
count_members5
62537 microseconds per iteration
========================================================================
What's count_members5?
Hint, three edits to count_members4, not counting the proc name:
delete one character, replace two characters with one character, and
replace one character with seven characters.
-- rec --
In article <865o8m$e36$1...@bob.news.rcn.net>, Don Porter
<d...@clover.cam.nist.gov> writes
>Donal K. Fellows <do...@ugglan.demon.co.uk> wrote:
>> foreach x $list {
>> if {[catch {incr count($x)}]} {set count($x) 1}
>> }
>Too lazy to try right now, but how would it compare with
> foreach x $list {
> expr {[catch {incr count($x)}] && [set count($x) 1]}
> }
Should compile to (approximately) the same thing. They're not quite the
same (a different result gets thrown away) but the difference should be
minimal. And if I was on a machine with a less rubbish console, I'd be
able to test this... :^(
Hmmm. I guess you mean replacing list-unary by string-unary coding as I
suggested :)
So it should look like:
proc count_members5 list {
foreach x $list {
append ulist($x) .
}
foreach name [array names ulist] {
set count($name) [string length $ulist($x)]
}
}
Did I get it right ?
-Alex
Maybe, but you've makde a simple blunder with which which string
you're taking the length of...
Interestingly, there is both an upside and a downside to this version.
Upside: it seems to be a bit faster than all the others.
Downside: memory usage is linear in the size of the input list.
Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
-- The small advantage of not having California being part of my country would
be overweighed by having California as a heavily-armed rabid weasel on our
borders. -- David Parsons <o r c @ p e l l . p o r t l a n d . o r . u s>
I think that it is a bad idea to explain the behaviour of this in terms
of how C does it as it will have the tendency to limit the audience to
only those people who already know C (which everyone should of course
except for someone who will probably not remain nameless for long !!!).
Ooops. Of course: set count($name) [string length $ulist($name)].
Thanks.
> Interestingly, there is both an upside and a downside to this version.
> Upside: it seems to be a bit faster than all the others.
> Downside: memory usage is linear in the size of the input list.
Let's be more precise: it all depends on the actual grainsize for lists
and strings. As it turns out, today lists' grainsize is pretty large (I
seem to remember sthg like room for 500 elements), while strings' must
be smaller (I don't know the figures though). Hence for small counts the
list-unary *seems* O(1) in memory, while it is actually O(N) and with a
n even higher slope than strings: string-unary asymptotically costs one
extra byte per unit, while list-unary obviously costs one list slot,
which is 4 bytes !
-Alex
Yes on the spirit. However, in this specific case, the explanation is
fully clear without following the reference to C ! Moreover, in any case
the metalevel notion that 'it is documented' is readily understandable
as soon as you can read English.
-Alex
> In article <38880B...@cnet.francetelecom.fr>,
> Alexandre Ferrieux <alexandre...@cnet.francetelecom.fr> wrote:
> > So it should look like:
> > proc count_members5 list {
> > foreach x $list {append ulist($x) .}
> > foreach name [array names ulist] {
> > set count($name) [string length $ulist($x)]
> > }
> > }
> > Did I get it right ?
Exactement, aside from the typo, you even picked the same character to
append.
-- rec --
As you can see from the compile traces (from 8.0.4) the two are very
similar. However, the version with the [if] is shorter because it
doesn't try to calculate a logical value that is then thrown away.
More optimisation is possible, obviously. But the two are very close.
I've marked with "|" the lines where there are differences that matter.
#### IF VERSION ####
(0) loadScalar1 0 # var "list"
(2) storeScalar1 1 # temp var 1
(4) pop
(5) foreach_start4 0
(10) foreach_step4 0
(15) jumpFalse1 45 # pc 60
Command 2: "if {[catch {incr count($x)}]} {set count($x) 1}"
Command 3: "catch {incr count($x)}"
(17) beginCatch4 1
Command 4: "incr count($x)"
(22) push1 0 # "count("
(24) loadScalar1 3 # var "x"
(26) push1 1 # ")"
(28) concat1 3
(30) incrStkImm 1
(32) pop
(33) push1 2 # "0"
(35) jump1 3 # pc 38
(37) pushReturnCode
(38) endCatch
| (39) tryCvtToNumeric
(40) jumpFalse1 15 # pc 55
Command 5: "set count($x) 1"
(42) push1 0 # "count("
(44) loadScalar1 3 # var "x"
(46) push1 1 # ")"
(48) concat1 3
(50) push1 3 # "1"
(52) storeStk
| (53) jump1 4 # pc 57
| (55) push1 4 # ""
(57) pop
(58) jump1 -48 # pc 10
(60) push1 4 # ""
(62) done
#### EXPR VERSION ####
(0) loadScalar1 0 # var "list"
(2) storeScalar1 1 # temp var 1
(4) pop
(5) foreach_start4 0
(10) foreach_step4 0
(15) jumpFalse1 50 # pc 65
Command 2: "expr {[catch {incr count($x)}] && [set count($x) 1]}"
Command 3: "catch {incr count($x)}"
(17) beginCatch4 1
Command 4: "incr count($x)"
(22) push1 0 # "count("
(24) loadScalar1 3 # var "x"
(26) push1 1 # ")"
(28) concat1 3
(30) incrStkImm 1
(32) pop
(33) push1 2 # "0"
(35) jump1 3 # pc 38
(37) pushReturnCode
(38) endCatch
| (39) jumpTrue1 6 # pc 45
| (41) push1 2 # "0"
| (43) jump1 4 # pc 47
| (45) push1 3 # "1"
| (47) dup
(48) jumpFalse1 14 # pc 62
Command 5: "set count($x) 1"
(50) push1 0 # "count("
(52) loadScalar1 3 # var "x"
(54) push1 1 # ")"
(56) concat1 3
(58) push1 3 # "1"
(60) storeStk
| (61) land
(62) pop
(63) jump1 -53 # pc 10
(65) push1 4 # ""
(67) done