a_list.count(a_callable) ?

Ping

unread,

Jun 14, 2007, 3:53:38 PM6/14/07

to

Hi,

I'm wondering if it is useful to extend the count() method of a list
to accept a callable object? What it does should be quite intuitive:
count the number of items that the callable returns True or anything
logically equivalent (non-empty sequence, non-zero number, etc).

This would return the same result as len(filter(a_callable, a_list)),
but without constructing an intermediate list which is thrown away
after len() is done.

This would also be equivalent to
n = 0
for i in a_list:
if a_callable(i): n += 1
but with much shorter and easier-to-read code. It would also run
faster.

This is my first post and please bear with me if I'm not posting it in
the right way.

Regards,
Ping

Dustan

unread,

Jun 14, 2007, 4:37:59 PM6/14/07

to

On Jun 14, 2:53 pm, Ping <ping.nsr....@gmail.com> wrote:
> Hi,
>
> I'm wondering if it is useful to extend the count() method of a list
> to accept a callable object? What it does should be quite intuitive:
> count the number of items that the callable returns True or anything
> logically equivalent (non-empty sequence, non-zero number, etc).
>
> This would return the same result as len(filter(a_callable, a_list)),

map and filter are basically obsolete after the introduction of list
comprehensions; your expression is equivalent to:
len([i for i in a_list if a_callable(i)])

Which can then be converted into a generator expression (round
brackets instead of square brackets) to avoid the intermediate list:
len((i for i in a_list if a_callable(i)))

Or syntactically equivalent (avoiding lispy brackets):
len(i for i in a_list if a_callable(i))

Dustan

unread,

Jun 14, 2007, 4:51:35 PM6/14/07

to

On Jun 14, 3:37 pm, Dustan <DustanGro...@gmail.com> wrote:
> map and filter are basically obsolete after the introduction of list
> comprehensions

It is probably worth noting that list comprehensions do not require
that you write a new function; they take any expression where
appropriate. For more information on list comprehensions, see:

http://docs.python.org/tut/node7.html#SECTION007140000000000000000

Generator expressions are the same, except syntactically they have
round brackets instead of square, and they return a generator instead
of a list, which allows for lazy evaluation.

Dustan

unread,

Jun 14, 2007, 5:06:24 PM6/14/07

to

On Jun 14, 3:37 pm, Dustan <DustanGro...@gmail.com> wrote:

> Which can then be converted into a generator expression (round
> brackets instead of square brackets) to avoid the intermediate list:
> len((i for i in a_list if a_callable(i)))

Sorry for the excess of posts everybody.

I just realized that the generator expression would not work. I'm not
sure how else could be implemented efficiently and without using up
memory besides by accumulating the count as your earlier example shows.

Carsten Haese

unread,

Jun 14, 2007, 5:18:22 PM6/14/07

to pytho...@python.org

sum(1 for i in a_list if a_callable(i))

--
Carsten Haese
http://informixdb.sourceforge.net

Carsten Haese

unread,

Jun 14, 2007, 5:34:37 PM6/14/07

to pytho...@python.org

On Thu, 2007-06-14 at 12:53 -0700, Ping wrote:
> Hi,
>
> I'm wondering if it is useful to extend the count() method of a list
> to accept a callable object? What it does should be quite intuitive:
> count the number of items that the callable returns True or anything
> logically equivalent (non-empty sequence, non-zero number, etc).
>
> This would return the same result as len(filter(a_callable, a_list)),
> but without constructing an intermediate list which is thrown away
> after len() is done.
>
> This would also be equivalent to
> n = 0
> for i in a_list:
> if a_callable(i): n += 1
> but with much shorter and easier-to-read code. It would also run
> faster.

As an alternative to the generator-sum approach I posted in reply to
Dustan's suggestion, you could (ab)use the fact that count() uses
equality testing and do something like this:

>>> class Oddness(object):
... def __eq__(self, other): return other%2==1
...
>>> [1,2,3,4,5].count(Oddness())
3

I don't know which approach is faster. Feel free to compare the two.

HTH,

Ping

unread,

Jun 15, 2007, 10:15:37 AM6/15/07

to

>
> sum(1 for i in a_list if a_callable(i))
>
> --
> Carsten Haesehttp://informixdb.sourceforge.net

This works nicely but not very intuitive or readable to me.

First of all, the generator expression makes sense only to
trained eyes. Secondly, using sum(1 ...) to mean count()
isn't very intuitive either.

I would still prefer an expression like a_list.count(a_callable),
which is short, clean, and easy to understand. :) However,
it does produce ambiguities if a_list is a list of callables.
Should the count() method match values or check return values
of a_callable? There are several possible designs but I'm not
sure which is better.

Ping

Dustan

unread,

Jun 15, 2007, 11:17:25 AM6/15/07

to

On Jun 15, 9:15 am, Ping <ping.nsr....@gmail.com> wrote:
> > sum(1 for i in a_list if a_callable(i))
>
> > --
> > Carsten Haesehttp://informixdb.sourceforge.net
>
> This works nicely but not very intuitive or readable to me.
>
> First of all, the generator expression makes sense only to
> trained eyes. Secondly, using sum(1 ...) to mean count()
> isn't very intuitive either.

Then wrap it in a function:
def count(a_list, a_function):
return sum(1 for i in a_list if a_function(i))

And call the function. You can also give it a different name (although
I can't think of a concise name that would express it any better).

> I would still prefer an expression like a_list.count(a_callable),
> which is short, clean, and easy to understand. :) However,
> it does produce ambiguities if a_list is a list of callables.
> Should the count() method match values or check return values
> of a_callable? There are several possible designs but I'm not
> sure which is better.

Indeed, the ambiguity in that situation would be a reason *not* to
introduce such behavior, especially since it would break older
programs that don't recognize that behavior. Just stick with writing
the function, as shown above.

Carsten Haese

unread,

Jun 15, 2007, 12:33:16 PM6/15/07

to pytho...@python.org

On Fri, 2007-06-15 at 14:15 +0000, Ping wrote:
> using sum(1 ...) to mean count() isn't very intuitive

I find it very intuitive, but then again, my years of studying Math may
have skewed my intuition.

> I would still prefer an expression like a_list.count(a_callable),
> which is short, clean, and easy to understand.

Did you see my alternative example on this thread? It allows you to use
list.count in almost exactly that way, except that instead of passing
the callable directly, you pass an object that defers to your callable
in its __eq__ method.

HTH,

Ping

unread,

Jun 15, 2007, 1:52:55 PM6/15/07

to

On 6 15 , 11 17 , Dustan <DustanGro...@gmail.com> wrote:
> On Jun 15, 9:15 am, Ping <ping.nsr....@gmail.com> wrote:
>
> > > sum(1 for i in a_list if a_callable(i))
>
> > > --
> > > Carsten Haesehttp://informixdb.sourceforge.net
>
> > This works nicely but not very intuitive or readable to me.
>
> > First of all, the generator expression makes sense only to
> > trained eyes. Secondly, using sum(1 ...) to mean count()
> > isn't very intuitive either.
>
> Then wrap it in a function:
> def count(a_list, a_function):
> return sum(1 for i in a_list if a_function(i))
>
> And call the function. You can also give it a different name (although
> I can't think of a concise name that would express it any better).
>

Hmm... This sounds like the best idea so far. It is efficient both
in memory and time while exposes an easy-to-understand name.
I would name the function count_items though.

n = count_items(a_list, lambda x: x > 3) # very readable :)

cheers,
Ping

Ping

unread,

Jun 15, 2007, 1:55:11 PM6/15/07

to

Yes, I read it. This works, but it may make lots of dummy classes
with the special __eq__ method if I have many criteria to use for
counting. I find the function design mentioned by Dustan most
attractive. :)

cheers,
Ping

BJörn Lindqvist

unread,

Jun 15, 2007, 2:06:00 PM6/15/07

to Ping, pytho...@python.org

Maybe you could extend count() analogous to how sort() works:

# L is a list of Person objects, each Person has a name attribute
L.sort(key = attrgetter("name"))

# How many olle are there?
print L.count("olle", key = attrgetter("name"))

# And index could be extended in the same way!
# Whom has id 1234?
print L.index(1234, key = attrgetter("id")).name

All of these could be solved by giving Person an __eq__() method, but
it fails when you need to search, sort or count on a different key.

--
mvh Björn

Carsten Haese

unread,

Jun 15, 2007, 2:39:55 PM6/15/07

to pytho...@python.org

No, it won't. You only need one class that is given the criterion at
instantiation time. Something like this:

class WhereTrue(object):
def __init__(self, func):
self.func = func
def __eq__(self, other):
return self.func(other)

list1.count(WhereTrue(callable1))
list2.count(WhereTrue(callable2))

Carsten Haese

unread,

Jun 15, 2007, 2:51:27 PM6/15/07

to pytho...@python.org

On Fri, 2007-06-15 at 14:39 -0400, Carsten Haese wrote:
> class WhereTrue(object):
> def __init__(self, func):
> self.func = func
> def __eq__(self, other):
> return self.func(other)
>
> list1.count(WhereTrue(callable1))
> list2.count(WhereTrue(callable2))

P.S: Note, however, that this will only work if the list elements don't
define __eq__ methods themselves. If they do, their __eq__ methods get
called instead of WhereTrue's __eq__ method, leading to incorrect
results.

Dustan

unread,

Jun 15, 2007, 5:17:24 PM6/15/07

to

Although that particular example would be more efficient inlined,
because of the additional time spent creating the lambda function in
your count_items call.

sum(1 for i in a_list if i > 3)

Ping

unread,

Jun 16, 2007, 4:12:07 AM6/16/07

to

On 6 16 , 2 06 , "BJörn Lindqvist" <bjou...@gmail.com> wrote:
>
> Maybe you could extend count() analogous to how sort() works:
>
> # L is a list of Person objects, each Person has a name attribute
> L.sort(key = attrgetter("name"))
>
> # How many olle are there?
> print L.count("olle", key = attrgetter("name"))
>
> # And index could be extended in the same way!
> # Whom has id 1234?
> print L.index(1234, key = attrgetter("id")).name
>
> All of these could be solved by giving Person an __eq__() method, but
> it fails when you need to search, sort or count on a different key.
>
> --
> mvh Björn

Wow! This jumps out of my screen! I like it very much.
How to get the extension into the language?

cheers,
Ping

p.s. By the way, I guess you meant
print L[L.index(1234, key = attrgetter("id"))].name
in the index example.

Wildemar Wildenburger

unread,

Jun 16, 2007, 1:04:38 PM6/16/07

to pytho...@python.org

Ping wrote:
> On 6 16 , 2 06 , "BJörn Lindqvist" <bjou...@gmail.com> wrote:
>
>> Maybe you could extend count() analogous to how sort() works:
>>
>

> Wow! This jumps out of my screen! I like it very much.
> How to get the extension into the language?
>

Well, you subclass list and extend/override the method.

class SmartCountingList(list):
def count(self, item, func=lambda x: x):
return len([item for item in self if func(item) is True])

... or whatever (this is dummy code, not tested and probably riddled
with stupid errors --- so take this as a pseudocode example)

/W

Dustan

unread,

Jun 16, 2007, 4:37:01 PM6/16/07

to

On Jun 16, 12:04 pm, Wildemar Wildenburger <wilde...@freakmail.de>
wrote:

> class SmartCountingList(list):
> def count(self, item, func=lambda x: x):
> return len([item for item in self if func(item) is True])

A less bug-prone and (I would think) speedier example, although still
untested:

class SmartCountingList(list):
def count(self, item, func=lambda x: x):

return sum(1 for i in self if func(item)==item)

Then, you would call it as follows:
a_list.count(True, a_function)

Dustan

unread,

Jun 16, 2007, 8:28:31 PM6/16/07

to

On Jun 16, 3:37 pm, Dustan <DustanGro...@gmail.com> wrote:
> class SmartCountingList(list):
> def count(self, item, func=lambda x: x):

> return sum(1 for i in self if func(item)==item)
>
> Then, you would call it as follows:
> a_list.count(True, a_function)

I need to learn to think things through before hitting the send button
(or test my examples); none of the mistakes I've made on this thread
have been from ignorance.

If a_function returns a true value other than True or the number 1
(which are technically the same), it is not 'equal' to True. Either
the function would return True, or the count method could be written
differently:

class SmartCountingList(list):
def count(self, item, is_func=False):
if is_func:
# item being a function:
return sum(1 for i in self if item(i))
else:
return super(SmartCountingList, self).count(item)

And just to prove that it works:

>>> s = SmartCountingList((1,2,3))
>>> s
[1, 2, 3]
>>> s.count(1)
1
>>> s.count(2)
1
>>> s.count(3)
1
>>> s.count(4)
0
>>> s.count(lambda n: n<3, True)
2

Evan Klitzke

unread,

Jun 16, 2007, 8:38:06 PM6/16/07

to pytho...@python.org

On 6/16/07, Dustan <Dustan...@gmail.com> wrote:
> On Jun 16, 3:37 pm, Dustan <DustanGro...@gmail.com> wrote:
> > class SmartCountingList(list):
> > def count(self, item, func=lambda x: x):
> > return sum(1 for i in self if func(item)==item)
> >
> > Then, you would call it as follows:
> > a_list.count(True, a_function)
>
> I need to learn to think things through before hitting the send button
> (or test my examples); none of the mistakes I've made on this thread
> have been from ignorance.
>
> If a_function returns a true value other than True or the number 1
> (which are technically the same), it is not 'equal' to True. Either

If you're _really_ pedantic, 1 and True are _not_ the same, and this
can be an important distinction in some situations.

>>> 1 == True
True
>>> 1 is True
False

--
Evan Klitzke <ev...@yelp.com>

Steven D'Aprano

unread,

Jun 16, 2007, 9:04:55 PM6/16/07

to

Did you intend for the method to count the number of items where
func(item) is item instead of true?

Personally, I don't see any advantage to making this a subclass. I think a
bare function would be far, far more sensible, since you could then apply
it to any sequence or iterable.

Here's a version that should work with any iterable. If the iterable is a
short enough sequence, it will use filter to build an intermediate list.
If it is too long, or if it can't predict how long the intermediate list
will be, it falls back to code that doesn't build an intermediate list.

(The cut-off length I have used is a wild guess. Somebody who cares more
than me can time the various strategies tactics and work out the "best"
length to switch from one strategy to another. Don't forget to try it in
different versions of Python.)

def count_items(iterable, func=None, _cutoff=100003):
"""Count the number of items of iterable where func(item) is a
true value. Equivalent to len(filter(func, seq)).

If func is None or not given, counts the number of true items.

Will not work for iterators that do not terminate.
"""
try:
# Counting items in pure Python code is only
# worthwhile if building a temporary list using
# filter would take a long time.
use_filter = len(iterable) < _cutoff
except TypeError:
# iterable has no len(), so play it safe and
# avoid calling filter.
use_filter = False
if use_filter:
# Take advantage of the fast filter built-in.
return len(filter(func, iterable))
else:
n = 0
if func is None:
for item in iterable:
if item: n += 1
else:
for item in iterable:
if func(item): n += 1
return n

--
Steven.

Ping

unread,

Jun 17, 2007, 11:42:47 AM6/17/07

to

Somehow I did not see my post sent about 10 hours ago.
I'm posting it again. I apologize if it showed up twice.

After seeing all the ideas tossed around, now I like
the proposal made by BJörn Lindqvist the best, and I
extend it further to match the sort() method:
L.count(value, cmp=None, key=None)
With this signature, the callable case is simply
L.count(True, cmp=a_callable),
although now a_callable must return True instead
of anything logical equivalent. I can live with that.

I made an implementation with subclassing and
Carsten Haese's sum(1 ...) method, see below.
It works fine for me. It would be great to see
it supported by the built-in list. :)

cheers,
Ping

$ cat slist.py
#!/usr/bin/env python

from operator import *

class slist (list):
def count(self, value, cmp=None, key=None):
if not cmp and not key: return list.count(self, value)
if not cmp: cmp = eq
if not key: # cmp given, no key
return sum(1 for i in self if cmp(i, value))
# both cmp and key are given
return sum(1 for i in self if cmp(key(i), value))

class Person:
def __init__(self, first_name, last_name, age, gender):
self.first_name, self.last_name, self.age, self.gender
= \
first_name, last_name, age, gender

a = slist([3, 5, 7, 3])
print "a =", a
print "a has", a.count(3), "3's and", a.count(4), "4's."
print "a has", a.count(4, cmp=gt), "elements > 4 and", \
a.count(5, cmp=le), "elements <= 5."

b = slist( [ Person("John", "Smith", 30, 'm'), Person("Claire", "Doe",
23, 'f'), \
Person("John", "Black", 43, 'm'), Person("Anne", "Jolie", 50,
'f') ] )
print "b has", b.count("John", key=attrgetter("first_name")), \
"elements with first_name == John."
print "b has", b.count(25, cmp=le, key=attrgetter("age")), \
"elements with age <= 25."

$ ./slist.py
a = [3, 5, 7, 3]
a has 2 3's and 0 4's.
a has 2 elements > 4 and 3 elements <= 5.
b has 2 elements with first_name == John.
b has 2 elements with age <= 30.

Ping

unread,

Jun 17, 2007, 12:49:23 PM6/17/07

to

> print "b has", b.count(25, cmp=le, key=attrgetter("age")), \
> "elements with age <= 25."

> [deleted]

> b has 2 elements with age <= 30.

Oops, I mixed up when copying and pasting at different times... :p
The output was of course

b has 1 elements with age <= 25.

Ping

Antoon Pardon

unread,

Jun 18, 2007, 6:44:14 AM6/18/07

to

If you want to check return values, I would thing your callable argument
should make the call. Something like:

def returns_less_than_three(func):
return func() < 3

ls.count(returns_less_than_three)

Checking the return values implictly, would make it vey hard if not
impossible to check against the callables themselves if you want to.

--
Antoon Pardon

Ping

unread,

Jun 18, 2007, 1:21:28 PM6/18/07

to

Hi,

I patched Objects/listobject.c to support
L.count(value, cmp=None, key=None).
I tested it with the same script above by replacing slist
with built-in list. It worked correctly with this small
test. The patch is below (126 lines, I hope that's not
too big to be pasted here). This is the first time that
I modified CPython source, and I may very well make
mistakes in (lack of) reference counting or other things.
Comments and corrections are much appreciated!

Regards,
Ping

--- Objects/listobject.c.orig Sun Oct 29 05:39:10 2006
+++ Objects/listobject.c Tue Jun 19 01:04:30 2007
@@ -919,12 +919,12 @@

/* Comparison function. Takes care of calling a user-supplied
* comparison function (any callable Python object), which must not
be
- * NULL (use the ISLT macro if you don't know, or call
PyObject_RichCompareBool
- * with Py_LT if you know it's NULL).
- * Returns -1 on error, 1 if x < y, 0 if x >= y.
+ * NULL.
+ * Returns -9 on error, otherwise return the result of the user-
supplied
+ * comparison.
*/
static int
-islt(PyObject *x, PyObject *y, PyObject *compare)
+custom_compare(PyObject *x, PyObject *y, PyObject *compare)
{
PyObject *res;
PyObject *args;
@@ -936,7 +936,7 @@
*/
args = PyTuple_New(2);
if (args == NULL)
- return -1;
+ return -9;
Py_INCREF(x);
Py_INCREF(y);
PyTuple_SET_ITEM(args, 0, x);
@@ -944,16 +944,28 @@
res = PyObject_Call(compare, args, NULL);
Py_DECREF(args);
if (res == NULL)
- return -1;
+ return -9;
if (!PyInt_Check(res)) {
Py_DECREF(res);
PyErr_SetString(PyExc_TypeError,
"comparison function must return
int");
- return -1;
+ return -9;
}
i = PyInt_AsLong(res);
Py_DECREF(res);
- return i < 0;
+ return i;
+}
+
+/* "less-than" Comparison function. Calls custom_compare to do the
+ * actual comparison.
+ * Returns -1 on error, 1 if x < y, 0 if x >= y.
+ */
+static int
+islt(PyObject *x, PyObject *y, PyObject *compare)
+{
+ int res = custom_compare(x, y, compare);
+ if (res == -9) return -1;
+ return res < 0;
}

/* If COMPARE is NULL, calls PyObject_RichCompareBool with Py_LT,
else calls
@@ -2232,16 +2244,44 @@
}

static PyObject *
-listcount(PyListObject *self, PyObject *v)
+listcount(PyListObject *self, PyObject * args, PyObject *kwds)
{
+ PyObject *v = NULL; /* value for counting */
+ PyObject *compare = NULL;
+ PyObject *keyfunc = NULL;
+ static char *kwlist[] = {"value", "cmp", "key", 0};
+ PyObject *item;
Py_ssize_t count = 0;
Py_ssize_t i;
+ int cmp;
+
+ assert(self != NULL);
+ assert (PyList_Check(self));
+ if (args != NULL) {
+ if (!PyArg_ParseTupleAndKeywords(args, kwds, "O|
OO:count",
+ kwlist, &v, &compare, &keyfunc))
+ return NULL;
+ }
+ if (compare == Py_None)
+ compare = NULL;
+ if (keyfunc == Py_None)
+ keyfunc = NULL;

for (i = 0; i < self->ob_size; i++) {
- int cmp = PyObject_RichCompareBool(self->ob_item[i],
v, Py_EQ);
+ item = self->ob_item[i];
+ if (keyfunc != NULL) {
+ item = PyObject_CallFunctionObjArgs(keyfunc,
item,
+ NULL);
+ }
+
+ if (compare != NULL) {
+ cmp = custom_compare(item, v, compare);
+ } else {
+ cmp = PyObject_RichCompareBool(item, v,
Py_EQ);
+ }
if (cmp > 0)
count++;
- else if (cmp < 0)
+ else if (cmp == -9)
return NULL;
}
return PyInt_FromSsize_t(count);
@@ -2404,7 +2444,7 @@
PyDoc_STRVAR(index_doc,
"L.index(value, [start, [stop]]) -> integer -- return first index of
value");
PyDoc_STRVAR(count_doc,
-"L.count(value) -> integer -- return number of occurrences of
value");
+"L.count(value, cmp=None, key=None) -> integer -- return number of
occurrences of value [in the key] [with the cmp function].");
PyDoc_STRVAR(reverse_doc,
"L.reverse() -- reverse *IN PLACE*");
PyDoc_STRVAR(sort_doc,
@@ -2422,7 +2462,7 @@
{"pop", (PyCFunction)listpop, METH_VARARGS,
pop_doc},
{"remove", (PyCFunction)listremove, METH_O, remove_doc},
{"index", (PyCFunction)listindex, METH_VARARGS,
index_doc},
- {"count", (PyCFunction)listcount, METH_O, count_doc},
+ {"count", (PyCFunction)listcount, METH_VARARGS |
METH_KEYWORDS, count_doc},
{"reverse", (PyCFunction)listreverse, METH_NOARGS,
reverse_doc},
{"sort", (PyCFunction)listsort, METH_VARARGS |
METH_KEYWORDS, sort_doc},
{NULL, NULL} /* sentinel */

BJörn Lindqvist

unread,

Jun 18, 2007, 3:17:22 PM6/18/07

to Ping, pytho...@python.org

> I patched Objects/listobject.c to support
> L.count(value, cmp=None, key=None).
> I tested it with the same script above by replacing slist
> with built-in list. It worked correctly with this small
> test. The patch is below (126 lines, I hope that's not

Great! If you want this change included in Python, you should post it
on SourceForge's patch tracker at
http://sourceforge.net/tracker/?group_id=5470&atid=305470. Optionally,
you can also ask if people like the patch on python-dev. But IMHO, the
odds of this patch being accepted are slim (borrowing from the example
in the last thread):

persons.count("olle", key = attergetter("name"))

is longer and just barely more readable than

sum(1 for x in persons if x.name == "olle"))

--
mvh Björn

John Machin

unread,

Jun 18, 2007, 6:25:00 PM6/18/07

to

On Jun 19, 5:17 am, "BJörn Lindqvist" <bjou...@gmail.com> wrote:
>
> persons.count("olle", key = attergetter("name"))
>
> is longer and just barely more readable than
>
> sum(1 for x in persons if x.name == "olle"))
>

The OP's proposal seems to have a very narrow focus, whereas the
generator approach can handle a much wider range of queries:

sum(1 for x in persons if x.name == "olle" and x.country == "se"))
sum(x.salary for x in persons if x.name == "olle"))

By the time one has looked up the syntax for the augmented count
method, remembered the existence of something like "attergetter" [sic]
and nutted out its spelling and usage, somebody else using generators
would have the job and gone to lunch :-)

YAGNI. Case dismissed.