Best way to extract an item from a set of len 1

1 view
Skip to first unread message

Tim Chase

unread,
Jan 25, 2006, 9:48:38 AM1/25/06
to pytho...@python.org
When you have a set, known to be of length one, is there a "best"
("most pythonic") way to retrieve that one item?

# given that I've got Python2.3.[45] on hand,
# hack the following two lines to get a "set" object
>>> import sets
>>> set = sets.Set

>>> s = set(['test'])
>>> len(s)
1
>>> s[0]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: unindexable object

(which is kinda expected, given that it's unordered...an index
doesn't make much sense)

To get the item, i had to resort to methods that feel less than
the elegance I've come to expect from python:

>>> item = [x for x in s][0]

or the more convoluted two-step

>>> item = s.pop()
>>> s.add(item)

or even worse, intruding into private members

>>> item = s._data.keys()[0]

Is any of these more "pythonic" than the others? Is there a more
elegant 2.3.x solution? If one upgrades to 2.4+, is there
something even more elegant? I suppose I was looking for
something like

>>> item = s.aslist()[0]

which feels a little more pythonic (IMHO). Is one solution
preferred for speed over others (as this is happening in a fairly
deeply nested loop)?

Any tips, preferences, input, suggestions, pointers to obvious
things I've missed, or the like?

Thanks,

-tkc

Fredrik Lundh

unread,
Jan 25, 2006, 10:05:31 AM1/25/06
to pytho...@python.org
Tim Chase wrote:

> I suppose I was looking for something like
>
> >>> item = s.aslist()[0]
>
> which feels a little more pythonic (IMHO). Is one solution
> preferred for speed over others (as this is happening in a fairly
> deeply nested loop)?

the obvious solution is

item = list(s)[0]

but that seems to be nearly twice as slow as [x for x in s][0]
under 2.4. hmm.

here's a faster variant:

item = iter(s).next()

but at least on my machine, your two-step solution

item = s.pop(); s.add(item)

seems to be even faster.

</F>

Peter Otten

unread,
Jan 25, 2006, 10:09:17 AM1/25/06
to
Tim Chase wrote:

> When you have a set, known to be of length one, is there a "best"
> ("most pythonic") way to retrieve that one item?

>>> s = set(["one-and-only"])
>>> item, = s
>>> item
'one-and-only'

This works for any iterable and guarantees that it contains exactly one
item. The comma may easily be missed, though.

Peter

Fuzzyman

unread,
Jan 25, 2006, 10:11:40 AM1/25/06
to
That's cute. :-)

Fuzzyman

Rene Pijlman

unread,
Jan 25, 2006, 10:13:35 AM1/25/06
to
Tim Chase:

>When you have a set, known to be of length one, is there a "best"
>("most pythonic") way to retrieve that one item?

e = s.copy().pop() #:-)

--
René Pijlman

Wat wil jij worden? http://www.carrieretijger.nl

Alex Martelli

unread,
Jan 25, 2006, 10:13:47 AM1/25/06
to
Tim Chase <pytho...@tim.thechases.com> wrote:
...

> To get the item, i had to resort to methods that feel less than
> the elegance I've come to expect from python:
>
> >>> item = [x for x in s][0]

A shorter, clearer expression of the same idea:

item = list(s)[0]

or

item = list(s).pop()

> or the more convoluted two-step
>
> >>> item = s.pop()
> >>> s.add(item)

which in turn suggests

item = set(s).pop()

Similar ideas include iter(s).next() and s.copy().pop().

Basically: s has no way to get the item non-destructively, so, either
make a copy (and use the destructive-get 'pop' on the copy) or build
from s a type which DOES have ways to get the item (iterator, list, etc)
be they destructive or not. As for speed, measuring is the only way,
and timeit is your friend. As for elegance, the most concise readable
form is "set(s).pop()" and that's what I would use.


Alex


Rene Pijlman

unread,
Jan 25, 2006, 10:21:42 AM1/25/06
to
Peter Otten:

>>>> s = set(["one-and-only"])
>>>> item, = s
>>>> item
>'one-and-only'
>
>This works for any iterable and guarantees that it contains exactly one
>item.

Nice!

>The comma may easily be missed, though.

You could write:

(item,) = s

But I'm not sure if this introduces additional overhead.

Alex Martelli

unread,
Jan 25, 2006, 10:35:35 AM1/25/06
to
Fredrik Lundh <fre...@pythonware.com> wrote:
...

> the obvious solution is
>
> item = list(s)[0]
>
> but that seems to be nearly twice as slow as [x for x in s][0]
> under 2.4. hmm.

Funny, and true on my laptop too:

helen:~ alex$ python -mtimeit -s's=set([23])' 'x=list(s)[0]'
100000 loops, best of 3: 2.55 usec per loop
helen:~ alex$ python -mtimeit -s's=set([23])' 'x=[x for x in s][0]'
100000 loops, best of 3: 1.48 usec per loop
helen:~ alex$ python -mtimeit -s's=set([23])' '[x for x in s]'
1000000 loops, best of 3: 1.36 usec per loop

Exploiting the design defect whereby a LC leaves variables bound can
shave another few percents off it, as shown.

> here's a faster variant:
>
> item = iter(s).next()

Not all that fast here:

helen:~ alex$ python -mtimeit -s's=set([23])' 'x=iter(s).next()'
100000 loops, best of 3: 1.71 usec per loop

> but at least on my machine, your two-step solution
>
> item = s.pop(); s.add(item)
> seems to be even faster.

Not really, here:

helen:~ alex$ python -mtimeit -s's=set([23])' 'x=s.pop();s.add(x)'
100000 loops, best of 3: 1.49 usec per loop

No joy from several variations on transform-and-pop:

helen:~ alex$ python -mtimeit -s's=set([23])' 'x=set(s).pop()'
100000 loops, best of 3: 2.21 usec per loop
helen:~ alex$ python -mtimeit -s's=set([23])' 'x=list(s).pop()'
100000 loops, best of 3: 3.2 usec per loop
helen:~ alex$ python -mtimeit -s's=set([23])' 'x=tuple(s)[0]'
100000 loops, best of 3: 1.79 usec per loop


Fastest I've found is unpacking-assignment:

helen:~ alex$ python -mtimeit -s's=set([23])' 'x,=s'
1000000 loops, best of 3: 0.664 usec per loop


Alex

Fredrik Lundh

unread,
Jan 25, 2006, 10:39:03 AM1/25/06
to pytho...@python.org
Peter Otten wrote:

you can make this a bit more obvious:

>>> [item] = s

this is almost twice as fast as the fastest alternative from my previous
post.

</F>

Alex Martelli

unread,
Jan 26, 2006, 12:20:02 AM1/26/06
to
Rene Pijlman <reply.in.th...@my.address.is.invalid> wrote:
> Peter Otten:
> >>>> s = set(["one-and-only"])
> >>>> item, = s
...

> >The comma may easily be missed, though.
>
> You could write:
>
> (item,) = s
>
> But I'm not sure if this introduces additional overhead.

Naah...:

helen:~ alex$ python -mtimeit -s's=set([23])' 'x,=s'

1000000 loops, best of 3: 0.689 usec per loop
helen:~ alex$ python -mtimeit -s's=set([23])' '(x,)=s'
1000000 loops, best of 3: 0.652 usec per loop
helen:~ alex$ python -mtimeit -s's=set([23])' '[x]=s'
1000000 loops, best of 3: 0.651 usec per loop

...much of a muchness.


Alex

Peter Otten

unread,
Jan 26, 2006, 3:11:41 AM1/26/06
to
Alex Martelli wrote:

And that is no coincidence. All three variants are compiled to the same
bytecode:

>>> import dis
>>> def a(): x, = s
...
>>> def b(): (x,) = s
...
>>> def c(): [x] = s
...
>>> dis.dis(a)
1 0 LOAD_GLOBAL 0 (s)
3 UNPACK_SEQUENCE 1
6 STORE_FAST 0 (x)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
>>> dis.dis(b)
1 0 LOAD_GLOBAL 0 (s)
3 UNPACK_SEQUENCE 1
6 STORE_FAST 0 (x)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
>>> dis.dis(c)
1 0 LOAD_GLOBAL 0 (s)
3 UNPACK_SEQUENCE 1
6 STORE_FAST 0 (x)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE

Peter

Christophe

unread,
Jan 26, 2006, 12:13:54 PM1/26/06
to
Alex Martelli a écrit :

> Fredrik Lundh <fre...@pythonware.com> wrote:
> ...
>
>>the obvious solution is
>>
>> item = list(s)[0]
>>
>>but that seems to be nearly twice as slow as [x for x in s][0]
>>under 2.4. hmm.
>
>
> Funny, and true on my laptop too:
>
> helen:~ alex$ python -mtimeit -s's=set([23])' 'x=list(s)[0]'
> 100000 loops, best of 3: 2.55 usec per loop

That's probably because of the name lookup needed for "list"

> helen:~ alex$ python -mtimeit -s's=set([23])' 'x=[x for x in s][0]'
> 100000 loops, best of 3: 1.48 usec per loop
> helen:~ alex$ python -mtimeit -s's=set([23])' '[x for x in s]'
> 1000000 loops, best of 3: 1.36 usec per loop
>
> Exploiting the design defect whereby a LC leaves variables bound can
> shave another few percents off it, as shown.
>
>
>>here's a faster variant:
>>
>> item = iter(s).next()
>
>
> Not all that fast here:
>
> helen:~ alex$ python -mtimeit -s's=set([23])' 'x=iter(s).next()'
> 100000 loops, best of 3: 1.71 usec per loop

That's probably because of the 2 name lookups needed :) "iter" and "next"

Reply all
Reply to author
Forward
0 new messages