[Python-Dev] performance of {} versus dict()

107 views
Skip to first unread message

Chris Withers

unread,
Nov 14, 2012, 4:12:05 AM11/14/12
to <python-dev@python.org>
Hi All,

A colleague pointed me at Doug's excellent article here:

http://www.doughellmann.com/articles/misc/dict-performance/index.html

...which made me a little sad, I suspect I'm not the only one who finds:

a_dict = dict(
x = 1,
y = 2,
z = 3,
...
)

...easier to read than:

a_dict = {
'x':1,
'y':2,
'z':3,
...
}

What can we do to speed up the former case?

Here's comparison for different versions of CPython:

$ python2.5 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 2.96 2.49 2.47 2.42 2.42
1000000 loops, best of 5: 2.42 usec per loop
$ python2.5 -m timeit -n 1000000 -r 5 -v
"{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
raw times: 1.69 1.71 1.68 1.68 1.68
1000000 loops, best of 5: 1.68 usec per loop

$ python2.6 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 2.41 2.41 2.42 2.44 2.41
1000000 loops, best of 5: 2.41 usec per loop
$ python2.6 -m timeit -n 1000000 -r 5 -v
"{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
raw times: 1.51 1.51 1.52 1.51 1.51
1000000 loops, best of 5: 1.51 usec per loop

$ python2.7 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 2.32 2.31 2.31 2.32 2.31
1000000 loops, best of 5: 2.31 usec per loop
$ python2.7 -m timeit -n 1000000 -r 5 -v
"{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
raw times: 1.49 1.49 1.77 1.76 1.55
1000000 loops, best of 5: 1.49 usec per loop

So, not the 6 times headline figure that Doug quotes, but certainly a
difference. Can someone with Python 3 handy compare there too?

cheers,

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Ulrich Eckhardt

unread,
Nov 14, 2012, 4:30:27 AM11/14/12
to pytho...@python.org
Am 14.11.2012 10:12, schrieb Chris Withers:
> Can someone with Python 3 handy compare there too?

C:\Python27\python -m timeit -n 1000000 -r 5 -v
"dict(a=1, b=2, c=3, d=4, e=5, f=6, g=7)"
raw times: 0.918 0.924 0.922 0.928 0.926
1000000 loops, best of 5: 0.918 usec per loop

C:\Python27\python -m timeit -n 1000000 -r 5 -v
"{'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7}"
raw times: 0.48 0.49 0.482 0.496 0.497
1000000 loops, best of 5: 0.48 usec per loop

C:\Python32\python -m timeit -n 1000000 -r 5 -v
"dict(a=1, b=2, c=3, d=4, e=5, f=6, g=7)"
raw times: 0.898 0.891 0.892 0.899 0.891
1000000 loops, best of 5: 0.891 usec per loop

C:\Python32\python -m timeit -n 1000000 -r 5 -v
"{'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7}"
raw times: 0.444 0.463 0.461 0.464 0.461
1000000 loops, best of 5: 0.444 usec per loop

C:\Python32_64\python -m timeit -n 1000000 -r 5 -v
"dict(a=1, b=2, c=3, d=4, e=5, f=6, g=7)"
raw times: 0.908 0.923 0.927 0.912 0.923
1000000 loops, best of 5: 0.908 usec per loop

C:\Python32_64\python -m timeit -n 1000000 -r 5 -v
"{'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7}"
raw times: 0.484 0.446 0.501 0.45 0.442
1000000 loops, best of 5: 0.442 usec per loop

C:\Python33_64\python -m timeit -n 1000000 -r 5 -v
"dict(a=1, b=2, c=3, d=4, e=5, f=6, g=7)"
raw times: 1.02 1.07 1.03 1.11 1.07
1000000 loops, best of 5: 1.02 usec per loop

C:\Python33_64\python -m timeit -n 1000000 -r 5 -v
"{'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7}"
raw times: 0.444 0.449 0.455 0.452 0.46
1000000 loops, best of 5: 0.444 usec per loop


Tested on Win7/64 bit. Python 2.7 is the 32-bit version, 3.2 is
installed as 32-bit and 64-bit versions and 3.3 only as 64-bit version.
In any case, the difference is even a bit stronger than what you
experience and it seems that all versions perform roughly similar.

Uli



**************************************************************************************
Domino Laser GmbH, Fangdieckstra�e 75a, 22547 Hamburg, Deutschland
Gesch�ftsf�hrer: Hans Robert Dapprich, Amtsgericht Hamburg HR B62 932
**************************************************************************************
Visit our website at http://www.dominolaser.com
**************************************************************************************
Diese E-Mail einschlie�lich s�mtlicher Anh�nge ist nur f�r den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empf�nger sein sollten. Die E-Mail ist in diesem Fall zu l�schen und darf weder gelesen, weitergeleitet, ver�ffentlicht oder anderweitig benutzt werden.
E-Mails k�nnen durch Dritte gelesen werden und Viren sowie nichtautorisierte �nderungen enthalten. Domino Laser GmbH ist f�r diese Folgen nicht verantwortlich.
**************************************************************************************

Chris Withers

unread,
Nov 14, 2012, 5:00:59 AM11/14/12
to Merlijn van Deen, <python-dev@python.org>
On 14/11/2012 09:58, Merlijn van Deen wrote:
> On 14 November 2012 10:12, Chris Withers <ch...@simplistix.co.uk> wrote:
>> ...which made me a little sad
>
> Why did it make you sad? dict() takes 0.2µs, {} takes 0.04µs. In other
> words: you can run dict() _five million_ times per second, and {}
> twenty-five million times per second. That is 'a lot' and 'a lot'. It
> also means you are unlikely to notice the difference in real-world
> code. Just use the one you feel is clearer in the situation, and don't
> worry about micro-optimalization.

I'm inclined to agree, but it makes me sad for two reasons:

- it's something that people get hung up on, for better or worse. (if it
wasn't, Doug wouldn't have written his article)

- it can make a difference, for example setting up a dict with many keys
at the core of a type loop.

Without looking at implementation, they should logically perform the same...

mar...@v.loewis.de

unread,
Nov 14, 2012, 5:11:39 AM11/14/12
to pytho...@python.org

Zitat von Chris Withers <ch...@simplistix.co.uk>:

> a_dict = dict(
> x = 1,
> y = 2,
> z = 3,
> ...
> )

> What can we do to speed up the former case?

It should be possible to special-case it. Rather than creating
a new dictionary from scratch, one could try to have the new dictionary
the same size as the original one, and copy all entries.

I also wonder whether the PyArg_ValidateKeywordArguments call is really
necessary: if this is not a proper keyword dictionary, dict creation
could still proceed in a reasonable way.

I don't know how much this would gain, though. You still have to
create two dictionary objects. For a better speedup, try

def xdict(**kwds):
return kwds

(possibly written in C for even more speed)

Regards,
Martin

Antoine Pitrou

unread,
Nov 14, 2012, 5:48:09 AM11/14/12
to pytho...@python.org
Le Wed, 14 Nov 2012 10:00:59 +0000,
Chris Withers <ch...@simplistix.co.uk> a écrit :

> On 14/11/2012 09:58, Merlijn van Deen wrote:
> > On 14 November 2012 10:12, Chris Withers <ch...@simplistix.co.uk>
> > wrote:
> >> ...which made me a little sad
> >
> > Why did it make you sad? dict() takes 0.2µs, {} takes 0.04µs. In
> > other words: you can run dict() _five million_ times per second,
> > and {} twenty-five million times per second. That is 'a lot' and 'a
> > lot'. It also means you are unlikely to notice the difference in
> > real-world code. Just use the one you feel is clearer in the
> > situation, and don't worry about micro-optimalization.
>
> I'm inclined to agree, but it makes me sad for two reasons:
>
> - it's something that people get hung up on, for better or worse. (if
> it wasn't, Doug wouldn't have written his article)
>
> - it can make a difference, for example setting up a dict with many
> keys at the core of a type loop.

Well, please post examples of *real-world* use cases where it makes a
difference.
Otherwise, you're asking us to add hacks to the implementation just to
make you feel good, which is quite unacceptable.

Regards

Antoine.

Steven D'Aprano

unread,
Nov 14, 2012, 6:21:03 AM11/14/12
to pytho...@python.org
On 14/11/12 21:00, Chris Withers wrote:
> On 14/11/2012 09:58, Merlijn van Deen wrote:
>> On 14 November 2012 10:12, Chris Withers <ch...@simplistix.co.uk> wrote:
>>> ...which made me a little sad
>>
>> Why did it make you sad? dict() takes 0.2µs, {} takes 0.04µs. In other
>> words: you can run dict() _five million_ times per second, and {}
>> twenty-five million times per second. That is 'a lot' and 'a lot'. It
>> also means you are unlikely to notice the difference in real-world
>> code. Just use the one you feel is clearer in the situation, and don't
>> worry about micro-optimalization.
>
> I'm inclined to agree, but it makes me sad for two reasons:
>
> - it's something that people get hung up on, for better or worse. (if it
> wasn't, Doug wouldn't have written his article)

People get hung up on all sorts of things. I would hate to think we would
complicate the implementation to pander to pointless micro-optimization.
I'm sure that there are many far more important things than this.


> - it can make a difference, for example setting up a dict with many keys
>at the core of a type loop.

Ah yes, the semi-mythical "tight loop".

I've never come across one of these tight loops that creates a dict with
many keys in a tight loop, and then apparently fails to actually use it
for anything useful. For if it did, surely the actual work done with the
dict is going to outweigh the setup cost for all but the most trivial
applications. I find it hard to get uptight about a small inefficiency
in trivial applications that don't do much.

Show me a non-toy use-case where creating dicts is an actual bottleneck,
and I'll revise my position.


> Without looking at implementation, they should logically perform the same...

I disagree. Calling dict() has to do a name lookup, and then a function
call. That alone is almost 2.5 times as expensive as creating a dict
literal on my machine:

[steve@ando ~]$ python3.3 -m timeit "d = {}"
10000000 loops, best of 3: 0.17 usec per loop
[steve@ando ~]$ python3.3 -m timeit "d = dict()"
1000000 loops, best of 3: 0.416 usec per loop

Then you have the function call itself, which engages the argument parsing
mechanism, which does more work than dict literal syntax. For example, it
checks for duplicate keyword arguments, while dict literals happily accept
duplicate keys.

It's hardly a surprise that dict() is slower than {}.

--
Steven

Chris Angelico

unread,
Nov 14, 2012, 8:18:38 AM11/14/12
to pytho...@python.org
On Wed, Nov 14, 2012 at 8:12 PM, Chris Withers <ch...@simplistix.co.uk> wrote:
> I suspect I'm not the only one who finds:
>
> a_dict = dict(
> x = 1,
> y = 2,
> z = 3,
> ...
> )
>
> ...easier to read than:
>
> a_dict = {
> 'x':1,
> 'y':2,
> 'z':3,
> ...
> }
>
> What can we do to speed up the former case?

Perhaps an alternative question: What can be done to make the latter
less unpalatable? I personally prefer dict literal syntax to a dict
constructor call, but no doubt there are a number of people who feel
as you do. In what way(s) do you find the literal syntax less
readable, and can some simple (and backward-compatible) enhancements
help that?

I've seen criticisms (though I don't recall where) of Python,
comparing it to JavaScript/ECMAScript, that complain of the need to
quote the keys. IMO this is a worthwhile downside, as it allows you to
use variables as the keys, rather than requiring (effectively) literal
strings. But it does make a dict literal that much more "noisy" than
the constructor.

ChrisA

Benjamin Peterson

unread,
Nov 14, 2012, 8:43:18 AM11/14/12
to mar...@v.loewis.de, pytho...@python.org
2012/11/14 <mar...@v.loewis.de>:
>
> Zitat von Chris Withers <ch...@simplistix.co.uk>:
>
>
>> a_dict = dict(
>> x = 1,
>> y = 2,
>> z = 3,
>> ...
>> )
>
>
>> What can we do to speed up the former case?
>
>
> It should be possible to special-case it. Rather than creating
> a new dictionary from scratch, one could try to have the new dictionary
> the same size as the original one, and copy all entries.
>
> I also wonder whether the PyArg_ValidateKeywordArguments call is really
> necessary: if this is not a proper keyword dictionary, dict creation
> could still proceed in a reasonable way.

In the common case PyArg_ValidateKeywordArguments should be a simple check.



--
Regards,
Benjamin

Stefan Behnel

unread,
Nov 14, 2012, 9:35:37 AM11/14/12
to pytho...@python.org
Chris Angelico, 14.11.2012 14:18:
If that bothers you in a specific case, I recommend using the constructor
instead of a literal.

Stefan

Mark Adam

unread,
Nov 14, 2012, 11:12:54 AM11/14/12
to <python-dev@python.org>
On Wed, Nov 14, 2012 at 3:12 AM, Chris Withers <ch...@simplistix.co.uk> wrote:
> Hi All,
>
> A colleague pointed me at Doug's excellent article here:
> ...which made me a little sad, I suspect I'm not the only one who finds:
>
> a_dict = dict(
> x = 1,
> y = 2,
> z = 3,
> ...
> )
>
> ...easier to read than:
>
> a_dict = {
> 'x':1,
> 'y':2,
> 'z':3,
> ...
> }

Hey, it makes me a little sad that dict breaks convention by allowing
the use of unquoted characters (which everywhere else looks like
variable names) just for a silly typing optimization.

mark

Oleg Broytman

unread,
Nov 14, 2012, 11:20:55 AM11/14/12
to pytho...@python.org
On Wed, Nov 14, 2012 at 10:12:54AM -0600, Mark Adam <dreamin...@gmail.com> wrote:
> On Wed, Nov 14, 2012 at 3:12 AM, Chris Withers <ch...@simplistix.co.uk> wrote:
> > Hi All,
> >
> > A colleague pointed me at Doug's excellent article here:
> > ...which made me a little sad, I suspect I'm not the only one who finds:
> >
> > a_dict = dict(
> > x = 1,
> > y = 2,
> > z = 3,
> > ...
> > )
> >
> > ...easier to read than:
> >
> > a_dict = {
> > 'x':1,
> > 'y':2,
> > 'z':3,
> > ...
> > }
>
> Hey, it makes me a little sad that dict breaks convention by allowing
> the use of unquoted characters (which everywhere else looks like
> variable names) just for a silly typing optimization.

It doesn't. It's a call (function call or or a class instantiation)
and it's not dict-specific: function(a=1, b=None)...

Oleg.
--
Oleg Broytman http://phdru.name/ p...@phdru.name
Programmers don't die, they just GOSUB without RETURN.

Serhiy Storchaka

unread,
Nov 14, 2012, 11:23:16 AM11/14/12
to pytho...@python.org
On 14.11.12 11:12, Chris Withers wrote:
> ....which made me a little sad, I suspect I'm not the only one who finds:
>
> a_dict = dict(
> x = 1,
> y = 2,
> z = 3,
> ...
> )
>
> ....easier to read than:
>
> a_dict = {
> 'x':1,
> 'y':2,
> 'z':3,
> ...
> }

PEP 8 recommends:

a_dict = dict(
x=1,
y=2,
z=3,
...
)

and

a_dict = {
'x': 1,
'y': 2,
'z': 3,
...
}


Richard Oudkerk

unread,
Nov 14, 2012, 11:42:39 AM11/14/12
to pytho...@python.org
On 14/11/2012 4:23pm, Serhiy Storchaka wrote:
> PEP 8 recommends:
>
> a_dict = dict(
> x=1,
> y=2,
> z=3,
> ...
> )
>
> and
>
> a_dict = {
> 'x': 1,
> 'y': 2,
> 'z': 3,
> ...
> }

In which section? I can't see such a recommendation.

--
Richard

Brian Curtin

unread,
Nov 14, 2012, 12:00:47 PM11/14/12
to Mark Adam, <python-dev@python.org>
On Wed, Nov 14, 2012 at 10:12 AM, Mark Adam <dreamin...@gmail.com> wrote:
> On Wed, Nov 14, 2012 at 3:12 AM, Chris Withers <ch...@simplistix.co.uk> wrote:
>> Hi All,
>>
>> A colleague pointed me at Doug's excellent article here:
>> ...which made me a little sad, I suspect I'm not the only one who finds:
>>
>> a_dict = dict(
>> x = 1,
>> y = 2,
>> z = 3,
>> ...
>> )
>>
>> ...easier to read than:
>>
>> a_dict = {
>> 'x':1,
>> 'y':2,
>> 'z':3,
>> ...
>> }
>
> Hey, it makes me a little sad that dict breaks convention by allowing
> the use of unquoted characters (which everywhere else looks like
> variable names) just for a silly typing optimization.

What convention and typing optimization is this? I hope you aren't
suggesting it should be dict("x"=1) or dict("x":1)?

Xavier Morel

unread,
Nov 14, 2012, 12:02:43 PM11/14/12
to python-dev Dev

On 2012-11-14, at 17:42 , Richard Oudkerk wrote:

> On 14/11/2012 4:23pm, Serhiy Storchaka wrote:
>> PEP 8 recommends:
>>
>> a_dict = dict(
>> x=1,
>> y=2,
>> z=3,
>> ...
>> )
>>
>> and
>>
>> a_dict = {
>> 'x': 1,
>> 'y': 2,
>> 'z': 3,
>> ...
>> }
>
> In which section? I can't see such a recommendation.

Whitespace in Expressions and Statements > Other Recommendations

3rd bullet:


Don't use spaces around the = sign when used to indicate a keyword argument or a default parameter value.

Yes:

def complex(real, imag=0.0):
return magic(r=real, i=imag)

No:

def complex(real, imag = 0.0):
return magic(r = real, i = imag)

Mark Adam

unread,
Nov 14, 2012, 12:08:32 PM11/14/12
to Xavier Morel, python-dev Dev
That's not a recommendation to use the **kwargs style.

mark

Mark Adam

unread,
Nov 14, 2012, 12:10:15 PM11/14/12
to Brian Curtin, <python-dev@python.org>
On Wed, Nov 14, 2012 at 11:00 AM, Brian Curtin <br...@python.org> wrote:
> On Wed, Nov 14, 2012 at 10:12 AM, Mark Adam <dreamin...@gmail.com> wrote:
>> On Wed, Nov 14, 2012 at 3:12 AM, Chris Withers <ch...@simplistix.co.uk> wrote:
>>> Hi All,
>>>
>>> A colleague pointed me at Doug's excellent article here:
>>> ...which made me a little sad, I suspect I'm not the only one who finds:
>>>
>>> a_dict = dict(
>>> x = 1,
>>> y = 2,
>>> z = 3,
>>> ...
>>> )
>>>
>>> ...easier to read than:
>>>
>>> a_dict = {
>>> 'x':1,
>>> 'y':2,
>>> 'z':3,
>>> ...
>>> }
>>
>> Hey, it makes me a little sad that dict breaks convention by allowing
>> the use of unquoted characters (which everywhere else looks like
>> variable names) just for a silly typing optimization.
>
> What convention and typing optimization is this? I hope you aren't
> suggesting it should be dict("x"=1) or dict("x":1)?

Try the canonical {'x':1}. Only dict allows the special
initialization above. Other collections require an iterable. I'm guessing
**kwargs initialization was only used because it is so simple to
implement, but that's not necessarily a heuristic for good language design.

mark

R. David Murray

unread,
Nov 14, 2012, 12:27:46 PM11/14/12
to pytho...@python.org
Maybe it's not good design, but I'll bet you that if it didn't do that,
there would be lots of instances of this scattered around various
codebases:

def makedict(**kw):
return kw

--David

Mark Adam

unread,
Nov 14, 2012, 12:57:48 PM11/14/12
to R. David Murray, pytho...@python.org
On Wed, Nov 14, 2012 at 11:27 AM, R. David Murray <rdmu...@bitdance.com> wrote:
> Maybe it's not good design, but I'll bet you that if it didn't do that,
> there would be lots of instances of this scattered around various
> codebases:
>
> def makedict(**kw):
> return kw

Now that's a good solution and probably solves the OP speed problem.

mark

Richard Oudkerk

unread,
Nov 14, 2012, 1:01:53 PM11/14/12
to pytho...@python.org
On 14/11/2012 5:02pm, Xavier Morel wrote:
>> In which section? I can't see such a recommendation.
>
> Whitespace in Expressions and Statements > Other Recommendations
>
> 3rd bullet:
>
> —
> Don't use spaces around the = sign when used to indicate a keyword argument or a default parameter value.

Oops, I did not even notice that difference.

I thought Serhiy was talking about indenting the closing ')' and '}'.

--
Richard

Xavier Morel

unread,
Nov 14, 2012, 1:08:44 PM11/14/12
to python-dev Dev
On 2012-11-14, at 18:08 , Mark Adam wrote:
>
> That's not a recommendation to use the **kwargs style.

And nobody said it was. It's a recommendation to not put spaces around
the equals sign when using keyword arguments which is the correction
Serhiy applied to the original code (along with adding a space after the
colon in the literal dict, also a PEP8 recommendation).

Xavier Morel

unread,
Nov 14, 2012, 1:12:25 PM11/14/12
to python-dev Dev
On 2012-11-14, at 18:10 , Mark Adam wrote:
>
> Try the canonical {'x':1}. Only dict allows the special
> initialization above. Other collections require an iterable.

Other collections don't have a choice, because it would often be
ambiguous. Dicts do not have that issue.

> I'm guessing
> **kwargs initialization was only used because it is so simple to
> implement, but that's not necessarily a heuristic for good language design.

In this case it very much is, it permits easily merging two dicts in a
single expression or cloning-with-replacement. It also mirrors the
signature of dict.update which I think is a Good Thing.

Mark Adam

unread,
Nov 14, 2012, 1:54:32 PM11/14/12
to Xavier Morel, python-dev Dev
On Wed, Nov 14, 2012 at 12:12 PM, Xavier Morel <catc...@masklinn.net> wrote:
> On 2012-11-14, at 18:10 , Mark Adam wrote:
>>
>> Try the canonical {'x':1}. Only dict allows the special
>> initialization above. Other collections require an iterable.
>
> Other collections don't have a choice, because it would often be
> ambiguous. Dicts do not have that issue.

mkay....

>> I'm guessing
>> **kwargs initialization was only used because it is so simple to
>> implement, but that's not necessarily a heuristic for good language design.
>
> In this case it very much is, it permits easily merging two dicts in a
> single expression or cloning-with-replacement. It also mirrors the
> signature of dict.update which I think is a Good Thing.

Merging of two dicts is done with dict.update. How do you do it on
initialization? This doesn't make sense.

mark

Xavier Morel

unread,
Nov 14, 2012, 2:37:02 PM11/14/12
to python-dev Dev
On 2012-11-14, at 19:54 , Mark Adam wrote:
>
> Merging of two dicts is done with dict.update.

No, dict.update merges one dict (or two) into a third one.

> How do you do it on
> initialization? This doesn't make sense.

dict(d1, **d2)

Oleg Broytman

unread,
Nov 14, 2012, 8:25:00 AM11/14/12
to pytho...@python.org
On the other had it's more powerful. You can write {'class': 'foo'}
but cannot dict(class='bar'). {1: '1'} but not dict(1='1').

Oleg.
--
Oleg Broytman http://phdru.name/ p...@phdru.name
Programmers don't die, they just GOSUB without RETURN.

Mark Adam

unread,
Nov 14, 2012, 3:53:11 PM11/14/12
to Xavier Morel, python-dev Dev
On Wed, Nov 14, 2012 at 1:37 PM, Xavier Morel <catc...@masklinn.net> wrote:
> On 2012-11-14, at 19:54 , Mark Adam wrote:
>>
>> Merging of two dicts is done with dict.update.
>
> No, dict.update merges one dict (or two) into a third one.

No. I think you need to read the docs.

>> How do you do it on
>> initialization? This doesn't make sense.
>
> dict(d1, **d2)

That's not valid syntax is it?

mark

Antoine Pitrou

unread,
Nov 14, 2012, 3:58:30 PM11/14/12
to pytho...@python.org
On Wed, 14 Nov 2012 14:53:11 -0600
Mark Adam <dreamin...@gmail.com> wrote:
> On Wed, Nov 14, 2012 at 1:37 PM, Xavier Morel <catc...@masklinn.net> wrote:
> > On 2012-11-14, at 19:54 , Mark Adam wrote:
> >>
> >> Merging of two dicts is done with dict.update.
> >
> > No, dict.update merges one dict (or two) into a third one.
>
> No. I think you need to read the docs.
>
> >> How do you do it on
> >> initialization? This doesn't make sense.
> >
> > dict(d1, **d2)
>
> That's not valid syntax is it?

Why don't you try it for yourself:

>>> d1 = {1:2}
>>> d2 = {3:4}
>>> dict(d1, **d2)
{1: 2, 3: 4}

MRAB

unread,
Nov 14, 2012, 4:20:33 PM11/14/12
to python-dev
On 2012-11-14 20:53, Mark Adam wrote:
> On Wed, Nov 14, 2012 at 1:37 PM, Xavier Morel <catc...@masklinn.net> wrote:
>> On 2012-11-14, at 19:54 , Mark Adam wrote:
>>>
>>> Merging of two dicts is done with dict.update.
>>
>> No, dict.update merges one dict (or two) into a third one.
>
> No. I think you need to read the docs.
>
>>> How do you do it on
>>> initialization? This doesn't make sense.
>>
>> dict(d1, **d2)
>
> That's not valid syntax is it?
>
No.

You can have dict(d1) and dict(**d2), but not dict(d1, **d2).

MRAB

unread,
Nov 14, 2012, 4:24:14 PM11/14/12
to python-dev
On 2012-11-14 21:20, MRAB wrote:
> On 2012-11-14 20:53, Mark Adam wrote:
>> On Wed, Nov 14, 2012 at 1:37 PM, Xavier Morel <catc...@masklinn.net> wrote:
>>> On 2012-11-14, at 19:54 , Mark Adam wrote:
>>>>
>>>> Merging of two dicts is done with dict.update.
>>>
>>> No, dict.update merges one dict (or two) into a third one.
>>
>> No. I think you need to read the docs.
>>
>>>> How do you do it on
>>>> initialization? This doesn't make sense.
>>>
>>> dict(d1, **d2)
>>
>> That's not valid syntax is it?
>>
> No.
>
> You can have dict(d1) and dict(**d2), but not dict(d1, **d2).
>
Oops, wrong! :-( (I see now where I went wrong...)

Brian Curtin

unread,
Nov 14, 2012, 4:24:20 PM11/14/12
to python-dev
On Wed, Nov 14, 2012 at 3:20 PM, MRAB <pyt...@mrabarnett.plus.com> wrote:
> On 2012-11-14 20:53, Mark Adam wrote:
>>
>> On Wed, Nov 14, 2012 at 1:37 PM, Xavier Morel <catc...@masklinn.net>
>> wrote:
>>>
>>> On 2012-11-14, at 19:54 , Mark Adam wrote:
>>>>
>>>>
>>>> Merging of two dicts is done with dict.update.
>>>
>>>
>>> No, dict.update merges one dict (or two) into a third one.
>>
>>
>> No. I think you need to read the docs.
>>
>>>> How do you do it on
>>>> initialization? This doesn't make sense.
>>>
>>>
>>> dict(d1, **d2)
>>
>>
>> That's not valid syntax is it?
>>
> No.
>
> You can have dict(d1) and dict(**d2), but not dict(d1, **d2).

Yes you can.

Ethan Furman

unread,
Nov 14, 2012, 4:27:15 PM11/14/12
to python-dev
MRAB wrote:
> On 2012-11-14 20:53, Mark Adam wrote:
>> On Wed, Nov 14, 2012 at 1:37 PM, Xavier Morel <catc...@masklinn.net>
>> wrote:
>>> On 2012-11-14, at 19:54 , Mark Adam wrote:
>>>>
>>>> Merging of two dicts is done with dict.update.
>>>
>>> No, dict.update merges one dict (or two) into a third one.
>>
>> No. I think you need to read the docs.
>>
>>>> How do you do it on
>>>> initialization? This doesn't make sense.
>>>
>>> dict(d1, **d2)
>>
>> That's not valid syntax is it?
>>
> No.
>
> You can have dict(d1) and dict(**d2), but not dict(d1, **d2).

To (mis-)quote Antoine:
>--> d1 = {1:2}
>--> d2 = {'3':4}
>--> dict(d1, **d2)
> {1: 2, '3': 4}

Apparently it is valid syntax. Just make sure you keys for the ** operator are valid strings. :)

~Ethan~

Brandon W Maister

unread,
Nov 14, 2012, 4:40:37 PM11/14/12
to Ethan Furman, python-dev

To (mis-)quote Antoine:
>--> d1 = {1:2}
>--> d2 = {'3':4}
>--> dict(d1, **d2)
> {1: 2, '3': 4}

Apparently it is valid syntax.  Just make sure you keys for the ** operator are valid strings.  :)


or not:

>>> dict(**{'not a valid identifier': True, 1: True})
{1: True, 'not a valid identifier': True}

brandon

Brian Curtin

unread,
Nov 14, 2012, 4:43:46 PM11/14/12
to Brandon W Maister, python-dev
Just because the string says it's not valid doesn't mean it's not valid.

Anyway, can this thread go to python-ideas or python-list now?

Xavier Morel

unread,
Nov 14, 2012, 5:01:32 PM11/14/12
to python-dev Dev
On 2012-11-14, at 21:53 , Mark Adam wrote:

> On Wed, Nov 14, 2012 at 1:37 PM, Xavier Morel <catc...@masklinn.net> wrote:
>> On 2012-11-14, at 19:54 , Mark Adam wrote:
>>>
>>> Merging of two dicts is done with dict.update.
>>
>> No, dict.update merges one dict (or two) into a third one.
>
> No. I think you need to read the docs.

I know what the docs say. dict.update requires an existing dict and (as
mutator methods usually do in Python) doesn't return anything. Thus it
merges a dict (or two) into a third one (the subject of the call).

>>> How do you do it on
>>> initialization? This doesn't make sense.
>>
>> dict(d1, **d2)
>
> That's not valid syntax is it?

Of course it is, why would it not be?

Greg Ewing

unread,
Nov 14, 2012, 4:40:48 PM11/14/12
to pytho...@python.org
Chris Angelico wrote:
> Perhaps an alternative question: What can be done to make the latter
> less unpalatable?

* We could introduce a new syntax such as {a = 1, b = 2}.

* If the compiler were allowed to recognise builtins, it could
turn dict(a = 1, b = 2) into {'a':1, 'b':2} automatically.

--
Greg

Chris Withers

unread,
Nov 14, 2012, 5:37:46 PM11/14/12
to mar...@v.loewis.de, Doug Hellmann, pytho...@python.org
On 14/11/2012 10:11, mar...@v.loewis.de wrote:
>
> Zitat von Chris Withers <ch...@simplistix.co.uk>:
>
>> a_dict = dict(
>> x = 1,
>> y = 2,
>> z = 3,
>> ...
>> )
>
>> What can we do to speed up the former case?
>
> It should be possible to special-case it. Rather than creating
> a new dictionary from scratch, one could try to have the new dictionary
> the same size as the original one, and copy all entries.

Indeed, Doug, what are your views on this? Also, did you have a
real-world example where this speed difference was causing you a problem?

> I don't know how much this would gain, though. You still have to
> create two dictionary objects. For a better speedup, try
>
> def xdict(**kwds):
> return kwds

Hah, good call, this trumps both of the other options:

$ python2.7 -m timeit -n 1000000 -r 5 -v
"{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
raw times: 1.45 1.45 1.44 1.45 1.45
1000000 loops, best of 5: 1.44 usec per loop
$ python2.6 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 2.37 2.36 2.36 2.37 2.37
1000000 loops, best of 5: 2.36 usec per loop$ python2.6 -m timeit -n
1000000 -r 5 -v 'def md(**kw): return kw; md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 0.548 0.533 0.55 0.577 0.539
1000000 loops, best of 5: 0.533 usec per loop

For the naive observer (ie: me!), why is that?

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk

Chris Withers

unread,
Nov 14, 2012, 5:41:28 PM11/14/12
to Greg Ewing, pytho...@python.org
On 14/11/2012 21:40, Greg Ewing wrote:
> * If the compiler were allowed to recognise builtins, it could
> turn dict(a = 1, b = 2) into {'a':1, 'b':2} automatically.

That would be my naive suggestion, I am prepared to be shot down in
flames ;-)

Would be even more awesome if it could end up with the magical
performance of "def md(**kw): return kw"...

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk

Chris Withers

unread,
Nov 14, 2012, 5:43:52 PM11/14/12
to mar...@v.loewis.de, Doug Hellmann, pytho...@python.org
On 14/11/2012 22:37, Chris Withers wrote:
> On 14/11/2012 10:11, mar...@v.loewis.de wrote:
>> def xdict(**kwds):
>> return kwds
>
> Hah, good call, this trumps both of the other options:
>
> $ python2.7 -m timeit -n 1000000 -r 5 -v
> "{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
> raw times: 1.45 1.45 1.44 1.45 1.45
> 1000000 loops, best of 5: 1.44 usec per loop
> $ python2.6 -m timeit -n 1000000 -r 5 -v
> 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
> raw times: 2.37 2.36 2.36 2.37 2.37
> 1000000 loops, best of 5: 2.36 usec per loop$ python2.6 -m timeit -n
> 1000000 -r 5 -v 'def md(**kw): return kw; md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
> raw times: 0.548 0.533 0.55 0.577 0.539
> 1000000 loops, best of 5: 0.533 usec per loop

Before anyone shoots me, yes, wrong python for two of them:

$ python2.7 -m timeit -n 1000000 -r 5 -v
"{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
raw times: 1.49 1.49 1.5 1.49 1.48
1000000 loops, best of 5: 1.48 usec per loop

$ python2.7 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 2.35 2.36 2.41 2.42 2.35
1000000 loops, best of 5: 2.35 usec per loop

$ python2.7 -m timeit -n 1000000 -r 5 -v 'def md(**kw): return kw;
md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 0.507 0.515 0.516 0.529 0.524
1000000 loops, best of 5: 0.507 usec per loop

MRAB

unread,
Nov 14, 2012, 5:51:37 PM11/14/12
to python-dev
On 2012-11-14 21:40, Greg Ewing wrote:
> Chris Angelico wrote:
>> Perhaps an alternative question: What can be done to make the latter
>> less unpalatable?
>
> * We could introduce a new syntax such as {a = 1, b = 2}.
>
> * If the compiler were allowed to recognise builtins, it could
> turn dict(a = 1, b = 2) into {'a':1, 'b':2} automatically.
>
That would be a transformation of the AST, although it assumes that
'dict' hasn't been rebound.

Should there be the option of a warning if a builtin is rebound? Or the
option of the transformation plus a warning if the builtin is rebound?

Donald Stufft

unread,
Nov 14, 2012, 6:00:58 PM11/14/12
to python-dev
$ pypy -m timeit 'dict()'
1000000000 loops, best of 3: 0.000811 usec per loop

$ pypy -m timeit '{}'    
1000000000 loops, best of 3: 0.000809 usec per loop

$ pypy -m timeit 'def md(**kw): return kw; md()'
100000000 loops, best of 3: 0.0182 usec per loop

$ pypy -m timeit -s 'def md(**kw): return kw' 'md()'
1000000000 loops, best of 3: 0.00136 usec per loop

If the difference between dict() and {} is hurting your code why are
you still using CPython.

Xavier Morel

unread,
Nov 14, 2012, 6:03:45 PM11/14/12
to python-dev Dev
The last one is kind-of weird, it seems to be greatly advantaged by the local lookup:

> python2.7 -m timeit -n 1000000 -r 5 -v "{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
raw times: 0.676 0.683 0.682 0.698 0.691
1000000 loops, best of 5: 0.676 usec per loop
> python2.7 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 1.64 1.66 1.4 1.44 1.44
1000000 loops, best of 5: 1.4 usec per loop
> python2.7 -m timeit -n 1000000 -r 5 -v 'def md(**kw): return kw; md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 0.188 0.203 0.201 0.195 0.202
1000000 loops, best of 5: 0.188 usec per loop
> python2.7 -m timeit -n 1000000 -r 5 -v -s 'def md(**kw): return kw' 'md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
raw times: 0.871 0.864 0.863 0.889 0.871
1000000 loops, best of 5: 0.863 usec per loop

Steven D'Aprano

unread,
Nov 14, 2012, 6:36:24 PM11/14/12
to pytho...@python.org
On 15/11/12 05:54, Mark Adam wrote:

> Merging of two dicts is done with dict.update. How do you do it on
> initialization? This doesn't make sense.

Frequently.

my_prefs = dict(default_prefs, setting=True, another_setting=False)


Notice that I'm not merging one dict into another, but merging two dicts
into a third.

(Well, technically, one of the two comes from keyword arguments rather
than an actual dict, but the principle is the same.)

The Python 1.5 alternative was:

my_prefs = {}
my_prefs.update(default_prefs)
my_prefs['setting'] = True
my_prefs['another_setting'] = False


Blah, I'm so glad I don't have to write Python 1.5 code any more. Even
using copy only saves a line:

my_prefs = default_prefs.copy()
my_prefs['setting'] = True
my_prefs['another_setting'] = False




--
Steven

Lukas Lueg

unread,
Nov 14, 2012, 6:38:38 PM11/14/12
to Python Dev
Notice that {'x':1} and dict(x=1) are different beasts: The first one compiles directly to BUILD_MAP. The second one loads a reference to 'dict' from globals() and calls the constructor. The two are not the same.



2012/11/15 Steven D'Aprano <st...@pearwood.info>

Chris Angelico

unread,
Nov 14, 2012, 6:40:20 PM11/14/12
to pytho...@python.org
On Thu, Nov 15, 2012 at 10:36 AM, Steven D'Aprano <st...@pearwood.info> wrote:
> On 15/11/12 05:54, Mark Adam wrote:
>
>> Merging of two dicts is done with dict.update. How do you do it on
>> initialization? This doesn't make sense.
>
>
> Frequently.
>
> my_prefs = dict(default_prefs, setting=True, another_setting=False)
>
>
> Notice that I'm not merging one dict into another, but merging two dicts
> into a third.

Side point: Wouldn't it be quite logical to support dict addition?

>>> {"a":1}+{"b":2}
Traceback (most recent call last):
File "<pyshell#59>", line 1, in <module>
{"a":1}+{"b":2}
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'

It would make sense for this to result in {"a":1,"b":2}.

ChrisA

Terry Reedy

unread,
Nov 14, 2012, 6:47:50 PM11/14/12
to pytho...@python.org
On 11/14/2012 4:12 AM, Chris Withers wrote:
To somewhat paraphrase: '''
I prefer 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)' to
"{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}".
I am sad that the former takes +-2 times as long to run (in 2.7).
Is the difference about the same in 3.x?
What can we do to speed up the former case?
'''

My responses, trying not to duplicate others.
1. Visual preference depends on the viewer. I prefer the dict display,
perhaps because I am more accustomed to it.

2. The two types of expressions have overlapping but distinct use cases.
This differences include that dict can be wrapped or replaced, while
displays cannot.

3. a) 3.x has dict comprehensions. How do they stack up? b) If one were
really initializing multiple dicts with the same starting items, and one
were really concerned about speed, should one calculate the common base
dict just once and then copy? Win7 64 with 3.3.0:

>>> repeat("dict(a=1, b=2, c=3, d=4, e=5)")
[0.6200045004915467, 0.6212762582470646, 0.6114683222573376]
>>> repeat("{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}")
[0.27170026972208233, 0.2594874604131968, 0.25977058559879584]
>>> repeat("d.copy()", "d={'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}")
[0.25768296004457625, 0.243041299387869, 0.2421860830290825]

>>> repeat("{str(i):i for i in range(10)}")
[4.914327732926495, 4.874041570524014, 4.871596119002334]
>>> repeat("{'0':0, '1':1, '2':2, '3':3, '4':4, '5':5, '6':6, '7':7,
'8':8, '9':9}")
[0.5207065648769458, 0.5000415004344632, 0.49980294978922757]
>>> repeat("d.copy()", "d={'0':0, '1':1, '2':2, '3':3, '4':4, '5':5,
'6':6, '7':7, '8':8, '9':9}")
[0.571671864980317, 0.5516194699132484, 0.5514937389677925]


Assuming no overlooked errors in the above...
a) Dict comprehensions are much slower than calls, which makes the calls
look good by comparison. b) Copying is not worthwhile.

4. There are about 3000 issues on the tracker. Nearly all are worth more
attention than this ;-).

--
Terry Jan Reedy

Mark Adam

unread,
Nov 14, 2012, 6:52:17 PM11/14/12
to Chris Angelico, pytho...@python.org
On Wed, Nov 14, 2012 at 5:40 PM, Chris Angelico <ros...@gmail.com> wrote:
> On Thu, Nov 15, 2012 at 10:36 AM, Steven D'Aprano <st...@pearwood.info> wrote:
>> On 15/11/12 05:54, Mark Adam wrote:
>> Notice that I'm not merging one dict into another, but merging two dicts
>> into a third.
>
> Side point: Wouldn't it be quite logical to support dict addition?
>

Yes, but then you'd be in my old argument that dict should inherit from set.

Stephen J. Turnbull

unread,
Nov 14, 2012, 9:28:45 PM11/14/12
to Chris Angelico, pytho...@python.org
Chris Angelico writes:

> >>> {"a":1}+{"b":2}

> It would make sense for this to result in {"a":1,"b":2}.

The test is not "does this sometimes make sense?" It's "does this
ever result in nonsense, and if so, do we care?"

Here, addition is usually commutative. Should {'a':1}+{'a':2} be the
same as, or different from, {'a':2}+{'a':1}, or should it be an error?

Chris Angelico

unread,
Nov 14, 2012, 9:35:46 PM11/14/12
to pytho...@python.org
On Thu, Nov 15, 2012 at 1:28 PM, Stephen J. Turnbull <ste...@xemacs.org> wrote:
> Chris Angelico writes:
>
> > >>> {"a":1}+{"b":2}
>
> > It would make sense for this to result in {"a":1,"b":2}.
>
> The test is not "does this sometimes make sense?" It's "does this
> ever result in nonsense, and if so, do we care?"
>
> Here, addition is usually commutative. Should {'a':1}+{'a':2} be the
> same as, or different from, {'a':2}+{'a':1}, or should it be an error?

>>> "a"+"b"
'ab'
>>> "b"+"a"
'ba'

I would say that the two dictionary examples are equally allowed to
give different results - that they should be equivalent to (shallow)
copy followed by update(), but possibly more efficiently.

ChrisA

Mark Adam

unread,
Nov 14, 2012, 10:24:07 PM11/14/12
to Stephen J. Turnbull, pytho...@python.org
On Wed, Nov 14, 2012 at 8:28 PM, Stephen J. Turnbull <ste...@xemacs.org> wrote:
> Chris Angelico writes:
>
> > >>> {"a":1}+{"b":2}
>
> > It would make sense for this to result in {"a":1,"b":2}.
>
> The test is not "does this sometimes make sense?" It's "does this
> ever result in nonsense, and if so, do we care?"
>
> Here, addition is usually commutative. Should {'a':1}+{'a':2} be the
> same as, or different from, {'a':2}+{'a':1}, or should it be an error?

Easy: dict should have a (user substitutable) collision function that
is called in these cases. This would allow significant functionality
with practically no cost. In addition, it could be implemented in
such a way as to offer significant speedups (when using dict.update
for example) over any possible hand-written substitutes (since it's
only run on key collisions and otherwise uses an underlying loop coded
in C).

mark

Stephen J. Turnbull

unread,
Nov 14, 2012, 10:48:22 PM11/14/12
to Chris Angelico, pytho...@python.org
Chris Angelico writes:
> On Thu, Nov 15, 2012 at 1:28 PM, Stephen J. Turnbull <ste...@xemacs.org> wrote:
> > Chris Angelico writes:
> >
> > > >>> {"a":1}+{"b":2}
> >
> > > It would make sense for this to result in {"a":1,"b":2}.
> >
> > The test is not "does this sometimes make sense?" It's "does this
> > ever result in nonsense, and if so, do we care?"
> >
> > Here, addition is usually commutative. Should {'a':1}+{'a':2} be the
> > same as, or different from, {'a':2}+{'a':1}, or should it be an error?
>
> >>> "a"+"b"
> 'ab'
> >>> "b"+"a"
> 'ba'
>
> I would say that the two dictionary examples are equally allowed to
> give different results - that they should be equivalent to (shallow)
> copy followed by update(), but possibly more efficiently.

I wouldn't. A string is a sequence of uninterpreted letters, and
necessarily ordered. In fact, that's about all you can say about
strings in general. I would prefer that concatenation be expressed by
juxtaposition, but that's troublesome for machine parsing (especially
error recovery). My intuition is elastic enough to admit exceptional
cases where the essential ordered nature of the objects being "added"
is more important than the customary interpretation of the operator
symbol, so interpreting string addition as concatenation doesn't
bother me. Furthermore, in string addition both operands affect the
result in proportion to their content, though differently.

Dictionaries aren't ordered, and their "elements" have structure
(key-value pairs). It would definitely bother me if dictionary
addition weren't commutative, and it's worse that an operand affects
the outcome in an all-or-nothing way.

Also, "update" is more appropriately expressed by an extended
assignment operator. Defining "+" in terms of "+=" as you propose
just doesn't seem right to me.

Stephen J. Turnbull

unread,
Nov 14, 2012, 11:11:24 PM11/14/12
to Mark Adam, pytho...@python.org
Mark Adam writes:

> Easy: dict should have a (user substitutable) collision function that
> is called in these cases.

"I smell overengineering."

> This would allow significant functionality with practically no
> cost.

We already have that functionality if we want it; just define an
appropriate mapping class.

I don't need or want it, so I can ignore it, but I suspect to get
anywhere with this proposal you're going to need to show that this
"significant functionality" needs to be in syntax.

mar...@v.loewis.de

unread,
Nov 14, 2012, 11:19:31 PM11/14/12
to pytho...@python.org

Zitat von Chris Withers <ch...@simplistix.co.uk>:

> On 14/11/2012 21:40, Greg Ewing wrote:
>> * If the compiler were allowed to recognise builtins, it could
>> turn dict(a = 1, b = 2) into {'a':1, 'b':2} automatically.
>
> That would be my naive suggestion, I am prepared to be shot down in
> flames ;-)

In general, special-casing builtins in the compiler is not possible
in Python. You cannot know statically that 'dict' really refers to
the builtin. Something may shadow the name at run-time, making dict
refer to some other callable.

Regards,
Martin

mar...@v.loewis.de

unread,
Nov 14, 2012, 11:27:23 PM11/14/12
to pytho...@python.org

Zitat von Chris Withers <ch...@simplistix.co.uk>:

> $ python2.7 -m timeit -n 1000000 -r 5 -v
> "{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
> raw times: 1.49 1.49 1.5 1.49 1.48
> 1000000 loops, best of 5: 1.48 usec per loop
>
> $ python2.7 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
> raw times: 2.35 2.36 2.41 2.42 2.35
> 1000000 loops, best of 5: 2.35 usec per loop
>
> $ python2.7 -m timeit -n 1000000 -r 5 -v 'def md(**kw): return kw;
> md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
> raw times: 0.507 0.515 0.516 0.529 0.524
> 1000000 loops, best of 5: 0.507 usec per loop
>
> For the naive observer (ie: me!), why is that?

It's faster than calling dict() because the dict code will
create a second dictionary, and discard the keywords dictionary.

It's (probably) faster than the dictionary display, because
the {} byte code builds the dictionary one-by-one, whereas
the keywords dictionary is built in a single step (taking
all keys and values from the evaluation stack).

Regards,
Martin

mar...@v.loewis.de

unread,
Nov 15, 2012, 12:04:57 AM11/15/12
to pytho...@python.org

Zitat von Chris Angelico <ros...@gmail.com>:

> On Thu, Nov 15, 2012 at 1:28 PM, Stephen J. Turnbull
> <ste...@xemacs.org> wrote:
>> Chris Angelico writes:
>>
>> > >>> {"a":1}+{"b":2}
>>
>> > It would make sense for this to result in {"a":1,"b":2}.
>>
>> The test is not "does this sometimes make sense?" It's "does this
>> ever result in nonsense, and if so, do we care?"
>>
>> Here, addition is usually commutative. Should {'a':1}+{'a':2} be the
>> same as, or different from, {'a':2}+{'a':1}, or should it be an error?
>
>>>> "a"+"b"
> 'ab'
>>>> "b"+"a"
> 'ba'
>
> I would say that the two dictionary examples are equally allowed to
> give different results - that they should be equivalent to (shallow)
> copy followed by update(), but possibly more efficiently.

Can this be moved to python-ideas, please?

Regards,
Martin

Stefan Behnel

unread,
Nov 15, 2012, 1:32:41 AM11/15/12
to pytho...@python.org
Donald Stufft, 15.11.2012 00:00:
> $ pypy -m timeit 'dict()'
> 1000000000 loops, best of 3: 0.000811 usec per loop
>
> $ pypy -m timeit '{}'
> 1000000000 loops, best of 3: 0.000809 usec per loop
>
> $ pypy -m timeit 'def md(**kw): return kw; md()'
> 100000000 loops, best of 3: 0.0182 usec per loop
>
> $ pypy -m timeit -s 'def md(**kw): return kw' 'md()'
> 1000000000 loops, best of 3: 0.00136 usec per loop

Yep, I really like the fact that optimisers can fold stupid benchmarks into
no-ops. I wonder why it fails so badly in the latter two cases, though. You
should bring that to the attention of the PyPy developers, they might want
to fix it.

Stefan


_______________________________________________
Python-Dev mailing list
Pytho...@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Chris Withers

unread,
Nov 15, 2012, 2:14:42 AM11/15/12
to Stefan Behnel, pytho...@python.org
On 15/11/2012 06:32, Stefan Behnel wrote:
> Donald Stufft, 15.11.2012 00:00:
>> $ pypy -m timeit 'dict()'
>> 1000000000 loops, best of 3: 0.000811 usec per loop
>>
>> $ pypy -m timeit '{}'
>> 1000000000 loops, best of 3: 0.000809 usec per loop
>>
>> $ pypy -m timeit 'def md(**kw): return kw; md()'
>> 100000000 loops, best of 3: 0.0182 usec per loop
>>
>> $ pypy -m timeit -s 'def md(**kw): return kw' 'md()'
>> 1000000000 loops, best of 3: 0.00136 usec per loop
>
> Yep, I really like the fact that optimisers can fold stupid benchmarks into
> no-ops. I wonder why it fails so badly in the latter two cases, though. You
> should bring that to the attention of the PyPy developers, they might want
> to fix it.

Agreed, but Donald, please try with a bunch of keys rather than an empty
dict...

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk

Stefan Behnel

unread,
Nov 15, 2012, 2:22:57 AM11/15/12
to pytho...@python.org
Chris Withers, 15.11.2012 08:14:
> On 15/11/2012 06:32, Stefan Behnel wrote:
>> Donald Stufft, 15.11.2012 00:00:
>>> $ pypy -m timeit 'dict()'
>>> 1000000000 loops, best of 3: 0.000811 usec per loop
>>>
>>> $ pypy -m timeit '{}'
>>> 1000000000 loops, best of 3: 0.000809 usec per loop
>>>
>>> $ pypy -m timeit 'def md(**kw): return kw; md()'
>>> 100000000 loops, best of 3: 0.0182 usec per loop
>>>
>>> $ pypy -m timeit -s 'def md(**kw): return kw' 'md()'
>>> 1000000000 loops, best of 3: 0.00136 usec per loop
>>
>> Yep, I really like the fact that optimisers can fold stupid benchmarks into
>> no-ops. I wonder why it fails so badly in the latter two cases, though. You
>> should bring that to the attention of the PyPy developers, they might want
>> to fix it.
>
> Agreed, but Donald, please try with a bunch of keys rather than an empty
> dict...

Right. If that makes a difference, it's another bug.

Stefan

Serhiy Storchaka

unread,
Nov 15, 2012, 2:24:21 AM11/15/12
to pytho...@python.org
On 15.11.12 01:47, Terry Reedy wrote:
> 4. There are about 3000 issues on the tracker. Nearly all are worth more
> attention than this ;-).

This is the best conclusion of this thread.

Łukasz Rekucki

unread,
Nov 15, 2012, 4:22:50 AM11/15/12
to pytho...@python.org
Hi,

I posted this (by accident) off the list:

> On 2012-11-14, at 23:43 , Chris Withers wrote:
>
>> On 14/11/2012 22:37, Chris Withers wrote:
>>> On 14/11/2012 10:11, mar...@v.loewis.de wrote:
>>>> def xdict(**kwds):
>>>> return kwds
>>>
>>> Hah, good call, this trumps both of the other options:
>>>

>>> 1000000 -r 5 -v 'def md(**kw): return kw; md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
>>> raw times: 0.548 0.533 0.55 0.577 0.539
>>> 1000000 loops, best of 5: 0.533 usec per loop

No, this just doesn't execute the right code:

>>> def md(**kw): return kw; md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)

...
>>> import dis
>>> dis.dis(md)
1 0 LOAD_FAST 0 (kw)
3 RETURN_VALUE
4 LOAD_GLOBAL 0 (md)
7 LOAD_CONST 1 ('a')
10 LOAD_CONST 2 (1)
13 LOAD_CONST 3 ('b')
16 LOAD_CONST 4 (2)
19 LOAD_CONST 5 ('c')
22 LOAD_CONST 6 (3)
25 LOAD_CONST 7 ('d')
28 LOAD_CONST 8 (4)
31 LOAD_CONST 9 ('e')
34 LOAD_CONST 10 (5)
37 LOAD_CONST 11 ('f')
40 LOAD_CONST 12 (6)
43 LOAD_CONST 13 ('g')
46 LOAD_CONST 14 (7)
49 CALL_FUNCTION 1792
52 POP_TOP

Also:

Python 3.2.3 (default, Apr 11 2012, 07:12:16) [MSC v.1500 64 bit
(AMD64)] on win 32
Type "help", "copyright", "credits" or "license" for more information.
>>> dict({1: "foo"}, **{frozenset([2]): "bar"})


Traceback (most recent call last):

File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings

While:

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> dict({1: "foo"}, **{2: "bar"})
{1: 'foo', 2: 'bar'}
>>> dict({1: "foo"}, **{frozenset([2]): "bar"})
{1: 'foo', frozenset([2]): 'bar'}

If you're worrying about global lookup, you should stop (in this case):

$ py -3.3 -m timeit -n 1000000 -r 5 -v -s "def xdict(): return dict()" "xdict()"
raw times: 0.477 0.47 0.468 0.473 0.469
1000000 loops, best of 5: 0.468 usec per loop

$ py -3.3 -m timeit -n 1000000 -r 5 -v -s "def xdict(dict=dict):
return dict()" "xdict()"
raw times: 0.451 0.45 0.451 0.45 0.449
1000000 loops, best of 5: 0.449 usec per loop

$ py -3.3 -m timeit -n 1000000 -r 5 -v -s "def xdict(dict=lambda **kw:
kw): return dict()" "xdict()"
raw times: 0.433 0.434 0.435 0.435 0.431
1000000 loops, best of 5: 0.431 usec per loop

$ py -3.3 -m timeit -n 1000000 -r 5 -v -s "def xdict(dict=dict):
return {}" "xdict()"
raw times: 0.276 0.279 0.279 0.277 0.275
1000000 loops, best of 5: 0.275 usec per loop

And using non-empty dicts doesn't change much and the first one is
roughly the sum of the latter two (as expected):

C:\Users\lrekucki>py -3.3 -m timeit -n 1000000 -r 5 -v -s "def
xdict(dict=dict): return dict(a=1, b=2, c=3, d=4, e=5, f=6)" "xdict()"
raw times: 1.72 1.71 1.71 1.71 1.71
1000000 loops, best of 5: 1.71 usec per loop

C:\Users\lrekucki>py -3.3 -m timeit -n 1000000 -r 5 -v -s "def
xdict(dict=lambda **kw: kw): return dict(a=1, b=2, c=3, d=4, e=5,
f=6)" "xdict()"
raw times: 1.01 1.01 1.01 1.01 1.01
1000000 loops, best of 5: 1.01 usec per loop

C:\Users\lrekucki>py -3.3 -m timeit -n 1000000 -r 5 -v -s "def
xdict(dict=dict): return {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f':
6}" "xdict()"
raw times: 0.744 0.736 0.735 0.733 0.733
1000000 loops, best of 5: 0.733 usec per loop


I hope that this helps move python-dev's focus to some more useful discussion.

--
Łukasz Rekucki

Greg Ewing

unread,
Nov 15, 2012, 5:48:27 AM11/15/12
to pytho...@python.org
mar...@v.loewis.de wrote:
> It's faster than calling dict() because the dict code will
> create a second dictionary, and discard the keywords dictionary.

Perhaps in the case where dict() is called with keyword
args only, it could just return the passed-in keyword
dictionary instead of creating another one?

--
Greg

Stefan Behnel

unread,
Nov 15, 2012, 9:58:46 AM11/15/12
to pytho...@python.org
Greg Ewing, 15.11.2012 11:48:
> mar...@v.loewis.de wrote:
>> It's faster than calling dict() because the dict code will
>> create a second dictionary, and discard the keywords dictionary.
>
> Perhaps in the case where dict() is called with keyword
> args only, it could just return the passed-in keyword
> dictionary instead of creating another one?

This should work as long as this still creates a copy of d at some point:

d = {...}
dict(**d)

Stefan

Alex Gaynor

unread,
Nov 15, 2012, 11:18:42 AM11/15/12
to pytho...@python.org
Stefan Behnel <stefan_ml <at> behnel.de> writes:
> Right. If that makes a difference, it's another bug.
>
> Stefan
>
>


It's fixed, with, I will note, fewer lines of code than many messages in this
thread:
https://bitbucket.org/pypy/pypy/changeset/c30cb1dcb7a9adc32548fd14274e4995

Alex

Terry Reedy

unread,
Nov 15, 2012, 11:21:53 AM11/15/12
to pytho...@python.org
On 11/15/2012 9:58 AM, Stefan Behnel wrote:
> Greg Ewing, 15.11.2012 11:48:
>> mar...@v.loewis.de wrote:
>>> It's faster than calling dict() because the dict code will
>>> create a second dictionary, and discard the keywords dictionary.
>>
>> Perhaps in the case where dict() is called with keyword
>> args only, it could just return the passed-in keyword
>> dictionary instead of creating another one?
>
> This should work as long as this still creates a copy of d at some point:
>
> d = {...}
> dict(**d)

I was thinking that CPython could check the ref count of the input
keyword dict to determine whether it is newly created and can be
returned or is pre-existing and must be copied. But it seems not so.

>>> def d(**x): return sys.getrefcount(x)

>>> import sys
>>> d(a = 3)
2
>>> d(**{'a': 3})
2
>>> b = {'a': 3}
>>> d(**b)
2

I was expecting 3 for the last one.

--
Terry Jan Reedy

Richard Oudkerk

unread,
Nov 15, 2012, 11:45:42 AM11/15/12
to pytho...@python.org
On 15/11/2012 4:21pm, Terry Reedy wrote:
> I was thinking that CPython could check the ref count of the input
> keyword dict to determine whether it is newly created and can be
> returned or is pre-existing and must be copied. But it seems not so.
>
> >>> def d(**x): return sys.getrefcount(x)
>
> >>> import sys
> >>> d(a = 3)
> 2
> >>> d(**{'a': 3})
> 2
> >>> b = {'a': 3}
> >>> d(**b)
> 2
>
> I was expecting 3 for the last one.

Isn't it always newly created?

>>> def f(**x): return x
...
>>> b = {'a':3}
>>> b is f(**b)
False

--
Richard

Greg Ewing

unread,
Nov 15, 2012, 5:39:44 PM11/15/12
to pytho...@python.org
Stefan Behnel wrote:
> This should work as long as this still creates a copy of d at some point:
>
> d = {...}
> dict(**d)

It will -- the implementation of the function call opcode always
creates a new keyword dict for passing to the called function.

--
Greg

Doug Hellmann

unread,
Nov 15, 2012, 6:17:19 PM11/15/12
to Chris Withers, mar...@v.loewis.de, pytho...@python.org

On Nov 14, 2012, at 5:37 PM, Chris Withers wrote:

> On 14/11/2012 10:11, mar...@v.loewis.de wrote:
>>
>> Zitat von Chris Withers <ch...@simplistix.co.uk>:
>>
>>> a_dict = dict(
>>> x = 1,
>>> y = 2,
>>> z = 3,
>>> ...
>>> )
>>
>>> What can we do to speed up the former case?
>>
>> It should be possible to special-case it. Rather than creating
>> a new dictionary from scratch, one could try to have the new dictionary
>> the same size as the original one, and copy all entries.
>
> Indeed, Doug, what are your views on this? Also, did you have a real-world example where this speed difference was causing you a problem?

No, not particularly. I noticed people using dict() and wondered what impact it might have in a general case.

>
>> I don't know how much this would gain, though. You still have to
>> create two dictionary objects. For a better speedup, try
>>
>> def xdict(**kwds):
>> return kwds
>
> Hah, good call, this trumps both of the other options:
>
> $ python2.7 -m timeit -n 1000000 -r 5 -v "{'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,'g':7}"
> raw times: 1.45 1.45 1.44 1.45 1.45
> 1000000 loops, best of 5: 1.44 usec per loop
> $ python2.6 -m timeit -n 1000000 -r 5 -v 'dict(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
> raw times: 2.37 2.36 2.36 2.37 2.37
> 1000000 loops, best of 5: 2.36 usec per loop$ python2.6 -m timeit -n 1000000 -r 5 -v 'def md(**kw): return kw; md(a=1,b=2,c=3,d=4,e=5,f=6,g=7)'
> raw times: 0.548 0.533 0.55 0.577 0.539
> 1000000 loops, best of 5: 0.533 usec per loop
>
> For the naive observer (ie: me!), why is that?
>
> Chris
>
> --
> Simplistix - Content Management, Batch Processing & Python Consulting
> - http://www.simplistix.co.uk

Reply all
Reply to author
Forward
0 new messages