Am 09.11.2012 23:54, schrieb Michael Bayer:
> NamedTuple is a tough one - because with our result sets we need to
> create a new NamedTuple for every call to execute(), meaning it has
> to be performant not just on creating new instances of the tuple,
> but on creating new tuple types as well.
>
> If you look at the source to NamedTuple, it is going through some
> very elaborate hoops ...
Yes, that's true. Of course this is done for good reasons, namely to
give you all the goodness of zero overhead per instance in terms of
memory and creation time, a telling type name etc. Raymond Hettinger
explains the design principles very well at
http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2011-fun-with-python-s-newer-tools-4901215
(minutes 11 to 27).
But you're right, it also comes at a cost, namely creation time for the
type itself. A quick test with timeit showed that this time overhead
only amortizes when you create at least about 175 instances. The memory
advantage is of course always there, but it's not interesting for
smaller datasets either. And then it will depend on how large your data
values are compared to the names of the columns. There is also not much
benefit in creating a custom type name for the tuple, since query
results usually don't have an obvious name anyway.
So maybe it's better to keep the current implementation and just make it
a bit more similar to Python's named tuples, e.g. renaming _labels to
_fields and adding _as_dict. By the way, the underscore has been only
added here to minimize the possibility of name clashes with tuple
fields, they shall not indicate that these are private attributes. As
another aside, the _as_dict method should not return a normal dict, but
an OrderedDict which can also be taken from collections nowadays.
-- Christoph