Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

"Tuples are for heterogeneous data, lists are for homogeneous data."

0 views
Skip to first unread message

Arthur

unread,
Mar 11, 2003, 9:59:34 PM3/11/03
to
The subject line is a Guido quote from today's dev list.

Is the point Guido making performance related, or is it something else? I
have used lists extensively for heterogenous data, and wonder what it is I
am losing by so doing.

Art


Erik Max Francis

unread,
Mar 11, 2003, 10:16:33 PM3/11/03
to
Arthur wrote:

Nothing, it's just a style guideline. I'm aware of no optimizations
that would provide a penalty for not using lists/tuples in this way.

I tend to use tuples for either data that have a fixed number of
elements (like coordinates, if I don't represent them with separate
attributes), or for data that have a fixed number of (or easily found
pattern of) elements but contain heterogeneous data (like the first is
an int, the second is a string, the third is an object of some class).

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/ \ Who, my friend, can scale Heaven?
\__/ _The Epic of Gilgamesh_
EmPy / http://www.alcyone.com/pyos/empy/
A templating system for Python.

Jack Diederich

unread,
Mar 11, 2003, 10:14:20 PM3/11/03
to
On Tue, Mar 11, 2003 at 09:59:34PM -0500, Arthur wrote:
> The subject line is a Guido quote from today's dev list.
>
> Is the point Guido making performance related, or is it something else? I
> have used lists extensively for heterogenous data, and wonder what it is I
> am losing by so doing.

I read the same thing, and was a little confused too.
He may mean tuples are meant to be typed, and lists generated.
So you end up with a cononical tuple of Foo's because you typed out a tuple
of Foo's. Lists tend to accumulate more random members.

I don't pretend to channel Guido, that is just my guess.

There was a post (maybe in the byte-code python-dev thread?) that lists
are actually faster for access than tuples. This is because lists are
more common so the common case is optmized. I would assume that tuples are
cheaper to instantiate.

-jackdied

Jack Diederich

unread,
Mar 11, 2003, 10:18:29 PM3/11/03
to
OK, I answered the question as if it was ass-backwards.

So I'll change my answer to 'No Idea'


Tim Peters

unread,
Mar 11, 2003, 10:11:01 PM3/11/03
to
[Arthur]

> The subject line is a Guido quote from today's dev list.
>
> Is the point Guido making performance related, or is it something else?

Something else. It's a rule of thumb Guido takes too seriously on occasion,
except when it's convenient to ignore it in his own code <wink>.

> I have used lists extensively for heterogenous data, and wonder what it
> is I am losing by so doing.

Beyond the possibility of using them as dict keys, nothing objective I know
of. I also use tuples extensively for homogeneous data, when there are so
many of 'em that memory footprint is a concern (a tuple of len N consumes
less memory than a list of len N containing the same objects, because a list
object contains extra stuff to cater to the *possibility* that the list may
grow or shrink later; a tuple object doesn't have to worry about that).


Raymond Hettinger

unread,
Mar 11, 2003, 11:06:08 PM3/11/03
to

"Erik Max Francis" <m...@alcyone.com> wrote in message
news:3E6EA691...@alcyone.com...

> Arthur wrote:
>
> > The subject line is a Guido quote from today's dev list.
> >
> > Is the point Guido making performance related, or is it something
> > else? I
> > have used lists extensively for heterogenous data, and wonder what it
> > is I
> > am losing by so doing.
>
> Nothing, it's just a style guideline. I'm aware of no optimizations
> that would provide a penalty for not using lists/tuples in this way.

It's a bit more than a style guideline, it reflects Guido's design intent
and is reflected in design decisions throughout python. That is why
tuples don't have index() or count() methods.


Raymond Hettinger


Lulu of the Lotus-Eaters

unread,
Mar 12, 2003, 1:38:03 AM3/12/03
to
Arthur <ajsi...@optonline.net> wrote previously:

I think perhaps you are thinking of "homogeneous" in too narrow a sense.

Data that is homogeneous in a Pythonic way isn't necessarily all
integers, or all strings. Rather, a list is a bunch of things that you
might loop over, treating each element in the "same way." Now this same
way might not be entirely the *same*: for example, if you call a method
on each list element, the method might wind up dispatching to different
code depending on the type of the element. Or you might even have some
branches within the loop body (if's, try's, etc)... but even then,
there's something "the same" involved in that every element is a
candidate for the same branch conditions, albeit different ones choose
different paths.

Tuples, in contrast, are more often used as *records*, i.e. related
information about a common thing. The fields of a record need not be
different data *types*, they just represent different information. For
example, age, weight, and SSN might all be stored as integers, but the
-meaning- of those three positions in a tuple is different.

Yours, Lulu...

--
mertz@ | The specter of free information is haunting the `Net! All the
gnosis | powers of IP- and crypto-tyranny have entered into an unholy
.cx | alliance...ideas have nothing to lose but their chains. Unite
| against "intellectual property" and anti-privacy regimes!
-------------------------------------------------------------------------


Donn Cave

unread,
Mar 12, 2003, 12:28:39 PM3/12/03
to
Quoth Lulu of the Lotus-Eaters <me...@gnosis.cx>:

| Arthur <ajsi...@optonline.net> wrote previously:
|> The subject line is a Guido quote from today's dev list. Is the point
|> Guido making performance related, or is it something else? I have used
|> lists extensively for heterogenous data, and wonder what it is I am
|> losing by so doing.
|
| I think perhaps you are thinking of "homogeneous" in too narrow a sense.
|
| Data that is homogeneous in a Pythonic way isn't necessarily all
| integers, or all strings. Rather, a list is a bunch of things that you
| might loop over, treating each element in the "same way." Now this same
| way might not be entirely the *same*: for example, if you call a method
| on each list element, the method might wind up dispatching to different
| code depending on the type of the element. Or you might even have some
| branches within the loop body (if's, try's, etc)... but even then,
| there's something "the same" involved in that every element is a
| candidate for the same branch conditions, albeit different ones choose
| different paths.
|
| Tuples, in contrast, are more often used as *records*, i.e. related
| information about a common thing. The fields of a record need not be
| different data *types*, they just represent different information. For
| example, age, weight, and SSN might all be stored as integers, but the
| -meaning- of those three positions in a tuple is different.

I wonder if it would be useful to think about this in terms of more
specific types implemented as sequences. The 9-tuple result of
time.localtime(t), list of strings in sys.argv, etc. are each in
a practical sense distinct data structures, and though not distinct
types at the Python implementation level they still have their own
rules and applications.

If you look at it that way, you can observe that sys.argv[1:] is
of the same type as sys.argv[:], but tm[1:] is not of the same
type as tm[:]. That is, the result of time.localtime() is not
interchangeable in any sense with a sub-slice of itself, and this
is true in general of tuples used as records. But lists used as
sequential storage structures are homogeneous in the sense that
they can be sliced and spliced into the same kinds of lists.

Donn Cave, do...@u.washington.edu

ajsi...@optonline.net

unread,
Mar 12, 2003, 12:53:54 PM3/12/03
to
Lulu writes -

>I think perhaps you are thinking of "homogeneous" in too narrow a sense.

>Data that is homogeneous in a Pythonic way isn't necessarily all
>integers, or all strings. Rather, a list is a bunch of things that you
>might loop over, treating each element in the "same way."

> etc.

That explanation helps. I - a bit ironically - am perhaps thinking in a more technical sense. The object in a list that I am calling "heterogenous" might all descend from a common base class and all contain a method of the same name which I might call in iterating over the list.

In Java I believe I had to do all kinds of casting shenanigans to accomplish this - which is why I think of them as "heterogenous" in some sense. More naturally, the objects do seem to me to be homogenous - which is why they work in a list. Which is a tautology in some sense, but also coincides with your point - I think.

Art


0 new messages