Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
lists as an efficient implementation of large two-dimensional arrays(!)
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Mitchell L Model  
View profile  
 More options Feb 2 2010, 3:14 pm
Newsgroups: comp.lang.python
From: Mitchell L Model <MLM...@Comcast.net>
Date: Tue, 2 Feb 2010 15:14:27 -0500
Local: Tues, Feb 2 2010 3:14 pm
Subject: lists as an efficient implementation of large two-dimensional arrays(!)
An instructive lesson in YAGNI ("you aren't going to need it"),  
premature optimization, and not making assumptions about Python data  
structure implementations.

I need a 1000 x 1000 two-dimensional array of objects. (Since they are  
instances of application classes it appears that the array module is  
useless; likewise, since I am using Python 3.1, so among other things,  
I can't use numpy or its relatives.) The usage pattern is that the  
array is first completely filled with objects. Later, objects are  
sometimes accessed individually by row and column and often the entire  
array is iterated over.

Worried (unnecessarily, as it turns out) by the prospect of 1,000,000  
element list I started by constructing a dictionary with the keys 1  
through 1000, each of which had as its value another dictionary with  
the keys 1 through 1000. Actual values were the values of the second  
level dictionary.

Using numbers to fill the array to minimize the effect of creating my  
more complex objects, and running Python 3.1.1 on an 8-core Mac Pro  
with 8Gb memory, I tried the following

#create and fill the array:
t1 = time.time()
d2 = {}
for j in range(1000):
     d2[j] = dict()
     for k in range(1000):
         d2[j][k] = k
print( round(time.time() - t1, 2))

0.41

# access each element of the array:
t1 = time.time()
for j in range(1000):
     for k in range(1000):
         elt = d2[j][k]
print( round(time.time() - t1, 2))

0.55

  My program was too slow, so I started investigating whether I could  
improve on the two-level dictionary, which got used a lot. To get  
another baseline I tried a pure 1,000,000-element list, expecting the  
times to be be horrendous, but look!

# fill a list using append
t1 = time.time()
lst = []
for n in range(1000000):
     lst.append(n)
print( round(time.time() - t1, 2))

0.26

# access every element of a list
t1 = time.time()
for n in range(1000000):
     elt = lst[n]
print( round(time.time() - t1, 2))

0.25

What a shock! I could save half the execution time and all my clever  
work and awkward double-layer dictionary expressions by just using a  
list!

Even better, look what happens using a comprehension to create the  
list instead of a loop with list.append:

t1 = time.time()
lst = [n for n in range(1000000)]
print( round(time.time() - t1, 2))

0.11

Half again to create the list.

Iterating over the whole list is easier and faster than iterating over  
the double-level dictionary, in particular because it doesn't involve  
a two-level loop. But what about individual access given a row and a  
column?

t1 = time.time()
for j in range(1000):
     for k in range(1000):
         elt = lst[j * 1000 + k]
print( round(time.time() - t1, 2))

0.45

This is the same as for the dictionary.

I tried a two-level list and a few other things but still haven't  
found anything that works better than a single long list -- just like  
2-D arrays are coded in old-style languages, with indices computed as  
offsets from the beginning of the linear sequence of all the values.  
What's amazing is that creating and accessing 1,000,000-element list  
in Python is so efficient. The usual moral obtains: start simple,  
analyze problems (functional or performance) as they arise, decide  
whether they are worth the cost of change, then change in very limited  
ways. And of course abstract and modularize so that, in my case, for  
example, none of the program's code would be affected by the change  
from a two-level dictionary representation to using a single long list.

I realize there are many directions an analysis like this can follow,  
and many factors affecting it, including patterns of use. I just  
wanted to demonstrate the basics for a situation that I just  
encountered. In particular, if the array was sparse, rather than  
completely full, the two-level dictionary implementation would be the  
natural representation.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gerald Britton  
View profile  
 More options Feb 2 2010, 3:36 pm
Newsgroups: comp.lang.python
From: Gerald Britton <gerald.brit...@gmail.com>
Date: Tue, 2 Feb 2010 15:36:45 -0500
Local: Tues, Feb 2 2010 3:36 pm
Subject: Re: lists as an efficient implementation of large two-dimensional arrays(!)
Did you try it with an array object using the array module?

On Tue, Feb 2, 2010 at 3:14 PM, Mitchell L Model <MLM...@comcast.net> wrote:

--
Gerald Britton

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
exar...@twistedmatrix.com  
View profile  
 More options Feb 2 2010, 3:57 pm
Newsgroups: comp.lang.python
From: exar...@twistedmatrix.com
Date: Tue, 02 Feb 2010 20:57:11 -0000
Local: Tues, Feb 2 2010 3:57 pm
Subject: Re: lists as an efficient implementation of large two-dimensional arrays(!)
On 08:36 pm, gerald.brit...@gmail.com wrote:

>On Tue, Feb 2, 2010 at 3:14 PM, Mitchell L Model <MLM...@comcast.net>
>wrote:
>>I need a 1000 x 1000 two-dimensional array of objects. (Since they are
>>instances of application classes it appears that the array module is
>>useless;
>Did you try it with an array object using the array module?

Jean-Paul

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Terry Reedy  
View profile  
 More options Feb 2 2010, 5:54 pm
Newsgroups: comp.lang.python
From: Terry Reedy <tjre...@udel.edu>
Date: Tue, 02 Feb 2010 17:54:51 -0500
Local: Tues, Feb 2 2010 5:54 pm
Subject: Re: lists as an efficient implementation of large two-dimensional arrays(!)
On 2/2/2010 3:14 PM, Mitchell L Model wrote:

> I need a 1000 x 1000 two-dimensional array of objects.

I would just use 1000 element list, with each element being a 1000
element list or array (if possible). Then l2d[i][j] works fine.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »