Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Fast Data Comparison (dict v. list. v string)
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Scott Brady Drummonds  
View profile  
 More options Apr 14 2004, 5:10 pm
Newsgroups: comp.lang.python
From: "Scott Brady Drummonds" <scott.b.drummonds.nos...@intel.com>
Date: Wed, 14 Apr 2004 14:01:44 -0700
Local: Wed, Apr 14 2004 5:01 pm
Subject: Fast Data Comparison (dict v. list. v string)
Hi, everyone,

I'm a relative novice at Python and am working on some optimizations in my
first Python project.  At its core, this tool I'm writing needs to perform
many comparisons to fixed-size lists of fixed-length strings.

Currently, my implementation uses dictionaries to store each string.  I
maintain a separate "key-mapping" dictionary that maps keys from one
dictionary to the other.  Then, all I have to do to compare the two
dictionaries is as follows:
  for metaKey in keyMap.keys():
    if dict1[metaKey] != dict2[keyMap[metaKey]]:
      #  Do some processing

Since the sizes of the dictionaries never change, I tried implementing this
using lists.  The solution looks something like this (and assumes that a
pre-processing phase has sorted the contents of each list so their indexes
are the same):
  for i in len(list1):
    if list1[i] != list2[i]:
      #  Do some processing

As it turns out, this implementation appears to be about 33% faster than the
dictionary-based one.  Now, assuming that the datum being stored at each
index can fit into one character, I could do a string-based implementation
like this:
  for i in len(string1):
    if string1[i] != string[i]:
      #  Do some processing

This last solution actually runs about the same as the dictionary, which
takes 50% longer than the list implementation.

Now, my questions are:
1)  Does anyone have another suggestion as to how I can organize these data
so that I can compare many elements many times?
2)  Is there a string comparison operator that will return which indexes
have different values?  Maybe it would be faster than the iterative
comparison approach for the third implementation.
3)  Since my data are changing between the successive executions of the code
snippets above, I need a way of having the other parts of the program update
it.  But, it appears that strings are constant as I can't assign individual
characters with "string1[i] = '0'".  Is there any way around this?

Thanks!
Scott

--
Remove ".nospam" from the user ID in my e-mail to reply via e-mail.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Larry Bates  
View profile  
 More options Apr 15 2004, 10:03 am
Newsgroups: comp.lang.python
From: "Larry Bates" <lba...@swamisoft.com>
Date: Thu, 15 Apr 2004 09:03:47 -0500
Local: Thurs, Apr 15 2004 10:03 am
Subject: Re: Fast Data Comparison (dict v. list. v string)
Scott,

You should take a close look at list comprehensions
and possibly sets (new in V2.3).  I'm not all that
familiar with sets, but your problem sounds like
they might come in handy.

Example list comprehension:

list1notinlist2=[x for x in list1 if x not in list2]
for item in list1notinlist:
    # Do some processing

Note: In Python if you find yourself using [i]
indexes into lists, there is "almost" always a
better way (at least that is my experience).

Larry Bates
Syscon, Inc.

"Scott Brady Drummonds" <scott.b.drummonds.nos...@intel.com> wrote in
message news:c5k8rp$ql6$1@news01.intel.com...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Scott Brady Drummonds  
View profile  
 More options Apr 15 2004, 12:40 pm
Newsgroups: comp.lang.python
From: "Scott Brady Drummonds" <scott.b.drummonds.nos...@intel.com>
Date: Thu, 15 Apr 2004 09:38:20 -0700
Local: Thurs, Apr 15 2004 12:38 pm
Subject: Re: Fast Data Comparison (dict v. list. v string)

"Larry Bates" <lba...@swamisoft.com> wrote in message

news:hIWdnXU6rJ7fC-PdRVn-iQ@comcast.com...

> Scott,

> You should take a close look at list comprehensions
> and possibly sets (new in V2.3).  I'm not all that
> familiar with sets, but your problem sounds like
> they might come in handy.

Excellent suggestion!  Thanks so much for the idea.

Scott


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
george young  
View profile  
 More options Apr 16 2004, 11:08 am
Newsgroups: comp.lang.python
From: g...@ll.mit.edu (george young)
Date: 16 Apr 2004 08:08:57 -0700
Local: Fri, Apr 16 2004 11:08 am
Subject: Re: Fast Data Comparison (dict v. list. v string)
"Scott Brady Drummonds" <scott.b.drummonds.nos...@intel.com> wrote in message <news:c5k8rp$ql6$1@news01.intel.com>...

You might also take a look at psyco: http://psyco.sourceforge.net.  It is
very easy to use, non-intrusive, and claims large performance improvements for
code like this.  I have hopes that things like psyco can let us keep code
in the simplest and *clearest* form, and let external optimization mechanisms
make it fast enough when needed.

-- George Young


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »