Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Why is my code faster with append() in a loop than with a large list?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  14 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
David  
View profile  
 More options Jul 5, 8:48 pm
Newsgroups: comp.lang.python
From: David <da...@pythontoo.com>
Date: Sun, 05 Jul 2009 20:48:39 -0400
Local: Sun, Jul 5 2009 8:48 pm
Subject: Re: Why is my code faster with append() in a loop than with a large list?
Xavier Ho wrote:
> (Here's a short version of the long version below if you don't want to
> read:)

> Why is version B of the code faster than version A? (Only three lines
> different)

> Version A: http://pastebin.com/f14561243
> Version B: http://pastebin.com/f1f657afc

I don't know but here is the diff for someone that may;

1c1,2
< # This one only took 8 seconds on my machine. Wow?
---
 >
 > # This one took hours to compute, overnight.
28c29
<     powers = [0, 0]
---
 >     powers = [0] * (num + 1)
32c33
<             powers[factor-1] += 1
---
 >             powers[factor] += 1
35d35
<             powers.append(0)
55c55
<     n += 1
\ No newline at end of file
---
 >     n += 1

--
Powered by Gentoo GNU/Linux
http://linuxcrazy.com


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 5, 8:54 pm
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Mon, 06 Jul 2009 01:54:39 +0100
Local: Sun, Jul 5 2009 8:54 pm
Subject: Re: Why is my code faster with append() in a loop than with a large list?

In your version you're creating a list of (num + 1) elements, but in the
other version the list is only as long as the largest factor.

For example, for num=28, your version creates a list 29 elements long,
but the other version creates one only 7 elements long. Also, 'the time
needed to append an item to the list is "amortized constant"' (quoted
from http://effbot.org/zone/python-list.htm).

This means that your little speed test isn't representation of what's
actually happening.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vilya Harvey  
View profile  
 More options Jul 6, 5:21 am
Newsgroups: comp.lang.python
From: Vilya Harvey <vilya.har...@gmail.com>
Date: Mon, 6 Jul 2009 10:21:07 +0100
Local: Mon, Jul 6 2009 5:21 am
Subject: Re: Why is my code faster with append() in a loop than with a large list?
2009/7/6 Xavier Ho <cont...@xavierho.com>:

> Why is version B of the code faster than version A? (Only three lines
> different)

Here's a guess:

As the number you're testing gets larger, version A is creating very
big list. I'm not sure exactly how much overhead each list entry has
in python, but I guess it's at least 8 bytes: a 32-bit reference for
each list entry, and 32 bits to hold the int value (assuming a 32-bit
version of python). The solution you're looking for is a large 8 digit
number; let's say 80,000,000, for the sake of easy calculation. That
means, as you get close to the solution, you'll be trying to allocate
almost 640 Mb of memory for every number you're checking. That's going
to make the garbage collector work extremely hard. Also, depending on
how much memory your computer has free, you'll probably start hitting
virtual memory too, which will slow you down even further. Finally,
the reduce step has to process all 80,000,000 elements which is
clearly going to take a while.

Version b creates a list which is only as long as the largest prime
factor, so at worst the list size will be approx. sqrt(80,000,000),
which is approx. 8900 elements or approx. 72 Kb or memory - a much
more manageable size.

Hope that helps,

Vil.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dave Angel  
View profile  
 More options Jul 6, 6:09 am
Newsgroups: comp.lang.python
From: Dave Angel <da...@dejaviewphoto.com>
Date: Mon, 06 Jul 2009 06:09:46 -0400
Local: Mon, Jul 6 2009 6:09 am
Subject: Re: Why is my code faster with append() in a loop than with a large list?

Just by inspection, it would seem the bottleneck in your first version
is that you return a huge list of nearly all zeroes, from factorize().  
This slows down countDivisors() a great deal.

It would probably save some time to not bother storing the zeroes in the
list at all.  And it should help if you were to step through a list of
primes, rather than trying every possible int.  Or at least constrain
yourself to odd numbers (after the initial case of 2).

DaveA


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 6, 10:22 am
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Mon, 06 Jul 2009 15:22:23 +0100
Local: Mon, Jul 6 2009 10:22 am
Subject: Re: Why is my code faster with append() in a loop than with a large list?
Dave Angel wrote:

[snip]
> It would probably save some time to not bother storing the zeroes in the
> list at all.  And it should help if you were to step through a list of
> primes, rather than trying every possible int.  Or at least constrain
> yourself to odd numbers (after the initial case of 2).

Or stop looking for more factors when you've passed the square root of
num. I don't know what effect there'll be on the time if you recalculate
the square root when num changes (expensive calculation vs smaller
search space).

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Piet van Oostrum  
View profile  
 More options Jul 6, 11:15 am
Newsgroups: comp.lang.python
From: Piet van Oostrum <p...@cs.uu.nl>
Date: Mon, 06 Jul 2009 17:15:47 +0200
Local: Mon, Jul 6 2009 11:15 am
Subject: Re: Why is my code faster with append() in a loop than with a large list?

>>>>> Dave Angel <da...@dejaviewphoto.com> (DA) wrote:
>DA> It would probably save some time to not bother storing the zeroes in the
>DA> list at all.  And it should help if you were to step through a list of
>DA> primes, rather than trying every possible int.  Or at least constrain
>DA> yourself to odd numbers (after the initial case of 2).

The first and the last save a constant factor (slightly less than 2):

def factorise(num):
    """Returns a list of prime factor powers. For example:
        factorise(6) will return
        [2, 2] (the powers are returned one higher than the actual value)
        as in, 2^1 * 3^1 = 6."""
    powers = []
    factor = 2
    while num > 1:
        power = 0
        while num % factor == 0:
            power += 1
            num /= factor
        if power > 0:
            powers.append(power+1)
        factor += 1
    return powers

...
    return reduce(mul, powers)

or to skip the odd factors:

def factorise(num):
    """Returns a list of prime factor powers. For example:
        factorise(6) will return
        [2, 2] (the powers are returned one higher than the actual value)
        as in, 2^1 * 3^1 = 6."""
    powers = []
    factor = 2
    while num > 1:
        power = 0
        while num % factor == 0:
            power += 1
            num /= factor
        if power > 0:
            powers.append(power+1)
        factor = 3 if factor == 2 else factor + 2
    return powers

This can be slightly optimised by taking factor 2 out of the loop.

def factorise(num):
    """Returns a list of prime factor powers. For example:
        factorise(6) will return
        [2, 2] (the powers are returned one higher than the actual value)
        as in, 2^1 * 3^1 = 6."""
    powers = []
    power = 0
    while num % 2 == 0:
        power += 1
        num /= 2
    if power > 0:
        powers.append(power+1)
    factor = 3
    while num > 1:
        power = 0
        while num % factor == 0:
            power += 1
            num /= factor
        if power > 0:
            powers.append(power+1)
        factor += 2
    return powers

To restrict the search to primes you would have to use a
sieve of Eratosthenes or something similar.
My first attempt (with a sieve from
http://code.activestate.com/recipes/117119/) only gave a speed decrease!!
But this had the sieve recreated for every triangle number. A global
sieve that is reused at each triangle number is better. But the speed
increase relative to the odd factors only is not dramatical.

# Based upon http://code.activestate.com/recipes/117119/

D = {9: 6} # contains composite numbers
Dlist = [2, 3] # list of already generated primes

def sieve():
    '''generator that yields all prime numbers'''
    global D
    global Dlist
    for p in Dlist:
        yield p
    q = Dlist[-1]+2
    while True:
        if q in D:
            p = D[q]
            x = q + p
            while x in D: x += p
            D[x] = p
        else:
            Dlist.append(q)
            yield q
            D[q*q] = 2*q
        q += 2

def factorise(num):
    """Returns a list of prime factor powers. For example:
        factorise(6) will return
        [2, 2] (the powers are returned one higher than the actual value)
        as in, 2^1 * 3^1 = 6."""
    powers = []
    power = 0
    for factor in sieve():
        power = 0
        while num % factor == 0:
            power += 1
            num /= factor
        if power > 0:
            # if you really want the factors then append((factor, power))
            powers.append(power+1)
        if num == 1:
            break
    return powers

--
Piet van Oostrum <p...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Piet van Oostrum  
View profile  
 More options Jul 6, 11:51 am
Newsgroups: comp.lang.python
From: Piet van Oostrum <p...@cs.uu.nl>
Date: Mon, 06 Jul 2009 17:51:02 +0200
Local: Mon, Jul 6 2009 11:51 am
Subject: Re: Why is my code faster with append() in a loop than with a large list?
Sorry, there was an error in the sieve in my last example. Here is a
corrected version:

D = {9: 6} # contains composite numbers
Dlist = [2, 3] # list of already generated primes
def sieve():
    '''generator that yields all prime numbers'''
    global D
    global Dlist
    for q in Dlist:
        yield q
    while True:
        q += 2
        p = D.pop(q, 0)
        if p:
            x = q + p
            while x in D: x += p
            D[x] = p
        else:
            Dlist.append(q)
            D[q*q] = 2*q
            yield q

--
Piet van Oostrum <p...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Scott David Daniels  
View profile  
 More options Jul 6, 12:53 pm
Newsgroups: comp.lang.python
From: Scott David Daniels <Scott.Dani...@Acm.Org>
Date: Mon, 06 Jul 2009 09:53:34 -0700
Local: Mon, Jul 6 2009 12:53 pm
Subject: Re: Why is my code faster with append() in a loop than with a large list?
Piet van Oostrum wrote:
>>>>>> Dave Angel <da...@dejaviewphoto.com> (DA) wrote:

>> DA> It would probably save some time to not bother storing the zeroes in the
>> DA> list at all.  And it should help if you were to step through a list of
>> DA> primes, rather than trying every possible int.  Or at least constrain
>> DA> yourself to odd numbers (after the initial case of 2).

> ...
> # Based upon http://code.activestate.com/recipes/117119/

> D = {9: 6} # contains composite numbers

XXX Dlist = [2, 3] # list of already generated primes
   Elist = [(2, 4), (3, 9)] # list of primes and their squares

XXX def sieve():
XXX   '''generator that yields all prime numbers'''
XXX   global D
XXX   global Dlist
  def sieve2():
      '''generator that yields all primes and their squares'''
      # No need for global declarations, we alter, not replace
XXX   for p in Dlist:
XXX       yield p
XXX   q = Dlist[-1]+2

       for pair in Elist:
           yield pair
       q = pair[0] + 2

>     while True:
>         if q in D:
>             p = D[q]
>             x = q + p
>             while x in D: x += p
>             D[x] = p
>         else:

XXX           Dlist.append(q)
XXX           yield q
XXX           D[q*q] = 2*q
               square = q * q
               pair = q, square
               Elist.append(pair)
               yield pair
               D[square] = 2 * q
>         q += 2

> def factorise(num):
>     """Returns a list of prime factor powers. For example:
>         factorise(6) will return
>         [2, 2] (the powers are returned one higher than the actual value)
>         as in, 2^1 * 3^1 = 6."""
>     powers = []
>     power = 0

XXX   for factor in sieve():
       for factor, limit in sieve2():
>         power = 0
>         while num % factor == 0:
>             power += 1
>             num /= factor

XXX       if power > 0:
           if power: # good enough here, and faster
>             # if you really want the factors then append((factor, power))
>             powers.append(power+1)

XXX       if num == 1:
XXX           break
XXX   return powers
           if num < limit:
               if num > 1:
                   # if you really want the factors then append((num, 1))
                   powers.append(2)
               return powers

OK, that's a straightforward speedup, _but_:
      factorize(6) == [2, 2] == factorize(10) ==  factorize(15)
So I am not sure exactly what you are calculating.

--Scott David Daniels
Scott.Dani...@Acm.Org


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dave Angel  
View profile  
 More options Jul 6, 12:49 pm
Newsgroups: comp.lang.python
From: Dave Angel <da...@ieee.org>
Date: Mon, 06 Jul 2009 12:49:49 -0400
Local: Mon, Jul 6 2009 12:49 pm
Subject: Re: Re: Why is my code faster with append() in a loop than with a large list?

But if I remember the code, it stopped when the quotient is one, which
is usually sooner than the square root.  And no need to precalculate the
square root.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dave Angel  
View profile  
 More options Jul 6, 1:13 pm
Newsgroups: comp.lang.python
From: Dave Angel <da...@ieee.org>
Date: Mon, 06 Jul 2009 13:13:48 -0400
Local: Mon, Jul 6 2009 1:13 pm
Subject: Re: Why is my code faster with append() in a loop than with a large list?

The OP only needed the number of factors, not the actual factors.  So
the zeroes in the list are unneeded.  6, 10, and 15 each have 4 factors.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 6, 1:20 pm
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Mon, 06 Jul 2009 18:20:59 +0100
Local: Mon, Jul 6 2009 1:20 pm
Subject: Re: Why is my code faster with append() in a loop than with a large list?

If the number is a large prime then the code will try all the numbers up
to that, eg if num == 1000003 then it'll try 2..1000003 even though it
could've given up after 1000.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dave Angel  
View profile  
 More options Jul 6, 2:28 pm
Newsgroups: comp.lang.python
From: Dave Angel <da...@ieee.org>
Date: Mon, 06 Jul 2009 14:28:05 -0400
Local: Mon, Jul 6 2009 2:28 pm
Subject: Re: Re: Why is my code faster with append() in a loop than with a large list?

That's one reason I suggested (and Piet implemented) a sieve.  You can
stop dividing when the square of the next prime exceeds your quotient.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Piet van Oostrum  
View profile  
 More options Jul 6, 6:51 pm
Newsgroups: comp.lang.python
From: Piet van Oostrum <p...@cs.uu.nl>
Date: Tue, 07 Jul 2009 00:51:24 +0200
Local: Mon, Jul 6 2009 6:51 pm
Subject: Re: Why is my code faster with append() in a loop than with a large list?

>>>>> Scott David Daniels <Scott.Dani...@Acm.Org> (SDD) wrote:
>SDD>      # No need for global declarations, we alter, not replace

Yes, I know, but I find it neater to do the global declarations, if only
for documentation. And they don't affect the run time, only the compile
time.
--
Piet van Oostrum <p...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Piet van Oostrum  
View profile  
 More options Jul 6, 6:58 pm
Newsgroups: comp.lang.python
From: Piet van Oostrum <p...@cs.uu.nl>
Date: Tue, 07 Jul 2009 00:58:00 +0200
Subject: Re: Why is my code faster with append() in a loop than with a large list?

That's right. I thought of doing the sqrt trick by testing for num <
factor, i.e."

        if num < factor:
            break
but found out that it is useless, as for each factor found, num is
divided by that factor as many times as possible. Therefore when the
largest prime factor of num has been found, num ends up being 1 and the
loop stops.
--
Piet van Oostrum <p...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google