Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Performance on local constants?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  15 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
William McBrine  
View profile  
 More options Dec 22 2007, 5:53 am
Newsgroups: comp.lang.python
From: William McBrine <wmcbr...@users.sf.net>
Date: Sat, 22 Dec 2007 10:53:39 GMT
Local: Sat, Dec 22 2007 5:53 am
Subject: Performance on local constants?
Hi all,

I'm pretty new to Python (a little over a month). I was wondering -- is
something like this:

s = re.compile('whatever')

def t(whatnot):
    return s.search(whatnot)

for i in xrange(1000):
    print t(something[i])

significantly faster than something like this:

def t(whatnot):
    s = re.compile('whatever')
    return s.search(whatnot)

for i in xrange(1000):
    result = t(something[i])

? Or is Python clever enough to see that the value of s will be the same
on every call, and thus only compile it once?

--
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 -- pass it on


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paddy  
View profile  
 More options Dec 22 2007, 6:56 am
Newsgroups: comp.lang.python
From: Paddy <paddy3...@googlemail.com>
Date: Sat, 22 Dec 2007 03:56:43 -0800 (PST)
Local: Sat, Dec 22 2007 6:56 am
Subject: Re: Performance on local constants?
On Dec 22, 10:53 am, William McBrine <wmcbr...@users.sf.net> wrote:

Python RE's do have a cache but telling it to compile multiple times
is going to take time.

Best to do as the docs say and compile your RE's once before use if
you can.

The timeit module: http://www.diveintopython.org/performance_tuning/timeit.html
will allow you to do your own timings.

- Paddy.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Dec 22 2007, 7:04 am
Newsgroups: comp.lang.python
From: John Machin <sjmac...@lexicon.net>
Date: Sat, 22 Dec 2007 04:04:17 -0800 (PST)
Local: Sat, Dec 22 2007 7:04 am
Subject: Re: Performance on local constants?
On Dec 22, 9:53 pm, William McBrine <wmcbr...@users.sf.net> wrote:

No.

Or is Python clever enough to see that the value of s will be the same

> on every call,

No. It doesn't have a crystal ball.

> and thus only compile it once?

But it is smart enough to maintain a cache, which achieves the desired
result.

Why don't you do some timings?

While you're at it, try this:

def t2(whatnot):
    return re.search('whatever', whatnot)

and this:

t3 = re.compile('whatever').search

HTH,
John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Duncan Booth  
View profile  
 More options Dec 22 2007, 7:05 am
Newsgroups: comp.lang.python
From: Duncan Booth <duncan.bo...@invalid.invalid>
Date: 22 Dec 2007 12:05:00 GMT
Subject: Re: Performance on local constants?

The best way to answer these questions is always to try it out for
yourself. Have a look at 'timeit.py' in the library: you can run
it as a script to time simple things or import it from longer scripts.

C:\Python25>python lib/timeit.py -s "import re;s=re.compile('whatnot')" "s.search('some long string containing a whatnot')"
1000000 loops, best of 3: 1.05 usec per loop

C:\Python25>python lib/timeit.py -s "import re" "re.compile('whatnot').search('some long string containing a whatnot')"
100000 loops, best of 3: 3.76 usec per loop

C:\Python25>python lib/timeit.py -s "import re" "re.search('whatnot', 'some long string containing a whatnot')"
100000 loops, best of 3: 3.98 usec per loop

So it looks like it takes a couple of microseconds overhead if you
don't pre-compile the regular expression. That could be significant
if you have simple matches as above, or irrelevant if the match is
complex and slow.

You can also try measuring the compile time separately:

C:\Python25>python lib/timeit.py -s "import re" "re.compile('whatnot')"
100000 loops, best of 3: 2.36 usec per loop

C:\Python25>python lib/timeit.py -s "import re" "re.compile('<(?:p|div)[^>]*>(?P<pat0>(?:(?P<atag0>\\<a[^>]*\\>)\\<img[^>]+ class\\s*=[^=>]*captioned[^>]+\\>\\</a\\>)|\\<img[^>]+class\\s*=[^=>]*capti oned[^>]+\\>)</(?:p|div)>|(?P<pat1>(?:(?P<atag1>\\<a[^>]*\\>)\\<img[^>]+cla ss\\s*=[^=>]*captioned[^>]+\\>\\</a\\>)|\\<img[^>]+class\\s*=[^=>]*captione d[^>]+\\>)')"
100000 loops, best of 3: 2.34 usec per loop

It makes no difference whether you use a trivial regular expression
or a complex one: Python remembers (if I remember correctly) the last
100 expressions it compiled,so the compilation overhead will be pretty
constant.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steven D'Aprano  
View profile  
 More options Dec 22 2007, 7:18 am
Newsgroups: comp.lang.python
From: Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au>
Date: Sat, 22 Dec 2007 12:18:09 -0000
Local: Sat, Dec 22 2007 7:18 am
Subject: Re: Performance on local constants?

Let's find out:

>>> import re
>>> import dis

>>> def spam(x):

...     s = re.compile('nobody expects the Spanish Inquisition!')
...     return s.search(x)
...
>>> dis.dis(spam)

  2           0 LOAD_GLOBAL              0 (re)
              3 LOAD_ATTR                1 (compile)
              6 LOAD_CONST               1 ('nobody expects the Spanish
Inquisition!')
              9 CALL_FUNCTION            1
             12 STORE_FAST               1 (s)

  3          15 LOAD_FAST                1 (s)
             18 LOAD_ATTR                2 (search)
             21 LOAD_FAST                0 (x)
             24 CALL_FUNCTION            1
             27 RETURN_VALUE

No, the Python compiler doesn't know anything about regular expression
objects, so it compiles a call to the RE engine which is executed every
time the function is called.

However, the re module keeps its own cache, so in fact the regular
expression itself may only get compiled once regardless.

Here's another approach that avoids the use of a global variable for the
regular expression:

>>> def spam2(x, s=re.compile('nobody expects the Spanish Inquisition!')):

...     return s.search(x)
...
>>> dis.dis(spam2)

  2           0 LOAD_FAST                1 (s)
              3 LOAD_ATTR                0 (search)
              6 LOAD_FAST                0 (x)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE

What happens now is that the regex is compiled by the RE engine once, at
Python-compile time, then stored as the default value for the argument s.
If you don't supply another value for s when you call the function, the
default regex is used. If you do, the over-ridden value is used instead:

>>> spam2("nothing")
>>> spam2("nothing", re.compile('thing'))

<_sre.SRE_Match object at 0xb7c29c28>

I suspect that this will be not only the fastest solution, but also the
most flexible.

--
Steven


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dustan  
View profile  
 More options Dec 22 2007, 8:36 am
Newsgroups: comp.lang.python
From: Dustan <DustanGro...@gmail.com>
Date: Sat, 22 Dec 2007 05:36:28 -0800 (PST)
Local: Sat, Dec 22 2007 8:36 am
Subject: Re: Performance on local constants?
On Dec 22, 6:04 am, John Machin <sjmac...@lexicon.net> wrote:

> t3 = re.compile('whatever').search

Ack! No! Too Pythonic! GETITOFF! GETITOFF!!

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Terry Reedy  
View profile  
 More options Dec 22 2007, 1:38 pm
Newsgroups: comp.lang.python
From: "Terry Reedy" <tjre...@udel.edu>
Date: Sat, 22 Dec 2007 13:38:28 -0500
Local: Sat, Dec 22 2007 1:38 pm
Subject: Re: Performance on local constants?

"Steven D'Aprano" <st...@REMOVE-THIS-cybersource.com.au> wrote in message

news:13mq041tef27vad@corp.supernews.com...
| >>> def spam2(x, s=re.compile('nobody expects the Spanish
Inquisition!')):
| ...     return s.search(x)
|
| I suspect that this will be not only the fastest solution, but also the
| most flexible.

'Most flexible' in a different way is

def searcher(rex):
    crex = re.compile(rex)
    def _(txt):
        return crex.search(txt)
    return _

One can then create and keep around multiple searchers based on different
patterns, to be used as needed.

tjr


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Dec 22 2007, 3:46 pm
Newsgroups: comp.lang.python
From: John Machin <sjmac...@lexicon.net>
Date: Sat, 22 Dec 2007 12:46:31 -0800 (PST)
Local: Sat, Dec 22 2007 3:46 pm
Subject: Re: Performance on local constants?
On Dec 23, 5:38 am, "Terry Reedy" <tjre...@udel.edu> wrote:

I see your obfuscatory ante and raise you several dots and
underscores:

class Searcher(object):
    def __init__(self, rex):
        self.crex = re.compile(rex)
    def __call__(self, txt):
        return self.crex.search(txt)

Cheers,
John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Terry Reedy  
View profile  
 More options Dec 22 2007, 10:39 pm
Newsgroups: comp.lang.python
From: "Terry Reedy" <tjre...@udel.edu>
Date: Sat, 22 Dec 2007 22:39:05 -0500
Local: Sat, Dec 22 2007 10:39 pm
Subject: Re: Performance on local constants?

"John Machin" <sjmac...@lexicon.net> wrote in message

news:ab88db50-ce4e-4298-bcec-079de67dbcb8@e25g2000prg.googlegroups.com...
| On Dec 23, 5:38 am, "Terry Reedy" <tjre...@udel.edu> wrote:
| > 'Most flexible' in a different way is
| >
| > def searcher(rex):
| >     crex = re.compile(rex)
| >     def _(txt):
| >         return crex.search(txt)
| >     return _
| >
|
| I see your obfuscatory ante and raise you several dots and
| underscores:

I will presume you are merely joking, but for the benefit of any beginning
programmers reading this, the closure above is a standard functional idiom
for partial evaluation of a function (in this this, re.search(crex,txt))

| class Searcher(object):
|    def __init__(self, rex):
|        self.crex = re.compile(rex)
|    def __call__(self, txt):
|        return self.crex.search(txt)

while this is, the equivalent OO version.  Intermdiate Python programmers
should know both.

tjr


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Dec 23 2007, 1:55 am
Newsgroups: comp.lang.python
From: John Machin <sjmac...@lexicon.net>
Date: Sat, 22 Dec 2007 22:55:07 -0800 (PST)
Local: Sun, Dec 23 2007 1:55 am
Subject: Re: Performance on local constants?
On Dec 23, 2:39 pm, "Terry Reedy" <tjre...@udel.edu> wrote:

Semi-joking; I thought that your offering of this:

def searcher(rex):
    crex = re.compile(rex)
    def _(txt):
        return crex.search(txt)
    return _
foo_searcher = searcher('foo')

was somewhat over-complicated, and possibly slower than already-
mentioned alternatives. The standard idiom etc etc it may be, but the
OP was interested in getting overhead out of his re searching loop.
Let's trim it a bit.

step 1:
def searcher(rex):
    crexs = re.compile(rex).search
    def _(txt):
        return crexs(txt)
    return _
foo_searcher = searcher('foo')

step 2:
def searcher(rex):
    return re.compile(rex).search
foo_searcher = searcher('foo')

step 3:
foo_searcher = re.compile('foo').search

HTH,
John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
William McBrine  
View profile  
 More options Dec 26 2007, 5:58 pm
Newsgroups: comp.lang.python
From: William McBrine <wmcbr...@users.sf.net>
Date: Wed, 26 Dec 2007 22:58:28 GMT
Local: Wed, Dec 26 2007 5:58 pm
Subject: Re: Performance on local constants?
Thanks for all the answers on this. (And, sorry the lousy Subject line; I
couldn't think of a better one.)

--
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 -- pass it on


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriel Genellina  
View profile  
 More options Dec 27 2007, 4:47 am
Newsgroups: comp.lang.python
From: "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
Date: Thu, 27 Dec 2007 06:47:20 -0300
Local: Thurs, Dec 27 2007 4:47 am
Subject: Re: Performance on local constants?
En Sun, 23 Dec 2007 03:55:07 -0300, John Machin <sjmac...@lexicon.net>  
escribió:

Nice derivation! Like the word-stairs game: love -> rove -> rave -> have  
-> hate

--
Gabriel Genellina


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthew Franz  
View profile  
 More options Dec 27 2007, 3:53 pm
Newsgroups: comp.lang.python
From: "Matthew Franz" <mdfr...@gmail.com>
Date: Thu, 27 Dec 2007 14:53:32 -0600
Local: Thurs, Dec 27 2007 3:53 pm
Subject: Re: Performance on local constants?
I get class Searcher(object) but can't for the life of me see why
(except to be intentionally obtuse) one would use the def
searcher(rex) pattern which I assure you would call with
searcher(r)(t) right?

- mdf


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Machin  
View profile  
 More options Dec 27 2007, 4:37 pm
Newsgroups: comp.lang.python
From: John Machin <sjmac...@lexicon.net>
Date: Thu, 27 Dec 2007 13:37:55 -0800 (PST)
Local: Thurs, Dec 27 2007 4:37 pm
Subject: Re: Performance on local constants?
On Dec 28, 7:53 am, "Matthew Franz" <mdfr...@gmail.com> wrote:

> I get class Searcher(object) but can't for the life of me see why
> (except to be intentionally obtuse) one would use the def
> searcher(rex) pattern which I assure you would call with
> searcher(r)(t) right?

The whole point of the thread was performance across multiple searches
for the one pattern. Thus one would NOT do
    searcher(r)(t)
each time a search was required; one would do
    s = searcher(r)
ONCE, and then do
    s(t)
each time ...

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthew Franz  
View profile  
 More options Dec 27 2007, 5:04 pm
Newsgroups: comp.lang.python
From: "Matthew Franz" <mdfr...@gmail.com>
Date: Thu, 27 Dec 2007 16:04:01 -0600
Local: Thurs, Dec 27 2007 5:04 pm
Subject: Re: Performance on local constants?
Thanks, that makes more sense. I got tripped up by the function
returning a function thing and (for a while) thought _ was some sort
of spooky special variable.

- mdf

--
Matthew Franz
http://www.threatmind.net/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »