Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
New implementation of re module
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 29 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
MRAB  
View profile  
 More options Jul 27 2009, 12:34 pm
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Mon, 27 Jul 2009 17:34:03 +0100
Local: Mon, Jul 27 2009 12:34 pm
Subject: New implementation of re module
Hi all,

I've been working on a new implementation of the re module. The details
are at http://bugs.python.org/issue2636, specifically from
http://bugs.python.org/issue2636#msg90954. I've included a .pyd file for
Python 2.6 on Windows if you want to try it out.

I'm interested in how fast it is generally, compared with the current re
module, but especially when faced with those 'pathological' regular
expressions which seem to take a long time to finish, for example:

     re.search(r"^(.+|D)*A$", "x" * 25 + "B")

which on my PC (1.8GHz) takes 18.98secs with the re module but <0.01secs
with this new implementation.

TIA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
William Dode  
View profile  
 More options Jul 27 2009, 3:04 pm
Newsgroups: comp.lang.python
From: William Dode <w...@flibuste.net>
Date: 27 Jul 2009 19:04:59 GMT
Local: Mon, Jul 27 2009 3:04 pm
Subject: Re: New implementation of re module

On 27-07-2009, MRAB wrote:
> Hi all,

> I've been working on a new implementation of the re module. The details
> are at http://bugs.python.org/issue2636, specifically from
> http://bugs.python.org/issue2636#msg90954. I've included a .pyd file for
> Python 2.6 on Windows if you want to try it out.

Someone can remember me how to compile it (on debian lenny), if possible
with python2.5. I've also python3.1 that i build alone...

I could test it with pytextile, i've a bunch of texts to bench and
compare.

Did you announce it on the unladen-swallow list ? They wanted to hack on
RE also...

--
William Dodé - http://flibuste.net
Informaticien Indépendant


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wolfgang Rohdewald  
View profile  
 More options Jul 27 2009, 3:27 pm
Newsgroups: comp.lang.python
From: Wolfgang Rohdewald <wolfg...@rohdewald.de>
Date: Mon, 27 Jul 2009 21:27:55 +0200
Local: Mon, Jul 27 2009 3:27 pm
Subject: Re: New implementation of re module
On Monday 27 July 2009, MRAB wrote:

> I've been working on a new implementation of the re module. The
> details are at http://bugs.python.org/issue2636, specifically from
> http://bugs.python.org/issue2636#msg90954. I've included a .pyd
> file for Python 2.6 on Windows if you want to try it out.

how do I compile _regex.c on Linux?

--
Wolfgang


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 27 2009, 4:00 pm
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Mon, 27 Jul 2009 21:00:48 +0100
Local: Mon, Jul 27 2009 4:00 pm
Subject: Re: New implementation of re module

All I can do is point you to
http://docs.python.org/extending/extending.html.

For Linux (which I don't have) you'll need to use _regex.h and _regex.c
to compile to _regex.so instead of _regex.pyd.

> Did you announce it on the unladen-swallow list ? They wanted to hack on
> RE also...

No. I haven't subscribed to it.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aahz  
View profile  
 More options Jul 27 2009, 10:52 pm
Newsgroups: comp.lang.python
From: a...@pythoncraft.com (Aahz)
Date: 27 Jul 2009 19:52:26 -0700
Local: Mon, Jul 27 2009 10:52 pm
Subject: Re: New implementation of re module
In article <mailman.3787.1248712420.8015.python-l...@python.org>,

MRAB  <pyt...@mrabarnett.plus.com> wrote:

>I've been working on a new implementation of the re module. The details
>are at http://bugs.python.org/issue2636, specifically from
>http://bugs.python.org/issue2636#msg90954. I've included a .pyd file for
>Python 2.6 on Windows if you want to try it out.

How does it handle the re module's unit tests?
--
Aahz (a...@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
OKB (not okblacke)  
View profile  
 More options Jul 28 2009, 12:41 am
Newsgroups: comp.lang.python
From: "OKB (not okblacke)" <brenNOSPAMb...@NObrenSPAMbarn.net>
Date: Tue, 28 Jul 2009 04:41:34 GMT
Local: Tues, Jul 28 2009 12:41 am
Subject: Re: New implementation of re module

        Variable-length lookbehind!  My hero!

--
--OKB (not okblacke)
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is
no path, and leave a trail."
        --author unknown


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 28 2009, 10:59 am
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Tue, 28 Jul 2009 15:59:06 +0100
Local: Tues, Jul 28 2009 10:59 am
Subject: Re: New implementation of re module

Aahz wrote:
> In article <mailman.3787.1248712420.8015.python-l...@python.org>,
> MRAB  <pyt...@mrabarnett.plus.com> wrote:
>> I've been working on a new implementation of the re module. The details
>> are at http://bugs.python.org/issue2636, specifically from
>> http://bugs.python.org/issue2636#msg90954. I've included a .pyd file for
>> Python 2.6 on Windows if you want to try it out.

> How does it handle the re module's unit tests?

Basically, it passes all those tests I expect it to pass. :-)

It fails those where the intended behaviour has changed, such as re.sub
treating unmatched groups as empty strings, as requested in
http://bugs.python.org/issue1519638.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christopher Arndt  
View profile  
 More options Jul 28 2009, 1:07 pm
Newsgroups: comp.lang.python
From: Christopher Arndt <chris.ar...@web.de>
Date: Tue, 28 Jul 2009 10:07:26 -0700 (PDT)
Local: Tues, Jul 28 2009 1:07 pm
Subject: Re: New implementation of re module
On 27 Jul., 21:27, Wolfgang Rohdewald <wolfg...@rohdewald.de> wrote:

> how do I compile _regex.c on Linux?

This simple setup.py file should do the trick:

from distutils.core import setup, Extension

setup(name='regex',
    version='1.0',
    py_modules = ['regex'],
    ext_modules=[Extension('_regex', ['_regex.c'])],
)

Also, you need to copy "unicodedata_db.h" from the "Modules" directory
of the Python source tree to your working directory, since this file
apparently is not installed into the include directory of a Python
installation.

I get an error for Python 2.5 on Mac OS X 10.4 and Linux though. Seems
that the module is not compatible with Python 2.5.

_regex.c: In function 'getstring':
_regex.c:2425: error: invalid type argument of '->'
_regex.c: In function 'getstring':
_regex.c:2425: error: invalid type argument of '->'
lipo: can't figure out the architecture type of: /var/tmp//
ccT3oDXD.out
error: command 'gcc' failed with exit status 1

resp. on Linux:

_regex.c: In function 'getstring':
_regex.c:2425: warning: implicit declaration of function 'Py_TYPE'
_regex.c:2425: error: invalid type argument of '->'
_regex.c: In function 'get_match_replacement':
_regex.c:2900: warning: implicit declaration of function
'PyLong_AsSsize_t'
error: command 'gcc' failed with exit status 1

With the official Python 2.6 distribution for Mac OS X it works.

Chris


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 28 2009, 1:25 pm
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Tue, 28 Jul 2009 18:25:43 +0100
Local: Tues, Jul 28 2009 1:25 pm
Subject: Re: New implementation of re module

The source code is intended to replace the current 're' module in Python
2.7 (and I'll be porting it to Python 3.2), so I'm not that worried
about Python versions earlier than 2.6 for testing, although if there's
sufficient need then I could tweak the sources for 2.5.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
William Dode  
View profile  
 More options Jul 28 2009, 4:05 pm
Newsgroups: comp.lang.python
From: William Dode <w...@flibuste.net>
Date: 28 Jul 2009 20:05:35 GMT
Local: Tues, Jul 28 2009 4:05 pm
Subject: Re: New implementation of re module

On 28-07-2009, MRAB wrote:
> With the official Python 2.6 distribution for Mac OS X it works.

> The source code is intended to replace the current 're' module in Python
> 2.7 (and I'll be porting it to Python 3.2), so I'm not that worried
> about Python versions earlier than 2.6 for testing, although if there's
> sufficient need then I could tweak the sources for 2.5.

I understand now why i could'nt compile it !

So, i would like if it's not too much work for you.

--
William Dodé - http://flibuste.net
Informaticien Indépendant


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aahz  
View profile  
 More options Jul 28 2009, 4:57 pm
Newsgroups: comp.lang.python
From: a...@pythoncraft.com (Aahz)
Date: 28 Jul 2009 13:57:37 -0700
Local: Tues, Jul 28 2009 4:57 pm
Subject: Re: New implementation of re module
In article <mailman.3843.1248793153.8015.python-l...@python.org>,

Then you should definitely publish to PyPI and post a message to
c.l.py.announce to get more users.
--
Aahz (a...@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Lawrence  
View profile  
 More options Jul 28 2009, 6:40 pm
Newsgroups: comp.lang.python
From: Mark Lawrence <breamore...@yahoo.co.uk>
Date: Tue, 28 Jul 2009 23:40:02 +0100
Local: Tues, Jul 28 2009 6:40 pm
Subject: Re: New implementation of re module

I tried this on my 3GHz PC timings pretty much the same.

 From here http://bugs.python.org/issue1721518 I knocked up this.

import time
import re
import regex

s = "Add.1, 2020 and Add.1, 2021-2023, 2025, 2028 and 2029 and Add.1) R"
r = "(?:\s|,|and|Add\S*?|Parts?|\([^\)]*\)|[IV\-\d]+)*$"
t0 = time.clock()
print regex.search(r, s)
t1 = time.clock()
print "time", t1 - t0

print "It's going to crash"
t0 = time.clock()
print re.search(r, s)
t1 = time.clock()
print "It hasn't crashed time", t1 - t0

Output shows a slight change in timing:).

<_regex.RE_Match object at 0x0243A1A0>
time 0.00279001940191
It's going to crash
<_sre.SRE_Match object at 0x024396B0>
It hasn't crashed time 98.4238155967

> TIA

I also got the files bm_regex_effbot.py and bm_regex_v8.py from
http://code.google.com/p/unladen-swallow/source/browse/#svn/tests/per...
and ran them, then reran them having substituted regex for re.  Output
timings were roughly effbot re 0.14secs, effbot regex 1.16secs, v8 re
0.17secs and v8 regex 0.67secs.

HTH.

--
Kindest regards.

Mark Lawrence.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 28 2009, 8:58 pm
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Wed, 29 Jul 2009 01:58:50 +0100
Local: Tues, Jul 28 2009 8:58 pm
Subject: Re: New implementation of re module
William Dode wrote:
> On 28-07-2009, MRAB wrote:

>> With the official Python 2.6 distribution for Mac OS X it works.
>> The source code is intended to replace the current 're' module in Python
>> 2.7 (and I'll be porting it to Python 3.2), so I'm not that worried
>> about Python versions earlier than 2.6 for testing, although if there's
>> sufficient need then I could tweak the sources for 2.5.

> I understand now why i could'nt compile it !

> So, i would like if it's not too much work for you.

There's a new version which should compile with Python 2.5 now.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mike  
View profile  
 More options Jul 29 2009, 11:24 am
Newsgroups: comp.lang.python
From: Mike <tutu...@gmail.com>
Date: Wed, 29 Jul 2009 08:24:52 -0700 (PDT)
Local: Wed, Jul 29 2009 11:24 am
Subject: Re: New implementation of re module
On Jul 27, 11:34 am, MRAB <pyt...@mrabarnett.plus.com> wrote:

> I've been working on a new implementation of the re module.

Fabulous!

If you're extending/changing the interface, there are a couple of sore
points in the current implementation I'd love to see addressed:

- findall/finditer doesn't find overlapping matches.  Sometimes you
really *do* want to know all possible matches, even if they overlap.
This comes up in bioinformatics, for example.

- split won't split on empty patterns, e.g. empty lookahead patterns.
This means that it can't be used for a whole class of interesting
cases.  This has been discussed previously:

    http://bugs.python.org/issue3262
    http://bugs.python.org/issue852532
    http://bugs.python.org/issue988761

- It'd be nice to have a version of split that generates the parts
(one by one) rather than returning the whole list.

- Repeated subgroup match information is not available.  That is, for
a match like this

    re.match('(.){3}', 'xyz')

there's no way to discover that the subgroup first matched 'x', then
matched 'y', and finally matched 'z'.  Here is one past proposal
(mine), perhaps over-complex, to address this problem:

    http://mail.python.org/pipermail/python-dev/2004-August/047238.html

Mike


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 29 2009, 11:45 am
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Wed, 29 Jul 2009 16:45:55 +0100
Local: Wed, Jul 29 2009 11:45 am
Subject: Re: New implementation of re module
Mike wrote:
> On Jul 27, 11:34 am, MRAB <pyt...@mrabarnett.plus.com> wrote:
>> I've been working on a new implementation of the re module.

> Fabulous!

> If you're extending/changing the interface, there are a couple of sore
> points in the current implementation I'd love to see addressed:

> - findall/finditer doesn't find overlapping matches.  Sometimes you
> really *do* want to know all possible matches, even if they overlap.
> This comes up in bioinformatics, for example.

Perhaps by adding "overlapped=True"?

> - split won't split on empty patterns, e.g. empty lookahead patterns.
> This means that it can't be used for a whole class of interesting
> cases.  This has been discussed previously:

>     http://bugs.python.org/issue3262
>     http://bugs.python.org/issue852532
>     http://bugs.python.org/issue988761

Already addressed (see issue2636 for the full details).

> - It'd be nice to have a version of split that generates the parts
> (one by one) rather than returning the whole list.

Hmm, re.splititer() perhaps.

> - Repeated subgroup match information is not available.  That is, for
> a match like this

>     re.match('(.){3}', 'xyz')

> there's no way to discover that the subgroup first matched 'x', then
> matched 'y', and finally matched 'z'.  Here is one past proposal
> (mine), perhaps over-complex, to address this problem:

>     http://mail.python.org/pipermail/python-dev/2004-August/047238.html

Yikes! I think I'll let you code that... :-)

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mike  
View profile  
 More options Jul 29 2009, 1:21 pm
Newsgroups: comp.lang.python
From: Mike <tutu...@gmail.com>
Date: Wed, 29 Jul 2009 10:21:39 -0700 (PDT)
Local: Wed, Jul 29 2009 1:21 pm
Subject: Re: New implementation of re module
On Jul 29, 10:45 am, MRAB <pyt...@mrabarnett.plus.com> wrote:

> Mike wrote:
> > - findall/finditer doesn't find overlapping matches.  Sometimes you
> > really *do* want to know all possible matches, even if they overlap.

> Perhaps by adding "overlapped=True"?

Something like that would be great, yes.

> > - split won't split on empty patterns, e.g. empty lookahead patterns.
> Already addressed (see issue2636 for the full details).

Glad to hear it.

> > - Repeated subgroup match information is not available.  That is, for
> > a match like this

> >     re.match('(.){3}', 'xyz')

> > there's no way to discover that the subgroup first matched 'x', then
> > matched 'y', and finally matched 'z'.  Here is one past proposal
> > (mine), perhaps over-complex, to address this problem:

> >    http://mail.python.org/pipermail/python-dev/2004-August/047238.html

> Yikes! I think I'll let you code that... :-)

I agree that that document looks a little scary--maybe I was trying to
bite off too much at once.

My intuition, though, is that the basic idea should be fairly simple
to implement, at least for a depth-first matcher.  The repeated match
subgroups are already being discovered, it's just that they're not
being saved, so there's no way to report them out once a complete
match is found.  If some trail of breadcrumbs were pushed onto a stack
during the DFS, it could be traced at the end.  And the whole thing
might not even been that expensive to do.

The hardest parts about this, in my mind, are figuring out how to
report the repeated matches out in a useful form (hence all that
detail in the proposal), and getting users to understand that using
this feature *could* suck up a lot of memory, if they're not careful.

As always, it's possible that my intuition is totally wrong.  Plus I'm
not sure how this would work out in the breadth-first case.

Details aside, I would really, really, really like to have a way to
get at the repeated subgroup matches.  I write a lot of code that
would be one-liners if this capability existed.  Plus, it just plain
burns me that Python is discovering this information but impudently
refuses to tell me what it's found!  ;-)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wolfgang Rohdewald  
View profile  
 More options Jul 30 2009, 6:35 am
Newsgroups: comp.lang.python
From: Wolfgang Rohdewald <wolfg...@rohdewald.de>
Date: Thu, 30 Jul 2009 12:35:22 +0200
Local: Thurs, Jul 30 2009 6:35 am
Subject: Re: New implementation of re module
On Tuesday 28 July 2009, Christopher Arndt wrote:

> setup(name='regex',
>     version='1.0',
>     py_modules = ['regex'],
>     ext_modules=[Extension('_regex', ['_regex.c'])],
> )

> Also, you need to copy "unicodedata_db.h" from the "Modules"
> directory of the Python source tree to your working directory,
> since this file apparently is not installed into the include
> directory of a Python installation.

using issue2636-20090729.zip

I have Python 2.6.2 on ubuntu 9.04

with ggc-4.3:
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -
Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c _regex.c -o
build/temp.linux-i686-2.6/_regex.o
_regex.c: In Funktion »bmatch_context«:
_regex.c:1462: Fehler: Als Erhöhungsoperand wird L-Wert erfordert
_regex.c:1470: Fehler: Als Erhöhungsoperand wird L-Wert erfordert
_regex.c:1478: Fehler: Als Verringerungsoperand wird L-Wert erfordert

with gcc-4.4:
gcc-4.4 -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-
prototypes -fPIC -I/usr/include/python2.6 -c _regex.c -o
build/temp.linux-i686-2.6/_regex.o
_regex.c: In function ‘bmatch_context’:
_regex.c:1462: error: lvalue required as increment operand
_regex.c:1470: error: lvalue required as increment operand
_regex.c:1478: error: lvalue required as decrement operand

--
Wolfgang


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 30 2009, 7:05 am
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Thu, 30 Jul 2009 12:05:58 +0100
Local: Thurs, Jul 30 2009 7:05 am
Subject: Re: New implementation of re module

There are other lines which are similar, eg line 1487. Do they all give
the same/similar error with your compiler?

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wolfgang Rohdewald  
View profile  
 More options Jul 30 2009, 8:29 am
Newsgroups: comp.lang.python
From: Wolfgang Rohdewald <wolfg...@rohdewald.de>
Date: Thu, 30 Jul 2009 14:29:07 +0200
Local: Thurs, Jul 30 2009 8:29 am
Subject: Re: New implementation of re module
On Thursday 30 July 2009, MRAB wrote:

> There are other lines which are similar, eg line 1487. Do they all
> give the same/similar error with your compiler?

yes. The full output with gcc-4.3:

notebook:~/kmj/src$ LANG=C python setup.py  build
running build
running build_py
running build_ext
building '_regex' extension
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -
Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c _regex.c -o
build/temp.linux-i686-2.6/_regex.o
_regex.c: In function 'bmatch_context':
_regex.c:1462: error: lvalue required as increment operand
_regex.c:1470: error: lvalue required as increment operand
_regex.c:1478: error: lvalue required as decrement operand
_regex.c:1487: error: lvalue required as decrement operand
_regex.c:1593: error: lvalue required as increment operand
_regex.c:1606: error: lvalue required as decrement operand
_regex.c:1616: error: lvalue required as increment operand
_regex.c:1625: error: lvalue required as increment operand
_regex.c:1634: error: lvalue required as decrement operand
_regex.c:1643: error: lvalue required as decrement operand
_regex.c:2036: error: lvalue required as increment operand
_regex.c:2047: error: lvalue required as increment operand
_regex.c:2059: error: lvalue required as decrement operand
_regex.c:2070: error: lvalue required as decrement operand
_regex.c:2316: error: lvalue required as increment operand
In file included from _regex.c:2431:
_regex.c: In function 'umatch_context':
_regex.c:1462: error: lvalue required as increment operand
_regex.c:1470: error: lvalue required as increment operand
_regex.c:1478: error: lvalue required as decrement operand
_regex.c:1487: error: lvalue required as decrement operand
_regex.c:1593: error: lvalue required as increment operand
_regex.c:1606: error: lvalue required as decrement operand
_regex.c:1616: error: lvalue required as increment operand
_regex.c:1625: error: lvalue required as increment operand
_regex.c:1634: error: lvalue required as decrement operand
_regex.c:1643: error: lvalue required as decrement operand
_regex.c:2036: error: lvalue required as increment operand
_regex.c:2047: error: lvalue required as increment operand
_regex.c:2059: error: lvalue required as decrement operand
_regex.c:2070: error: lvalue required as decrement operand
_regex.c:2316: error: lvalue required as increment operand
error: command 'gcc' failed with exit status 1

--
Wolfgang


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 30 2009, 8:56 am
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Thu, 30 Jul 2009 13:56:32 +0100
Local: Thurs, Jul 30 2009 8:56 am
Subject: Re: New implementation of re module

So it complains about:

     ++(RE_CHAR*)context->text_ptr

but not about:

     ++info->repeat.count

Does this mean that the gcc compiler thinks that the cast makes it an
rvalue? I'm using Visual C++ 2008 Express Edition, which doesn't
complain. What does the C standard say?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wolfgang Rohdewald  
View profile  
 More options Jul 30 2009, 9:18 am
Newsgroups: comp.lang.python
From: Wolfgang Rohdewald <wolfg...@rohdewald.de>
Date: Thu, 30 Jul 2009 15:18:27 +0200
Local: Thurs, Jul 30 2009 9:18 am
Subject: Re: New implementation of re module
On Thursday 30 July 2009, MRAB wrote:

> So it complains about:

>      ++(RE_CHAR*)context->text_ptr

> but not about:

>      ++info->repeat.count

> Does this mean that the gcc compiler thinks that the cast makes it
> an rvalue? I'm using Visual C++ 2008 Express Edition, which doesn't
> complain. What does the C standard say?

I am not really a C expert but I found some links. Most helpful:
http://developer.apple.com/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc...

(search -fnon-lvalue-assign)

so I did the conversion mentioned there. This works:

--- _regex.c    2009-07-29 11:34:00.000000000 +0200
+++ n   2009-07-30 15:15:22.000000000 +0200
@@ -1459,7 +1459,7 @@
             if (text_ptr < (RE_CHAR*)context->slice_end && text_ptr[0] != '\n')
               {
                 context->node = node->next_1;
-                ++(RE_CHAR*)context->text_ptr;
+                ++*(RE_CHAR**)&context->text_ptr;
             } else
                 context = reject_context(state, context);
             break;

--
Wolfgang


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wolfgang Rohdewald  
View profile  
 More options Jul 30 2009, 9:24 am
Newsgroups: comp.lang.python
From: Wolfgang Rohdewald <wolfg...@rohdewald.de>
Date: Thu, 30 Jul 2009 15:24:28 +0200
Local: Thurs, Jul 30 2009 9:24 am
Subject: Re: New implementation of re module
On Thursday 30 July 2009, Wolfgang Rohdewald wrote:

> so I did the conversion mentioned there. This works:

I actually do not know if it works - but it compiles.

--
Wolfgang


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Piet van Oostrum  
View profile  
 More options Jul 30 2009, 9:39 am
Newsgroups: comp.lang.python
From: Piet van Oostrum <p...@cs.uu.nl>
Date: Thu, 30 Jul 2009 15:39:35 +0200
Local: Thurs, Jul 30 2009 9:39 am
Subject: Re: New implementation of re module

>>>>> MRAB <pyt...@mrabarnett.plus.com> (M) wrote:
>M> Hi all,
>M> I've been working on a new implementation of the re module. The details
>M> are at http://bugs.python.org/issue2636, specifically from
>M> http://bugs.python.org/issue2636#msg90954. I've included a .pyd file for
>M> Python 2.6 on Windows if you want to try it out.
>M> I'm interested in how fast it is generally, compared with the current re
>M> module, but especially when faced with those 'pathological' regular
>M> expressions which seem to take a long time to finish, for example:
>M>     re.search(r"^(.+|D)*A$", "x" * 25 + "B")
>M> which on my PC (1.8GHz) takes 18.98secs with the re module but <0.01secs
>M> with this new implementation.

Is this version also going to use the Thompson approach?
--
Piet van Oostrum <p...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hrvoje Niksic  
View profile  
 More options Jul 30 2009, 2:32 pm
Newsgroups: comp.lang.python
From: Hrvoje Niksic <hnik...@xemacs.org>
Date: Thu, 30 Jul 2009 20:32:10 +0200
Local: Thurs, Jul 30 2009 2:32 pm
Subject: Re: New implementation of re module

MRAB <pyt...@mrabarnett.plus.com> writes:
> So it complains about:

>     ++(RE_CHAR*)context->text_ptr

> but not about:

>     ++info->repeat.count

> Does this mean that the gcc compiler thinks that the cast makes it an
> rvalue?

The cast operator does return an rvalue, treating it otherwise used to
be an extension to popular compilers, including ironically gcc.  The
standard-compliant way of writing the above would be:

++ *(RE_CHAR **) &context->text_ptr


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MRAB  
View profile  
 More options Jul 30 2009, 5:44 pm
Newsgroups: comp.lang.python
From: MRAB <pyt...@mrabarnett.plus.com>
Date: Thu, 30 Jul 2009 22:44:31 +0100
Local: Thurs, Jul 30 2009 5:44 pm
Subject: Re: New implementation of re module
Wolfgang Rohdewald wrote:
> On Thursday 30 July 2009, Wolfgang Rohdewald wrote:
>> so I did the conversion mentioned there. This works:

> I actually do not know if it works - but it compiles.

Yes, it works. I've updated my code accordingly and it'll be in the next
release.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 29   Newer >
« Back to Discussions « Newer topic     Older topic »