I have an if-elif chain in which I'd like to match a string against
several regular expressions. Also I'd like to use the match groups
within the respective elif... block. The C-like idiom that I would
like to use is this:
if (match = my_re1.match(line):
# use match
elsif (match = my_re2.match(line)):
# use match
elsif (match = my_re3.match(line))
# use match
...buy this is illegal in python. The other way is to open up an else:
block in each level, do the assignment and then the test. This
unneccessarily leads to deeper and deeper nesting levels which I find
ugly. Just as ugly as first testing against the RE in the elif: clause
and then, if it matches, to re-evaluate the RE to access the match
groups.
Thanks,
robert
This might help:
-----------
s = "foo"
class Tester(object):
def __call__(self, pattern):
self.m = re.match(pattern, s)
return self.m is not None
def __getattr__(self, name):
return getattr(self.m, name)
test = Tester()
if test("bar"):
print "wrong"
elif test("foo"):
print "right"
-------------
Diez
How about this (untested) code:
for re in (re1, re2, re3):
match = re.match(line)
if match:
# use it
This requires that "use it" means the same for each regular expression
though...
Uli
--
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
<ot>
Isn't it the third or fourth time this very same question pops up here ?
Starts to look like a FAQ.
</ot>
The canonical solution is to iterate over a list of expression,function
pairs, ie:
def use_match1(match):
# code here
def use_match2(match):
# code here
def use_match3(match):
# code here
for exp, func in [
(my_re1, use_match1),
(my_re2, use_match2),
(my_re3, use_match3)
]:
match = exp.match(line)
if match:
func(match)
break
The alternate solution is Diez's Test object.
HTH
> The canonical solution is to iterate over a list of
> expression,function pairs, ie:
Although that solution is pretty, it is not the canonical solution
because it doesn't cover the important case of "if" bodies needing to
access common variables in the enclosing scope. (This will be easier
in Python 3 with 'nonlocal', though.) The snippet posted by Diez is
IMHO closer to a canonical solution to this FAQ.
> Although that solution is pretty, it is not the canonical solution
> because it doesn't cover the important case of "if" bodies needing to
> access common variables in the enclosing scope. (This will be easier
> in Python 3 with 'nonlocal', though.) The snippet posted by Diez is
> IMHO closer to a canonical solution to this FAQ.
Hello everybody,
thanks for the various answers. I'm actually pretty puzzled because I
expected to see some obvious solution that I just hadn't found before.
In general I find Python more elegant and syntactically richer than C
(that's where I come from), so I didn't expect the solutions to be a
lot more verbose and/or ugly (no offense) than the original idea which
would have worked if Python's assignment statement would double as
expression, as in C.
Thanks again,
robert
PS: Since I'm testing only three REs, and I only need the match
results from one of them, I just re-evaluate that one.
Is it really a lot to change to have it
if my_re1.match(line):
match = my_re1.match(line)
elseif my_re2.match(line):
match = my_re2.match(line)
elseif my_re3.match(line):
match = my_re3.match(line)
?
That reads clearly to me...
> On May 21, 1:47 pm, Hrvoje Niksic <hnik...@xemacs.org> wrote:
>
>> Although that solution is pretty, it is not the canonical solution
>> because it doesn't cover the important case of "if" bodies needing to
>> access common variables in the enclosing scope. (This will be easier
>> in Python 3 with 'nonlocal', though.) The snippet posted by Diez is
>> IMHO closer to a canonical solution to this FAQ.
>
> Hello everybody,
>
> thanks for the various answers. I'm actually pretty puzzled because I
> expected to see some obvious solution that I just hadn't found before.
> In general I find Python more elegant and syntactically richer than C
> (that's where I come from), so I didn't expect the solutions to be a
> lot more verbose and/or ugly (no offense) than the original idea which
> would have worked if Python's assignment statement would double as
> expression, as in C.
Well, it's a design-decision - and I'm pretty ok with it being a bit verbose
here - as it prevents a *great* deal of programming errors that would
otherwise happen from accidentally writing a = b where a == b was meant.
One could argue that regular expressions - which seem to be THE case where
it bugs people - should offer a standard way that essentially works as my
solution - by keeping state around, making series of tests easier.
Diez
And wastes time. regular expressions can become expensive to match - doing
it twice might be hurtful.
Diez
match = (my_re1.match(line) or my_re2.match(line)) or
my_re3.match(line)
?
Depends if the OP wants to know that...
<bobl...@googlemail.com> wrote in message
news:7df99fd4-21c9-49a1...@26g2000hsk.googlegroups.com...
Well, in *general* one wants that. So as a general-purpose solution this is
certainly *not* the way to go.
Diez
One thing I hate from C is the assignment in expressions...Forcing
myself to write
0 == Something
rather than
Something == 0
just to make sure I was mistakenly assigning values in statements is
annoying, it ruins the ease of reading.
I kind of agree with the select:case, but I think a key issue is how
to implement it. Elif is reasonable for now.
Diez, true I guess, but then we haven't seen what these expressions
are, and why there has to be three.
Since the OP is processing alternative regexp's, he may be drifting
close to parsing territory. Here is a pyparsing example showing how
to match one of 'n' alternatives, and then apply functional logic
depending on which expression was matched. This example actually
shows 4 ways to address this question (in roughly increasing order of
OO-ness):
- explicit if/elif/... test on the name of the specific match to see
which alternative matched
- more Pythonic dispatch using a dict of names and corresponding
expression processing functions
- parse-time processing using pyparsing parse actions
- parse actions invoke class constructors, to return expression-
specific objects
-- Paul
from pyparsing import oneOf, Combine, Optional, Word, \
alphas, nums, alphanums, QuotedString
# define basic expressions
sign = oneOf("+ -")
integer = Combine(Optional(sign) + Word(nums))
real = Combine(Optional(sign) + Word(nums) + "." +
Optional(Word(nums)))
name = Word(alphas,alphanums) | QuotedString("'")
# define alternates with results names
item = real("real") | integer("integer") | name("name")
print "\nUse results names to determine which pattern matched"
for it in item.searchString("abc 123 -3.14 'phase of the moon'"):
if it.getName() == "real":
print "Real:", (float(it[0]))
elif it.getName() == "integer":
print "Int:", (int(it[0]))
else:
print "String:", it[0]
print "\nUse dict to dispatch to type-specific functions " \
"- more Pythonically canonical"
def processRealItem(it):
print "Real:", (float(it[0]))
def processIntegerItem(it):
print "Int:", (int(it[0]))
def processNameItem(it):
print "String:", it[0]
for it in item.searchString("abc 123 -3.14 'phase of the moon'"):
{"real" : processRealItem,
"integer" : processIntegerItem,
"name" : processNameItem }[it.getName()](it)
print "\nMove expression-specific logic into parse-time parse actions"
def convertInt(t):
return int(t[0])
def convertReal(t):
return float(t[0])
integer.setParseAction(convertInt)
real.setParseAction(convertReal)
item = real("real") | integer("integer") | name("name")
for it in item.searchString("abc 123 -3.14 'phase of the moon'"):
print "%s: %s %s" % (it.getName(), it[0], type(it[0]))
print "\nUse class constructors as parse-time parse actions " \
"- results names no longer needed"
class IntObject(object):
def __init__(self,t):
self.type, self.value = "Int", int(t[0])
class RealObject(object):
def __init__(self,t):
self.type, self.value = "Real", float(t[0])
class StringObject(object):
def __init__(self,t):
self.type, self.value = "String", t[0]
integer.setParseAction(IntObject)
real.setParseAction(RealObject)
name.setParseAction(StringObject)
item = real | integer | name
for it in item.searchString("abc 123 -3.14 'phase of the moon'"):
print "%s: %s (%s)" % (it[0].type, it[0].value,
it[0].__class__.__name__)
Prints:
Use results names to determine which pattern matched
String: abc
Int: 123
Real: -3.14
String: phase of the moon
Use dict to dispatch to type-specific functions - more Pythonically
canonical
String: abc
Int: 123
Real: -3.14
String: phase of the moon
Move expression-specific logic into parse-time parse actions
name: abc <type 'str'>
integer: 123 <type 'int'>
real: -3.14 <type 'float'>
name: phase of the moon <type 'str'>
Use class constructors as parse-time parse actions - results names no
longer needed
String: abc (StringObject)
Int: 123 (IntObject)
Real: -3.14 (RealObject)
String: phase of the moon (StringObject)
You are perfectly correct. Pythons design is lacking here IMO. But
what is your question?
interesting trick, i've never thought of that/seen it
although if Python implemented it I think it should default to giving
warnings when you use = in an expression, that way you don't have to worry.
That introduces complications though, do you want to see a pagefull of
warnings every time you import a module that uses the ='s?
You could specify in your python file that you want to suppress that
warning, but then you'd never know when you used = by accident when you
meant to use ==.
anyway i was thinking you could have a second assignment operator to use
just in expressions, and only allow that. it could be := since some
languages tend to use that. i wouldn't like it as a general assignment
operator but assignment in expressions is a special case. also <- or ->.
C uses -> for functions but I think math/calculators use that for
assignment.
My preference would be ?=.
if match ?= my_re1.match(line):
# use match
elif match ?= my_re2.match(line):
# use match
elif match ?= my_re3.match(line):
# use match
> if (match = my_re1.match(line):
> # use match
> elsif (match = my_re2.match(line)):
> # use match
> elsif (match = my_re3.match(line))
> # use match
>
> ...buy this is illegal in python.
Assignment expressions is disallowed in Python to protect against a
very common bug in C/C++ programs, the (accidental) confusion of
if (match = my_re1.match(line))
with
if (match == my_re1.match(line))
or vice versa.
You could use named groups to search for all three patterns at once
like this:
original:
prog1 = re.compile(r'pat1')
prog2 = re.compile(r'pat2')
prog3 = re.compile(r'pat3')
...
Becomes:
prog = re.compile(r'(?P<p1>pat1)|(?P<p2>pat2)|(?P<p3>pat3)')
match = prog.match(line)
for p in 'p1 p2 p3'.split():
if match.groupdict()[p]:
do_something_for_prog(p)
- Paddy.
This is just a syntactical issue. But what is the *value* of an
assigment? In Python it is always None: assigments are statements, not
expressions.
However Guido and team have found a *pragmatic* solution for this at
another place:
with open("myFile") as f:
BLOCK
Compare this with a possible syntactical form of an if-statement:
if EXPR as NAME:
BLOCK
This isn't ugly syntax-wise. It's just a bit harder to understand the
semantics of an if-statement. It might read like this:
"Evaluate EXPR and compute bool(EXPR). If this value is True assign
EXPR to NAME and execute BLOCK. Otherwise refuse both assigment and
BLOCK execution."
Maybe assignment can be performed unconditionally as in the C case.
I'm not sure about this.
But that is an error that occurs because of the specific symbols chosen
to represent an assignment or equality test. At the time python was
first designed, other symbols for assignment were already used for
in other languages, like := and <-
So if preventing errors was the main motivation why not allow an
assignment to be an expression but use a symbol for the assignment
that would prevent those kind of errors?
I find it hard to believe that a design choice like whether or not
to have the assignment behave as an expression or not, was decided
on the ground of a particulare lexical representation of the assignment
symbol.
--
Antoon Pardon
Try this.
-- Paul
class TestValue(object):
"""Class to support assignment and test in single operation"""
def __init__(self,v=None):
self.value = v
"""Add support for quasi-assignment syntax using '<<' in place of
'='."""
def __lshift__(self,other):
self.value = other
return bool(self.value)
import re
tv = TestValue()
integer = re.compile(r"[-+]?\d+")
real = re.compile(r"[-+]?\d*\.\d+")
word = re.compile(r"\w+")
for inputValue in ("123 abc 3.1".split()):
if (tv << real.match(inputValue)):
print "Real", float(tv.value.group())
elif (tv << integer.match(inputValue)):
print "Integer", int(tv.value.group())
elif (tv << word.match(inputValue)):
print "Word", tv.value.group()
Prints:
Integer 123
Word abc
Real 3.1
class TestValue(object):
"""Class to support assignment and test in single operation"""
def __init__(self,v=None):
self.value = v
"""Add support for quasi-assignment syntax using '<<' in place of
'='."""
def __lshift__(self,other):
self.value = other
return self
def __bool__(self):
return bool(self.value)
__nonzero__ = __bool__
-- Paul