Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

regex function driving me nuts

38 views
Skip to first unread message

MartinD.

unread,
Oct 23, 2012, 3:51:34 PM10/23/12
to
Hi,

I'm new to Python.
Does someone has an idea what's wrong. I tried everything. The only regex that is tested is the last one in a whole list of regex in keywords.txt
Thanks!
Martin


########
def checkKeywords( str, lstKeywords ):

for regex in lstKeywords:
match = re.search(regex, str,re.IGNORECASE)
# If-statement after search() tests if it succeeded
if match:
print match.group() ##just debugging
return match.group() ## 'found!

return

#########

keywords1 = [line for line in open('keywords1.txt')]
resultKeywords1 = checkKeywords("string_to_test",keywords1)
print resultKeywords1

Prasad, Ramit

unread,
Oct 23, 2012, 4:29:10 PM10/23/12
to pytho...@python.org
Hi Martin,
It is always helpful to provide python version, operating system version, full error message,
and input/expected output for the code. Now I can tell you are using Python 2.x but
without having any clue what is in keywords1.txt it is impossible to figure out
what the problem might be. Other than using regular expressions that is. :)

Ramit Prasad


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.

Ian Kelly

unread,
Oct 23, 2012, 4:32:31 PM10/23/12
to Python
On Tue, Oct 23, 2012 at 1:51 PM, MartinD. <cyber...@gmail.com> wrote:
> Hi,
>
> I'm new to Python.
> Does someone has an idea what's wrong. I tried everything. The only regex that is tested is the last one in a whole list of regex in keywords.txt
> Thanks!
> Martin

How do you know that it's the only one being tested? Your debugging
statement only prints one that actually matches, not each one that is
tried.

> keywords1 = [line for line in open('keywords1.txt')]

Note that each "keyword" will including the trailing newline, if any.
This is probably why you are only seeing the last keyword match:
because it is the only one without a trailing newline.

To remove the newlines, call the str.strip or str.rstrip method on
each line, or use the str.splitlines method to get the keywords:

keywords1 = open('keywords1.txt').read().splitlines()

Vlastimil Brom

unread,
Oct 23, 2012, 4:36:40 PM10/23/12
to MartinD., pytho...@python.org
2012/10/23 MartinD. <cyber...@gmail.com>:
> --
> http://mail.python.org/mailman/listinfo/python-list

Hi,
just a wild guess, as I don't have access to containing the list of
potentially problematic regex patterns
does:
keywords1 = [line.strip() for line in open('keywords1.txt')]
possibly fix yout problem?
the lines of the file iterator also preserve newlines, which might not
be expected in your keywords, strip() removes (be default) any
starting and tryiling whitespace.

hth,
vbr

cyber...@gmail.com

unread,
Oct 23, 2012, 7:51:11 PM10/23/12
to MartinD., pytho...@python.org
Stripping the line did it !!!
Thank you very much to all !!!
Cheers! :-)
Martin


Le mardi 23 octobre 2012 16:36:44 UTC-4, Vlastimil Brom a écrit :
> 2012/10/23 MartinD.

cyber...@gmail.com

unread,
Oct 23, 2012, 7:51:11 PM10/23/12
to comp.lan...@googlegroups.com, MartinD., pytho...@python.org
Stripping the line did it !!!
Thank you very much to all !!!
Cheers! :-)
Martin


Le mardi 23 octobre 2012 16:36:44 UTC-4, Vlastimil Brom a écrit :
> 2012/10/23 MartinD.
>
Message has been deleted
0 new messages