issue implementing the following in a .pyl file

10 views
Skip to first unread message

Socr...@gmail.com

unread,
Feb 8, 2008, 2:46:40 PM2/8/08
to pyggy
I'm parsing IIS logs and am having a hell of a time getting a regex
that will work for the page portion of the file.

given something like : 2007-02-09 23:59:59 GET /ABC.aspx or
2007-02-09 23:59:59 GET /Search/Styles/XYZ.css

i need to match on the blah.aspx, blah/blah.css, etc.. portion of the
line.

a regex that will do this fine in a general fashion is /[A-Za-z0-9]+(/
[A-Za-z0-9]+)*\.[A-Za-z0-9]+

however, the closest approximation i can get into a pylly spec file
is: "/[[:alnum:]]+(/[[:alnum:]]+)+\.[[:alnum:]]+" : return "PAGE"

This seems to puke when i try to run getparser on the pyl file that
contains it.

i've tried several variations on this, and cant get anything to work
the way the standard regex does. any ideas?



also, it doesn't look like i can use some other standard regex
constructs such as \d. Am i missing something or are some of those
just not implemented?

Thanks in advance,
-Devin
Reply all
Reply to author
Forward
0 new messages