Socr...@gmail.com
unread,Feb 8, 2008, 2:46:40 PM2/8/08Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to pyggy
I'm parsing IIS logs and am having a hell of a time getting a regex
that will work for the page portion of the file.
given something like : 2007-02-09 23:59:59 GET /ABC.aspx or
2007-02-09 23:59:59 GET /Search/Styles/XYZ.css
i need to match on the blah.aspx, blah/blah.css, etc.. portion of the
line.
a regex that will do this fine in a general fashion is /[A-Za-z0-9]+(/
[A-Za-z0-9]+)*\.[A-Za-z0-9]+
however, the closest approximation i can get into a pylly spec file
is: "/[[:alnum:]]+(/[[:alnum:]]+)+\.[[:alnum:]]+" : return "PAGE"
This seems to puke when i try to run getparser on the pyl file that
contains it.
i've tried several variations on this, and cant get anything to work
the way the standard regex does. any ideas?
also, it doesn't look like i can use some other standard regex
constructs such as \d. Am i missing something or are some of those
just not implemented?
Thanks in advance,
-Devin