<a href="http://somelink.com">The description</a> (12/29/2007 10:34
PM)
However, HTML being what it is, I can't be guaranteed that there
aren't more spaces and other bits of text than appear in this example,
so I'm having to write a fairly expansive regex statement. Right now
the statement is this:
<a.*?href="(?<url>.*?)".*?>(?<tag>.*?)</a>.*?\((?<date>\d{2}/\d{2}/
\d{4}\s\d{2}:\d{2}\s\w{2})\)
The problem is that the statement finds the first <a href> in the
document, followed by the first date it comes across. That first match
contains a lot of text, including other hrefs, especially the href-
date combo I'm hoping it would find.
Is there any way to tell the statement to ignore a match if it
contains an href? I tried adding a (?!http:\/\/), but that statement
only seems to work for a prefix, and not for text in the middle of a
match.
Thanks for any help.
--Brent