Help with Regex to get images

0 views
Skip to first unread message

isaac2004

unread,
May 12, 2009, 11:51:59 PM5/12/09
to Regex
hi,
im trying to get the playerId for a baseball player from ESPN's
website. the info is in a link like so

<a href="/mlb/players/profile?playerId=6195">Hanley Ramirez</a></td>


im trying to just get the 6195 but am having some trouble doing it.
can someone help me. thanks

Eugeny Sattler

unread,
May 18, 2009, 6:50:33 AM5/18/09
to re...@googlegroups.com
(view this in non-proportional font like Courier)

why not use just
/mlb/players/profile\?playerId=(\d+)\"\>([^<]+)\</a>
\__/ \___/
│ │
$1 will catch │
numeric playerID │

$2 will catch
player name

i would add optional spaces \s* , just in case:
/mlb/players/profile\?playerId=(\d+)\s*\"\s*\>\s*([^<]+)\s*\</a>

DuCakedHare

unread,
Jun 4, 2009, 6:34:37 AM6/4/09
to Regex
In minimalist fashion, for future:

\?playerId=(\d+).*?>(.+)</a>

You only need to identify the playerId item and extract the number,
whatever the path to that ID
After that, if for some reason another tag appears after that, it will
be ignored.
Assuming it's in a A element, then the closing </a> is sufficent to
detect.
Reply all
Reply to author
Forward
0 new messages