How to process all matches of RegEx?

63 views
Skip to first unread message

Shedo

unread,
Jan 3, 2015, 5:37:36 PM1/3/15
to haxe...@googlegroups.com
I have a RegEx that works as intended, the problem is, I can't figure out how to get all the matching results. Here's an example, say I have the following:

~/dog(.*)cat(.*)/g

<a href="dog1cat9">text</a>
<a href="dog2cat8">text</a>
<a href="dog3cat7">text</a>
<a href="dog4cat6">text</a>

I can only retrieve "1" and "9" from the first set via the r.matched(1) and r.matched(2).
How do I get the rest of the matching sets?

Marc Weber

unread,
Jan 3, 2015, 6:48:24 PM1/3/15
to haxelang
Excerpts from Shedo's message of Sat Jan 03 22:37:36 +0000 2015:
> I can only retrieve "1" and "9" from the first set via the r.matched(1) and
> r.matched(2).
> How do I get the rest of the matching sets?
Using regex for matching xml is bad often. Usually I prefer xml tools.

You can use match, .matched(0) and .matchSub till you've processed
the whole string (write your own loop)

http://api.haxe.org/EReg.html

In your particular case splitting on "<a href=\"dog" and then applying to
each regex might be most simple (or then splitting on cat)

Marc Weber

Shedo

unread,
Jan 3, 2015, 6:56:42 PM1/3/15
to haxe...@googlegroups.com
Do you have some example usage of the XML tools?

Marc Weber

unread,
Jan 4, 2015, 6:02:48 AM1/4/15
to haxelang
> Do you have some example usage of the XML tools?
I tend to use Ruby / Nokogiri for such small tasks.

Then its

doc = Nokogiri::HTML(open('uri'))
doc.css('a').each do |a|
puts $1 $2 if /(.*)dog(.*)/ =~ a["href"]
end
(or similar)

For haxe you may want to try google which yields
http://old.haxe.org/doc/cross/xml for example

Well - do whatever works in your case. Haxe Regex docs already mention
that other tools (eg preg_match_callback for PHP or Python) could be
faster to get your job done unless it must be written in Haxe, because
they automatically do what you want (match the regex as often as
possible).

However implementing the loop I talked about is also trivial. You must
know whether its ok to miss <a target="foo" href="dog3cat178">..</a>
or whether its ok to to catch <dummy href="dog... as well. I don't know
your case.

You may also want to have a look at
http://lib.haxe.org/legacy/p/xpath
or
https://github.com/djcsdy/haxe-xpath

I never used either of those.

Marc Weber
Reply all
Reply to author
Forward
0 new messages