S> wxRegEx reEmail(_T("[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/
S> =?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-
S> z0-9-]*[a-z0-9])?"), wxRE_ADVANCED | wxRE_ICASE);
S>
S> reEmail.Matches(text);
S>
S> wcout << reEmail.GetMatchCount() << _T('\n'); // Returns just 1
Well, yes, there are no capture groups in your regex. So it returns 1
because the regex matches. Please read GetMatchCount() documentation.
S> wcout << reEmail.GetMatch(text, 0).wchar_str().data(); // Why is
S> 'text' required to be passed again here?
Because wxRegEx doesn't copy the (potentially big) string unnecessarily.
S> How do I extract multiple matches? Do I have to use Mid()? If so,
S> wxWidgets must really consider redesigning this confusing class.
I think you need to understand that a regex simply matches or doesn't
match. If you want to apply it again, you need to do it yourself. I.e. in
this case you can get the end of the first match (from GetMatch()) and call
Matches() again starting at this offset. And again and again until it
doesn't match any more.
Regards,
VZ
S> size_t start = 0;
S> size_t len = 0;
S> size_t prevstart = 0;
S>
S> while(reEmail.Matches(text.Mid(prevstart)) && reEmail.GetMatch(&start,
S> &len))
S> {
S> wcout << text.Mid(prevstart + start, len).wchar_str().data();
S> prevstart += start + len;
S> }
S>
S> Though the code does the job, it is hard to get grasp of the logic.
Really? How much simpler can it be when this is exactly the C++
translation of the following pseudo-code:
while match found:
show matching string
advance past its end
I can't help wondering how else can this be written.
S> I really think wxWidgets should redesign wxRegEx class.
FWIW I disagree.
S> It's filled with misnamed functions such as 'GetMatchCount()' (should be
S> called GetCapturedExpressionCount() may be?)
First, "one" != "filled with". Second, GetMatchCount() might be slightly
unclear but IMHO it takes a big effort to not understand what it does after
reading its documentation.
S> and has a confusing function interface.
What do you mean by this? It seems pretty logical to me.
S> It's sad that for what wxRegEx class is used mostly, i.e.
S> extracting multiple data matches from input,
I question the assumption that it's mostly used for this. For instance I
never used it for it so far.
S> we programmers have to do away with cryptic code as above.
Please propose a simpler version.
S> Is there anyway to file a improvement proposal to wxWidgets heads?
This should be done on our Trac (http://trac.wxwidgets.org/) but in this
particular case I don't think it should be done at all.
Regards,
VZ