Norman Peelman wrote:
> On 06/27/2015 12:53 PM, Juan Garcia wrote:
>> Situation is string:
>>
>> The boy love movies [any variable string] and sport but sport is more
>> important.
>>
>> I would like to search string "love movies...and sport" but regular
>> expression i have return "love movies...and sport but sport".
>>
>> Which is the correct regular expression for this?
>
> You should supply the regex you are currently working with so that
> others can help you better. You don't really say exactly what you are
> after but this may get you started:
>
> (love movies)(.+)(and sport)
>
> love movies [any variable string] and sport
Without flags, “.+” means “any string of at least one character that does
not contain newline” in ERE/PCRE. “Any variable string” is only matched by
“.*” or “.+” with PCRE and the “s” flag, or by an all-character class like
“[\S\s]” without either or both.
The parentheses cause storing of the matches for the subexpressions for
later use of back-references. They are a waste of runtime and memory if you
are only matching against a string.
The OP’s problem is probably that they used
/love movies.*sport/
or
/love movies.+sport/
and did not consider that regular expressions are greedy by default, i.e. as
much as possible is matched by them. There are three ways to work around
that:
A) provide more context, as you indicated (by specifying the “and” as
well)
B) make the expression non-greedy:
/love movies.*?sport/
or
/love movies.+?sport/
C) (only in PCRE) use negative lookahead or (here) lookbehind to prevent
the expression from matching if the match would contain certain
substrings:
/love movies.+(?<! but )sport/
--
PointedEars
Zend Certified PHP Engineer
Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.