newbie question

3 views
Skip to first unread message

SpaceMarine

unread,
Oct 29, 2009, 1:57:37 PM10/29/09
to Regex
hello,

i have a test regex that looks for one of two possible favorite
colors:

red|blue

...no problem. now id like to include "Favorite color: " into the test
string, like so:

Favorite color: blue

...how do i modify the original regex to include this static label?
ive tried /b but it didnt work out.


thanks!
sm

eugeny....@gmail.com

unread,
Oct 30, 2009, 6:01:09 AM10/30/09
to Regex


On Oct 29, 9:57 pm, SpaceMarine <spacemar...@mailinator.com> wrote:
> hello,
>
> i have a test regex that looks for one of two possible favorite
> colors:
>
>     red|blue
>
> ...no problem. now id like to include "Favorite color: " into the test
> string, like so:
>
>     Favorite color: blue

Favorite color\: (red|blue)

--
regards,
Eugeny

inhahe

unread,
Oct 30, 2009, 6:20:03 AM10/30/09
to re...@googlegroups.com


On Fri, Oct 30, 2009 at 6:01 AM, Eugeny....@gmail.com <eugeny....@gmail.com> wrote:
 
Favorite color\: (red|blue)


do :'s really need to be escaped? that could explain why some regex i wrote the other day wouldn't work

also.. one thing i don't get about | -- how does it know you're looking for "red" or "blue" and not "re" + ("d" or "b") + "lue"?

Eugeny Sattler

unread,
Oct 31, 2009, 11:57:57 AM10/31/09
to re...@googlegroups.com
>> Favorite color\: (red|blue)
>>
>
> do :'s really need to be escaped? that could explain why some regex i wrote
> the other day wouldn't work

Not necessary but you are safer when it is escaped.
There are situations when ":" are a part of special meaning in regex
(for example non-capturing groups start from "(?:" and end with ")"
so, it is better to escape ":" when you want to match it as literal

> also.. one thing i don't get about | -- how does it know you're looking for
> "red" or "blue" and not "re" + ("d" or "b") + "lue"?

Thanks to round brackets! Here, the first option starts right after
opening round bracket and ends before a pipe, the second option starts
right after pipe and ends before closing round bracket.

For single letters or digits a character class syntax is more suitable.
If you want to match "a" or "b" or "c" use [abc] construct.
Although you won't be punished if you use (a|b|c) construct.

(a|b|c) construct was meant for words, not for single letters.

--
regards, Eugeny

inhahe

unread,
Oct 31, 2009, 12:03:46 PM10/31/09
to re...@googlegroups.com
On Sat, Oct 31, 2009 at 11:57 AM, Eugeny Sattler <eugeny....@gmail.com> wrote:

> also.. one thing i don't get about | -- how does it know you're looking for
> "red" or "blue" and not "re" + ("d" or "b") + "lue"?
Thanks to round brackets! Here, the first option starts right after
opening round bracket and ends before a pipe, the second option starts
right after pipe and ends before closing round bracket.

For single letters or digits a character class syntax is more suitable.
If you want to match "a" or "b" or "c" use [abc] construct.
Although you won't be punished if you use (a|b|c) construct.

(a|b|c) construct was meant for words, not for single letters.


But how greedy is the |?

i mean for example if i have a(.*?b)+?(?=hi|bye).blah|hi|green$ .. how is that grouped?  where does the alternative before the "|hi" begin? 

Eugeny Sattler

unread,
Oct 31, 2009, 12:37:31 PM10/31/09
to re...@googlegroups.com
> But how greedy is the |?
> i mean for example if i have a(.*?b)+?(?=hi|bye).blah|hi|green$ .. how is
> that grouped?  where does the alternative before the "|hi" begin?


1) that syntax looks erroneous to me because the pipe between "blah"
and "hi" and that between "hi" and "green" are not accompanied by
round brackets.
Ask you regex processor if it has the same opinion. :)

2) boundaries of lookahed construct can serve as boundaries of alternation.
so, no need to write
(?=(hi|bye))
while it is enough to write
(?=hi|bye)

just as you did in your example.
Just bear in mind that "(?=(hi|bye))" and "(?=hi|bye)" are treated the same.
The first option starts after "?=" and ends before pipe. The second
option starts after pipe and ends before closing round bracket.

My advice: stop thinking that regex engine first finds a pipe in your
espression and afterwards looks for ends of your alternate options to
the right and to the left (in a greedy or lazy way).
The regex engine processes regular expression always from left to
right, and if it did not encounter an opening round bracket, you have
lost your chance to tell it that you start an alternation.

--
regards, Eugeny

SpaceMarine

unread,
Nov 4, 2009, 1:51:33 PM11/4/09
to Regex
thanks!

On Oct 30, 4:01 am, "Eugeny.Satt...@gmail.com"
Reply all
Reply to author
Forward
0 new messages