Succinct regex to match this

0 views
Skip to first unread message

Ross Presser

unread,
Oct 6, 2009, 8:57:48 PM10/6/09
to Regex
I want to match strings that represent a number of years, months and
days, like this:

12y7m19d
12y19d
7m19d
12y7m
19d
7m
etc.

Reducing it to a simpler problem, I want to match any substring of

abc

in order, but NOT match the empty string, and NOT match out of order
strings like

bac

Is there a succinct way to express this, or do I have to do

a|b|c|ab|ac|bc|abc

Thank you.

Accmailer

unread,
Oct 7, 2009, 3:38:36 AM10/7/09
to Regex


On 7 окт, 05:57, Ross Presser <rpres...@gmail.com> wrote:
> I want to match strings that represent a number of years, months and
> days, like this:
>
> 12y7m19d
> 12y19d
> 7m19d
> 12y7m
> 19d
> 7m
> etc.
>
> Reducing it to a simpler problem, I want to match any substring of
>
> abc
Frankly speaking i did not quite get how the first question transforms
into the second one :)

>
> in order, but NOT match the empty string, and NOT match out of order
> strings like
>
> bac
>
> Is there a succinct way to express this, or do I have to do
>
> a|b|c|ab|ac|bc|abc

I am not sure that "ac" is a substring of "abc", though ...if you
insist...
a?b?c?
can be a more elegant expression.
It will match all things you listed: "a" or "b" or "c" or "ab" or "ac"
or "bc" or "abc".
Problem: it will match zero length string and report "FOUND!" :)

-- Regards, Eugeny

Accmailer

unread,
Oct 7, 2009, 3:48:53 AM10/7/09
to Regex
> > Is there a succinct way to express this, or do I have to do
>
> > a|b|c|ab|ac|bc|abc
>
> I am not sure that "ac" is a substring of "abc", though ...if you
> insist...
> a?b?c?
> can be a more elegant expression.
> It will match all things you listed: "a" or "b" or "c" or "ab" or "ac"
> or "bc" or "abc".
> Problem: it will match zero length string and report "FOUND!" :)
Just now an idea got into my mind: Thу abovementioned problem can
be solved via positive lookahead requiring at least one letter
to be present. This way:

\A(?=\w+)a?b?c?\Z

where \A and \Z are start of input and end of input anchors.

Ralph Boland

unread,
Oct 12, 2009, 5:14:36 PM10/12/09
to Regex
If you have set difference operation ('-') and epsilon string ('e')

Then you can write abc - e.

I am developing a tool for generating finite state machines from
regular expression. I will be implementing a not empty operator
('!').
With it you could write (abc)!

So there should be an easy way to do what you want but I'm guessing
it is not available on the system you are using.

Regards

Ralph Boland

Ross Presser

unread,
Nov 3, 2009, 7:00:54 PM11/3/09
to Regex


On Oct 7, 2:38 am, Accmailer <eugeny.satt...@gmail.com> wrote:
> > Reducing it to a simpler problem, I want to match any substring of
>
> > abc
>
> Frankly speaking i did not quite get how the first question transforms
> into the second one :)

"a", "b" and "c" are to be considered markers for regex pieces, not
just characters. so, matching something like

12y13d

is equivalent to matching

ac

where
a = [0-9]+y
b = [0-9]+m
c = [0-9]+d

> I am not sure that "ac" is a substring of "abc", though ...if you
> insist...

substring is perhaps a bad word for the concept ... what's meant is a
string composed of optional elements that must be in a certain order,
where each element is optional but at least one element must be used.

> a?b?c?
> can be a more elegant expression.
> It will match all things you listed: "a" or "b" or "c" or "ab" or "ac"
> or "bc" or "abc".
> Problem: it will match zero length string and report "FOUND!" :)

Exactly -- that was what led me to post in the first place.

> Just now an idea got into my mind: Thу abovementioned problem can
> be solved via positive lookahead requiring at least one letter
> to be present. This way:
> \A(?=\w+)a?b?c?\Z

That's probably the concept I hadn't thought of -- positive
lookahead. Thanks for that.

In the final event, I solved my problem by adding a prefix character
to the string to be matched. So
instead of matching "ac" with /a?b?c?/
I am matching "xac" with /xa?b?c?/

This is probably less expensive than positive lookahead, and it's
easier to document in any case.

Ross Presser

unread,
Nov 3, 2009, 7:03:00 PM11/3/09
to Regex


On Oct 12, 4:14 pm, Ralph Boland <rpbol...@gmail.com> wrote:
> If you have set difference operation ('-')  and  epsilon string ('e')
>
> Then you can write  abc - e.

I'm not familiar with those things ... as I mentioned to Accmailer
above, a b and c are to
be considered regex particles, not single characters. I don't know if
that changes your answer.

>
> I am developing a tool for generating finite state machines from
> regular expression.  I will be implementing a not empty operator
> ('!').
> With it you could write  (abc)!
>
> So there should be an easy way to do what you want but I'm guessing
> it is not available on the system you are using.

Probably not. I found a workaround by prefixing my test string as
well as the regex.
Reply all
Reply to author
Forward
0 new messages