How to find duplicate numbers between commas?

9 views
Skip to first unread message

smartscottiedog

unread,
Oct 27, 2009, 2:03:42 PM10/27/09
to Regex
I have a following regex in .NET

^(?!.*?\b44\b)(?:\s*(\d+)\s*,?)*$

This will allow comma separated list of numbers, excluding one number
I specified (e.g. 44 from above). I would like to modify that to not
allow any duplicates numbers in the list.

For example

12, 22, 3, 45, 3, 36

should fail the validation since number 3 has been repeated. Is that
even possible?

eugeny....@gmail.com

unread,
Oct 28, 2009, 4:47:06 AM10/28/09
to Regex
> For example
>
> 12, 22, 3, 45, 3, 36
>
> should fail the validation since number 3 has been repeated.  Is that
> even possible?

I would try vice versa approach: "If a value is present twice in a
string, that string is no good" You just change logic operator in your
code. And the regex to test wether a CSV string contains duplicate
numbers would be
(\d+)[, ]+(?:\d+, )*?\1[, \r\n]+
^ ^
|__spaces here____|

So, report if that approach works.

smartscottiedog

unread,
Oct 28, 2009, 11:25:28 AM10/28/09
to Regex
First, thank you for replying.

I must say that I'm no good with regular expression. The expression I
got above is from someone else and I only understand bits of it. I
changed it as you recommended.

from

^(?!.*?\b44\b)(?:\s*(\d+)\s*,?)*$

to

^(?!.*?\b44\b)(?:(\d+)[, ]+(?:\d+, )*?\1[, \r\n]+)*$

Is this what you intended me to try out? A complete expression string
would be great.

And thanks again.

On Oct 28, 3:47 am, "Eugeny.Satt...@gmail.com"

eugeny....@gmail.com

unread,
Oct 30, 2009, 6:22:32 AM10/30/09
to Regex
Hi,
First define your goal.
If that is "not allow any duplicates numbers in the list" then i
you've got my reply already. I tested it in PowerGREP (where no code
wrapper is required at all) and it works. I hope it'll work in your
environment as well, after you wrap it into .NET specific things.

Elsewhere, if your goal is to "find a CSV line that contains "44"
value at least twice", then I would recommend the same expression as i
posted before but just change the first "\d+" to "44". Without quotes
i mean..

But why bother checking for 44 duplicates then for 569 duplicates than
for 19875624 duplicates etc. etc. etc. endlessly.... when you have a
universal solution in your hands already?
From your first message I understood that you need a universal
solution.

--
Regards, Eugeny

Reply all
Reply to author
Forward
0 new messages