Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

how can I use randbetween without repeating numbers in a set

9,413 views
Skip to first unread message

Maria

unread,
Oct 21, 2009, 9:04:01 PM10/21/09
to

Gord Dibben

unread,
Oct 21, 2009, 9:16:09 PM10/21/09
to
See JE McGimpsey's site for a solution to this problem.

http://www.mcgimpsey.com/excel/udfs/randint.html

Note: requires use of VBA


Gord Dibben MS Excel MVP

On Wed, 21 Oct 2009 18:04:01 -0700, Maria <Ma...@discussions.microsoft.com>
wrote:


Ms-Xl-Learner

unread,
Oct 21, 2009, 9:24:55 PM10/21/09
to
On Oct 22, 6:04 am, Maria <Ma...@discussions.microsoft.com> wrote:
>

Increase the Maximum Number value in your Randbetween formula.

For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.

If you use the formula like this

=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)

Then you will not get any duplicates.

Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.

-------------------------
(Ms-Exl-Learner)
-------------------------

David Biddulph

unread,
Oct 22, 2009, 1:25:35 AM10/22/09
to
I'm confused by your reply to Maria.

Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability of
duplicates doesn't become zero until the range of numbers from which you are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is non-zero.]

Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph

David Biddulph

unread,
Oct 22, 2009, 1:29:05 AM10/22/09
to
I'm confused by your reply to Maria.

Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability of
duplicates doesn't become zero until the range of numbers from which you are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is non-zero.]

Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph

Ms-Exl-Learner

unread,
Oct 22, 2009, 4:06:04 AM10/22/09
to
David Sir I am not having any in depth knowledge about excel. I am learning
it and know some little bit about excel by practice. I have also come across
this issue and in that time I used to do like this to avoid duplicates.

Open a New work book and in A1 Cell paste this formula

=RANDBETWEEN(1,10)

Apply the formula upto A10, after that select the column or Range A1:A10 and
do copy and paste it as values. In B1 cell apply this formula
=COUNTIF(A:A,A1) and drag it upto B10. Now you can see the values “2” in B
Column. When we apply the Randbetween for 10 cells and if we mention the
Maximum Randbetween value as 10 then it’s creating duplicates.

Now open another new workbook and paste the below formula in A1 Cell

=RANDBETWEEN(1,100)
OR
=RANDBETWEEN(1,50)

As mentioned above repeat the same steps like pasting it upto A10 cell and
do copy and paste it as values and apply countif formula to check for
duplicates. Now all the countif results will be “1”.

So if we increase the Randbetween Maximum value depends upon the data then
it’s not creating any duplicate values. That is the reason I have suggested
it.

If this method is wrong then please guide me I will also stop trying this.

--------------------
(Ms-Exl-Learner)
--------------------

"David Biddulph" wrote:

> .
>

Mike H

unread,
Oct 22, 2009, 5:28:01 AM10/22/09
to
Maria,

I can't reply directly because you didn't put anything in the body of your
message so I'll use David's response to reply.

Try this macro, Set your maximum number to whatever you want but it must be
larger than the number of cells in the fill range.

Sub Marine()
Dim MyMax As Long
MyMax = 1000 'Change to suit
Dim FillRange As Range
Set FillRange = Range("A1:a100")
For Each c In FillRange
Do
c.Value = Int((MyMax * Rnd) + 1)
Loop Until WorksheetFunction.CountIf(FillRange, c.Value) < 2
Next
End Sub


Mike

"David Biddulph" wrote:

> .
>

David Biddulph

unread,
Oct 22, 2009, 7:30:08 AM10/22/09
to
In this case it's not a knowledge of Excel that counts, but a very basic
knowledge of elementary statistics.

To go back to Excel:
With your RANDBETWEEN(1,100), either recalculate a number of times (just hit
F9) or copy across a number of columns, and then see whether your COUNTIF is
always 1.
I copied this across all 256 column of Excel 2003 & counted in how many
columns out of the 256 there were duplicates. Again the value will change
when you recalculate but the values are typically between 70 and 120 columns
out of the 256 which contain duplicates with 10 samples from
RANDBETWEEN(1,10)

You can use Excel to calculate the probability, as follows.
When you've got a random number in row 1, the probability of the random
number in row 2 not being a duplicate is 99/100
If you've got 2 different random numbers in rows 1 & 2, the probability of
the random number in row 3 not being a duplicate of either of those is
98/100
If you've got 3 different random numbers in rows 1 to 3, the probability of
the random number in row 4 not being a duplicate of any of those is 97/100
and so on down to
If you've got 9 different random numbers in rows 1 to 9, the probability of
the random number in row 10 not being a duplicate of any of those is 91/100
Therefore the probability of there not being a duplicate in your 10 random
samples from 100 is the product of those 9 probabilities above, which works
out at 62.82%.
For interest I then worked out the expected (mean) number of columns with
duplicates out of my 256 columns, and this is =256*(1-62.82%), or about 95
(and 95 is the middle of the typical range of 70 to 120 which I quoted above
when I recalculated a number of times).

If you change from RANDBETWEEN(1,100) to your other suggestion of
RANDBETWEEN(1,50), the probability of no duplicates drops from 62.82% to
38.17%, and the expected number of columns with duplicates increases to
about 158 of the 256 (and typical values seen are between 143 and 173).

Q.E.D.
--
David Biddulph


"Ms-Exl-Learner" <Ms.Exl....@gmail.com> wrote in message
news:C0CFE96D-5D4F-4439...@microsoft.com...

Ms-Exl-Learner

unread,
Oct 22, 2009, 12:18:07 PM10/22/09
to
David Sir Thanks a lot for spending your valuable time in explaining /
guiding me by way of giving a brief valuable note about randbetween function.
Now it’s clear to me that the method which was followed by me will not stop
creating the duplicate values.

Before that I used to increase the randbetween maximum value depends upon
the number of cells for which I am going to apply. For example if I am
applying the randbetween formula for 10 cells then I used to square it like
(10*10) so previously I had the thought if I use the randbetween Maximum
value as 100 for 10 cells then it will not create any duplicates. In the same
case for 100 cells I was used 10000 as the maximum value (100*100) and
assumed that the squaring the number of cells and using that as the maximum
value in randbetween will rectify the duplication issue. But now only I
understood that this is not the right way to stop creating the duplicates.

Today I have learned another informative message about the method of using
the Randbetween Function from your post.

Once again Thank you very much!!!

--------------------
(Ms-Exl-Learner)
--------------------

"David Biddulph" wrote:

> .
>

Bernd P

unread,
Oct 29, 2009, 7:31:49 AM10/29/09
to
Hello,

> See JE McGimpsey's site for a solution to this problem.
>
> http://www.mcgimpsey.com/excel/udfs/randint.html

A faster and more flexible RandInt you can find here:
http://sulprobil.com/html/randint.html

Regards,
Bernd

Joe User

unread,
Oct 29, 2009, 10:41:41 AM10/29/09
to
"Bernd P" <bplu...@gmail.com> wrote:
> A faster and more flexible RandInt you can find here:
> http://sulprobil.com/html/randint.html

Test the error paths. They all return #VALUE instead of the intended error.
Hint: compare the results of the following.

Function test1() As Long()
test1 = CVErr(xlErrNum)
End Function

Function test2()
test2 = CVErr(xlErrNum)
End Function

Bernd P

unread,
Oct 29, 2009, 11:28:29 AM10/29/09
to
> Test the error paths.  They all return #VALUE instead of the intended error.

Hello,

Thank you. Corrected.

Regards,
Bernd

Tushar Mehta

unread,
Oct 29, 2009, 11:56:29 AM10/29/09
to
For a variety of options see
Select elements at random without repetition
http://www.tushar-mehta.com/excel/newsgroups/rand_selection/index.html

On Wed, 21 Oct 2009 18:04:01 -0700, Maria
<Ma...@discussions.microsoft.com> wrote:

Regards,

Tushar Mehta
Microsoft MVP Excel 2000-present
www.tushar-mehta.com
Excel and PowerPoint tutorials and add-ins

Joe User

unread,
Oct 29, 2009, 6:45:29 PM10/29/09
to
"Bernd P" <bplu...@gmail.com> wrote:
> A faster and more flexible RandInt you can find here:
> http://sulprobil.com/html/randint.html

I don't think the "late initialization" algorithm works correctly when the
lMin-to-lMax range includes zero.

I am not taking the time to prove it by testing. But consider the following
mental experiment with lMin < 0, lMax = 0, and lRept = 1.

Assume the first random position is lT(x) such that 1 <= x < lRange; thus,
the first random integer is x-1+lMin. Then lT(x) is replaced with
lT(lRange). Since lT(lRange) is zero (uninitialized), lT(x) is replaced
with lRange-i+lMin, which is zero. So far, so good.

Now assume the second random position is the same lT(x) coincidentally. In
that case, the second random integer should be zero; that is, it should be
lT(lRange) that was copied down. But since lT(x) is zero, it will be
interpreted as uninitialized. So the second random integer will be x-1+lMin
again.

Not only is that not the intended random integer, but also it violates the
lRept requirement of one.

If you agree, I think the simplest solution is to disable the "late
initialization" algorithm when the lMin-to-Lmax range includes zero. To
wit:

If lRange < CLateInit Or (lMin <= 0 And lMax >= 0) Then '[sic]
For i = 1 To lRange
lT(i) = Int((i - 1) / lRept) + lMin
Next i
i = 1
For lRow = 1 To UBound(lR, 1)
For lCol = 1 To UBound(lR, 2)
lRnd = Int(((lRange - i + 1) * Rnd) + 1)
lR(lRow, lCol) = lT(lRnd)
lT(lRnd) = lT(lRange - i + 1)
i = i + 1
Next lCol
Next lRow
Else
'If we have a huge range of possible random integers and a
'comparably small number of draws, i.e. if
'(lMax - lMin) * lRept >> lCount
'then we can save some runtime with late initialisation.
i = 1
[...etc...]


PS: I do not agree that "lRange < CLateInit" is sufficient to test the
condition (lMax-lMin)*lRept >> lCount. But that's a matter of opinion.

Bernd P

unread,
Oct 30, 2009, 2:19:11 AM10/30/09
to
Hello again,

On 29 Okt., 23:45, "Joe User" <joeu2004> wrote:
> ...


> I don't think the "late initialization" algorithm works correctly when the
> lMin-to-lMax range includes zero.

> ...

Thanks for spotting. I corrected that.

> ...


> If you agree, I think the simplest solution is to disable the "late
> initialization" algorithm when the lMin-to-Lmax range includes zero.

> ...

I do not agree. A runtime check revealed that there is enough time to
shift the results.

> ...


> PS:  I do not agree that "lRange < CLateInit" is sufficient to test the
> condition (lMax-lMin)*lRept >> lCount.  But that's a matter of opinion.

I agree and I changed it - now it's a border for a ratio (lRange /
lCount). But it's still a matter of opinion...

Thanks again for your help.

Regards,
Bernd

0 new messages