Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

RANDOM # around a MEAN with STANDARD DEVIATION

892 views
Skip to first unread message

Tanim

unread,
Sep 11, 2007, 10:08:01 PM9/11/07
to
In Excel 2007, how can I create random numbers around a mean of 7 or some
other specific number?

I have to generate 36 random numbers around the mean of 6 with standard
deviation of 10.

Also how can I change the seed in 2007?

Your help is greatly appreciated.

joeu2004

unread,
Sep 11, 2007, 10:21:31 PM9/11/07
to
On Sep 11, 7:08 pm, Tanim <Ta...@discussions.microsoft.com> wrote:
> I have to generate 36 random numbers around the mean of 6 with
> standard deviation of 10.

=norminv(rand(), 6, 10)


Tanim

unread,
Sep 11, 2007, 10:30:00 PM9/11/07
to
I have tried it but it kept highlighting the 6 for some reason and the
formula isn't working. I can't get out of the formula box.

joeu2004

unread,
Sep 11, 2007, 11:55:39 PM9/11/07
to
On Sep 11, 7:30 pm, Tanim <Ta...@discussions.microsoft.com> wrote:
> I have tried it but it kept highlighting the 6 for some reason and
> the formula isn't working. I can't get out of the formula box.

I cannot imagine what your usage error is; most likely a typo. Try
this:

1. Go back to my previous posting, carefully highlight the formula
from "=" to the end of line, and right-click Copy (or type ctrl-C).

2. Highlight a cell in your spreadsheet, and right-click Paste (or
type ctrl-V).

Press F9 many times to verify that you are getting random results in
the range that you expect.

Mike Middleton

unread,
Sep 12, 2007, 12:15:27 AM9/12/07
to
Tanim -

(1) For random numbers from a normal distribution, use =NORMINV(RAND(),6,10)

(2) You cannot change the seed of the worksheet function RAND. (a) If you
write Excel VBA, you can use the Excel VBA RND function for random values of
your custom user-defined function, and you can use the Excel VBA RANDOMIZE
statement to set the seed. (b) Third-party add-ins for Excel allow you to
specify the seed for their random number generator functions, e.g., my
RiskSim add-in with free tryout available from www.treeplan.com.

- Mike Middleton
http://www.DecisionToolworks.com
Decision Analysis Add-ins for Excel

"Tanim" <Ta...@discussions.microsoft.com> wrote in message
news:2EF60B8A-5611-48E1...@microsoft.com...

Bernd P

unread,
Sep 12, 2007, 6:39:53 AM9/12/07
to
Hello,

On average this might be true, but doesn't the OP need 36 numbers with
a mean of 6 and a STDEV of 10 exactly?

Regards,
Bernd

Dana DeLouis

unread,
Sep 12, 2007, 4:53:43 PM9/12/07
to
Just to add. Under "Data Analysis", you can select "Random Number
Generation."
You can pick a "Normal" distribution, along with Number of points, Mean, SD,
and Random Seed.

--
Dana DeLouis
Windows XP & Excel 2007


"Tanim" <Ta...@discussions.microsoft.com> wrote in message
news:2EF60B8A-5611-48E1...@microsoft.com...

joeu2004

unread,
Sep 12, 2007, 7:43:54 PM9/12/07
to
On Sep 12, 3:39 am, Bernd P <bplumh...@gmail.com> wrote:
> On average this might be true, but doesn't the OP need 36
> numbers with a mean of 6 and a STDEV of 10 exactly?

I bristle at the word "exactly" in this context. But otherwise ....

Good point! I think it's a matter of interpretation. Mine et al: 36
random numbers from the distribution with a mean 6 and sd of 10; or
yours: 36 random numbers that have a mean of 6 and sd of 10.

For the latter, perhaps the following meets the requirement:

A1: =norminv(rand(), m, s)
A2: =2*m - A1

where "m" is 6 and "s" is 10, or references to cells with those
values.

Copy A1:A2 down through A36. That will produce a mean of 6. But it
does not necessarily produce an sd of 10. One fix: replace A35 with
the following array formula (commit with ctrl-shift-Enter):

=sqrt((s^2 - sum((A1:A34-m)^2/n))*n/2) + m

where "n" is 36 or a reference to a cell with that value.

That will generate 36 numbers with a mean of 6 and sd (STDEVP) of 10
"exactly", within the accuracy of binary computers.

But, someone argue, that is only 16 random numbers, not 36.

I do not believe we can generate exactly 36 random numbers and ensure
that they meet the criteria as you interpret them, namely: the 36
numbers have the required mean and sd.

Arguably, we could generate 34 random numbers and only 2 dependent
numbers that meet that criteria. But I think that runs the risk that
one or both of the 2 dependent numbers are extreme outliers. (And
that __still__ is not "36 random number" exactly.)

Alternatively, using my methodology, we could generate 74 numbers, 36
of which are random. But we must include all 74 numbers in the
solution in order to ensure that the criteria are met. That does not
sound like the solution the OP is looking for.

David Biddulph

unread,
Sep 12, 2007, 7:57:43 PM9/12/07
to
And even forgetting the fact that you've only got 16 random numbers, at your
A35 stage you may find yourself with the
((s^2 - sum((A1:A34-m)^2/n))*n/2) term negative, in which case you wouldn't
be able to achieve STDEV of 10.

I had previously thought along partly similar lines, producing 34 random
numbers and intending to set the 35th and 36th to achieve the mean and STDEV
required, but had discarded the idea because the spread of the first 34
might be so large as to prevent the STDEV being achievable.
--
David Biddulph

"joeu2004" <joeu...@hotmail.com> wrote in message
news:1189640634.3...@y42g2000hsy.googlegroups.com...

joeu2004

unread,
Sep 13, 2007, 12:06:42 AM9/13/07
to
On Sep 12, 4:57 pm, "David Biddulph" <groups [at] biddulph.org.uk>
wrote:

> at your A35 stage you may find yourself with the
> ((s^2 - sum((A1:A34-m)^2/n))*n/2) term negative, in which
> case you wouldn't be able to achieve STDEV of 10.

I thought about that, too; so at one point, the formula was (s -
sqrt(v))^2*n/2, where "v" is sum((A1:A34-m)^2)/n. (The "/n" does need
to be inside the SUM(). It was a typo, although they are the same
mathematically -- albeit not necessarily so "numeric analytically".)

But I concluded that I can prove that s^2 is always greater than "v"
-- ergo, the sqrt argument will never be negative. The derivation of
the formula might be insightful (or insiteful <smile>). We want:

s^2 = varp(A1:A36) = sum( (A1:A36-m)^2 ) / n
= sum( (A1:A34-m)^2 ) / n + ( (A35-m)^2 + (A36-m)^2 ) / n

That is by definition. Since all the terms of the sum are non-
negative, sum(...)/n, the partial variance, must be less than or equal
to s^2, the total variance. Therefore, s^2 - sum(...)/n must be non-
negative.

(For the rest of the derivation, see the Endnotes below.)

But aha! I just generated a random set of numbers where that is not
true(!). What's wrong with my proof?

Perhaps nothing. I suspect that the binary arithmetic error is
catching up to me. I did not expect that for such a small set of
numbers -- but I should have written the formula to accomodate it
anyway.

So the formula for A35 should be the array formula (commit with ctrl-
shift-Enter):

=sqrt( ( s - sqrt( sum( (A1:A34-m)^2 )/n ) )^2 *n/2 ) + m

But even with that correction, I get a 1% error difference in the
expected v. actual sd with one particular set of random numbers.

I guess that proves the point I made about "bristling" at Bernd's use
of the word "exactly" <wink>.


> I had previously thought along partly similar lines, producing 34 random
> numbers and intending to set the 35th and 36th to achieve the mean and STDEV
> required, but had discarded the idea because the spread of the first 34
> might be so large as to prevent the STDEV being achievable.

That was the point I made about "extreme outliers".

I don't think my method avoids extreme outliers completely. But I do
think they are less likely. That is merely based on horrible
empirical experience with the other method (the paradigm: 34 random
numbers plus 2 dependent numbers).


-----

Endnotes

To complete the derivation ....

We select A35 and A36 in the same that we selected the other pairwise
"random" values, namely:

A36 = 2*m - A35

So ....

s^2 = varp(A1:A36) = sum( (A1:A36-m)^2 ) / n
= sum( (A1:A34-m)^2 ) / n + ( (A35-m)^2 + (A36-m)^2 ) / n

s^2 - sum((A1:A34-m)^2)/n = ( (A35-m)^2 + (2*m-A35-m)^2 )/n
= ( (A35-m)^2 + (m-A35)^2 )/n
= 2*(A35-m)^2/n

Somehow, I think that last term should have been obvious to me. But I
had not stopped to think about it.

So ....

(s^2 - sum((A1:A34-m)^2)/n) * n / 2 = (A35-m)^2

sqrt( (s^2 - sum((A1:A34-m)^2)/n) * n / 2 ) + m = A35

QED

joeu2004

unread,
Sep 13, 2007, 12:48:30 AM9/13/07
to
On Sep 12, 9:06 pm, I wrote:
> So the formula for A35 should be the array formula (commit with
> ctrl-shift-Enter):
> =sqrt( ( s - sqrt( sum( (A1:A34-m)^2 )/n ) )^2 *n/2 ) + m

Hmm, I don't think so. Obviously s^2 - v does not equal (s -
sqrt(v))^2.

Oh well, back to the drawing board. As I noted previously, I believe
the (original) formula is mathematically correct. But I need to deal
with the numerical error caused by binary computer arithmetic. Sorry,
don't have the time for this now. Perhaps someone else can offer a
solution.

joeu2004

unread,
Sep 13, 2007, 3:25:02 AM9/13/07
to
On Sep 12, 9:06 pm, joeu2004 <joeu2...@hotmail.com> wrote:
> On Sep 12, 4:57 pm, "David Biddulph" <groups [at] biddulph.org.uk>
> wrote:
> > at your A35 stage you may find yourself with the
> > ((s^2 - sum((A1:A34-m)^2/n))*n/2) term negative, in which
> > case you wouldn't be able to achieve STDEV of 10.

Of course you are right.


> I thought about that, too [...].


> But I concluded that I can prove that s^2 is always greater than "v"

> [....] Since all the terms of the sum are non-negative, sum(...)/n,


> the partial variance, must be less than or equal to s^2, the total
> variance. Therefore, s^2 - sum(...)/n must be non-negative.

Of course that is wrong. I confused myself between what would be true
__if__ sd were 10 and what __is__ true given the sd of the 34
generated data points. Klunk!

I will write 10,000 times "I will try not to solve math problems while
I am rushing to get ready for a trip". Double klunk!


> I had previously thought along partly similar lines, producing 34
> random numbers and intending to set the 35th and 36th to achieve
> the mean and STDEV required, but had discarded the idea because
> the spread of the first 34 might be so large as to prevent the STDEV
> being achievable.

Now I wonder if even that is feasible. After all, isn't that
essentially what I tried to do? It is easy to meet the mean criteria,
at least by my method. But how else would you meet the sd criteria,
other than with my method, which does not work?

It seems that the solution must ensure that the variance of the 34
data values (actually 34/36 of the variance) does not exceed the
desired variance of the 36 data values. I 'spose we could iterate in
a macro until that condition is met. But I generally avoid iterative
implementations that are not guaranteed to terminate in a finite (and
reasonable) time. Of course, we could put a limit on the number of
iterations, returning #NUM if we fail. Hmm, that sounds familiar
<wink>.

Oh well....

Bernd P

unread,
Sep 13, 2007, 4:25:25 AM9/13/07
to
Hello again,

Why not just producing 36 random numbers,
then "shifting" them to get average 6,
finally "zooming" them to get sd 10?

Regards,
Bernd

Bernd P

unread,
Sep 13, 2007, 5:48:17 AM9/13/07
to
Put into A1:C3:
Mean =AVERAGE(B3:B38) =AVERAGE(C3:C38)
STDEV =STDEV(B4:B38) =STDEV(C4:C38)
Data =RAND() =6+(B3-$B$1)*10/$B$2

Then copy B3:C3 down to B38:C38.

In column C you will have your 36 random numbers which show exactly a
mean of 6 and a stdev of 10.

Regards,
Bernd

David Biddulph

unread,
Sep 13, 2007, 6:33:40 AM9/13/07
to
Yes, though I think you probably intended your STDEV formulae to refer to
rows 3:38, not 4:38?

And of course if you change B3:B38 from =RAND() to the previously suggested
=NORMINV(RAND(), 6, 10) then you can see the difference between the original
"random sample from a distribution" approach and the revised version.
--
David Biddulph

"Bernd P" <bplu...@gmail.com> wrote in message
news:1189676897....@22g2000hsm.googlegroups.com...

Bernd P

unread,
Sep 13, 2007, 8:37:02 AM9/13/07
to
Hello David,

Thanks. You are right.

Regards,
Bernd

joeu2004

unread,
Sep 13, 2007, 10:04:56 AM9/13/07
to

Excellent, with David's corrections! I would use STDEVP() in this
context instead of STDEV(). But no matter, as long as the OP is
consistent in his usage.

0 new messages