Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Generate a danom bumber that follow a certain probability mass distribribution

2 views
Skip to first unread message

Maurizio Cerrato

unread,
Jun 29, 2009, 5:53:46 AM6/29/09
to
Hi,

I want to generate a random rumber (say between 1 and 7), but the
value generate must follow a custom probability mass distribution,
for example it' should have 10% to generate 1, 20% to generate a 7 and
60% to generate a 6.
Thanks for help

Pascal J. Bourguignon

unread,
Jun 29, 2009, 6:27:51 AM6/29/09
to

If you start from an equiprobable random number generator, you can
implement a weighted random number generated by summing the weights,
and dispatching the equiprobable random value:

(defun weighted-random (weights)
"Returns an random integer between 0 and (1- (length weights)) inclusive
with a probability given by the weigths.
P(i) = (/ (elt weights i) (reduce (function +) weights))."
(let ((erv (random (reduce (function +) weights))))
(loop
for i from 0
for w in weights
do (if (< erv w)
(return i)
(decf erv w)))))


(histogram (loop repeat 100 collect (weighted-random '(1 4 9 25 9 4 1))))
--> #(3 6 14 47 23 5 2)

(histogram (loop repeat 1000 collect (weighted-random '(1 4 9 25 36 25 9 4 1))))
--> #(9 25 80 222 308 229 86 28 13)

(histogram (loop repeat 100000 collect (weighted-random (mapcar (lambda (x) (* x 4))
'(0 10 10/4 10/4 10/4 10/4 60 20)))))
--> #(0 9828 2393 2483 2527 2581 59829 20359)

--
__Pascal Bourguignon__

Kai-Uwe Bux

unread,
Jun 29, 2009, 7:21:55 PM6/29/09
to
Maurizio Cerrato wrote:

Well, 10+20+60 = 90. I think, you want the probabilities add up to 100%.

For small examplex with known weights, you can get away with ad-hoc methods.
E.g., for the above (assuming that you want the remaining 10% to be even
distributed among the numbers 2,3,4, and 5) you could do the following:

generate a uniform random integer in [0,200) and output after mapping:

[0,20) --> 1
[20,60) --> 7
[60,180) --> 6
[180,185) --> 2
[185,190) --> 3
[190,195) --> 4
[195,200) --> 5


If the weights are unknown, huge in number, but fixed and do not change, you
should read Knuth: The Art o Computer Programming, Vol 2, Chapter 3.4.1
Section A where Walker's method is explained.

If the weights can change, you might want to google for the "hat container
class" by "AngleWyrm". This class implements a good scheme that allows for
logarithmic update of weights. Knuth also gives a reference to a more
complicated method that allows for constant expected time updates of
weights.


Best

Kai-Uwe Bux

Gene

unread,
Jul 2, 2009, 11:22:19 PM7/2/09
to

Pascal's suggestion is called the method of inverses. This is because
you create a graph of the inverse cumulative density function, then
use this curve to "look up" the aribitrarily distributed random value
given a pseudorandom uniformly distributed value in [0..1]. The
inverse cumulative density of an empircal distribution like yours is
very easy to construct. See any good textbook on discrete event
simulation. This one has a nice discussion:

http://www.amazon.com/Simulation-Modeling-Analysis-Expertfit-Software/dp/0073294411/ref=sr_1_2?ie=UTF8&s=books&qid=1246591187&sr=1-2

0 new messages