I want to generate a random rumber (say between 1 and 7), but the
value generate must follow a custom probability mass distribution,
for example it' should have 10% to generate 1, 20% to generate a 7 and
60% to generate a 6.
Thanks for help
If you start from an equiprobable random number generator, you can
implement a weighted random number generated by summing the weights,
and dispatching the equiprobable random value:
(defun weighted-random (weights)
"Returns an random integer between 0 and (1- (length weights)) inclusive
with a probability given by the weigths.
P(i) = (/ (elt weights i) (reduce (function +) weights))."
(let ((erv (random (reduce (function +) weights))))
(loop
for i from 0
for w in weights
do (if (< erv w)
(return i)
(decf erv w)))))
(histogram (loop repeat 100 collect (weighted-random '(1 4 9 25 9 4 1))))
--> #(3 6 14 47 23 5 2)
(histogram (loop repeat 1000 collect (weighted-random '(1 4 9 25 36 25 9 4 1))))
--> #(9 25 80 222 308 229 86 28 13)
(histogram (loop repeat 100000 collect (weighted-random (mapcar (lambda (x) (* x 4))
'(0 10 10/4 10/4 10/4 10/4 60 20)))))
--> #(0 9828 2393 2483 2527 2581 59829 20359)
--
__Pascal Bourguignon__
Well, 10+20+60 = 90. I think, you want the probabilities add up to 100%.
For small examplex with known weights, you can get away with ad-hoc methods.
E.g., for the above (assuming that you want the remaining 10% to be even
distributed among the numbers 2,3,4, and 5) you could do the following:
generate a uniform random integer in [0,200) and output after mapping:
[0,20) --> 1
[20,60) --> 7
[60,180) --> 6
[180,185) --> 2
[185,190) --> 3
[190,195) --> 4
[195,200) --> 5
If the weights are unknown, huge in number, but fixed and do not change, you
should read Knuth: The Art o Computer Programming, Vol 2, Chapter 3.4.1
Section A where Walker's method is explained.
If the weights can change, you might want to google for the "hat container
class" by "AngleWyrm". This class implements a good scheme that allows for
logarithmic update of weights. Knuth also gives a reference to a more
complicated method that allows for constant expected time updates of
weights.
Best
Kai-Uwe Bux
Pascal's suggestion is called the method of inverses. This is because
you create a graph of the inverse cumulative density function, then
use this curve to "look up" the aribitrarily distributed random value
given a pseudorandom uniformly distributed value in [0..1]. The
inverse cumulative density of an empircal distribution like yours is
very easy to construct. See any good textbook on discrete event
simulation. This one has a nice discussion: