Generate random numbers with fixed sum and different constraints

Dmitrey Yershov

unread,

Nov 14, 2012, 9:19:17 AM11/14/12

to

Hello. I need to generate non-negative rundom numbers sum of which is equal to 1. Each number xi is constrained: ai<=xi<=bi. How can I do this? Similar question was solved here

http://www.mathworks.com/matlabcentral/fileexchange/9700

but in this alghorithm a<=xi<=b (a and b are the same for all xi). Any ideas?

Greg Heath

unread,

Nov 17, 2012, 6:22:09 AM11/17/12

to

"Dmitrey Yershov" <pierrev...@mail.ru> wrote in message <k80995$9eo$1...@newscl01ah.mathworks.com>...

> Hello. I need to generate non-negative rundom numbers sum of which is equal to 1. Each number xi is constrained: ai<=xi<=bi. How can I do this? Similar question was solved here
>
> http://www.mathworks.com/matlabcentral/fileexchange/9700
>
> but in this alghorithm a<=xi<=b (a and b are the same for all xi). Any ideas?

Z = a + (b-a)*rand(m,n);

sumZ = repmat(sum(Z),m,1);

I'll let you figure out the rest.

Hope this helps

Greg

Roger Stafford

unread,

Nov 17, 2012, 1:32:16 PM11/17/12

to

"Greg Heath" <he...@alumni.brown.edu> wrote in message <k87s11$h6u$1...@newscl01ah.mathworks.com>...

- - - - - - - - - -
I'd like to know the answer to that too, Greg. Suppose your m = 3, n = 1, a = 0, and b = 2/3, and suppose z comes up randomly with z = [.6;.1;.1] as is possible. How is that point supposed to be projected onto a plane so as to have a sum of 1? A simple division by its sum(z) = .8 gives [.75;.125;.125] which exceeds the stated limit.

The space of points in this case having a sum of 1 within the permitted three-dimensional cube cuts it in half in a planar hexagon, and it is difficult to see how to project all points in the cube onto this hexagon in a simple manner using just the sum(z), never mind doing so in an area-wise uniform manner throughout the hexagon.

This is one of the reasons I went to the trouble of writing 'randfixedsum' which breaks such a hexagon into triangles and deals with each separately. However doing a similar thing with an n-dimensional rectangular box other than an n-dimensional cube as Dmitrey wishes to do is a project involving much more complicated simplex structure. At the moment I have no idea how to set about such a task.

Roger Stafford

Bruno Luong

unread,

Nov 17, 2012, 2:59:12 PM11/17/12

to

"Roger Stafford" wrote in message <k88l7g$7tm$1...@newscl01ah.mathworks.com>...

> "Greg Heath" <he...@alumni.brown.edu> wrote in message <k87s11$h6u$1...@newscl01ah.mathworks.com>...

>

> This is one of the reasons I went to the trouble of writing 'randfixedsum' which breaks such a hexagon into triangles and deals with each separately. However doing a similar thing with an n-dimensional rectangular box other than an n-dimensional cube as Dmitrey wishes to do is a project involving much more complicated simplex structure. At the moment I have no idea how to set about such a task.

Might be convert linear inequality into hull vertices. Use Delaunay to decompose the convex polytopes into union of simplexes then apply the barycenter coordinates to generate uniform distribution on simplexes. Put all that together, it should be able to generate the uniform distribution with required constraints.

Bruno

Roger Stafford

unread,

Nov 17, 2012, 8:44:16 PM11/17/12

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <k88qag$nm1$1...@newscl01ah.mathworks.com>...

> Might be convert linear inequality into hull vertices. Use Delaunay to decompose the convex polytopes into union of simplexes then apply the barycenter coordinates to generate uniform distribution on simplexes. Put all that together, it should be able to generate the uniform distribution with required constraints.
>
> Bruno

- - - - - - - - - -

Yes, that is a conceivable approach, Bruno. However it faces some formidable difficulties with large dimensionality. For an n-dimensional cube with n equal to, say, 51 the number of vertices in the n-1 dimensional polytope with a fixed sum set at or near the half-way point would be 51!/25!^2 = 6,446,940,928,325,352. This would present quite a challenge to 'delaunayn'! Moreover, besides providing for the uniform distribution of each simplex, one must also choose different simplices in proportion to their n-1 dimensional volumes, and certainly with the unequal bound values Dmitrey has requested there would be a vast number of different volumes among them to compute. For a practical program to be able to handle large values of n there should be some underlying symmetry principle that greatly simplifies such proceedings as these, and at the moment I can't think what that might be for
unequal bounding intervals.

Roger Stafford

Greg Heath

unread,

Nov 17, 2012, 9:05:23 PM11/17/12

to

"Roger Stafford" wrote in message <k88l7g$7tm$1...@newscl01ah.mathworks.com>...

> "Greg Heath" <he...@alumni.brown.edu> wrote in message <k87s11$h6u$1...@newscl01ah.mathworks.com>...
> > "Dmitrey Yershov" <pierrev...@mail.ru> wrote in message <k80995$9eo$1...@newscl01ah.mathworks.com>...
> > > Hello. I need to generate non-negative rundom numbers sum of which is equal to 1. Each number xi is constrained: ai<=xi<=bi. How can I do this? Similar question was solved here
> > >
> > > http://www.mathworks.com/matlabcentral/fileexchange/9700
> > >
> > > but in this alghorithm a<=xi<=b (a and b are the same for all xi). Any ideas?
> >
> > Z = a + (b-a)*rand(m,n);
> >
> > sumZ = repmat(sum(Z),m,1);
> >
> > I'll let you figure out the rest.
> >
> > Hope this helps
> >
> > Greg
> - - - - - - - - - -
> I'd like to know the answer to that too, Greg. Suppose your m = 3, n = 1, a = 0, and b = 2/3, and suppose z comes up randomly with z = [.6;.1;.1] as is possible. How is that point supposed to be projected onto a plane so as to have a sum of 1? A simple division by its sum(z) = .8 gives [.75;.125;.125] which exceeds the stated limit.

WHOOPS! Silly mistake.

Please pardon my youthful exuberance.

Greg

P.S. The OP did not indicate that the number of points is specified. Was that an unintended omission?

Dmitrey Yershov

unread,

Nov 18, 2012, 12:04:09 PM11/18/12

to

"Roger Stafford" wrote in message <k89ehg$rm6$1...@newscl01ah.mathworks.com>...

Thanks to everyone who reply! In fact, I'll be happy if I have method for n<=5 which is able to generate 1000 points in less than 2 sec (processor 2,4 Ghs).

Bruno Luong

unread,

Nov 18, 2012, 1:46:10 PM11/18/12

to

I have not fully tested, but it should go like this:

% lower + upper bounds
lo = [1 0 2 1 0];
up = [5 6 4 4 2];
% target sum
s = 15;
% number of samples
n = 1000;

m = length(lo);
slo = sum(lo);
sup = sum(up);

x0 = interp1([slo; sup], [lo; up], s);
if any(isnan(x0))
error('no solution exists')
end

x0 = x0(:);
B = null(ones(size(lo)));
A = [-B; B];
b = [x0-lo(:); up(:)-x0];

% FEX: http://www.mathworks.com/matlabcentral/fileexchange/30892
% Thank you Matt
[V,nr,nre] = lcon2vert(A,b,[],[],tol);

T = delaunayn(V);
P = V(T,:);
P = reshape(P, [size(T) m-1]);
P = permute(P, [3 2 1]);
np = size(P,3);

% http://www.mathworks.com/matlabcentral/fileexchange/9700
% % Thank you Roger
Q = bsxfun(@minus, P, P(:,end,:));
Q(:,end,:) = [];
vol = arrayfun(@(k) det(Q(:,:,k)), 1:np);
vol = abs(vol);
w = randfixedsum(m,n,1,0,1);

% generate n samples of v with probability vol
vol = vol / sum(vol);
c = cumsum(vol); c(end)=1;
[~, i] = histc(rand(1,n),[0 c]);

V = P(:,:,i);
y = zeros(m-1, n);
for k=1:n
y(:,k) = V(:,:,k)*w(:,k);
end
x = bsxfun(@plus, x0, B*y);

% Bruno

Bruno Luong

unread,

Nov 19, 2012, 1:39:13 AM11/19/12

to

Sorry, I miss the definition of the variable 'tol' in my previous posted code; here we go again:

% lower + upper bounds
lo = [1 0 2 1 0];
up = [5 6 4 4 2];
% target sum
s = 15;
% number of samples
n = 1000;

%% Engine

m = length(lo);
slo = sum(lo);
sup = sum(up);

% A particular solution, x0

x0 = interp1([slo; sup], [lo; up], s);
if any(isnan(x0))
error('no solution exists')
end

% feasible set: S = { x=x0+A*y : A*y <= b }

x0 = x0(:);
B = null(ones(size(lo)));
A = [-B; B];
b = [x0-lo(:); up(:)-x0];

% FEX: http://www.mathworks.com/matlabcentral/fileexchange/30892

tol = 1e-6;

[V,nr,nre] = lcon2vert(A,b,[],[],tol);

% Split S in simplex

T = delaunayn(V);
P = V(T,:);
P = reshape(P, [size(T) m-1]);
P = permute(P, [3 2 1]);
np = size(P,3);

% Compute the volume of simplexes

Q = bsxfun(@minus, P, P(:,end,:));
Q(:,end,:) = [];
vol = arrayfun(@(k) det(Q(:,:,k)), 1:np);
vol = abs(vol);

% generate n samples (i) of v with probability ~ vol

vol = vol / sum(vol);
c = cumsum(vol); c(end)=1;
[~, i] = histc(rand(1,n),[0 c]);

% Random rarycentric coordinates
% FEX: http://www.mathworks.com/matlabcentral/fileexchange/9700
w = randfixedsum(m,n,1,0,1);

% Put together

V = P(:,:,i);
y = zeros(m-1, n);
for k=1:n
y(:,k) = V(:,:,k)*w(:,k);
end

% final result, m x n array of random vector

David Epstein

unread,

Nov 19, 2012, 5:26:09 AM11/19/12

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <k8badi$8rd$1...@newscl01ah.mathworks.com>...

I would be interested to know how this approach compares with straightforward rejection sampling, that is, not cutting up into simplices. If the dimension is n, the relevant subspace is defined using 2*n inequalities and one equality, so, in dimension 51 only 102 inequalities. I was thinking of the following procedure:

1. Let f be the change of scale map from Dmitrey's box B to the unit cube U. f sends Dimitrey's probability measure to the standard probability measure.
2. Let V be the subspace of B defined by one linear equality and 2n inequalities. Then f(V) in U is defined by one linear equality and 2n inequalities.
3. Use Gram-Schmidt to construct linear isometry g that sends U into an n-dimensional euclidean space Z and f(V) into the subspace z_1=0. The input data for Gram-Schmidt has its first vector v a unit vector orthogonal to f(V), followed by the standard basis, orthogonal to the various (n-1)-dimensional faces of U. (Perhaps omit the basis vector nearest to v.)
4. Using max and min, find the smallest rectangle R in Z containing g(U), and hence a rectangle S, obtained from R by setting z_1=0. S contains g(f(V)).
5. Generate a uniform random sample in S and keep only those satisfying the appropriate 2*n inequalities.

I suppose that the problem with this method is that the volume of g(f(V)) may be small compared with the volume of S. But maybe the ratio could be improved by tricks, for example "adaptive sampling"---whenever a random point p of S turns out not to lie in g(f(V)), then we can use the coordinates of p to try to chop off part of S while maintaining the rectangular shape relative to the coordinate axes.

@Roger: what were your reasons for rejecting this approach in your randfixedsum package on FEX?

David

Dmitrey Yershov

unread,

Nov 19, 2012, 5:58:07 AM11/19/12

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <k8ck6h$lhp$1...@newscl01ah.mathworks.com>...

Thank you very much, Bruno! I think that your procedure is the card! BUT now when I try to generate points with the following input parameters:

lo = [0 0 0];
up = [2 1 2];
s = 2;
n = 1000;

it gives me the error:

Error using qhullmx
QH6154 qhull precision error: initial facet 1 is coplanar with the interior point
ERRONEOUS FACET:

While executing: | qhull d Qt Qbb Qc
Options selected for Qhull 2010.1 2010/01/14:
run-id 1283078028 delaunay Qtriangulate Qbbound-last Qcoplanar-keep
_pre-merge _zero-centrum Qinterior-keep Pgood _max-width 2.7
Error-roundoff 3.8e-15 _one-merge 2.7e-14 Visible-distance 7.6e-15
U-coplanar-distance 7.6e-15 Width-outside 1.5e-14 _wide-facet 4.5e-14

The input to qhull appears to be less than 3 dimensional, or a
computation has overflowed.

Qhull could not construct a clearly convex simplex from points:

The center point is coplanar with a facet, or a vertex is coplanar
with a neighboring facet. The maximum round off error for
computing distances is 3.8e-15. The center point, facets and distances
to the center point are as follows:

facet p1 p3 p0 distance= 0
facet p2 p3 p0 distance= -1.1e-16
facet p2 p1 p0 distance= -2.2e-16
facet p2 p1 p3 distance= -1.1e-16

These points either have a maximum or minimum x-coordinate, or
they maximize the determinant for k coordinates. Trial points
are first selected from points that maximize a coordinate.

The min and max coordinates for each dimension are:
0: -0.8392 0.8928 difference= 1.732
1: -1.239 1.493 difference= 2.732
2: 0 2.732 difference= 2.732

If the input should be full dimensional, you have several options that
may determine an initial simplex:
- use 'QJ' to joggle the input and make it full dimensional
- use 'QbB' to scale the points to the unit cube
- use 'QR0' to randomly rotate the input for different maximum points
- use 'Qs' to search all points for the initial simplex
- use 'En' to specify a maximum roundoff error less than 3.8e-15.
- trace execution with 'T3' to see the determinant for each point.

If the input is lower dimensional:
- use 'QJ' to joggle the input and make it full dimensional
- use 'Qbk:0Bk:0' to delete coordinate k from the input. You should
pick the coordinate with the least range. The hull will have the
correct topology.
- determine the flat containing the points, rotate the points
into a coordinate plane, and delete the other coordinates.
- add one or more points to make the input full dimensional.

This is a Delaunay triangulation and the input is co-circular or co-spherical:
- use 'Qz' to add a point "at infinity" (i.e., above the paraboloid)
- or use 'QJ' to joggle the input and avoid co-circular data

Error in delaunayn (line 101)
t = qhullmx(x', 'd ', opt);

Error in Untitled (line 32)
T = delaunayn(V);

I will be obliged to you if you can fixe it.

Bruno Luong

unread,

Nov 19, 2012, 7:04:13 AM11/19/12

to

"Dmitrey Yershov" <pierrev...@mail.ru> wrote in message

> Error in delaunayn (line 101)
> t = qhullmx(x', 'd ', opt);
>
> Error in Untitled (line 32)
> T = delaunayn(V);
>
> I will be obliged to you if you can fixe it.

Please try to replace the delaunayn() command with:

if m==3
T = delaunay(V);
else
T = delaunayn(V);
end

Obviously it doesn' work for 2-dimensional triangulation.

Bruno

Dmitrey Yershov

unread,

Nov 19, 2012, 7:29:08 AM11/19/12

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <k8d77s$lfd$1...@newscl01ah.mathworks.com>...

Thank you very VERY much, Bruno! Your procedure works fast enough to solve my problem and gives proper results (at least by sight for 3 dimensions)!

P.S.: Special thanks to Roger Stafford and Matt J for basis procedures.

Bruno Luong

unread,

Nov 19, 2012, 7:39:13 AM11/19/12

to

"Dmitrey Yershov" <pierrev...@mail.ru> wrote in message
>

> Thank you very VERY much, Bruno! Your procedure works fast enough to solve my problem and gives proper results (at least by sight for 3 dimensions)!

You could further optimize the code by replacing the for-loop of V*w with James's MTIMESX

http://www.mathworks.com/matlabcentral/fileexchange/25977-mtimesx-fast-matrix-multiply-with-multi-dimensional-support

Bruno

Bruno Luong

unread,

Nov 19, 2012, 8:02:11 AM11/19/12

to

It seems to me this task
w = randfixedsum(m,n,1,0,1)

can be optimized as well. But let save this optimization topic for another day.

Bruno

Roger Stafford

unread,

Nov 19, 2012, 6:10:18 PM11/19/12

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <k8dakj$39u$1...@newscl01ah.mathworks.com>...

> It seems to me this task
> w = randfixedsum(m,n,1,0,1)

> can be optimized as well. .....

- - - - - - - - -

Yes, I agree, Bruno; 'randfixedsum' is rather cumbersome to use just for a single simplex. How about this instead:

R = bsxfun(@power,rand(m-1,n),1./(m-1:-1:1)');
X = cumprod([ones(1,n);R]).*[ones(m,n)-[R;zeros(1,n)]];

Roger Stafford

Roger Stafford

unread,

Nov 19, 2012, 8:54:09 PM11/19/12

to

"David Epstein" <David.Eps...@remove.warwick.ac.uk> wrote in message <k8d1g1$4gr$1...@newscl01ah.mathworks.com>...

> @Roger: what were your reasons for rejecting this approach in your randfixedsum package on FEX?

- - - - - - - - - -

I think you will find that as n increases the acceptance rate in such a procedure shrinks toward zero altogether too rapidly, thereby restricting one in practice to a rather small range for n.

To get a feeling for this, consider an n-dimensional hypersphere of radius 1/2 enclosed in an n-dimensional cube with unit-length sides. The n-dimensional volume of the cube is 1 whereas that of the hypersphere for even n is (pi/4)^(n/2)/(n/2)! (See http://en.wikipedia.org/wiki/N-sphere.) For n = 50 this would be 1.53E-28, a small acceptance rate indeed.

Roger Stafford

Bruno Luong

unread,

Nov 20, 2012, 1:53:08 AM11/20/12

to

"Roger Stafford" wrote in message <k8ee8q$k6f$1...@newscl01ah.mathworks.com>...

Really cool Roger! I would love if you could share the justification of this formula.

Bruno

Bruno Luong

unread,

Nov 20, 2012, 2:02:09 AM11/20/12

to

Just a small typo correction in a comment of my code. It should read:

% feasible set: S = { x=x0+B*y : A*y <= b }

% Bruno

Bruno Luong

unread,

Nov 20, 2012, 2:17:08 AM11/20/12

to

"Roger Stafford" wrote in message <k8ens1$kfr$1...@newscl01ah.mathworks.com>...

IMHO, the difficulty is in the same order than solving linear programing with the constraints A*y <= b, which have to find accurately the vertices without ambiguity. At least for the convex part.

I have little experience in Delaunay in high dimensional space. But I could imagine it 's challenging too.

Bruno

james bejon

unread,

Nov 20, 2012, 3:16:14 AM11/20/12

to

"Roger Stafford" wrote in message <k8ee8q$k6f$1...@newscl01ah.mathworks.com>...

> R = bsxfun(@power,rand(m-1,n),1./(m-1:-1:1)');
> X = cumprod([ones(1,n);R]).*[ones(m,n)-[R;zeros(1,n)]];

Very neat! (Though I have no idea how it works).

Roger Stafford

unread,

Nov 20, 2012, 2:54:09 PM11/20/12

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <k8f9ck$gng$1...@newscl01ah.mathworks.com>...

> Really cool Roger! I would love if you could share the justification of this formula.

"james bejon" wrote in message <k8fe8e$2lo$1...@newscl01ah.mathworks.com>...
> Very neat! (Though I have no idea how it works).

- - - - - - - -

To Bruno, James, and anyone else interested:

I recommended the two lines of code

R = bsxfun(@power,rand(m-1,n),1./(m-1:-1:1)');
X = cumprod([ones(1,n);R]).*[ones(m,n)-[R;zeros(1,n)]];

as a substitute for Bruno's

w = randfixedsum(m,n,1,0,1)

(with X here instead of w) because 'randfixedsum' is designed for handling complicated polytopes which break down to into a great many individual simplices, whereas Bruno's call represents a single m-1 dimensional simplex within R^m. (It's analogous to using a cannon to swat flies.) It is more efficient to compute the single simplex directly without having to bother with provisions for a decomposition into simplices. Its vertices are the m points:

[1;0;0;0;...], [0;1;0;0;...], [0;0;1;0;...], ..., [0;0;0;...;0;0;1]

which we can call P1, P2, ..., Pm.

For example letting m = 4 we have a 3D tetrahedron simplex imbedded in R^4 space, and as can be seen the two code lines, in effect, carry out the computation

r1 = rand^(1/3); r2 = rand^(1/2); r3 = rand^(1);
X = [ (1)*(1-r1) ; (1*r1)*(1-r2) ; (1*r1*r2)*(1-r3) ; (1*r1*r2*r3)*(1-0) ];

which with the definitions of the points is the same as

X = (1-r1)*P1 + r1*((1-r2)*P2 + r2*((1-r3)*P3 + r3*P4));

This distributes points within the tetrahedron in a statistically uniform manner. The product coefficients of the four points amount to barycentric coordinates with a sum of one. The expression

(1-r3)*P3 + r3*P4

would distribute uniformly along the line between P3 and P4 using r3 = rand. The quantity

(1-r2)*P2 + r2*((1-r3)*P3 + r3*P4)

would then distribute uniformly within the triangle P2P3P4 with r2 = rand^(1/2) since the area of a triangle varies as the square of its height. Finally the full expression with r1 = rand^(1/3) gives a uniform distribution within the tetrahedron P1P2P3P4 since its volume is proportional to the cube of its height.

Roger Stafford

Bruno Luong

unread,

Nov 21, 2012, 2:38:17 AM11/21/12

to

Thanks Roger for the explanation.

Bruno

james bejon

unread,

Nov 21, 2012, 4:14:09 PM11/21/12

to

Very nice. I hadn't thought of it in those terms.

JS Hong

unread,

Sep 17, 2013, 1:52:09 AM9/17/13

to

Similar error happened to me with five variable.
Could anybody help me to fix this?

I copy your code and run it.
But with following parameters,

s = 183.8975;

lo = [ 0 0 0 0 0];

up = [420.0241 424.2856 648.9985 249.4882 113.4215];

the same error occurs.

Is this because s is small?

"Dmitrey Yershov" <pierrev...@mail.ru> wrote in message <k8d3bv$a2s$1...@newscl01ah.mathworks.com>...

JS Hong

unread,

Sep 17, 2013, 2:28:06 AM9/17/13

to

Actually, I found the similar error with original parameters

lo = [1 0 2 1 0];
up = [5 6 4 4 2];

If s = any number between 2 and 6, the same error occurs.
2 and 6 is the max number whose both lower bounds are 0.

Can you tell me how to handle this error?
Thanks.