what problem is this?

Richard Frost

unread,

Aug 2, 2001, 4:15:27 AM8/2/01

to

Hi,

Here is what I believe is an integer programming problem.
If you recognize it as belonging to some specific form
studied by others, I would be very interested to know
the common name of the problem and any leading references.

Thanks in advance!

This problem arises in a dynamics simulation. We have
N quanta, each of which can obtain 1 of k states.
It is always the case that N is much bigger than k.
Each state has an associated energy.
We must choose state for each quanta such that the
sum of the state energies over all quanta is maximized.
The constraint is that a minimum number of each state
must be chosen.

Here is a small example to illustrate the problem.
[Best viewed in Courier or other non-proportional font]

states
1 2 3 4

quanta energy per state state selection

1 3.8010 0.5205 0.0000 4.1145 3
2 6.7669 0.7796 0.0000 1.3483 2
3 2.4238 6.8272 3.9676 5.5862 2
4 9.8537 6.1103 3.6638 0.0000 2
5 0.0000 0.0000 2.5809 7.6061 3
6 8.4254 0.0000 0.2815 2.0483 2
7 4.1260 0.0000 8.0204 0.0000 3
8 6.8177 7.9486 0.0000 1.8128 2
9 9.3119 7.9454 4.9854 5.0084 4
10 5.9040 9.5057 8.1729 2.4243 1
11 1.7657 4.5349 7.6196 8.5012 4
12 0.0000 9.3196 5.6693 8.3141 2
13 4.0614 0.8154 0.7023 0.0000 3
14 5.6432 7.1313 0.0000 2.8673 2
15 9.7479 6.7587 8.7914 3.5300 3
16 6.4028 2.1059 3.1347 8.8705 4
17 8.9003 7.7349 7.8964 6.2441 4
18 1.7349 5.4328 0.4950 4.4327 4
19 0.0000 6.0686 2.7465 7.0289 2
20 3.3274 9.6733 8.2947 9.3464 2

required minimum selections per state
1 5 4 2
actual number of selections per state
1 9 5 5

total energy of selections = 112.9145

Jerry Harder

unread,

Aug 2, 2001, 5:13:45 AM8/2/01

to

"Richard Frost" <fro...@san.rr.com> wrote in message
news:z_7a7.39620$Gj5.17...@typhoon.san.rr.com...

> Hi,
>
> Here is what I believe is an integer programming problem.
> If you recognize it as belonging to some specific form
> studied by others, I would be very interested to know
> the common name of the problem and any leading references.
>
> Thanks in advance!
>
> This problem arises in a dynamics simulation. We have
> N quanta, each of which can obtain 1 of k states.
> It is always the case that N is much bigger than k.
> Each state has an associated energy.
> We must choose state for each quanta such that the
> sum of the state energies over all quanta is maximized.
> The constraint is that a minimum number of each state
> must be chosen.
>

[snipped]

I do not have a fancy name for this problem. It is a simple zero-one or
binary programming problem. I am emailing an Excel spreadsheet which with
the Solver addin can easily solve a problem of this size in less than a
minute.

--
Good luck,

Jerry Harder
remove spamnein from address to reply

Dmitri Chubarov

unread,

Aug 2, 2001, 5:02:45 AM8/2/01

to Richard Frost

Hello,

On Thu, 2 Aug 2001, Richard Frost wrote:

>
> This problem arises in a dynamics simulation. We have
> N quanta, each of which can obtain 1 of k states.
> It is always the case that N is much bigger than k.
> Each state has an associated energy.
> We must choose state for each quanta such that the
> sum of the state energies over all quanta is maximized.
> The constraint is that a minimum number of each state
> must be chosen.
>

I would suggest dealing with it as a 0-1 integer linear
programming problem.

Let q(i,j), i=1..N, j=1..k be boolean variables, such that
q(i,j)=1 if i-th quanta is in state j.

Then the first system of linear constraints is the uniqueness of
state:

(i in 1..N) :
sum (j in 1..k) q(i,j) = 1

The minimum number of each states can be formulated as

(j in 1..k) :
sum (i in 1..N) q(i,j) .GE. C(j)

Finally, the objective function is

maximize:
sum (i in 1..N, j in 1..k) E(i,j) * q(i,j)

That is what is referred to as the model.

For the example provided, I have received the following results
with the student version of CPLEX:

optimal solution; objective 154.2056;

which results in the following distribution of states:
1 2 3 4
5 7 4 4

1: (2, 4, 6, 9, 13)
2: (3, 8, 10, 12, 14, 18, 20)
3: (7, 11, 15, 17)
4: (1, 5, 16, 19)

/DC

Paul A. Rubin

unread,

Aug 2, 2001, 2:24:18 PM8/2/01

to

[posted and mailed]

I agree with previous posts that this can be modeled as a 0-1 integer
program, but it can also be modeled as a transportation problem, which
should solve faster (especially for large N).

You need N+1 source nodes. Each of the N quanta is a source node with
supply 1. The remaining source node (which I'll call the dummy node) has
supply N(k-1). You need 2k demand nodes (two for each state), which I'll
label 1..k and 1'...k'. The demand at each node i in 1...k is the minimum
number d_i of quanta required in that state. The demand at each "sibling"
node i' is N-d_i. Thus total demand is sum_{i=1...k} (d_i + N-d_i) = Nk,
which matches total supply.

Choose a positive constant M greater than any energy level. For each
combination of a quantum i (in 1...N) and a state j (in 1...k), draw an arc
from i to j and an arc from i to j', both of which cost M-e_ij, where e_ij
is the energy level for quantum i in state j. Also draw arcs from the dummy
node to each node 1'...k' (but *not* to 1...k), all with cost 0. Now solve
it as a transportation problem. Letting x_ij be 1 if quantum i is assigned
state j and zero if not, you are minimizing sum_{i,j} (M-e_ij)x_ij, but
since sum_{j} x_ij = 1 for every i, sum_{i,j} Mx_ij reduces to MN, a
constant, and hence minimizing sum_{i,j} (M-e_ij)x_ij is equivalent to
maximizing sum_{i,j} e_ij*x_ij, the sum of the energy levels of the assigned
states.

You can find descriptions of transportation problems in pretty much any
introductory operations research or mathematical programming text.

-- Paul

"Richard Frost" <fro...@san.rr.com> wrote in news:z_7a7.39620$Gj5.17571047
@typhoon.san.rr.com:

--
***************************************************************************
Paul A. Rubin Phone: (517) 432-3509
Department of Management Fax: (517) 432-1111
The Eli Broad Graduate School of Management E-mail: ru...@msu.edu
Michigan State University http://www.msu.edu/~rubin/
East Lansing, MI 48824-1122 (USA)
***************************************************************************
Mathematicians are like Frenchmen: whenever you say something to them,
they translate it into their own language, and at once it is something
entirely different. J. W. v. GOETHE

Lutz Tautenhahn

unread,

Aug 2, 2001, 4:01:58 PM8/2/01

to

"Dmitri Chubarov" <chub...@cs.man.ac.uk> schrieb im Newsbeitrag
news:Pine.LNX.4.33.010802...@voronkov.cs.man.ac.uk...

Hi,
this looks much like a 'Transportation problem':
You could use the following transportation tableau:

[Best viewed in Courier or other non-proportional font]

M must be a 'big value' and MM must be even 'bigger' than M.
n_i is the minimum number of a state, which must be choosen.
Diff = N-sum n_i, which is always >=0, otherwise there would
be no feassible solution.

I solved the problem this way and found
1:{2, 4, 6, 9, 13}
2:{3, 8, 12, 14, 18, 20}
3:{7, 11, 15, 17}
4:{1, 5, 16, 19}
which is the same result as that of Dmitri Chubarov.
I assume, you know about the complexity difference between a
0-1 integer problem and a transportation problem.

Regards, Lutz Tautenhahn

Richard Frost

unread,

Aug 2, 2001, 11:36:23 PM8/2/01

to

Thank you to Dmitri, Jerry, Paul, and Lutz for their quick replies -- and to
Paul for pointing out
that my return email address was incorrect.

The mapping to a transportation problem is of interest, esp. in terms of
efficiency. I'm aware that some
integer programming problems can be solved in P, while others remain NP. If
I could please indulge a related
question: how do the methods in your solutions fair? My real problems have
N = O(10^6) and k = O(10^2).

- Richard Frost
fr...@san.rr.com

Paul A. Rubin

unread,

Aug 3, 2001, 11:14:59 AM8/3/01

to

[posted and mailed]

What Lutz and I separately proposed, the transportation (aka Hitchcock)
problem, is a special case of a network flow model, which is in turn a
special case of a linear program. There are polynomial time methods for
LPs, and I'm pretty sure a transportation problem can be hammered into the
format needed by, say, Karmarkar's method. So the formal answer is that the
transportation problem can be solved in polynomial time. I don't know a
good complexity bound for it. The informal answer is: LPs typically solve
orders of magnitude faster than integer programs (the typical solution
procedure for an IP, branch and cut, solves thousands/millions of same-size
LPs as subproblems); network models tend to solve faster than generic LPs of
the same size; and transportation problems are pretty basic as network
problems go. So I'm inclined to say that this is as good as it gets.

Which of course doesn't tell you how long it will take, which is a function
not just of problem size but also of the hardware/software you're willing to
throw at it. You're looking at O(10^6) nodes and O(10^8) arcs the way I
formulated it. I don't have first-hand experience with problems that size,
so I'll just say that, based on what I've read, larger problems have been
solved in practice.

-- Paul

"Richard Frost" <fr...@san.rr.com> wrote in
news:X_oa7.41097$Gj5.17...@typhoon.san.rr.com:

--

Mike Hennebry

unread,

Aug 3, 2001, 11:18:51 AM8/3/01

to

In article <X_oa7.41097$Gj5.17...@typhoon.san.rr.com>,

Richard Frost <fr...@san.rr.com> wrote:
>The mapping to a transportation problem is of interest, esp. in terms of
>efficiency. I'm aware that some
>integer programming problems can be solved in P, while others remain NP. If
>I could please indulge a related
>question: how do the methods in your solutions fair? My real problems have
>N = O(10^6) and k = O(10^2).

The transportation problem is the best you are going to get.
Given the problem statement, 10^8 variables are inevitable,
regardless of the particular formulation.

It might be possible to improve the problem statement.
Quantum nodes can be merged if their respective states
have the same energy levels.

For an approximation scheme, one might merge quantum
nodes that are not quite identical. In that case, arc
costs should be averaged and optimal disaggregation
is a set of independent transportation problems.
You might decide that the approximation is good enough
or you might use it to initialize the full blown 10^8
variable version.

Since each quantum must be in some state, the differences
between the energy levels are what are important. Perform
the transformation E2[q,j]=E[q,j]-E[q,1], which makes each
quantum have zero energy in state 1. Doing so makes it
easier to find quanta to merge.

My recollection is that LPs with a million variables are
fairly ordinary these days. I'm not sure about a hundred
million.

--
Mike henn...@web.cs.ndsu.NoDak.edu
"No, no: the purpose of language
is to cast spells on other people ..."
Lisa S Chabot

Dima Pasechnik

unread,

Aug 3, 2001, 12:10:07 PM8/3/01

to

henn...@web.cs.ndsu.nodak.edu (Mike Hennebry) writes:

[...]

>
> My recollection is that LPs with a million variables are
> fairly ordinary these days. I'm not sure about a hundred
> million.
>

yes, but do you really need an LP here?
One can solve maxflow problems by other methods (Goldberg-Tarjan, etc),
that might be more suitable for a problem of sich size...

---
Dmitrii

Paul A. Rubin

unread,

Aug 3, 2001, 1:01:25 PM8/3/01

to

Dima Pasechnik <d.pas...@twi.tudelft.nl> wrote in
news:ug0b9y...@twi.tudelft.nl:

This would be more of a min-cost flow than a maxflow problem. There are
non-LP methods, of course (including the classic "stepping-stone" method),
but current commercial LP solvers may be at least competitive. I know that
CPLEX, for instance, recognizes network structures in LPs, has network
optimization code, and handles sparse matrices efficiently, so it might be a
contender.

-- Paul

Richard Frost

unread,

Aug 3, 2001, 11:28:30 PM8/3/01

to

"Paul A. Rubin" <ru...@msu.edu> wrote in message
news:Xns90F28593B...@35.8.2.20...

>
> This would be more of a min-cost flow than a maxflow problem.

Yes, exactly. Looking at the flow algorithms, I have arrived at a solution
which utilizes k inverted-heaps
(heaps that maintain minimums instead of maximums) to collect the required
number of quanta for each
state. The run time is T(N,k) <= N . ( k.lg(k) + k.O(lg(N/k))) where the
"." is multiplication. The heaps
are very interesting from the implementation point of view. k is small
enough that individual threads or
processes can be established for each heap, each front-ended by a queue.
This reduces the run time
to N . (k.lg(k) + O(lg(N/k))) = O(Nk.lg(k)). Whenever N is small enough
that N=k I then have
O(N^2 . lg(N)) which I believe matches the canonical methods for the minflow
problem.

- Richard F.

Lutz Tautenhahn

unread,

Aug 4, 2001, 6:55:56 PM8/4/01

to

Hello again,

"Paul A. Rubin" <ru...@msu.edu> schrieb im Newsbeitrag
news:Xns90F273878...@35.8.2.20...

> [posted and mailed]
>
> What Lutz and I separately proposed, the transportation (aka Hitchcock)

I was really surprised that we even used the same symbols (dummy and M),
in Germany there is a saying "zwei Doofe - ein Gedanke", which means in
english "two dummies - one thought".
Btw, a view years ago I saw a documentation on TV about russian
scientists, who did research and training on the field of telepathy and
really had some kind of success with it, maybe I should apply as a medium?

> problem, is a special case of a network flow model, which is in turn a
> special case of a linear program. There are polynomial time methods for
> LPs, and I'm pretty sure a transportation problem can be hammered into the
> format needed by, say, Karmarkar's method.

Although, I wouldn't do this.

>So the formal answer is that the
> transportation problem can be solved in polynomial time. I don't know a
> good complexity bound for it.

Teleportation problems are typically solved in 2 phases, which is similar to
the simplex method:
1. Find a basis solution (polynomial, for our problem O(N*k) with Greedy)
2. Iteratively find a better basis solution until the optimum is found:
To check, if a basis solution is optimal, can be done in O(N*k).
Improving a basis solution is (as I think) also about O(N*k).

[I'm just looking at my program code about that (I had written a C++ prog
about that a few years ago), but it is not so easy to estimate, because
there are recursive procedure calls and the number of recursions depends on
the data.]

The number of iterations required to find the optimum from a depends on
1. the starting solution, which is used and
2. how much the optimum basis solution is degenerated, that means how many
of the basis variables are equal to zero.
In general, the more degenerated, the more iterations are required [assuming
you use a standard method like the Distribution method], because there will
be more iterations which do only an exchange of zero-basis-variables without
an improvement of the objective value.
You can easy find a lower bound for the number of degenerated basis
variables: Select for every quantum the state of maximum energy and count,
how many of the minimum state constraints are violated. My conjecture is,
that the number of zero-basis-variables is at least the number of the
violated constraints.

I think, it should be possible to take advantage of the special problem
structure [the sum in the data rows is equal to 1, similar to the assignment
problem], which could it make easier to solve, then a general transportation
problem. For the selection/creation of the 'best' method to solve the
problem, it would be interesting to know:
* How is the distribution of the energy values (for instance is there an
ordering so that always energy of state n <= energy of state n+1) or are
the energies randomly distributed?
* Is the optimum basis solution expected to be highly degenerated ?
* What is the ratio (sum of the required minimum selections per state)/N ?

>The informal answer is: LPs typically solve
> orders of magnitude faster than integer programs (the typical solution
> procedure for an IP, branch and cut, solves thousands/millions of
same-size
> LPs as subproblems); network models tend to solve faster than generic LPs
of
> the same size; and transportation problems are pretty basic as network
> problems go. So I'm inclined to say that this is as good as it gets.
>
> Which of course doesn't tell you how long it will take, which is a
function
> not just of problem size but also of the hardware/software you're willing
to
> throw at it.

With 4 byte per number one needs about
10^6 * 10^2 * 4 byte = 400 MB storage (RAM)
which is possible in our days.

I would definitely map the real values on the domain of (long-)integer
values (1. faster calculation, 2. no trouble, caused by rounding errors
during calculation).

Regards, Lutz.

Paul A. Rubin

unread,

Aug 6, 2001, 5:04:12 PM8/6/01

to

[posted and mailed]

"Lutz Tautenhahn" <lutz.ta...@wirtschaft.tu-chemnitz.de> wrote in
news:9khucn$q2b$1...@narses.hrz.tu-chemnitz.de:

>
> I was really surprised that we even used the same symbols (dummy and
> M), in Germany there is a saying "zwei Doofe - ein Gedanke", which
> means in english "two dummies - one thought".

I think I prefer the American version: "Great minds often think alike."
:-)

> Btw, a view years ago I saw a documentation on TV about russian
> scientists, who did research and training on the field of telepathy and
> really had some kind of success with it, maybe I should apply as a
> medium?

Unless you can channel either somebody more important than I am or somebody
more dead than I am, I suspect your skill won't be particularly appreciated.
:-)

-- Paul