What is the algorithm or principle to combine all the combinations of
4s
into the smallest combination of 6s from a range of 10 consecutive
numbers?
Is there a way to calculate what the minimum 0f 6 numbers will be
before doing it?
The data is below.
The set of 10 numbers are { 1,...,10}.
There are 210 subsets of 4 numbers [a,b,c,d] in this set of 10 numbers
I want to find the lowest number of subsets of 6 numbers
[a,b,c,d,e,f,g]
which
will contain all the subset of 4's[a,b,c,d]
21 is the minimum number of subsets of 6 numbers [a,b,c,d,e,f,g] which
will contain all the subset of 4's[a,b,c,d] _ see below what my result
is?
I used brut force to get the result but I am not sure 21 subsets of 6's
is the lowest?
I am talking here about an algorithmic process which I can understand.
Please spare my brain and if you can be really down to earth , thanks?
an be then written in Basic,
Then I can write from what your explanation my version in Basic, C,
C++, Java etc
1) How does the algorithm works to find the minimum of number of
subsets
2) Is it possible to use matrices to resolve this problem? and how?
3) Any websites, and information appreciated
==
Explanation:
Combinations of 4's or set of 4's
[1 2 3 5], [1 2 3 5], [ 1 2 3 6], ....,[6 7 9 10], [6
8 9
10], [7 8 9 10]
(please - no permutations)
Combinations of 6's or set of 6's
[1 2 3 4 5 6], [1 2 3 4 6 7],....,[5 6 7 8 9 10]
Expected result below of 21 sets of 6's or better than below ( better
means less than 20 sets of 6's)
4) My results for grouping all 4s into the least number
of combinations of 6s .
I find 21 combinations of 6s?
But what I am after is the way you set up the algorithm or explain how
it works
without using brut force?
1 1 2 3 4 5 6
2 1 2 3 4 7 8
3 1 2 3 4 9 10
4 1 2 4 5 7 10
5 1 2 4 6 8 9
6 1 2 5 6 7 9
7 1 2 5 6 8 10
8 1 3 5 6 7 8
9 1 3 5 6 9 10
10 1 3 7 8 9 10
11 1 4 5 8 9 10
12 1 4 6 7 9 10
13 2 3 4 5 8 10
14 2 3 4 6 7 9
15 2 3 5 7 9 10
16 2 3 6 8 9 10
17 2 4 5 7 8 9
18 2 4 6 7 8 10
19 3 4 5 6 7 10
20 3 4 5 6 8 9
21 5 6 7 8 9 10
I have read this question over half a dozen times, and I still can't
figure out what you're asking.
Maybe you should show what you mean with a smaller example; that is
] combine all the combination of 2s into the smallest combination of 3s
] from a range of 5 consecutive numbers.
Maybe then I can see what you're after.
(Cross-posted to alt.sci.math.combinatorics, since that's the
mathematical area where it belongs.)
--- Christopher Heckman
Didn't you make another post showing a set of 20, or was I
halucinating?
Anyway, I got a program that finds a 21 set solution. It's running
overnight to see if I can get a better set. I'll post results tomorrow.
There are 210 combinations of 10 items taken 4 at a time.
There are 210 combinations of 10 items taken 6 at a time.
There are 15 combinations of 6 items taken 4 at a time.
One set of 6 numbers covers 15 of the 210 4-number combinations.
If his example is correct, all 210 of the 4-number combinations are
covered by his set of 21 6-number combinations
I think he wants to know
- is 21 the minimum (some combinations are duplicated with 21 sets)
- how to write an algorithm that finds the minimum set.
I've got an algorithm that's cooking overnight, but I don't know if
will
find the minimum.
>> is 21 the minimum (some combinations are duplicated with 21 sets)
Yes this is Exactly what I am looking for.
>> how to write an algorithm that finds the minimum set.
Exactly correct.
>> Didn't you make another post showing a set of 20, or was I
hallucinating?
Yes absolutely correct. it was showing a set of 20, this set
of 20 works but this is not my work.
It was emailed to me 3 years ago.
I can post the set of 20 but it does not prove this is the lowest
number we can get.
I believe we can get 19.
The person who send it to me could not tell me how it was obtained. All
he mention in a very brief email was that he was using a lot of 1's and
0's and was taking a huge amount of time.
This person who emailed me 3 years ago never emailed me back even after
I send him a few email.
He send me this URL
http://www.cs.sunysb.edu/~algorith/files/set-cover.shtml
The problem I simply can't figure it out how to do what is mentioned on
the URL.
This is from a rare book..
http://iris.gmu.edu/~khoffman/papers/set_covering.html
The reason I removed my previous posting was because the question
looked solved when actually the answer of 20 is not necessarily
correct.
I have no way of verifying if 20 is correct and how it was obtained.
That person told me he was using a lot of 0and 1 to get this answer.
and it was taking a huge time to solve that problem.
All I can tell you are the bits and pieces are remember and the only
part of the email I got left.
The person who send me the email, or did not know how to do it or did
not want to be bothered with the explanations.
But this is an interesting problem totally genuine.
I'll be happy to post the 20 lines but I believe it will stop people
form trying to post what I am looking for.
What I am looking for is not the answer of 20 or 19 but the process how
it is done,
The process must be very interesting. This is what I am after, the
process and the logic behind it.
Thanks again for putting your computer to test this?
I wrote a program but stopped 3 yesr ago because my computyer was not
powerfull enough.
gnh888
Yes, please re-post it.
> but it does not prove this is the lowest
> number we can get.
> I believe we can get 19.
My first algorithm got 21 as the smallest covering set. It's possible
it could do better if I changed some things around to allow more
cases to be checked. I'll play around with it this weekend.
>
> The person who send it to me could not tell me how it was obtained. All
> he mention in a very brief email was that he was using a lot of 1's and
> 0's and was taking a huge amount of time.
Yeah, my program uses a lot of 1s and 0s also. I'll post some
explanations later when I have more time.
>
> This person who emailed me 3 years ago never emailed me back even after
> I send him a few email.
> He send me this URL
> http://www.cs.sunysb.edu/~algorith/files/set-cover.shtml
> The problem I simply can't figure it out how to do what is mentioned on
> the URL.
> This is from a rare book..
> http://iris.gmu.edu/~khoffman/papers/set_covering.html
>
>
> The reason I removed my previous posting was because the question
> looked solved when actually the answer of 20 is not necessarily
> correct.
> I have no way of verifying if 20 is correct and how it was obtained.
That's why I want you to re-post it, I can easily verify whether or not
it is correct.
>
> That person told me he was using a lot of 0and 1 to get this answer.
> and it was taking a huge time to solve that problem.
I believe a brute force check using my algorithm would take 210! or
>>> print gmpy.fac(210)
10582362029223656378427428424334835305758990578716
90195623527375221444875324002101478493690117146739
54768265316577892528273760626189481169051055226066
65074118957389727368479141118013403943916006656189
58385010008177116826257256704776162675986612591949
75646029749546282594356217374097544153589482020891
75077473501255831346084682486417203023912212889600
0000000000000000000000000000000000000000000000000
loops to test all possibilities. Obviously, we don't want to have
to do that. I found 21 simply by testing 210 possibilities.
I'm pondering an open ended scheme that would simply run
forever and track the best answer.
> All I can tell you are the bits and pieces are remember and the only
> part of the email I got left.
>
> The person who send me the email, or did not know how to do it or did
> not want to be bothered with the explanations.
>
> But this is an interesting problem totally genuine.
Definitely. I just completed a program to generate the Cartesian
Product of a set with filters to create the subsets
Permutations with Replacement
Combinations with Replacement
Permutations without Replacement
Combinations without Replacement
Your problem is a case of Combinations without Replacement and
I'm using it to exercise my program.
>
> I'll be happy to post the 20 lines but I believe it will stop people
> form trying to post what I am looking for.
On the contrary, it gives me something to shoot for.
>
> What I am looking for is not the answer of 20 or 19 but the process how
> it is done,
I'll discuss the process in a later post. There is, of course,
no guarantee that my algorithm is the best.
>
> The process must be very interesting. This is what I am after, the
> process and the logic behind it.
> Thanks again for putting your computer to test this?
> I wrote a program but stopped 3 yesr ago because my computyer was not
> powerfull enough.
I'm using Python, which I highly recommend. Very high level. I'll post
the code and explain the logic behind it when I get a chance.
>
> gnh888
Like Mensanator and I found, and probably others ,
this problem looks TRIVIAL but it is NOT TRIVIAL
IMPORTANT: For READERS WHO READ QUICKLY
the result of 20 below , I repeat 20 below is NOT necessarily CORRECT.
[ 20 is NOT my answer]
20 or LOWER IS NOT WHAT WE ARE LOOKING FOR.
1 2 3 4 7 9
1 2 3 5 6 9
1 2 3 8 9 10
1 2 4 5 7 10
1 2 4 6 7 8
1 2 5 6 8 10
1 3 4 5 6 8
1 3 4 5 9 10
1 3 5 6 7 10
1 3 6 7 8 9
1 4 6 8 9 10
1 5 7 8 9 10
2 3 4 5 6 7
2 3 4 5 8 10
2 3 6 7 8 10
2 4 5 6 9 10
2 4 7 8 9 10
2 5 6 7 8 9
3 4 5 7 8 9
3 4 6 7 9 10
We are after the logic and the algorithm to do this?
or information to do this and better than this.
I will release more of my ideas ,
but I can say "we" are trying to get other readers interested and have
some input.
Too many details now may derail the goal.
The subgoals are to prove.
1) How to calculate and prove this is a minimum?
2) To explain the logic behind being a minimum?
3) To also find the minimum for 11, 12 numbers ,etc.. etc..
If the algorithm is NO found it may takes 10 years or more to do it by
brut force with 11,12 etc...
4) I believe the solution has been found, so what is it? how it works.
gnh888
Just a quick note, I have verified that this set is, in fact,
a covering set. Verify means showing that each of the
210 C(10,4) combinations appears at least once in the
aggregate C(6,4) combinations of the 20 entries in the
covering set.
One way to do that is assign each of the 210 C(10,4)
combinations a unique bit position in a 210-bit binary number
and then keep a running tally by ORing the binary number
created from each entry in the covering set (each will be
a 210-bit binary number with exactly 15 1-bits). If, at the
end, the running tally has exactly 210 1-bits, then it is a valid
covering set. Any 0 bit would represent a C(10,4) combination
that wasn't covered.
We can even watch the progress as the verification runs
(but for space considerations, only the left most 50 bits are
shown).
10000000000000000000000000000000000000000000000000
10000000000000000000000000000000000000000000000000
10000000000000000000000000000000000010000000000000
10000000000000000000000000000100000010000000000000
10000000000000000000000001000100000010000000000000
10000000000100000000000001000100000010000000000000
10000000000100000000000001000100001010000000000000
10000000000100000000000001100100001010000000001000
10000000000101000000000001100100001010000001001001
10000100000101000000000001100100001010010011111001
10100100000101001000111001100100001010010011111001
11100111110101001000111001100100001010010011111001
11100111110101001000111001100100001110010011111001
11100111110101001000111001110100001110010011111101
11101111110101001000111001110100001110110111111101
11101111111101001000111001110100111110110111111101
11101111111101001111111001110100111110110111111101
11101111111111111111111001110100111110110111111101
11101111111111111111111001111111111110110111111111
11111111111111111111111111111111111111111111111111
(In the first couple lines, the bit changes were mostly in the
section not shown.)
FYI, I wrote a little backtracking Python program using a greedy ordering
heuristic: at each step, sort the 6-sets that haven't yet been used, in
decreasing order of the number of 4-sets still remaining (== weren't covered
by a previous 6-set selection) covered by the 6-set. Wow -- that's a
mouthful ;-). IOW, of all the choices available at each step, try those
that "cross off" (cover) a maximal number of remaining 4-sets first, and
those that cross off a minimal number last, recursing after each choice.
There's no point trying a 6-set all of whose 4-subsets have already been
crossed off, and that allows for significant pruning of the implicit search
tree. Subsets were represented as bit vectors (integers), and to save a lot
of repeated computation a setup step created a dict mapping each 6-set to a
set containing the 15 4-sets it covers.
Anyway, that worked about as expected: it found a covering using 20 6-sets
in a fraction of a second, but looks like it will run until the heat death
of the universe failing to find a smaller covering. Backtracking turned out
to be necessary: the first ("100% greedy") cover it found used 21 6-sets.
The _general_ minimum set-cover problem is NP-hard, so there's no evident
reason to hope that this can be done by any poly-time algorithm. The sets
here have a lot of structure, though, so perhaps there's a way to out-think
this particular case (don't look at me ;-)). OTOH, that the "100% greedy"
cover was not minimal strongly suggests there's not enough exploitable
structure to make this easy.
I ) >> I have verified that this set is, in fact, a covering set.
You mean this set of 20 combinations of 6s is a valid "covering
set".
I am not familiar with the word "covering set" but I understand
what you mean.
II ) You would say my previous set of 21 combinations is also a valid
"covering set"?
[21 is not the optimal set of 6s covering all 4s]
III) Do you think 20 is the optimal or least numbers of subsets of 6's
III) I do my verification slightly differently than you, a bit more
clumsy using Powerbasic,
but the result is the same .
This is why I am not familiar with some of your terms which are much
more mathematical.
Your verification using binary number is easier and will use it in my
future verifications.
IV) Is it possible to find out if the number 20, as a minimum, can be
improved on?
My feelings are telling me 19 and even 18 is possible.
V) I mean, is it possible to design a system / algorithm /
brain_storming_pool which will
find automatically the least number of combination of 6's covering
all the combinations of 4's?
not only for 10 number but for 11, 12 etc...?
gnh888
>> perhaps there's a way to out-think
>>this particular case (don't look at me ;-))
I like this Tim :-) :-) just joking
What about using smaller sets containing 4, 5 , 6, 7, 8, 9 numbers
and extrapolating from there?
I have not done it yet but this is an idea, I'll pursue.I'll be
programming in PB again shortly.
Part answer: For people reading this post for the firsttime please read
previous posting , starting from the top.
A) One combination of 6's cover all the combinations of 4's with
the set of numbers 1,2,3,4
B) One combination of 6's cover all the combinations of 4's with the
set of numbers 1,2,3,4,5
C) One combination of 6's cover all the combinations of 4's with the
set of numbers 1,2,3,4,5,6
D) Three combinations of 6's cover all the combinations of 4's with
the set of numbers 1,2,3,4,5,6,7
E) I guess "five" [need to be checked] combinations of 6's cover all
the combinations of 4's with the set of numbers 1,2,3,4,5,6,7,8
Sorry, can't do it mentally, I need to go back to my PB program....
just as a guess I would say 2
F) I guess "12" [need to be checked] combinations of 6's cover all
the combinations of 4's with the set of numbers 1,2,3,4,5,6,7,8,9
gnh888 also posted this to alt.math.recreational, where he got an
answer: It's a block design.
--- Christopher Heckman
Yes. From that set of 20 six-number combinations, all 210
of the four-number combinations can be created.
> I am not familiar with the word "covering set" but I understand
> what you mean.
>
> II ) You would say my previous set of 21 combinations is also a valid
> "covering set"?
It must be verified. A set of 21 certainly doesn't
guarantee that it's a covering set. My test found cases
where it took 28 six-number combinations to achieve
covering.
> [21 is not the optimal set of 6s covering all 4s]
Right. You can prove that simply by demonstrating the
existence of a 20 number covering set, of which I found
numerous examples.
>
> III) Do you think 20 is the optimal or least numbers of subsets of 6's
As pointed out by others, 18 is the minimum, but you
have to find an 18 or 19 set to prove it. Apparently,
no one (including me) has found one yet.
>
> III) I do my verification slightly differently than you, a bit more
> clumsy using Powerbasic,
> but the result is the same .
> This is why I am not familiar with some of your terms which are much
> more mathematical.
I'll try to do a write up of my algorithm later. I also
use the binary number thing to create the covering sets
(sounds like your e-mail friend was doing something similar
if his program involved a lot of 1s and 0s). My program
finds sets of 20 quite quickly, so maybe my approach to
finding them is better.
>
> Your verification using binary number is easier and will use it in my
> future verifications.
I hope your PowerBasic supports Big Integers!
>
> IV) Is it possible to find out if the number 20, as a minimum, can be
> improved on?
After reading the web sites listed in this thread, no.
> My feelings are telling me 19 and even 18 is possible.
Sure, possible. But, being NP-hard, it might not be provable.
>
> V) I mean, is it possible to design a system / algorithm /
> brain_storming_pool which will
> find automatically the least number of combination of 6's covering
> all the combinations of 4's?
You'll see the term "greedy" used a lot on the references.
My program uses that technique. Basically, what it means
is that at every decision point, pick the best choice.
As we assemble a covering set, each choice could cover
a maximum of 15 of the 210 C(10,4) elements. But, once
you've added a few elements to the covering set, you'll
set that, due to duplication, some potential elements
of the C(10,6) set cover less than 15. The greedy
approach is to always add an element that covers the
maximum number of uncovered targets.
It seems logical, the sum of optimal choices should give
the optimal answer. Alas, it doesn't always happen.
The greedy algorithm _does_ solve some problems, such
as finding a Minimal Spanning Tree, but fails at others
such as the Bridge Crossing Problem (where you have to
make a short-term non-optimal choice to get a later
optimal choice that makes up for it).
One good thing is that a greedy algorithm often gives you
a good answer even when in can't guarantee that it's the
best. So if it turns out that 20 is the best, then the
greedy algorithm is solving the covering set problem.
> not only for 10 number but for 11, 12 etc...?
You can certainly find good answers.
Here's the results of my overnight test once I got my
program tuned up:
Greedy Hamming Distance Random Shuffle Test
Test run: 7 hours
Total trials: 1,320,000
Covering Occurrences
Set
18 - 0.0000%
19 - 0.0000%
20 5,292 0.4009%
21 56,472 4.2782%
22 171,368 12.9824%
23 309,587 23.4536%
24 490,456 37.1558%
25 231,923 17.5699%
26 52,638 3.9877%
27 2,257 0.1710%
28 7 0.0005%
29 - 0.0000%
Covering Distribution
Set
18
19
20 n
21 mmmmm
22 mmmmmmmmmmmmmmmm
23 mmmmmmmmmmmmmmmmmmmmmmmmmmmmm
24 mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
25 mmmmmmmmmmmmmmmmmmmmmm
26 mmmmm
27 i
28 .
29
I did not check if there were duplicates amongst the
5292 size 20 covering sets.
>
> gnh888
Quickly to answer this
>>I hope your PowerBasic supports Big Integers!
I don't know what you mean by "big integer"
Powerbasic supports Quad-integers:
" Quad-integers are 64-bit (8 byte) signed integers (twice as many
bits as Long integers) with a range of -9.22x10^18 to 9.22x10^18 (
-2^63 to 2^63 -1)"
"A C/C++ LARGE_INTEGER and a Delphi int64 are both equivalent to a
PowerBASIC Quad integer"
The result of a search on "Quad intergers" I made on Powrbaisc website:
http://www.powerbasic.com/support/forums/search.cgi?action=simplesearch&SearchIn=ALL&ForumChoice=ALL&SearchTerms=Quad+integers&BooleanAND=YES&SearchDate=ALL&SearchUser=&ExactName=no&File=temp-2453764-160228-rwrG.cgi&Total=46&StartAt=A:000004
Guy (gnh888)
Typically, that means arbitrary precision, no limit (other than
your computers memory and speed) on how many bits an
integer can have.
>
> Powerbasic supports Quad-integers:
> " Quad-integers are 64-bit (8 byte) signed integers (twice as many
> bits as Long integers) with a range of -9.22x10^18 to 9.22x10^18 (
> -2^63 to 2^63 -1)"
Which means you can't have a 210-bit integer required by my
program. Sure there are other ways to do it, such as arrays of
64-bit integers or strings of characters "1" and "0". But then you
have to write a bunch of utility functions to use them.
My decision algorithm is based on Hamming Distance:
Help on built-in function hamdist:
hamdist(...)
hamdist(x,y): returns the Hamming distance (number of
bit-positions where the bits differ) between x and y.
I calculate the Hamming distance for each C(10,6) item compared
to the aggregate total of the covering set under construction and
greedily take the one that produces the largest Hamming Distance.
That choice is then bit-wise ORed to the aggregate and the item
appended to the covering set. Repeat until all C(10,4) items are
covered (when the aggregate = 2**210 - 1).
Having math functions like hamdist and bit-wise OR that can work
with arbitrarily large integers makes this process a lot easier.
>
> "A C/C++ LARGE_INTEGER and a Delphi int64 are both equivalent to a
> PowerBASIC Quad integer"
>
> The result of a search on "Quad intergers" I made on Powrbaisc website:
> http://www.powerbasic.com/support/forums/search.cgi?action=simplesearch&SearchIn=ALL&ForumChoice=ALL&SearchTerms=Quad+integers&BooleanAND=YES&SearchDate=ALL&SearchUser=&ExactName=no&File=temp-2453764-160228-rwrG.cgi&Total=46&StartAt=A:000004
My research would be too constrained by such toy programming
languages. I often have problems where the parameters can have
50,000 decimal digits. That is one of the reasons I use Python.
Other languages like Java have Big Integer libraries and you can
get the GMP library for C/C++ . I use the Python version of GMP
because although Python has built-in support for Big Integers,
GMP is where I get the Hamming Distance function from.
Meanwhile, having had fairly good success locating size 20
covering sets using the Hamming Distance Random Shuffle
algorithm, I thought I might try ordering the C(10,6) entries
by entropy to see what that does.
Entropy measures the dispersion of 1s and 0s in a binary
number. 00001111 has high entropy and 01010101 has
low entropy. A value can be calculated by dividing the number
of 0s by the number of contiguous groups of 0s.
I wanted to test whether the greedy algorithm works better
if it scans all the low entropy choices first. I still do random
shuffling but preserve the entropy order.
In 22000 trials, I didn't get a single 20 covering! So I tried
reverse entropy order. That was worse, in 22000 trilas, no
20 or 21 coverings. Maybe I'm onto something.
> Guy (gnh888)
[follow-ups added to alt.math.recreational, since OP was also there]
Hi, Guy:
Concerning "the logic and the algorithm" for such designs,
you may find the expository write up here helpful:
http://answers.google.com/answers/threadview?id=367273
The lower bound 20 for blocks in an optimal (10,6,4)-covering design
is an easy application of the Schonheim bound, discussed there and
in the PDF on "New Constructions" I pointed you to at the La Jolla
Covering Repository site.
Let C(v,k,t) be the minimal number of blocks required for a (v,k,t)-
covering design, ie. a collection of k-subsets ("blocks") of a set of
v elements such that every t-subset of that set is contained in some
set of the collection.
The Schonheim bound says that:
C(v,k,t) >= (v/k) * C(v-1,k-1,t-1)
So, if one first analyzes the requirements of a (9,5,3)-covering design
and finds that 12 blocks are necessary for one of those, then:
C(10,6,4) >= (10/6) * 12 = 20
Putting this together with a 20 block "covering" settles the specific
case. Check out the 1995 paper by Dan Gordan, Greg Kuperberg,
and Oren Patashnik for a very readable account of constructions:
http://www.ccrwest.org/cover/cover.pdf
Dan's recent improvements page should point you to some search
terms for the latest and greatest innovations.
regards, chip