I've been lurking in this group for a while and been entertained (and
educated) by the postings.
Can anyone help? I need a routine that will create all possible
sequences (about 14M) for a 6/49 Lottery draw with no duplications. I
guess off the top of my head that the filesize created will be more
than 250 Megs. No problem on the new beast I have just bought.
There is too small a number of 'real' draws around the globe for me to
make valid conclusions from the data to my level of satisfaction.
There may be some unreliable results from some lotteries ;-)
My plan is to test a number of ideas against three criteria - at this
stage.
1. Theory: Probability calculations.
2. Actual Results. 30,000 in database to date.
3. Simulated Results: Hence the need for the routine.
Hope you can help.
Ed
Remove nospam from the email address.
It might be a good idea to produce the combinations in lexicographic
ordering. That's not too difficult and should give you an idea how to code
it. i.e.
You start with a list = [1,2,3,4,5,6] with indices 0,1,2,3,4,5. You can add
up to 43 to any or all of these values to get a valid combination, as long
as no numbers are repeated. If generating a lex ordering there are maximum
values any indexed element can take, from 44 for list[0], to 49 for list[5].
There is also the constraint that the list elements must be strictly
increasing from low index to high index. As long as you maintain these
constraints you just need to keep adding 1 to the highest indexed element
possible until you generate all combinations.
[1,2,3,4,5,6]
[1,2,3,4,5,7]
...
[1,2,3,4,5,49] reached max for list[5] so go to list[4]
[1,2,3,4,6,7] added 1 to list[4], so must set list[5] to list[4] + 1
[1,2,3,4,6,8] can now start adding to list[5] again
...
[1,2,3,4,6,49]
[1,2,3,4,7,8]
...
[1,2,3,4,48,49] list[5] and list[4] at max values so go to list[3]
[1,2,3,5,6,7] list[3] increased by 1, list[4]=list[3]=1,
list[5]=list[4]+1
[1,2,3,5,6,8] can now start adding to list[5] again
...
[1,2,3,5,6,49] list[5] at max, so try list[4]
[1,2,3,5,7,8]
...
[43,45,46,47,48,49] list[5] at max, so try list[4], try list[3] etc.
[44,45,46,47,48,49] all at max, finished
So that's the basic idea. Be careful how you code it because it can take a
while to run.
Duncan
"Ed Hopkins" <ed.ho...@no-spamntlworld.com> wrote in message
news:3bf6db24...@news.cable.ntlworld.com...
I would have liked a program that would do what I did but I don't know how
to program computers so I did it the hard way....which wasn't too hard
really.
Duncan Smith wrote in message <9t72ch$842$1...@newsg4.svr.pol.co.uk>...
Calculations and filling the array all performed in memory (Athlon 1.8
XP and 512 Megs DDR) - to improve speed.
On completion passed to a text file (on 60 Gb hard drive).
The time consuming part I expect to be the design, trial run and final
accuracy checks before and after the 14 M main run.
Hopefully there is a helpful soul out there who has already invented
the wheel :-)
Thanks for your response Duncan. Will study it in more depth after I
have sent this quick message back to the group.
Thanks for your thoughtful response.
Regards
Ed
You should be able to find similar code to this on the Web somewhere.
/* there are all sorts of variations of this code */
- find_all_combinations
{
/*
Print all combinations of "n" numbers taken "x" at a time.
Example: n=7, x=4
1 2 3 4
1 2 3 5
1 2 3 6
1 2 3 7
1 2 4 5
1 2 4 6
1 2 4 7
1 2 5 6
1 2 5 7
1 2 6 7
1 3 4 5
1 3 4 6
1 3 4 7
1 3 5 6
1 3 5 7
1 3 6 7
1 4 5 6
1 4 5 7
1 4 6 7
1 5 6 7
2 3 4 5
2 3 4 6
2 3 4 7
2 3 5 6
2 3 5 7
2 3 6 7
2 4 5 6
2 4 5 7
2 4 6 7
2 5 6 7
3 4 5 6
3 4 5 7
3 4 6 7
3 5 6 7
4 5 6 7
*/
int n,x,i;
int a[100]; /* reasonable limit for value of n(100)in this example */
n=49;
x=6;
/*
Initialization:
start with a[0]=0,a[1]=1,a[2]=2,....a[x]=x -1
also, i=x
*/
for(i=0;i<x+1;i++)
a[i]=i;
a[x]=a[x]-1;
i=x;
while(a[1] < n-x+1)
{
if(a[i] < n-x+i) /* increment to max value */
a[i]++;
else
{
i--;
while(a[i] >= n-x+i) /* shift left */
i--;
a[i]++; /* increment */
while(i < x) /* roll-over all numbers to right */
{a[i+1]=a[i]+1;i++;}
} /* end else */
/* print combination here : a[0],a[1],a[2], .....a[x] */
} /* end while(a[1] < n-x+1) */
return self;
}
>Can anyone help? I need a routine that will create all possible
>sequences (about 14M) for a 6/49 Lottery draw with no duplications. I
>guess off the top of my head that the filesize created will be more
>than 250 Megs. No problem on the new beast I have just bought.
Below is a post of a very rough routine I
wrote a few years back for GFA Basic Windows
and DOS.
You should be able to convert it easily.
For Next Loops might be easier for fixed
values, could be quicker as well.
e.g.
For a% = 1 To 44
For b% = (a% + 1) To 45
For c% = (b% + 1) To 46
For d% = (c% + 1) To 47
For e% = (d% + 1) To 48
For f% = (e% + 1) To 49
SUB or routines here.
Next etc.
'/**** etc are REM's
a% ++ Means ADD 1 to a%
GFA variables are not quite the same as
Microsoft. GFA is also a lot faster especially
when compiled. Running through all 14M
combos (no printout) takes about .3 secs
on a 667 PIII machine.
The following saved as say lottery.lst should run
in the 16 Bit Demo version of GFA Win from
http://gfasoft.gfa.net/eng/index.htm
It will not work without proper variable
declarations on the 32 Bit version.
16 bit is very forgiving ;-)
Warning try say a 20 ball lottery first to see
how long it takes.
==============Paste=====================
Openw #1,30,30,_x * .6,_y * .8,-1
ClearW #1
Deflng "A-Z" '/** Does not seem to work default = #
tim1% = Timer
f$ = "rollout3.txt"
Open "o",#1,f$
a% = 1 , b% = 2, c% = 3, d% = 4, e% = 5, f% = 6
LotterySize% = 49 '/** Maximum size of lottery 25, 49 etc
RollSize% = 6
TotalCombs% = 0
Dim Lot%(6)
'===========================================================
Do Until a% = LotterySize% - RollSize% + 2 '/*** LOOP 1 ***/
Do Until b% = LotterySize% - 2 '/*** LOOP 2 ***/
Do Until c% = LotterySize% - 2 '/*** LOOP 3 ***/
Do Until d% = LotterySize% - 1 '/*** LOOP 4 ***/
Do Until e% = LotterySize% '/*** LOOP 5 ***/
Do Until f% >= LotterySize% + 1 '/*** INNER LOOP***/
Lot%(1) = a% 'assign draw to array
Lot%(2) = b%
Lot%(3) = c%
Lot%(4) = d%
Lot%(5) = e%
Lot%(6) = f%
TotalCombs% ++ '/ linecount of total combs
Print #1, Using "##",Lot%(1);
For ct| = 2 To 6
Print #1, Using "###",Lot%(ct|);
' Print #1, " "; ' space as required
Next ct|
Print #1
'Print " Total Cmbs = ";TotalCombs% ' to screen only
f% ++
Loop ' /*** INNER LOOP x ***/
e% ++
f% = e% + 1
Loop ' /*** LOOP 4 ***/ d
d% ++
e% = d% + 1, f% = e% + 1
Loop ' /*** LOOP 4 ***/ d
c% ++
d% = c% + 1 , e% = d% + 1, f% = e% + 1
Loop ' /*** LOOP 3 ***/ c
b% ++
c% = b% + 1, d% = c% + 1 , e% = d% + 1, f% = e% + 1
Loop ' /*** LOOP NO 2 ***/ b
'-----------------------------------------------------------
Print " End of Loop 1 "
a% ++
b% = a% + 1, c% = b% + 1, d% = c% + 1 , e% = d% + 1, f% = e% + 1
Loop ' /*** LOOP NO 1 ***/ a
Close #1
Print " Time taken ";
Print (Timer - tim1%) / 1000;
Print " secs"
Print
Print
'===================================================
Print " For Lottery of ";LotterySize%;" balls"
Print " Total possible combinations = ";TotalCombs%
Print
Sound 250,.9
Print
Print " Press Enter to finish"
'Keyget w%
Input ask$
'Screen 3 'DOS GFA Only
'Edit 'DOS
CloseW #1
'========================================================
Duncan
"Ed" <ed.ho...@nospamntlworld.com> wrote in message
news:3bf716cb...@news.cable.ntlworld.com...
>ed.ho...@no-spamntlworld.com (Ed Hopkins) wrote:
>
>>Can anyone help? I need a routine that will create all possible
>>sequences (about 14M) for a 6/49 Lottery draw with no duplications. I
>>guess off the top of my head that the filesize created will be more
>>than 250 Megs. No problem on the new beast I have just bought.
>
>Below is a post of a very rough routine I
>wrote a few years back for GFA Basic Windows
>and DOS.
I compiled the program to check that it works OK.
It took 313 secs to generate and save to disk all
the combinations for a 6/49 Lottery (14M lines).
The file size was 253mb and it can be viewed
and scrolled using a decent text editor. (The
freeware PFE Editor works slowly, but OK).
I'm not sure if such a large file is of any use
for programming. It would be better to apply
a filter first. Saving 2 or 3 draws to each line
might speed up access.
Just out of curiosity I compressed it with
PKZIP it finished up at 33.7mb still quite
large, so don't as me to e-mail it to ya. ;-)
I may add a routine to save only 1+2+3 draws
that would reduce it to 2,851,200 lines (51mb)
Sean B
When I have it working I will post the VB source code to the group.
Oh, if anyone wants me to run some tests for them on the 14M Sequences
send me a wish list and I will see what I can do.
Ed
On Sun, 18 Nov 2001 18:12:46 -0000, "Duncan Smith"
As I do not know probability theory at an advanced level. The
alternatives seem to be:
1. Study statistics at advanced level. No time - gotta work.
2. Ask an expert ;-)
3. Use brute force to count sequences to get the probability.
One plan is to count sequences that meet particular criteria related
to the 'gaps' between the ball numbers.
For example I would not use 1,2,3,4,5,6 in the lottery because the
'gap' between each number is the same - too much 'order' in a
'disordered' (random) lottery system. The sum of the 'gap' size is
also very small.
I don't know if there is value following this particular line of
thought although I do have a gut feeling that if any analysis shows a
bell curve it should be useful in excluding particular sequences.
Ed
Frank
"Richard McTavish" <ra.mc...@sympatico.ca> wrote in message
news:3BF7389F...@sympatico.ca...
And Python, although C is probably best for this job. Sorry, I don't use
(or know any) Java.
Duncan
When you start looking at things like that it can get difficult to calculate
the probabilities, so filtering to get them is probably the best thing to
do.
>
> For example I would not use 1,2,3,4,5,6 in the lottery because the
> 'gap' between each number is the same - too much 'order' in a
> 'disordered' (random) lottery system.
Which of these sequences of coin tosses is more likely?
H H H H H H or H T H T T H
They are both equally likely. (1/64)
What's the probability of 6 heads in 6 tosses? 1/64
What's the probability of 3 heads in 6 tosses? 20/64
Do you see what I'm getting at?
The sum of the 'gap' size is
> also very small.
>
> I don't know if there is value following this particular line of
> thought although I do have a gut feeling that if any analysis shows a
> bell curve it should be useful in excluding particular sequences.
It's easy to get a bell curve. Just look at the distribution of sums. It
is approximately normally distributed. Note: this is the theoretical
distribution under the assumption that all combinations are equally likely.
So if you look at the empirical distribution of the sums, you want anything
BUT a bell curve. If you generate a distribution by filtering all possible
combinations, then you want to see something (significantly) different to
that distribution in the draw history.
Duncan
Frank
"Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message
news:9tbk9j$rop$1...@news8.svr.pol.co.uk...
I have written a small app. that will write all lottery combinations for a
k/N lottery to a space delimited text file. Lex ordering as per my earlier
post. It's written in Python and I haven't optimised it, so it will take
maybe 40 mins to write all combs for a 6/49. But if you only need to run it
once... Anyone want it?
Duncan
>Which of these sequences of coin tosses is more likely?
>
>H H H H H H or H T H T T H
>
>They are both equally likely. (1/64)
>
>What's the probability of 6 heads in 6 tosses? 1/64
>What's the probability of 3 heads in 6 tosses? 20/64
>
>Do you see what I'm getting at?
>
Sorry Duncan, I'm not sure what you'r getting at. Please explain or
give me some more clues. Maybe it's too late in the evening for me.
>The sum of the 'gap' size is
>> also very small.
>>
>> I don't know if there is value following this particular line of
>> thought although I do have a gut feeling that if any analysis shows a
>> bell curve it should be useful in excluding particular sequences.
>
>It's easy to get a bell curve. Just look at the distribution of sums. It
>is approximately normally distributed.
I agree. No problem so far......
>Note: this is the theoretical
>distribution under the assumption that all combinations are equally likely.
Err.... ok (really - it's ok) I accept that there is an equal
likyhood of any ball being selected in one draw. In addition, any
group of balls (6 in a 6/49) are also equally likely to be drawn.
But, I have this nagging thought that there are more odd numbered
balls than even numbered balls (in the pool of 49 balls of a 6/49
lottery). The implication being that there is a slightly increased
probability that the first ball drawn will be an odd numbered ball
with a knock-on effect when all six balls are drawn. Hmm......
interesting network of possibilities here for the math experts in the
group.
>So if you look at the empirical distribution of the sums, you want anything
>BUT a bell curve. If you generate a distribution by filtering all possible
>combinations, then you want to see something (significantly) different to
>that distribution in the draw history.
Sorry, I got lost when I reached the part about empirical
distribution. My free education showing its true value.
But, surely sequences CAN be excluded with 95% or 99% certainty when
there is a distribution curve. Whether it is Normal, Binomial, or
Hypergeometric etc I don't know the technical aspects of these curves
or their purpose (too advancd) but I seem to recall that using the
mean and standard deviation, exclusions can be made with 95% or 99%
certainty. This should be a useful strategy to remove unlikely
sequences.
I do think there are benefits in using both methods.
Regards
Ed
No, it WAS a bit vague. The point is that most people would intuitively say
that the all heads sequence was less probable, because it looks less typical
of a sequence of coin tosses. ie. because it doesn't have an approximately
equal number of heads and tails. In the same way, just because a lottery
combination is atypical in terms of its sum, odd/even split etc., it doesn't
mean it is less probable. Mid-range sums (around 130 - 170 for a 6/49)
would be expected to come up relatively frequently because there's more of
them. (Each sum in this range has probability over 1%, outside this range
all probabilities are less than 1%.) It appears that they are drawn with
pretty much the frequency you would expect from a fair lottery.
>
> >The sum of the 'gap' size is
> >> also very small.
> >>
> >> I don't know if there is value following this particular line of
> >> thought although I do have a gut feeling that if any analysis shows a
> >> bell curve it should be useful in excluding particular sequences.
> >
> >It's easy to get a bell curve. Just look at the distribution of sums.
It
> >is approximately normally distributed.
>
> I agree. No problem so far......
>
> >Note: this is the theoretical
> >distribution under the assumption that all combinations are equally
likely.
>
> Err.... ok (really - it's ok) I accept that there is an equal
> likyhood of any ball being selected in one draw. In addition, any
> group of balls (6 in a 6/49) are also equally likely to be drawn.
>
Under the assumption that the lottery is fair (in the probabilistic sense,
rather than meaning rigged).
> But, I have this nagging thought that there are more odd numbered
> balls than even numbered balls (in the pool of 49 balls of a 6/49
> lottery). The implication being that there is a slightly increased
> probability that the first ball drawn will be an odd numbered ball
> with a knock-on effect when all six balls are drawn. Hmm......
> interesting network of possibilities here for the math experts in the
> group.
Well yes. A 4/2 odd/even split is slightly more likely than a 2/4 split.
But again, there's slightly more possible 4/2 splits than 2/4 splits.
That's why the probabilities are different, not because certain combinations
are more likely. (After all, the calcs ARE based on them all being equally
likely.)
>
> >So if you look at the empirical distribution of the sums, you want
anything
> >BUT a bell curve. If you generate a distribution by filtering all
possible
> >combinations, then you want to see something (significantly) different to
> >that distribution in the draw history.
>
> Sorry, I got lost when I reached the part about empirical
> distribution. My free education showing its true value.
Just the frequency distribution of the observed data, rather than the
theoretical probability distribution. You wouldn't expect them to be
identical, but they should be 'pretty close' if the lottery is fair.
>
> But, surely sequences CAN be excluded with 95% or 99% certainty when
> there is a distribution curve. Whether it is Normal, Binomial, or
> Hypergeometric etc I don't know the technical aspects of these curves
> or their purpose (too advancd) but I seem to recall that using the
> mean and standard deviation, exclusions can be made with 95% or 99%
> certainty. This should be a useful strategy to remove unlikely
> sequences.
You can produce prediction intervals that way. So you lop off the 5% in top
tail and 5% in the bottom tail and you can be 90% confident the next
'observation' is within the retained 90% (or some corresponding range of
sums or whatever). But, as long as the lottery is fair, you are just
getting rid of 10% of the combinations in order to be 90% sure you haven't
rejected the winner.
If you WERE trying to predict the sum, you would definitely be better of
choosing 150 rather than 21. 165772 to 1 in your favour.
But you want to predict the winning ticket (not the sum, odd/even split,
deltas...). If you pick 21 and the sum is 21, you win the jackpot. If you
pick 150 and the winning ticket sums to 150, it's 165772 to 1 against you
winning; simply because there are 165772 combinations that sum to 150.
Go back to my coin tossing game. You need to predict the sequence to win.
H H H H H H is just as likely as H T H T T H. You might feel like playing H
T H T T H because it looks more typical. Well, you're more likely to match
the number of heads, but no more likely to match the sequence. Of course,
you're no worse off in terms of your chances of winning, it would just be a
mistake to think you'd improved your chances.
Duncan
So, "before selecting a filter/figure be aware of the lines it costs."
Frank
"Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message
news:9tcdln$s1o$1...@news6.svr.pol.co.uk...
Unfortunately VB isn't any better - the graphical front end stuff is
nice but underlying it you've got ancient creaking-at-the-joints
Microsoft Basic, and if you want to do any serious software development
you have to shift into a decent language.
I'm still looking for a decent language which supports hugely better
precisions than the IEEE standard for double-precision floating point
so that I can rewrite my lottery system in it.
Evil Nigel
Frank
"Nigel Hurneyman" <ni...@loud-n-clear.com> wrote in message
news:3BFB81F5...@loud-n-clear.com...
Duncan
I now have the equivalent written in C. Same thing, but only takes around 7
minutes. Anyone who can handle a 255Mb text file is welcome to it (source
code or exe).
Duncan
I'll make my best ... but I don't promise anything ... (another computer
language?! to learn?! ...)
Cheers,
Frank
"Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message
news:9thhkn$i0l$1...@newsg2.svr.pol.co.uk...
pray do tell. What would this file be used for and how?
not to pry that is just curios...
Ed asked for this so we just obliged.
We are always open to suggestions
regarding methods of storing filtered
draws to disk. The best method I can
think of would be switching bits on/off
in a file approx. 1.7mb in size. I've
never actually tried this in practice
though.
Perhaps you could suggest a more
convenient way to store the combinations
other than in a 253mb file, I'm sure it
would be very useful.
Cheers,
Sean B
>I now have the equivalent written in C. Same thing, but only takes around 7
>minutes. Anyone who can handle a 255Mb text file is welcome to it (source
>code or exe).
>
>Duncan
Hi Duncan,
I haven't used C or C++ for a years, but would
be interested in looking at how this is done
in C.
Possibly if its under say 500 lines you could
post the code here, alternatively please e-mail
it to me (remove the XXX). Thanks.
I'm not interested in the EXE, because I've
already got my compiled GFA Basic version,
which is about the same speed.
Cheers,
Sean B
The alternative is to use 'brute force' methods which I think most
people have done at some stage (probably with excel or another
spreadsheet).
Computers are now powerful enough for the home user to work on
extremely large data sets. For example, all possible combinations in
a 6/49 lottery.
Within five years I think the home user will be able to create, store
and analyse lottery simulations of many millions of sequences. For
example, to analyse the 'gaps' or 'age' between events on a data set
that is large enough to give meaningful conclusions. But, this is
another story........
In answer to the question
>>pray do tell. What would this file be used for and how?
Short answer: Why do people climb mountains?
Long answer: To use brute force to analyse the data set.
For example, to count sequences to compare with probability
calculations. To count sequences instead of trying to calculate
probability calculations that may be too tedious or complicated to
perform.
Homework: Topic - Science History - Middle Ages - Readabout Tycho
Brahe, name the guy who got hold of Tycho's data. Describe the the
three laws of planetary motion he deduced from the data.
The uses will depend on the expertise of the user. For me I hope to
gain useful insights by just doing the analysis. It may be the case
that I will become bored with the effort and want to create a random
lottery dataset that contains many millions of sequences so I can
analyse events over time. In any case, I will learn from the
experience.
People in this group have been a great help and I appreciate all the
responses. At least we now know the level of programming expertise in
the group and the various programming languages we can call on if
required.
In other words, I may test the dataset for one reason and realise
something else. Just doing the thing will increase my comprehension.
Mail me and get a free CyberPint of your favourite frothy substance.
Cheers and Happy Christmas.
Ed
int main() {
int n, k, i, j, d;
cout << endl << "Please enter k and N for your k/N lottery" << endl;
cout << endl << "Combinations will be placed in results.txt" << endl;
cout << endl << "k?" << endl;
cin >> k;
cout << endl << "N?" << endl;
cin >> n;
ofstream f;
f.open("results.txt", ios::out);
f << setiosflags(ios::left);
int *a = new int[k]; //Array to contain the lowest combination
int *m = new int[k]; //Array to contain the highest combination
d = n - k; //d = m[i] - a[i] for all i
for(i=0; i<k; i++) {
a[i] = i + 1;
m[i] = i + d + 1;
}
d += 1; //when a[i] == d the highest combination has been reached (a == m)
while(a[0] != d){
for(i=0; i<k; i++){ //print the current combination
f << setw(2) << a[i] << " ";
}
f << endl;
i = k-1;
if(a[i] != m[i])
a[i] += 1; //increment the value in the rightmost position by 1
else{
for(i=k-2; i>=0; i--){
if(a[i] != m[i]){ //this value can be incremented by 1
a[i] += 1;
for(j=i+1; j<k; ++j){ //reset higher indexed values e.g. 33,34,35
a[j] = a[j-1] + 1;
}
break;
}
}
}
}
for(i=0; i<k; i++){
f << setw(2) << a[i] << " "; //print out the last combination
}
delete [] a;
delete [] m;
f.close();
return 0;
}
----------------------------------------------------------------------------
-------------------------
import Numeric
import sys
import string
print ''
print 'Enter k and N for your k/N lottery'
print ''
k = input('k = ')
n = input('N = ')
print ''
f = open('results.txt', 'w')
print 'Processing...'
print ''
sys_stdout = sys.stdout
sys.stdout = f
def combinations(n, k): # produces combinations in lex order
current = range(1, k + 1)
max = range(n - k + 1, n + 1)
while current != max:
for number in current:
print '%2d'% (number),
print ''
index = -1
if current[index] != max[index]:
current[index] = current[index] + 1
else:
for indices in range(-2, -k - 1, -1):
if current[indices] != max[indices]:
current[indices] = current[indices] + 1
for other_indices in range(indices + 1, 0):
current[other_indices] = current[indices] +
other_indices - indices
break
for number in max:
print '%2d'% (number),
print ''
print ''
combinations(n, k)
f.close()
sys.stdout = sys_stdout
print 'Results are in', filename
----------------------------------------------------------------------------
-----------------------------
Thanks for posting this Duncan. Now I
remember why I stopped using C++ ;-)
Python looks interesting, I'll download
it later.
I'll have a look at the code later when I've
finished looking at the ridiculously high
number of 1+2+3 draws in the UK Lottery.
It would appear that the lexicographic ref.
no. of any combination could be calculated
for your method of traveling through the draws.
A poster of a couple of years ago (Don McDonald)
used a method of numbering combinations and
also supplied the formula. It could well be the
same sequence. Reversal did not seem
possible though.
The size of the datafile could probably be
reduced by saving the complete line only
when 2 Ball numbers change. Still be 14M
lines though!!! Ugh!!!!
Might work putting all the numbers on one line.
1 2 3 4 5 6
7 8 9 10 11 12 13 14 .......... 47 48 49
1 2 3 4 6 7
8 9 10 11 12 13 14 15 .......... 47 48 49
etc.
Cheers,
Sean B
Eli
"Ed Hopkins" <ed.ho...@no-spamntlworld.com> wrote in message
news:3bf6db24...@news.cable.ntlworld.com...
> Hi,
>
> I've been lurking in this group for a while and been entertained (and
> educated) by the postings.
>
> Can anyone help? I need a routine that will create all possible
> sequences (about 14M) for a 6/49 Lottery draw with no duplications. I
> guess off the top of my head that the filesize created will be more
> than 250 Megs. No problem on the new beast I have just bought.
>
I've maybe missed something, but what do you mean by 1+2+3 draws? How many
have there been in how many draws?
>
> It would appear that the lexicographic ref.
> no. of any combination could be calculated
> for your method of traveling through the draws.
> A poster of a couple of years ago (Don McDonald)
> used a method of numbering combinations and
> also supplied the formula. It could well be the
> same sequence. Reversal did not seem
> possible though.
>
What do you mean by reversal? Producing the combination from the ref. no.?
> The size of the datafile could probably be
> reduced by saving the complete line only
> when 2 Ball numbers change. Still be 14M
> lines though!!! Ugh!!!!
>
Probably needs a computer scientist, but how efficiently can bit strings be
stored? All we need are 14M bit strings of length 49. Then the algorithm
is basically just bit shifting. Do we then need a low level language to
implement it?
Duncan
So what happens if you load a machine with balls numbered from 1 to 7 but
in seven different colours? Do the odds change? What about numbering the
balls with pictures of peoples faces instead of numbers? Do the odds
change? Finally ... what if I number the balls with pictures of numbers?
Do the odds change?
--
john R. Latala
jrla...@golden.net
>
>"Sean B" <se...@XXXsbacc3.fsnet.co.uk> wrote in message
>news:04910u8dmecaem43j...@4ax.com...
>> "Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote:
>>
>> Thanks for posting this Duncan. Now I
>> remember why I stopped using C++ ;-)
>> Python looks interesting, I'll download
>> it later.
>>
>> I'll have a look at the code later when I've
>> finished looking at the ridiculously high
>> number of 1+2+3 draws in the UK Lottery.
>
>I've maybe missed something, but what do you mean by 1+2+3 draws? How many
>have there been in how many draws?
>
Dividing the 49 balls into groups 1-9,10-19,20-29 etc.
1 from each group, 3 from each group etc.
1+2+3 would be any order, 2+3+1 etc.
In other words all the possible ways of entering
1+2+3
(2,851,200 out of 13,983,816 = 20.39% possible)
This is currently running at just below 30% and has been
high for hundreds of draws. I'm not complaining
Lottery Analysis pays off and by playing only
1+2+3's I am increasing my chances of winning ;-)
I first spotted this advantage around draw 50 of the
UK Lottery.
I will post the analysis in the next few days.
>>
>> It would appear that the lexicographic ref.
>> no. of any combination could be calculated
>> for your method of traveling through the draws.
>> A poster of a couple of years ago (Don McDonald)
>> used a method of numbering combinations and
>> also supplied the formula. It could well be the
>> same sequence. Reversal did not seem
>> possible though.
>>
>
>What do you mean by reversal? Producing the combination from the ref. no.?
>
Yes it can be calculated, but, from the ref.
no. its either trial and error, look-up tables
or loop through a program.
>> The size of the datafile could probably be
>> reduced by saving the complete line only
>> when 2 Ball numbers change. Still be 14M
>> lines though!!! Ugh!!!!
>>
>
>Probably needs a computer scientist, but how efficiently can bit strings be
>stored? All we need are 14M bit strings of length 49. Then the algorithm
>is basically just bit shifting. Do we then need a low level language to
>implement it?
>
14M bit strings of length 49 would still be quite large.
A single file of 14m bits (about 1.7mb) would be
sufficient, just turn off the bit of any combination
ref. no that you want to filter out.
It should be possible in any language. One way
would be just convert binary strings to hex. and
save them using the largest strings possible.
Many versions of even Basic can also manipulate
bits directly.
It must be possible, because many commercial
programs such as Lotwin filter down from large
numbers of combinations fairly quickly.
It's something I keep meaning to look into, but
never seem to be able to find the spare time.
Cheers.
Sean B
Well I agree with that figure.
> This is currently running at just below 30% and has been
> high for hundreds of draws. I'm not complaining
> Lottery Analysis pays off and by playing only
> 1+2+3's I am increasing my chances of winning ;-)
That I'm not so sure of (yet?) ;-)
> I first spotted this advantage around draw 50 of the
> UK Lottery.
>
> I will post the analysis in the next few days.
>
> >>
> >> It would appear that the lexicographic ref.
> >> no. of any combination could be calculated
> >> for your method of traveling through the draws.
> >> A poster of a couple of years ago (Don McDonald)
> >> used a method of numbering combinations and
> >> also supplied the formula. It could well be the
> >> same sequence. Reversal did not seem
> >> possible though.
> >>
> >
> >What do you mean by reversal? Producing the combination from the ref.
no.?
> >
>
> Yes it can be calculated, but, from the ref.
> no. its either trial and error, look-up tables
> or loop through a program.
>
I'll maybe have a look at this.
> >> The size of the datafile could probably be
> >> reduced by saving the complete line only
> >> when 2 Ball numbers change. Still be 14M
> >> lines though!!! Ugh!!!!
> >>
> >
> >Probably needs a computer scientist, but how efficiently can bit strings
be
> >stored? All we need are 14M bit strings of length 49. Then the
algorithm
> >is basically just bit shifting. Do we then need a low level language to
> >implement it?
> >
>
> 14M bit strings of length 49 would still be quite large.
> A single file of 14m bits (about 1.7mb) would be
> sufficient, just turn off the bit of any combination
> ref. no that you want to filter out.
>
Bit strings would be quick to filter.
> It should be possible in any language. One way
> would be just convert binary strings to hex. and
> save them using the largest strings possible.
> Many versions of even Basic can also manipulate
> bits directly.
>
I can't figure out how to do that in Python (I usually code in Python before
converting to C++ if necessary).
> It must be possible, because many commercial
> programs such as Lotwin filter down from large
> numbers of combinations fairly quickly.
>
> It's something I keep meaning to look into, but
> never seem to be able to find the spare time.
>
Me too.
Duncan
I don't visit the bookies but I have read that bookmakers change the
odds quite frequently.
Arn't odds calculated differently from probabilities?
To expand on my earlier statement. The set of all possible
probabilities do not 'change'. Whatever probability you get depends
on the the question being asked
For example, in a 6/49 lottery;
Question: What is the probability of the first drawn ball being odd?
Answer: There are 25 odd numbered balls. Probability = 25/49.
Question: What is the probability of the first drawn ball being even?
Answer: There are 24 even numbered balls. Probability = 24/49.
The probabilities have not changed.
There is a different answer (probability) because each question is
different.
My statement is correct.
Your statements are correct.
I agree that the numbers on the balls are only labels and have no
effect on the draw.
But, these labels are useful and can be used to help us design
interesting questions.
Interesting questions:
What is the probability of a number selected in the previous draw
being selected again in the next draw?
What is the probability of the same six balls being selected in
consecutive lottery draws?
What formula should I use to be 99% confident that the same six
numbers are not repeated consecutively?
Most interesting question: A years supply of CyberJuice to the first
correct answer. :o)
During the last five draws the lowest numbered balls form a group of
five named as Set5. What formula should I use to be 99% confident of
the 'age' or 'gap' between Set5 and the appearance of another Set5?
My point here is that numbers on balls are useful. I would be hard
pushed to design the same questions without ball numbers.
Ed
>"Sean B" <se...@XXXsbacc3.fsnet.co.uk> wrote in message
Duncan,
I've managed to find an old post showing Don's
method of numbering, I'm not sure if this is the same
as your order. It should be of interest though.
Regards.
Sean B
=============pasted=================================
Newsgroups: rec.gambling.lottery,alt.lotto.players,
sci.stat.math,alt.sci.math.combinatorics,
alt.sci.math.statistics.prediction,alt.sci.math.probability
Subject: Re: Algorithm wanted on combination
Date: Mon, 31 May 1999 19:14:38 GMT
(Sean B) wrote:
>GFA Basic has a Combin(ation) Function so I knocked up the
>following brief program which will run on the free trial version
>of GFA.
>
-> snip
>
The modified program listed below will accept any legitimate set
of 6 balls via the keyboard. They must be in ascending
order. I could have added a sort but it would take up
more space.
It should be possible to write a program that will reverse
this calculation. A test would be needed to find the value
of the 6th ball; fairly easy because all starting values after
the increment of the 6th ball equal 0+0+0+0+0+(6th Ball value)
So if we want to find the 6 numbers that are value 120,500 a test
of the combinations of the 6th ball only would give
1-2-3-4-5-25 = 134,597 and 1-2-3-4-5-24 = 100,948 so the 6th ball
must be 24. Quite possibly the other values could be found in a
similar way. The value of the 6th ball having been found would
remain a constant for the rest of the program; so it could probably
be deducted 120,500 - 100,948 = 19,552 The 5th ball now sets
the values from 0+0+0+0+(5th Ball value) to 0+0+0+0+(5th Ball value
+1)
so we would have to find the two 5th ball values that 19,552 lies
between and take the lower on...... then proceed to the 4th ball.
Happy pondering,
Sean B
>
>'//* or ' = REM statement
>
>'===================save as COMBNUMS.LST===================
>
>'Program will run on the 16 bit Windows trial version
>'of GFA Basic (runs OK on W95) runs on the DOS
>'version as well.
>'download from http://www.gfasoft.gfa.net/eng/trials.htm
>'From file menu select new, then merge, then choose COMBNUMS.LST
>'Delete these 6 lines if they cause any problems.
Screen 18 '//* For DOS ver of GFA
Color 7,1 '//* ignored by Windows ver. (this ain't Microsoft)
Titlew #1, " Lottery Combination Calculator v1.01 "
Openw #1,30,30,_x * .9,_y * .9,-1
ClearW #1
Defdbl "A-Z"
'Following algorithm by Don McDonald
'Arranged into ascending order, the lucky numbers were
'3 5 14 22 30 44. The combination sequence number of these
'numbers is :
'2C1 + 4C2 + 13C3 + 21C4 + 29C5 + 43C6 +1 = 6,221,489.
'//* Alter the 6 balls below
A = 1
B = 2
C = 3
D = 5
E = 6
F = 9
'//* If you rem out the PRINT and INPUT lines below
'//* the values given above will be used instead of the
'//* keyboard entry
Print '//* rough and ready input routine follows.
Print "Enter Six Sorted Lottery numbers, press enter after each one "
Print
Input A,B,C,D,E,F
'//* reduce all by 1
A = A - 1
B = B - 1
C = C - 1
D = D - 1
E = E - 1
F = F - 1
If A >= 1
AC = Combin(A,1)
Else
AC = 0
EndIf
Print
Print
Print " 1st Combination Total = ";AC
If B >= 2
BC = Combin(B,2)
Else
BC = 0
EndIf
Print " 2nd Combination Total = ";BC
If C >= 3
CC = Combin(C,3)
Else
CC = 0
EndIf
Print " 3rd Combination Total = ";CC
If D >= 4
DC = Combin(D,4)
Else
DC = 0
EndIf
Print " 4th Combination Total = ";DC
If E >= 5
EC = Combin(E,5)
Else
EC = 0
EndIf
Print " 5th Combination Total = ";EC
If F >= 6
FC = Combin(F,6)
Else
FC = 0
EndIf
Print " 6th Combination Total = ";FC
X = AC + BC + CC + DC + EC + FC + 1
Print
Print " Lottery Line = ";
Print A + 1,B + 1,C + 1,D + 1,E + 1,F + 1
Print
Print " Combination Number = ";
Print Using"###,###,###",X
Print
Print
Print
Print " Any Key to finish"
Keyget w
CloseW #1
"Eli Bielawski" <eli...@execulink.com> wrote in message news:<u02c4ui...@corp.supernews.com>...
Would the following adaptation work?
for i:=1 to 49-5 do
for j:=i+1 to 49-4 do
for k:=j+1 to 49-3 do
for l:=k+1 to 49-2 do
for m:=l+1 to 49-1 do
for n:=m+1 to 49 do
writeln(f, i,',',j,',',k,',',l,',',m,',',n);
Evil Nigel
C(N,k) - (C(n-a,k) + C(n-b,k-1) + ... + C(n-y,2) + C(n-z,1) + 1)
Obviously, just remove the 1 in the brackets if the indexing starts at 1.
Now to figure out if it's easily inverted.
Duncan
"Sean B" <se...@XXXsbacc3.fsnet.co.uk> wrote in message
news:rsjb0uc5dp18hne2r...@4ax.com...
i = C(N,k) - (C(N-a,k) + C(N-b,k-1) + ... + C(N-y,2) + C(N-z,1) + 1)
It turns out that it is easily inverted.
'a' is the maximum such that C(N-a,k) !> C(N,k) - i
'b' is the maximum such that C(N-b,k-1) !> C(N,k) - i - C(N-a,k)
'c' is the maximum such that C(N-c,k-2) !> C(N,k) - i - C(N-a,k)
-
C(N-b,k-1)
etc.
Not a closed form solution, but will run pretty quick when coded up.
Duncan
Duncan
Excellent, Duncan this now provides a
calculation that supplies the same number
as a loop. I realised after looking further
into Don's sequence that writing a loop
to provide the same numbers was not
straightforward (see below).
"Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote:
>
>"Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message
>news:9uat7g$v52$1...@news8.svr.pol.co.uk...
>> Sean,
>> Don's numbering method is for some other ordering. I can see why
>it
>> might be useful to be able to work out the index of a combination, so I
>have
>> (after some time) figured out how it can be done for a lex ordering. For
>> the line a,b,...,y,z such that a<b<...<y<z for a k/N lottery the index
>> (starting at zero) is given by:
>>
>> C(N,k) - (C(n-a,k) + C(n-b,k-1) + ... + C(n-y,2) + C(n-z,1) + 1)
>>
>> Obviously, just remove the 1 in the brackets if the indexing starts at 1.
>>
>> Now to figure out if it's easily inverted.
>>
>> Duncan
>>
>That should be:
>
>i = C(N,k) - (C(N-a,k) + C(N-b,k-1) + ... + C(N-y,2) + C(N-z,1) + 1)
>
Should be no problems with this.
>It turns out that it is easily inverted.
>
>'a' is the maximum such that C(N-a,k) !> C(N,k) - i
>'b' is the maximum such that C(N-b,k-1) !> C(N,k) - i - C(N-a,k)
>'c' is the maximum such that C(N-c,k-2) !> C(N,k) - i - C(N-a,k)
I'm not completely clear on this without actually
trying it out..... shouldn't C(N-b,k-1) also be on
the 'c' is the maximum line?
> -
>C(N-b,k-1)
>etc.
>
>Not a closed form solution, but will run pretty quick when coded up.
>
Yep, I assume you mean to test ascending values
of k in a loop. Lookup tables could probably speed
things up, at least for the 1st ball, which can only
have 44 possible values.
>Duncan
>
>
Dons Comb. Sequence.
This has one big advantage, in that the
sequence numbers, remain the same
irrespective of Lottery size.
i.e. All the numbers in a 6/42 Lottery will
be the same as the first 42 balls of a
6/49 Lottery. So all loops, lookup tables etc.
would work for any size of Lottery
Seems difficult to write into a computer
loop though..... any volunteers? ;-)
1 2 3 4 5 6
1 2 3 4 5 7
1 2 3 4 6 7
1 2 3 5 6 7
1 2 4 5 6 7
1 3 4 5 6 7
2 3 4 5 6 7
1 2 3 4 5 8
1 2 3 4 6 8
1 2 3 5 6 8
1 2 4 5 6 8
1 3 4 5 6 8
2 3 4 5 6 8
1 2 3 4 7 8
1 2 3 5 7 8
1 2 4 5 7 8
1 3 4 5 7 8
2 3 4 5 7 8
1 2 3 6 7 8
1 2 4 6 7 8
==============
perhaps easier to do/see reversed
6 5 4 3 2 1
7 5 4 3 2 1
7 6 4 3 2 1
7 6 5 3 2 1
7 6 5 4 2 1
7 6 5 4 3 1
7 6 5 4 3 2
___________
8 5 4 3 2 1
8 6 4 3 2 1
8 6 5 3 2 1
8 6 5 4 2 1
8 6 5 4 3 1
8 6 5 4 3 2
8 7 4 3 2 1 3rd col drops back?
8 7 5 3 2 1
8 7 5 4 2 1
8 7 5 4 3 1
8 7 5 4 3 2
8 7 6 3 2 1 3rd col advances?
8 7 6 4 2 1
==============
The increment on each line moves
generally diagonally as below.
==============
Reversed
6 5 4 3 2 1
7 5 4 3 2 1
7 (6) 4 3 2 1
7 6 (5) 3 2 1
7 6 5 (4) 2 1
7 6 5 4 (3) 1
7 6 5 4 3 (2)
___________
8 5 4 3 2 1
8 (6) 4 3 2 1
8 6 (5) 3 2 1
8 6 5 (4) 2 1
8 6 5 4 (3) 1
8 6 5 4 3 (2)
8 7 4 3 2 1 3rd col drops back
8 7 (5) 3 2 1
8 7 5 (4) 2 1
8 7 5 4 (3) 1
8 7 5 4 3 (2)
8 7 6 3 2 1 3rd col advances
8 7 6 (4) 2 1
==============
Sean B
Yes, (it is below). It just didn't post the same as it looked on my
monitor. But I did make a mistake when I calculated this. I think it can
be corrected. Found the mistake when I was coding it.
eg. The C(N-a,k) comes from C(N-a+1,k) - C(N-a,k-1), and the correct result
will probably involve one or both of these terms.
>
> > -
> >C(N-b,k-1)
> >etc.
> >
> >Not a closed form solution, but will run pretty quick when coded up.
> >
> Yep, I assume you mean to test ascending values
> of k in a loop. Lookup tables could probably speed
> things up, at least for the 1st ball, which can only
> have 44 possible values.
>
Once I get it sorted out there should be some reasonably obvious
optimisations.
>
> >Duncan
> >
> >
> Dons Comb. Sequence.
> This has one big advantage, in that the
> sequence numbers, remain the same
> irrespective of Lottery size.
>
> i.e. All the numbers in a 6/42 Lottery will
> be the same as the first 42 balls of a
> 6/49 Lottery. So all loops, lookup tables etc.
> would work for any size of Lottery
>
I might have a go at this later (in Python first).
>That should be:
>
>i = C(N,k) - (C(N-a,k) + C(N-b,k-1) + ... + C(N-y,2) + C(N-z,1) + 1)
Duncan,
I've knocked together a rough program in
GFA BASIC to calculate the CSN using your
formula. Worked OK until I tried Balls in
the range 44-49. So I had to split the line
up and calculate each ball separately.
I haven't tried the reversal yet and won't get
a chance to until next weekend.
Cheers,
Sean B
================program paste================
' Works in the GFA Basic Demo from
' http://www.gfasoft.gfa.net/eng/trials.htm
Screen 18 '//* For DOS ver of GFA
Color 7,1 '//* ignored by Windows ver.
Titlew #1, " Lottery Combination Sequence Calculator v2.00 "
Openw #1,30,30,_x * .9,_y * .9,-1
ClearW #1
Defdbl "A-Z"
' Combination sequence number algorithm by Duncan Smith
' Save file as COMB_DS1.LST
' Choose File New then File Merge in GFA for Widows
' Pick this file from list
'redo: '//* only needed if you are entering via keyboard
'//* Change balls below
a = 44
b = 45
c = 46
d = 47
e = 48
f = 49
'//* If you rem out the PRINT and INPUT lines below
'//* the values given above will be used instead of the
'//* keyboard entry. Now jumped over instead.
Goto missinput '//* left in so input can be added later
Print '//* rough and ready input routine follows.
Print "Enter Six Sorted Lottery numbers, press enter after each one "
Print
Input a,b,c,d,e,f
missinput:
N = 49
k = 6
i = Combin(N,k)
If a = 44
ac = 0
Else
ac = Combin(N - a,k)
EndIf
If b = 45
bc = 0
Else
bc = Combin(N - b,k - 1)
EndIf
If c = 46
cc = 0
Else
cc = Combin(N - c,k - 2)
EndIf
If d = 47
dc = 0
Else
dc = Combin(N - d,k - 3)
EndIf
If e = 48
fc = 0
Else
ec = Combin(N - e,k - 4)
EndIf
If f = 49
fc = 0
Else
fc = Combin(N - f,k - 5)
EndIf
'the following (one line) will not work
'with 44-49 values, hence the above IF/ENDIF's
' t = Combin(N - a,k) + Combin(N - b,k - 1) +
' Combin(N - c,k - 2) + Combin(N - d,k - 3) +
' Combin(N - e,k - 4) + Combin(N - f,k - 5)
t = ac + bc + cc + dc + ec + fc '//* so add them up here
i = i - t '//* could be incorporated into line above
Print
Print " Lottery Line = ";
Print a ,b ,c ,d ,e ,f
Print
Print " Combination Number = ";
Print Using"###,###,###",i
Print " Combinations Total = ";
Print Using"###,###,###",Combin(N, k)
Print
Print " i = ";i
Print " t = ";t
Print
Print
Print " Any Key to finish"
Keyget w
'Goto redo '//* only if you are entering via keyboard
CloseW #1
Yes, strange (see below). I'll have to double-check that it works for a lex
ordering (although that's what I set out to do). Maybe it also has
something to do with the problems I'm having getting the reversal to work?
>>> import lottery
>>> lot = lottery.combinations(18,6)
>>> lot[500]
[1, 2, 4, 5, 10, 14]
>>> lottery.IndexFromComb(18, [1, 2, 4, 5, 10, 14])
500L
>>> lot[327]
[1, 2, 3, 8, 14, 17]
>>> lottery.IndexFromComb(18, [1, 2, 3, 8, 14, 17])
327L
>>> lot[18561]
[12, 13, 15, 16, 17, 18]
>>> lottery.IndexFromComb(18, [12, 13, 15, 16, 17, 18])
18560L
>>> lot[18560]
[12, 13, 14, 16, 17, 18]
<snip>
> > The size of the datafile could probably be
> > reduced by saving the complete line only
> > when 2 Ball numbers change. Still be 14M
> > lines though!!! Ugh!!!!
> >
>
> Probably needs a computer scientist, but how efficiently can bit strings
be
> stored? All we need are 14M bit strings of length 49. Then the algorithm
> is basically just bit shifting. Do we then need a low level language to
> implement it?
>
> Duncan
I have to assume that this file is going to be used by a UI in the first
place. Since this is more than likely the case (don't know what one would
do with the text file without a way to either associate information with
each entry, or do comparisons with the lottery's historical data), the first
move would certainly be to store it in a binary format.
In fact, with the Cash 5 (with a more manageable 201,376 possible
combinations) Lotto here in Colorado, a complete set of five numbers (1 <=
x <= 32) can be stored in a single four byte Integer (32 bits provide 32
Boolean values).
So, using two four byte integers, you could store a single set of 649
numbers. Comparatively speaking, the smallest space it would take to store
a set of 649 numbers in a text file (assuming best case - all single digits)
is 12 bytes (six bytes for six numbers, plus six spaces - see below), with a
worst-case maximum of 18 bytes.
Not only can bit-arrays save space, but by using logical bitwise operators,
calculations involving them can be extremely fast. From a cursory review of
your code (i.e., without analyzing the algorithm itself), you might try the
following:
1) Although it might not look as elegant, you could probably observe a
performance increase by eliminating the loop which iterates through the six
numbers. Output all the numbers with one statement. By combining the loop
which iterates six times with it's associated "f << endl;", you'll be
turning seven statements into one. Depending on what compiler you're using,
it could help out a little.
2) Regardless of whether that helps with speed, it should at least result in
shrinking the filesize by eliminating the unnecessary final " " (that's 14M
extra bytes).
2) You might also consider storing several sets of numbers on one line (four
sets would still fall into a reasonable 80 cols). Storing four sets per
line could save 10-19M bytes, depending on whether "endl" is a CR or a CRLF.
3) I don't have a C (ANSI or otherwise) help file in front of me, so I'm not
sure what the default buffering is when it comes to text files. I know that
the buffer can be set to zero (i.e., don't buffer anything), which is great
for data integrity, but generally bad for performance.
Flushing the buffer excessively requires overhead that can slow the process
considerably. Consider playing with the buffer size to see if you can
reduce the number of times that the buffer has to automatically flush
(assuming you're buffering the output in the first place).
However, ideally, this data belongs in a properly indexed database
(InterBase 6.0 Open Source is free!).
Even if you don't go with a binary format, you should still be able to speed
up the process and shrink the filesize considerably. Regardless, I find
this stuff quite fascinating, and hope my brief analysis hasn't been
completely useless <g>.
Cheers,
Andy Kazmaier
Define C(n,k) as zero for k>n and it should work. Python did the same (but
only because I neglected this in my code.).
Duncan
Yes, my next move. It should be doable in Python. I was maybe going to
look at using NumPy.
From a cursory review of
> your code (i.e., without analyzing the algorithm itself), you might try
the
> following:
>
> 1) Although it might not look as elegant, you could probably observe a
> performance increase by eliminating the loop which iterates through the
six
> numbers. Output all the numbers with one statement. By combining the
loop
> which iterates six times with it's associated "f << endl;", you'll be
> turning seven statements into one. Depending on what compiler you're
using,
> it could help out a little.
>
Yes.
> 2) Regardless of whether that helps with speed, it should at least result
in
> shrinking the filesize by eliminating the unnecessary final " " (that's
14M
> extra bytes).
>
> 2) You might also consider storing several sets of numbers on one line
(four
> sets would still fall into a reasonable 80 cols). Storing four sets per
> line could save 10-19M bytes, depending on whether "endl" is a CR or a
CRLF.
>
> 3) I don't have a C (ANSI or otherwise) help file in front of me, so I'm
not
> sure what the default buffering is when it comes to text files. I know
that
> the buffer can be set to zero (i.e., don't buffer anything), which is
great
> for data integrity, but generally bad for performance.
>
> Flushing the buffer excessively requires overhead that can slow the
process
> considerably. Consider playing with the buffer size to see if you can
> reduce the number of times that the buffer has to automatically flush
> (assuming you're buffering the output in the first place).
>
> However, ideally, this data belongs in a properly indexed database
> (InterBase 6.0 Open Source is free!).
>
Or NumPy array?
> Even if you don't go with a binary format, you should still be able to
speed
> up the process and shrink the filesize considerably. Regardless, I find
> this stuff quite fascinating, and hope my brief analysis hasn't been
> completely useless <g>.
>
Not at all.
> Cheers,
>
> Andy Kazmaier
>
>
i = C(N,k) - (C(N-a,k) + C(N-b,k-1) + ... + C(N-y,2) + C(N-z,1))
where C(a,b) = 0 when b>a.
It turns out that it is easily inverted. For combination a,b,c,... with
indexing starting at 1.
a = max(a) s.t. C(N-a,k) !> C(N,k) - i
b-a = max(b) s.t. C(N-b,k-1) !> C(N,k) - i - C(N-a,k)
c-b = max(c) s.t. C(N-c,k-2) !> C(N,k) - i - C(N-a,k)
-C(N-b,k-1)
etc.
Python code (default indexing start at 0) below:
---------------------------------------------------------
def IndexFromComb(N, comb=[], start=0): ##comb is list [a,b,...]
k=len(comb)
index = C(N, k) - 1 + start
for i in range(k):
index = index - C(N - comb[i], k - i)
return index
def CombFromIndex(N, k, index, start=0):
combsum = C(N, k) - index
if start == 0:
combsum = combsum - 1
comb = []
last_n = 0
for i in range(k):
n = last_n + 1
while C(N - n, k - i) > combsum:
n = n + 1
n = n + last_n
comb.append(n)
combsum = combsum - C(N - n, k - i)
return comb
def C(N, x):
if x > N: return 0
combs = factorial(N) / (factorial(x) * factorial(N-x))
return combs
def factorial(n):
if n < 0: return None
if n == 0 or n == 1: return 1
fact = 1L ##Long integer type
for i in xrange(2, n+1): ##i.e. [2,3,...,n]
fact = fact * i
return fact
>>> lottery.IndexFromComb(80, [3, 7, 12, 13, 15, 16, 17, 18, 44, 49, 52, 58,
59, 61, 65, 67, 69, 70, 72, 80])
1886694139820414439L
profiled at 0.02 secs
>>> lottery.CombFromIndex(80, 20, 1886694139820414439L)
[3, 7, 12, 13, 15, 16, 17, 18, 44, 49, 52, 58, 59, 61, 65, 67, 69, 70, 72,
80]
>>>
profiled at 0.86 secs
ed.ho...@no-spamntlworld.com (Ed Hopkins) wrote in message news:<3bf6db24...@news.cable.ntlworld.com>...
Cheers,
Sean B
>A decent test 80/20. Pretty quick.
>
Very impressive, sure beats jumping
through all them loops ;-)
Now we need a minimal way of saving
all the CSN's to a file ;-(
Sean B
--------------------------
"Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message
news:9ugjjg$2e1$1...@newsg3.svr.pol.co.uk...
<snip>
> > 3) I don't have a C (ANSI or otherwise) help file in front of me, so I'm
> not
> > sure what the default buffering is when it comes to text files. I know
> that
> > the buffer can be set to zero (i.e., don't buffer anything), which is
> great
> > for data integrity, but generally bad for performance.
> >
> > Flushing the buffer excessively requires overhead that can slow the
> process
> > considerably. Consider playing with the buffer size to see if you can
> > reduce the number of times that the buffer has to automatically flush
> > (assuming you're buffering the output in the first place).
> >
> > However, ideally, this data belongs in a properly indexed database
> > (InterBase 6.0 Open Source is free!).
> >
>
> Or NumPy array?
Unfortunately, I'm not familiar with Python nor any of its addons, which
NumPy appears to be However, I can only assume that the NumPy array is an
array class that provides built-in searching/indexing, sorting, etc., which
would certainly help out if there was a user interface to allow for
manipulation or viewing of the data. However, it wouldn't really affect
disk I/O performance unless it also featured some sort of built-in
super-optimized file output functionality, which might be worth
investigating.
But, for a desktop application, yes, using a flat file along with the NumPy
array can probably be as efficient as any desktop database.
More importantly, although the ofstream appears to implement buffering
automatically, using endl after each row actually forces the buffer to flush
its contents to disk. So, you're performing ~ 14M flush operations, each of
which includes a costly synchronization between what's written to the file,
and what's contained in the buffer.
Because the ofstream handles buffering automatically, there's no reason
(aside from mission-critical data integrity reasons) to flush each output.
Generally, the size of the buffer coincides with an amount of data that is
equal to, or a multiple of the file system's Page Size, which is usually
4096KB or 8192KB)
Use the '\n' character when you need to move to the next line without
invoking a flush. Closing the file should force a flush of whatever the
last data is in the buffer, but you can also force the last flush (the only
manual flush in the whole process) but outputting an endl prior to the call
to close.
The drawback to the automatic buffering is that you will lose whatever data
is residing in the buffer, in the event of a system failure of some type.
Andy
MetaKit now seems to be free from
http://www.equi4.com/ and is usable from
Phython. I used their free disk catalogue
program Catfish for years on Win 3.11
Worth a look.
Sean B
Well, there's supposed to be five pieces in my Wendy's Chicken Nuggets, but
sometimes...<g> (a little HTML humor there).
Seriously, I have no doubt that some genius could come up with some way to
store the physical data within <i>a</i> lotto game (assuming it was a 260 or
less complicated game...).
The problem is not the data storage, but the instantaneous retrieval. With
no offense to the inventors of your two-bit system, there's nothing more
efficient than a bit retrieval system. In fact, there's nothing more
elementary.
I would be willing to bet on the fact that whatever 16-bit system you're
talking about couldn't even handle the 532 Colorado Cash 5 system I'm
referring to, without relying upon an algorithm that requires more than
bitwise operations for number-to-number comparisons.
With 32-bits, I can store the complete values (instantaneously available) of
the Colorado Cash 5.
There's no physical way, with 80x86, 68000 series, or other processors, that
you could do that with 16-bits (these are double value bits, not some
trinary alien variety), and complete mathematical operations as fast as you
could with the values stored plainly in a larger format.
I do appreciate your link to MetaKit. Prior to this NG, Python was
something to watch out for in the rainforests.
Kind regards,
Andy
"Sean B" <se...@XXXsbacc3.fsnet.co.uk> wrote in message
news:08ss0uk04qmi64lpo...@4ax.com...
There's no doubt that you could store such values in a bitarray. The
problem with that is that you've just added an extra layer of abstraction.
Now, there's no doubt that one of the features of OOP (aside from
encapsulation, inheritance, and polymorphism) is abstraction.
However, when that abstraction adds complexity to data comparison and
manipulation, it increases the time to complete those types of operations.
It's fine to store data in such a way, as long as you can rapidly extract
that data to a form which is more conducive to such comparison and
manipulation operations.
If physical storage was the issue causing problems (and you've so much RAM
that it's going out of style), you've got a plan.
However, these days, it's more likely the speed than HD space that causes
customer complaints.
Cheers!
Andy
"www" <x...@hotmail.com> wrote in message
news:3C0E83B1...@hotmail.com...
>Seriously, I have no doubt that some genius could come up with some way to
>store the physical data within <i>a</i> lotto game (assuming it was a 260 or
>less complicated game...).
>
>The problem is not the data storage, but the instantaneous retrieval. With
>no offense to the inventors of your two-bit system, there's nothing more
>efficient than a bit retrieval system. In fact, there's nothing more
>elementary.
>
It was someone in the newsgroup who said
he knew of a method of doing this.
>I would be willing to bet on the fact that whatever 16-bit system you're
>talking about couldn't even handle the 532 Colorado Cash 5 system I'm
>referring to, without relying upon an algorithm that requires more than
>bitwise operations for number-to-number comparisons.
>
Lotwin is one of the best programs at filtering
down lines. The original 16 bit program
ran on W3.1 and quite basic machines. Speed
never seemed to be a problem.
>With 32-bits, I can store the complete values (instantaneously available) of
>the Colorado Cash 5.
>
Perhaps you could further describe your method.
>There's no physical way, with 80x86, 68000 series, or other processors, that
>you could do that with 16-bits (these are double value bits, not some
>trinary alien variety), and complete mathematical operations as fast as you
>could with the values stored plainly in a larger format.
>
>I do appreciate your link to MetaKit. Prior to this NG, Python was
>something to watch out for in the rainforests.
>
Named after Monty I believe ;-)
Sean B
>Kind regards,
>
>Andy
Or shrubberies.;-)
>
> Kind regards,
>
> Andy
[snip]
Okay, to clarify, the system we're talking about here is a generic system
used to store, manipulate, and retrieve lottery results (specificaly how
best to store single sets of lottery numbers). I never meant to refer to an
OS itself, or its variations (16-bit vs. 32-bit Windows).
Although bit manipulation is most efficient when done on a bitfield less
than or equal to the register size of the CPU, it's not a necessity( i.e.,
one can have four-byte datatypes on a 2-byte processor).
>
> >With 32-bits, I can store the complete values (instantaneously available)
of
> >the Colorado Cash 5.
> >
>
> Perhaps you could further describe your method.
Oops, well, I wasn't trying to make it sound as if this was some method I
had developed. Rather, using the tried-and-true method of bitwise
arithmetic, one can process data quickly, and store it efficiently, with a
lotto number that's stored in a bitarray. With Colorado's 532 Cash 5 Lotto,
one could store all five numbers of a single drawing within a 32-bit
unsigned integer by setting the correct bits:
bit positions:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7
6 5 4 3 2 1 0
lotto numbers:
0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
1 0 0 0 0 1 0 0 0 0 0 0
This number (Lotto 7 - 12 - 19 - 23 - 30) yields 541329472 in decimal (the
above representation is it's binary equivalent). Signed integer arithmetic
(one's/two's complement) may complicate this on some systems, but there
should be unsigned types available.
Comparing values like two integers will generally be faster than comparing
any other types of values (a lot of which occurs when one is doing
statistical analyses).
Andy
Thanks. I thought perhaps you had method of
actually compressing the line rather than
just turning bits on, which would need 2 sets
of 32 bit bytes for a 6/49 and result in quite
a big file.
I don't think compression would be much use
anyway. One method I thought of was just
saving the gaps between the draw numbers
quite a lot of these would be single figure.
My original idea of just saving 14M bits to a
file still sounds feasible. Looking at how
black and white bitmaps are stored may help.
Another possibility is reading and writing bytes
to and from the sectors on a ram disk. For
filtering we will be going through numbered
loops and several sectors could be processed
before and after x numbers of loops. My current
setup already has a 100MB ram disk. I would
only need 14MB of this. (Obviously it would
have to be empty before the program was
run).
I'm not too concerned about the extra time
taken on any conversions. Any programs
I write are for personal use only. DOS
managed for many years swapping 64k
pages back and forth under EMS. I
remember once seeing a shareware
package that allowed you to have huge
arrays of up to 82MB in EMS. It also
allowed directly addressing fields of
1-32 bits at any location fairly quickly.
I use mostly Windows these days so
there's not much point in me trying to
see if its still available.
Thanks again for the help anyway.
Sean B
"Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message news:<9thi59$i9u$1...@newsg2.svr.pol.co.uk>...
> "Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message
> news:9tbrkt$iqj$1...@news6.svr.pol.co.uk...
> >
> > "Duncan Smith" <buz...@urubu.freeserve.co.uk> wrote in message
> > news:9t8tnp$7pl$1...@newsg3.svr.pol.co.uk...
> > > Well I have something that could be easily tweaked to write the
> combinations
> > > to a, say, tab/comma delimited text file. But it would take a while to
> run.
> > > So if I get the time I'll do something in C (there is also C-code for
> this
> > > on the web, do a search for 'C' 'combinations' 'numerical recipes' or
> > > something).
> > >
> > > Duncan
> >
> > I have written a small app. that will write all lottery combinations for a
> > k/N lottery to a space delimited text file. Lex ordering as per my
> earlier
> > post. It's written in Python and I haven't optimised it, so it will take
> > maybe 40 mins to write all combs for a 6/49. But if you only need to run
> it
> > once... Anyone want it?
> >
> > Duncan
> >
>
> I now have the equivalent written in C. Same thing, but only takes around 7
> minutes. Anyone who can handle a 255Mb text file is welcome to it (source
> code or exe).
>
> Duncan