Why don't I get random numbers?

50 views
Skip to first unread message

LFS

unread,
Oct 21, 2012, 4:34:15 AM10/21/12
to sage-s...@googlegroups.com
Hiya guys!
I have a function:
def GDD(num,P):
    D=GeneralDiscreteDistribution(P)
    D.reset_distribution()
    L=[D.get_random_element() for _ in range(num)]
    return L

and then commands
P=[0.5,0.33, 0.17]
L=GDD(num,P)

I am definitely not getting random distributions within my lists L (i.e. the same lists repeat themselves much too often).
Am I not understanding something here?
Thanks for any help.

Dima Pasechnik

unread,
Oct 21, 2012, 5:18:35 AM10/21/12
to sage-s...@googlegroups.com
On 2012-10-21, LFS <lfah...@gmail.com> wrote:
> ------=_Part_948_19630829.1350808455726
> Content-Type: text/plain; charset=ISO-8859-1
Sage, as (almost?) any other system, only gets you pseudorandom numbers,
i.e. a sequence that is deterministic (from a given random seed you
always get the same sequence), but looks like a random one.

D.reset_distribution() restarts the sequence from a default random seed.
Now you will always get the same segment of the sequence!

To fix it, you need e.g. to change the function GDD to accept D as a
parameter:

def GDDinit(P):
D=GeneralDiscreteDistribution(P)
D.reset_distribution()
return D

def GDD(num,D):
L=[D.get_random_element() for _ in range(num)]
return L

D=GDDInit([0.5,0.33, 0.17])
L1=GDD(10,D)
L2=GDD(10,D)

Now your L1 and L2 should not correlate much...

HTH,
Dmitrii

> Thanks for any help.
>

LFS

unread,
Oct 21, 2012, 6:30:13 AM10/21/12
to sage-s...@googlegroups.com
Hiya Dmitrii
Thanks so much for your quick reply, but I am still getting the same problem. 
I go to excel and use a RandBetween[1,1000] and just sort them according to my probability and all is good, i.e. the histogram of the expected values is "normally distributed".
Here I keep getting the same counts over and over (see line 5) of  http://sage.math.canterbury.ac.nz/home/pub/201 and the histogram looks awful.
Thanks again!
Linda

LFS

unread,
Oct 21, 2012, 6:42:23 AM10/21/12
to sage-s...@googlegroups.com
Hiya Dimitri, It might actually be working. Not sure, but when I increased the number of sets ns, it does looks better. Still alot of repeats, but the histogram looks better. Linda

David Kirkby

unread,
Oct 21, 2012, 6:52:02 AM10/21/12
to sage-s...@googlegroups.com
On 21 October 2012 11:30, LFS <lfah...@gmail.com> wrote:
> Hiya Dmitrii
> Thanks so much for your quick reply, but I am still getting the same
> problem.

I don't know if this the best way in Sage, but it is common practice
to seed the random number generator from the number of seconds since
the Epoch (1/1/1970). So every time you seed it, you get a different
sequence.

Another option is to seed it with some bytes from /dev/random or
/dev/urandom. In fact, /dev/urandom gives about as good as you can get
for random numbers, as they are consider cryptographically secure.
However, if there is insufficent entropy in the system, /dev/urandom
will block, and give no data.

One advantage of using a pseudo random number generator is that it is
possible to repeat an experiment, using the same seed, and you will
get the same numbers. That's not possible if you use /dev/random or
/dev/urandom.

Dave

David Kirkby

unread,
Oct 21, 2012, 6:57:09 AM10/21/12
to sage-s...@googlegroups.com
On 21 October 2012 11:42, LFS <lfah...@gmail.com> wrote:
> Hiya Dimitri, It might actually be working. Not sure, but when I increased
> the number of sets ns, it does looks better. Still alot of repeats, but the
> histogram looks better. Linda

If there is a number of repeats, something is defnitely wrong. It
should either give you pseudo-random numbers if seeded properly, or a
100% preproducible set of numbers if seeded with a fixed seed.

It should not look better, but still not right. Something is wrong in
that case.

Dave

LFS

unread,
Oct 21, 2012, 7:03:11 AM10/21/12
to sage-s...@googlegroups.com
Hiya Dave,
What would the line of code look like to reseed it with the epoch thing each time I call it?
Thanks so much,
Linda

David Kirkby

unread,
Oct 21, 2012, 7:03:15 AM10/21/12
to sage-s...@googlegroups.com
One simple way to get a qualitative feel for the quality of random
numbers is to take them two at a tim, staring with the first two being
x1, y1. So you gererate:

x1, y1
x2, y2
x3, y3 etc

then plot a graph of x,y for all parts xn, yn

The graph should look like a scatter graph, with no obvious pattern.
There are more quantitive measures to check them. I'll give you a
reference to my PhD these if you want it, as I looked at random number
quality in some detail for Monte Carlo simulations.

Dave

David Kirkby

unread,
Oct 21, 2012, 7:07:28 AM10/21/12
to sage-s...@googlegroups.com
The truth is I don't know off hand. I've spent a lot of time porting
Sage to Solaris, but have not used it much at all. I'm sure Python has
some code to get the get the seconds since the Epoch. One thing to
watch with this method, is if you call it too often (less than once
per second), you will be seeding it with the same number, and so get
the same sequence. I know someone who came unstuck. It wa ok on a slow
computer, but when he went to a fast computer, the things went all
wrong, as he was seeding it with the same numbers.

I'll try to find an example in the Sage manual. I'm sure it must have
something on this, but personally I don't know how to best do it. I'm
just relying on my experience of using pseudo-random numbers in C code
for Monte Carlo modelling.

dave

David Kirkby

unread,
Oct 21, 2012, 7:16:43 AM10/21/12
to sage-s...@googlegroups.com
On 21 October 2012 12:03, LFS <lfah...@gmail.com> wrote:
http://www.sagemath.org/doc/reference/sage/misc/randstate.html

says

If set_random_seed() is called with no arguments, then a new seed is
automatically selected. On operating systems that support it, the new
seed comes from os.urandom(); this is intended to be a truly random
(not pseudo-random), cryptographically secure number. (Whether it is
actually cryptographically secure depends on operating system details
that are outside the control of Sage.)


I tend to disagree with what's quoted there. The seed will be truely
random, but the sequence of numbers will not be. They will still be
preudo random.

Sage no doubt has endless ways of generating random numbers, and that
method might only work for one or more of the RNGs, but not all of
them.


Dave
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "sage-support" group.
> To post to this group, send email to sage-s...@googlegroups.com.
> To unsubscribe from this group, send email to
> sage-support...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-support?hl=en.
>
>

LFS

unread,
Oct 21, 2012, 7:17:41 AM10/21/12
to sage-s...@googlegroups.com
Thanks Dave - If I have a moment I will try the scatter plot method to check. That seems like a useful idea for student.
(This was just for a class lesson on testing the CLT and I am way over my time allotment :))
Thanks to both you and Dimitri for feedback!
Linda

Dima Pasechnik

unread,
Oct 21, 2012, 7:24:09 AM10/21/12
to sage-s...@googlegroups.com
On 2012-10-21, David Kirkby <david....@onetel.net> wrote:
> On 21 October 2012 12:03, LFS <lfah...@gmail.com> wrote:
>> Hiya Dave,
>> What would the line of code look like to reseed it with the epoch thing each
>> time I call it?
>> Thanks so much,
>>
>> Linda
>
>
> http://www.sagemath.org/doc/reference/sage/misc/randstate.html
>
> says
>
> If set_random_seed() is called with no arguments, then a new seed is
> automatically selected. On operating systems that support it, the new
> seed comes from os.urandom(); this is intended to be a truly random
> (not pseudo-random), cryptographically secure number. (Whether it is
> actually cryptographically secure depends on operating system details
> that are outside the control of Sage.)
>
>
> I tend to disagree with what's quoted there. The seed will be truely
> random, but the sequence of numbers will not be. They will still be
> preudo random.

the paragraph above only claims the randomness of the seed, rather than
of the whole sequence.

P Purkayastha

unread,
Oct 21, 2012, 7:24:37 AM10/21/12
to sage-s...@googlegroups.com
I am getting quite a different output here. I simply ran "Evaluate all"
and attached is the empirical distribution I get. It is quite close to a
normal one.
Histogram.png

LFS

unread,
Oct 21, 2012, 7:32:21 AM10/21/12
to sage-s...@googlegroups.com
Yes - I think Dimitri's solution is working!  (I changed the published file to his algorithm and upped the number of sets ns for the CLT and I also seem to be now be getting good empirical data.)
Thank-you all! Linda

LFS

unread,
Oct 21, 2012, 7:34:51 AM10/21/12
to sage-s...@googlegroups.com
Sorry Dmitrii to continually be typing your name wrong (using the Macedonian spelling where I live :)). Linda

Dima Pasechnik

unread,
Oct 21, 2012, 7:31:48 AM10/21/12
to sage-s...@googlegroups.com
On 2012-10-21, LFS <lfah...@gmail.com> wrote:
> ------=_Part_2721_6851258.1350818261101
> Content-Type: text/plain; charset=ISO-8859-1
in the worksheet, the code:

def GDD(num,P):
L=[D.get_random_element() for _ in range(num)]
return L

is wrong. It must be def GDD(num,D), not def GDD(num,P)

Perhaps that's why it does not work for you as it should.

> Linda
>

LFS

unread,
Oct 21, 2012, 7:41:17 AM10/21/12
to sage-s...@googlegroups.com
Done. Thanks

LFS

unread,
Oct 21, 2012, 7:48:42 AM10/21/12
to sage-s...@googlegroups.com
Actually Dmitrii with this change it is giving me exactly the same empirical data each time!

LFS

unread,
Oct 21, 2012, 7:53:43 AM10/21/12
to sage-s...@googlegroups.com
I am not sure, but your image looks like the bottom histogram, which is the normal distribution.
The middle histogram is from the empirical data.
Linda

Dima Pasechnik

unread,
Oct 21, 2012, 7:55:49 AM10/21/12
to sage-s...@googlegroups.com
On 2012-10-21, LFS <lfah...@gmail.com> wrote:
> ------=_Part_87_6472836.1350820122321
> Content-Type: text/plain; charset=ISO-8859-1
>
> Actually Dmitrii with this change it is giving me exactly the same
> empirical data each time!

well, I just gave you a general framework for initializing and using a
pseudo-random number generator.
If you initialize it with the same seed, you get the same pseudo-random
sequence. (sometimes uselful, if you want to check that you get the
same results from seemingly random computation)
So if you restart your computation from the very beginning, inclusing
the initializing of the random seed with the same value, you will get
the same data each time you do the computation.

But if you want to emulate true randomness, you only have to initialize
the seed once.

HTH,
Dmitrii

LFS

unread,
Oct 21, 2012, 8:04:04 AM10/21/12
to sage-s...@googlegroups.com
yes - i finally saw that and took the call to the GDDinit out of the loop and this may be working, but I don't know how to explain this to the kiddies.
Probably should have just stuck with Excel where I understand the generators. Too complicated by far.
Thanks everyone for your help.

David Kirkby

unread,
Oct 21, 2012, 8:22:34 AM10/21/12
to sage-s...@googlegroups.com
On 21 October 2012 13:04, LFS <lfah...@gmail.com> wrote:
> yes - i finally saw that and took the call to the GDDinit out of the loop
> and this may be working, but I don't know how to explain this to the
> kiddies.
> Probably should have just stuck with Excel where I understand the
> generators. Too complicated by far.
> Thanks everyone for your help.

But kids are used to Windows,so if you can encourage them to use a
non-Windows system, it would be useful as a side-benefit.

I'm not sure if it ever happened, but I know an 8-year old contacted
William about being a Sage developer.

Dave

LFS

unread,
Oct 21, 2012, 9:17:14 AM10/21/12
to sage-s...@googlegroups.com
Oh I agree Dave for several reasons (not least of all that Excel is not free, nor is it math based).
However I have to admit that I have not been happy with any non-Excel program for just "teaching" probability and statistics.
Too complicated and the kiddies will give up. It took me 4 days to get this simulation in Sage going (I think it works now) and I gave up in GeoGebra without a fight. (Please don't think that I think Mathematica, et.al. any better. If anything they are even more complicated, not to mention costly.) I just want it to be relatively simple and mathematical to empirically test stuff :)

P Purkayastha

unread,
Oct 21, 2012, 9:32:27 AM10/21/12
to sage-s...@googlegroups.com
It was the middle one. You can figure it out from the x labels. But also
attached here is the worksheet for you to verify :)

It still has the wrong function definition of GDD, but it doesn't matter
since D is considered a global variable and it still works.
Generate Random Discrete Data and Find Expected Value BAD.sws

Dima Pasechnik

unread,
Oct 21, 2012, 9:44:50 AM10/21/12
to sage-s...@googlegroups.com
On 2012-10-21, LFS <lfah...@gmail.com> wrote:
> ------=_Part_1134_6290611.1350821044307
> Content-Type: text/plain; charset=ISO-8859-1
>
> yes - i finally saw that and took the call to the GDDinit out of the loop
> and this may be working, but I don't know how to explain this to the
> kiddies.
Just tell them the truth, as we all should be telling our
students. :-)
Explaining to them what pseudorandom numbers (and why it's hard to
get "true" random numbers from computer) are is much more
useful, and easier than that very advanced (and I am a professional
mathematician, you know) stuff you are trying to talk about...
At least the concept : you have a (very long, but finite) cyclic
sequence of numbers, and you draw from this sequence starting from the
the place determined by the seed, is much easier than non-finite
probability.

> Probably should have just stuck with Excel where I understand the
> generators. Too complicated by far.

Hmm, what do you mean? IMHO, until this conversation here, you did not
know why it actually works in excel!
What excel does behind the scene is initializing the
random seed from system time, at least that's what rumours on the net
say. In it's usual sloppiness, M$ does not bother to document this.
How one can trust computations done with this piece of software, I have
no clue.
Must be church-like indoctrination, no less :-)
Just make sure that your copy of M$ Office is paid for,
and it will work, by magic. If my kid was in a class like this,
I'd have had very serious objections to the teacher.

Cheers,
Dmitrii

> Thanks everyone for your help.
>
> On Sunday, 21 October 2012 13:56:06 UTC+2, Dima Pasechnik wrote:
>>
Reply all
Reply to author
Forward
0 new messages