Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Trying to get better frequency distributions...

41 views
Skip to first unread message

Chris M. Thomasson

unread,
Sep 21, 2015, 5:29:46 PM9/21/15
to
Here is some example code using the default random number generator
(std::rand()).

I do not have time to explain it, but it can create small differences
between the average
minimum and maximum values of a frequency distribution:

http://codepad.org/ZZhBc65N
__________________________________________________________
#include <iostream>
#include <algorithm>
#include <vector>
#include <climits>
#include <cassert>
#include <cstdlib>
#include <cstring>
#define AN (UCHAR_MAX + 1U)


// crappy little rand...
unsigned int random_byte()
{
double rn = std::rand() / ((double)RAND_MAX);
return (unsigned int)((rn) * (AN - 1U));
}


// stores byte counts...
struct counts
{
unsigned int m_count[AN];
unsigned int m_total;

#define COUNTS_SINIT() { { 0 }, 0 }

void inc(unsigned int n)
{
assert(n < AN);
m_count[n] = m_count[n] + 1;
m_total = m_total + 1;
}

bool inc_if(unsigned int n, unsigned int nmax, unsigned int tmax)
{
assert(n < AN);
if (m_count[n] >= nmax || m_total >= tmax) return false;
inc(n);
return true;
}

void display() const
{
std::cout << m_total << " bytes...\r\n\r\n";

unsigned int total = 0;
double avg_total = 0.0;
double avg_min = 999999999999999.0;
double avg_max = 0.0;

for (unsigned int i = 0; i < AN; ++i)
{
if (m_count[i])
{
double bavg = m_count[i] / (double)m_total;

avg_min = std::min(avg_min, bavg);
avg_max = std::max(avg_max, bavg);

avg_total = avg_total + bavg;
total = total + m_count[i];

std::cout << "[" << i << "] = " << m_count[i] << ", " <<
bavg << "\r\n";
}
}

double avg_diff = avg_max - avg_min;
assert(avg_diff >= 0.0);

std::cout << "total = " << total << "\r\n";
std::cout << "avg_min = " << avg_min << "\r\n";
std::cout << "avg_max = " << avg_max << "\r\n";
std::cout << "avg_diff = " << avg_diff << "\r\n";
std::cout << "avg_total = " << avg_total << "\r\n";
}
};


// crude attempt to get a "flat" frequency distribution...
struct flat_freq
{
counts m_counts;

#define FLAT_FREQ_SINIT() { COUNTS_SINIT() }


void prv_process(unsigned int n, unsigned int nmax, unsigned int tmax)
{
for (unsigned int i = 0; i < n; ++i)
{
// Color Monitor
unsigned int rn = (random_byte() * ((i + 1) * 13)) % AN;

if (! m_counts.inc_if(rn, nmax, tmax))
{
//std::cout << "inc_if failed = \t" << rn << " \r";
}
}
}

unsigned int process(unsigned int n, unsigned int iter = 150)
{
unsigned int i = 0;
unsigned int ndiv = n / AN;
//unsigned int nmax = (ndiv) ? (ndiv + 1U) : (ndiv + 1U);
unsigned int nmax = (ndiv + 0U);
double nmax_real = 0.0;

//unsigned int nrem = ndiv * AN;
//unsigned int nmax = (nrem) ? (ndiv + 1U) : (nrem);
unsigned int iters_total = 0;
while (m_counts.m_total < n)
{
iters_total = iters_total + 1;

double tir = m_counts.m_total / (n - 0.0);

//nmax_real = nmax_real + 1.681; // large
nmax_real = nmax_real + 0.281; // small

for (i = 0; m_counts.m_total < n && i < 128; ++i)
{

unsigned int ptotal = m_counts.m_total;
prv_process(iter, (unsigned int)nmax_real, n);
if (ptotal == m_counts.m_total)
{
std::cout << "infinite loop detected at:" << ptotal <<
"! \r";
break;
}
}
}

std::cout << "\r\n\r\n";

std::cout << "iters_total = " << iters_total << "\r\n";

return i;
}

void display() const
{
m_counts.display();
}
};




int main()
{
{
std::srand(123);

flat_freq ffreq = FLAT_FREQ_SINIT();

unsigned int i = ffreq.process(10003);

std::cout << "i = " << i << "\r\n\r\n";

ffreq.display();
}

std::cout << "\r\n_________________\r\nComplete!\r\n\r\n";
std::cin.get();

return 0;
}
___________________________________________________________




I will explain this further very soon. It has to do with encryption.

bartekltg

unread,
Sep 21, 2015, 8:15:28 PM9/21/15
to
On 21.09.2015 23:29, Chris M. Thomasson wrote:
> Here is some example code using the default random number generator
> (std::rand()).
>
> I do not have time to explain it, but it can create small differences
> between the average
> minimum and maximum values of a frequency distribution:
>
> http://codepad.org/ZZhBc65N

Where is a problem? You have (pseudo)random process (not much, because
of " std::srand(123);") so number of hits in one particular bucket
also will be a random variable. For every run (or seed) you get slighty
different results.

Stefan is right, use <random>.
mt19937 _is_ better PRGN than rand();
For very strongly uncorrelated numbers you can use ranlux24/ranlux48.


> I will explain this further very soon. It has to do with encryption.

Mersenne twister is not a "cryptographic secure" PRNG.
I dont know about randlux, but also probably not.


pzdr
bartekltg

Chris M. Thomasson

unread,
Sep 22, 2015, 4:19:43 PM9/22/15
to
> "bartekltg" wrote in message news:mtq6ii$r8d$1...@speranza.aioe.org...

> > On 21.09.2015 23:29, Chris M. Thomasson wrote:
> > Here is some example code using the default random number generator
> > (std::rand()).
> >
> > I do not have time to explain it, but it can create small differences
> > between the average
> > minimum and maximum values of a frequency distribution:
> >
> > http://codepad.org/ZZhBc65N

> Where is a problem? You have (pseudo)random process (not much, because
> of " std::srand(123);") so number of hits in one particular bucket
> also will be a random variable. For every run (or seed) you get slighty
> different results.

I am trying to fill the counters up to a point where they are all basically
equal, and reduce the spikes in the freq graph.

I want the difference between the average max and min frequency counts to be
as small as possible.


WRT the encryption angle, here is what I am trying to do:

https://groups.google.com/d/topic/sci.crypt/g6CPhD9mh24/discussion

https://groups.google.com/d/topic/sci.crypt/KH1A8KeWlvw/discussion

https://groups.google.com/d/topic/sci.crypt/AlM10CzGclQ/discussion

[...]

Chris M. Thomasson

unread,
Sep 22, 2015, 6:56:05 PM9/22/15
to
> "Stefan Ram" wrote in message
> news:dist-2015...@ram.dialup.fu-berlin.de...

> > "Chris M. Thomasson" <nos...@nospam.nospam> writes:
> >I want the difference between the average max and min frequency counts to
> >be
> >as small as possible.

> The following function will yield pseudo-random numbers that
> are either 0 or 1, and the difference between the count of 0s
> and 1s will be 1 at most, which is as small as possible.

> int random(){ static int i = 0; return *i++ % 2; }

After I set rid of the (*i++):

int random(){ static int i = 0; return i++ % 2; }

it produces:

0, 1, 0, 1, ...

The pattern is too easily spotted... ;^)

BTW, I am not advocating the use of prng's for encryption. I want to use
fractals.

I am just trying to explore a way to get flat frequency distributions of
bytes out of any random source.

JiiPee

unread,
Sep 22, 2015, 6:56:13 PM9/22/15
to
On 22/09/2015 23:39, Stefan Ram wrote:
> int random(){ static int i = 0; return *i++ % 2; }

does not compile, gives an error: error: invalid type argument of unary
'*' (have 'int')

bartekltg

unread,
Sep 22, 2015, 7:57:46 PM9/22/15
to
On 22.09.2015 22:19, Chris M. Thomasson wrote:
>> "bartekltg" wrote in message news:mtq6ii$r8d$1...@speranza.aioe.org...
>
>> > On 21.09.2015 23:29, Chris M. Thomasson wrote:
>> > Here is some example code using the default random number generator
>> > (std::rand()).
>> >
>> > I do not have time to explain it, but it can create small differences
>> > between the average
>> > minimum and maximum values of a frequency distribution:
>> >
>> > http://codepad.org/ZZhBc65N
>
>> Where is a problem? You have (pseudo)random process (not much, because
>> of " std::srand(123);") so number of hits in one particular bucket
>> also will be a random variable. For every run (or seed) you get slighty
>> different results.
>
> I am trying to fill the counters up to a point where they are all basically
> equal, and reduce the spikes in the freq graph.

OK. Why?

> I want the difference between the average max and min frequency counts
> to be
> as small as possible.

But what contains do youhave? You want a random number generator,
that always will keep almost equal distribution? Random process mean
buckets _won't_ be equally filled.

You want that after N*k draws there will be k hits in every
of N buckets? Then prepare an array with k times a sequence
{0,1,2... N-1} and do random permutation on it.
The result is your 'random' sequence.
If you not have time for proper explanation, you surely understand
that most people do not have time to read three fat threads ;-)

bartekltg

Christian Gollwitzer

unread,
Sep 23, 2015, 2:54:47 AM9/23/15
to
Am 23.09.15 um 00:55 schrieb Chris M. Thomasson:
> I am just trying to explore a way to get flat frequency distributions of
> bytes out of any random source.

Random means random, it does not mean that you get perfectly flat
distributions from a uniform source, because you only ever get a finite
number of samples. Good PRNGs are usually tested using statistical tests
to show that the distributions fall within th eexpected range of a
uniform uncorrelated random source:

http://csrc.nist.gov/groups/ST/toolkit/rng/documents/nissc-paper.pdf

If a known good generator (try KISS
https://de.wikipedia.org/wiki/KISS_%28Zufallszahlengenerator%29 ) does
not give you, what you want, then maybe you want something different?

For example, a quasi-random sequence, also called low discrepancy
sequence, provides a more even filling of the space:

https://en.wikipedia.org/wiki/Low-discrepancy_sequence

Another option is to generate the numbers as a whole - for instance, the
Poisson disc can be used to fill space evenly, if that is wht you want
to achieve.

Or if you have a skewed random source, and want to massage it such that
you get a uniform distribution, pass it through an arithmetic coder.

Christian
0 new messages