2 views

Skip to first unread message

Jul 18, 2007, 3:12:13â€¯AM7/18/07

to

I'm looking for a fast way to sample from a Markov-1 sequence of

random bits. The method below is 3600 times slower than built-in

RandomInteger function, can it be made much faster?

random bits. The method below is 3600 times slower than built-in

RandomInteger function, can it be made much faster?

p = 0.9; (* the probability of encountering 00 or 11 *)

f = RandomChoice[{p^# (1 - p)^(1 - #), p^(1 - #) (1 - p)^#} -> {1, 0}]

&;

NestList[f, RandomChoice[{0, 1}], 100000]

Jul 19, 2007, 3:38:58â€¯AM7/19/07

to

Yaroslav Bulatov schrieb:

Hi Yaroslav,

isn't this formula a bit too complicated to be used in a fast generator?

Just return the same bit as the last with probability p:

SeedRandom[1];

Timing[With[{p=0.9},

rlist=NestList[If[Random[]<=p,#,1-#]&,Random[Integer],10^6];]][[1]]

Count[Partition[rlist,2,1],{x_,x_},{1}]/Length[rlist]//N

Out[32]=

0.300019 Second

Out[33]=

0.899887

, which is pretty near 0.9 :-)

Peter

Jul 19, 2007, 3:39:59â€¯AM7/19/07

to

Yaroslav Bulatov wrote:

> I'm looking for a fast way to sample from a Markov-1 sequence of

> random bits. The method below is 3600 times slower than built-in

> RandomInteger function, can it be made much faster?

>

> p = 0.9; (* the probability of encountering 00 or 11 *)

> f = RandomChoice[{p^# (1 - p)^(1 - #), p^(1 - #) (1 - p)^#} -> {1, 0}]

> &;

> NestList[f, RandomChoice[{0, 1}], 100000]

>

>

> I'm looking for a fast way to sample from a Markov-1 sequence of

> random bits. The method below is 3600 times slower than built-in

> RandomInteger function, can it be made much faster?

>

> p = 0.9; (* the probability of encountering 00 or 11 *)

> f = RandomChoice[{p^# (1 - p)^(1 - #), p^(1 - #) (1 - p)^#} -> {1, 0}]

> &;

> NestList[f, RandomChoice[{0, 1}], 100000]

>

>

Here's some code that is roughly 25-30 times faster than the code sent

for a list of 10^6 bits. The code starts with a list containing a 0 or 1

(generated with equal probability) followed by n 0 or 1 values generated

with 1 having probability p. Those values are then looped through,

iteratively replacing any value that follows a 0 with the other value

(the probabilities flip-flop for values following a 0, so we can just

switch the value). In effect this trades the more costly n calls to

RandomChoice for a single call to RandomChoice and an update loop

through the list, and uses compilation to speed up the pass through the

loop.

In[1]:= markovBits[p_, n_] := Block[{not},

Compile[{{rr, _Integer, 1}},

not[1] = 0;

not[0] = 1;

Module[{rrnew = rr},

Do[If[rrnew[[i]] === 0,

rrnew[[i + 1]] = not[rrnew[[i + 1]]]], {i,

Length[rrnew] - 1}];

rrnew]

][Join[{RandomInteger[]}, RandomChoice[{p, 1 - p} -> {1, 0},

n]]]]

In[2]:= markovBits[.9, 10^6]; // Timing

Out[2]= {0.89, Null}

The following counts the values of absolute differences of adjacent

values to give a check of the adjacency probabilities, so the 0s are the

00 and 11 cases.

In[3]:= Tally[Abs[Most[#] - Rest[#]]] &[markovBits[.9, 10^6]]

Out[3]= {{0, 900210}, {1, 99790}}

Darren Glosemeyer

Wolfram Research

Jul 20, 2007, 3:35:06â€¯AM7/20/07

to

Here are 3 functions. The first is your version; the second is your

version compiled; the third is a slightly different version also

compiled.

fun = Function[{n, p},

NestList[

RandomChoice[{p^# (1 - p)^(1 - #), p^(1 - #) (1 - p)^#} -> {1,

0}] &, RandomChoice[{0, 1}], n]];

fun1 = Compile[{{n, _Integer}, {p, _Real}},

NestList[

RandomChoice[{p^# (1 - p)^(1 - #), p^(1 - #) (1 - p)^#} -> {1,

0}] &, RandomChoice[{0, 1}], n]];

fun2 = Compile[{{n, _Integer}, {p, _Real}},

NestList[

If[# == 0, If[RandomReal[] < p, 0, 1],

If[RandomReal[] < 1 - p, 0, 1]] &, RandomChoice[{0, 1}], n]];

This gives the relative timings:

First[#]/#& @ Table[Timing[i[10^5, .9]][[1]], {i, {fun, fun1, fun2}}]

On my machine I get a factor of 5 speed up for fun1 and a factor of 25

speed up for fun2 relative to fun.

--Mark.

Jul 21, 2007, 4:22:11â€¯AM7/21/07

to

Well, I like both Peter's and Darren's solutions. Peter's can be

speeded up just a touch by compiling, but it beats mine even without

compiling. Darren's can be speeded up by using Peter's idea and

rearranging a bit. It ends up being about the same speed as Peter's.

markovBits1 =

Compile[{{n, _Integer}, {p, _Real}},

Module[{r =

Join[{RandomInteger[]}, RandomChoice[{p, 1 - p} -> {1, 0}, n]]},

Do[If[r[[i]] === 0, r[[i + 1]] = 1 - r[[i + 1]]], {i, n}];

r]]

--Mark

Jul 21, 2007, 4:49:49â€¯AM7/21/07

to

The Compiled one-liner (fun2) seems to be the fastest solution.

I also compared it with a simple java solution, and it was about 2.7

times faster (19 seconds vs 7 seconds). Using Java solution through

JLink seems to take about the same time as running stand-alone.

Running JVM with some optimization flags (server,compileThreshold)

took the time to 6 seconds. A simple C solution took 2 seconds, 1.5

seconds with gcc -O9 optimization.

The bottom line seems to be that efficient Mathematica solution can be

25 times faster than inefficient one, porting to C can give 10x speed-

up, and Java is somewhere in between.

fun2 = Compile[{{n, _Integer}, {p, _Real}},

NestList[

If[# == 0, If[RandomReal[] < p, 0, 1],

If[RandomReal[] < 1 - p, 0, 1]] &, RandomChoice[{0, 1}], n]];

Timing[fun2[4 10^7, .9]][[1]]

public class hmm {

public static int[] sample(double p, int length) {

int[] result = new int[length];

result[0]=(Math.random()>.5)?0:1;

for (int i = 1; i < length; ++i)

result[i]=(Math.random()>p)?(1-result[i-1]):result[i-1];

return result;

}

public static void main(String[] args) {

long start = System.currentTimeMillis();

sample(.9,4*10000000);

System.out.println((System.currentTimeMillis()-start)/1000.);

}

}

#include <stdio.h>

#include <stdlib.h>

#include <time.h>

#define n 40000000

main() {

int* result = malloc(4*n);

double start = clock();

int i;

result[0] = 0;

for (i = 1; i<n;++i)

result[i]=(rand()/((double)RAND_MAX + 1)>.9)?(1-result[i-1]):

(result[i-1]);

printf("%d\n",result[i-1]);

printf("%f\n",(clock()-start)/CLOCKS_PER_SEC);

return 0;

Jul 22, 2007, 4:34:35â€¯AM7/22/07

to

Yaroslav Bulatov wrote:

>The Compiled one-liner (fun2) seems to be the fastest solution.

>

>I also compared it with a simple java solution, and it was about 2.7

>times faster (19 seconds vs 7 seconds). Using Java solution through

>JLink seems to take about the same time as running stand-alone.

>Running JVM with some optimization flags (server,compileThreshold)

>took the time to 6 seconds. A simple C solution took 2 seconds, 1.5

>seconds with gcc -O9 optimization.

>

>The bottom line seems to be that efficient Mathematica solution can be

>25 times faster than inefficient one, porting to C can give 10x speed-

>up, and Java is somewhere in between.

>

>

Here is a much faster solution:

markov[len_, p_] := BitAnd[

Accumulate @ Prepend[ UnitStep[RandomReal[{-p, 1 - p}, len]],

RandomInteger[]],

1

]

In[7]:= fun2[10^7, .9]; // Timing

Out[7]= {4.656,Null}

In[8]:= markov[10^7, .9]; // Timing

Out[8]= {0.766,Null}

A test that toggles occur about 10% of the time:

In[10]:= res = markov[10^7, .9];

N@Total[BitXor[Most[res], Rest[res]]]/(Length[res] - 1)

Out[11]= 0.0998825

A bit of explanation. In general, an efficient Mathematica algorithm

will use vector/array operations as much as possible, so that the kernel

function does the looping through a list. In this vein, both BitAnd[_,

1] and UnitStep[_] are vector operations relying on the listability of

BitAnd and UnitStep.

In fun2, the function NestList is slow because it has to loop through

the list, and doesn't take advantage of vector operations. It might seem

at first glance that a NestList type of approach is unavoidable, since

your application needs to repeatedly apply a custom function to an

initial value. However, there is at least one kernel function,

Accumulate, that very quickly applys a particular function (Plus) to an

initial value (I can't think of any others that are as quick).

Accumulate is a new version 6 function, and it can be about an order of

magnitude quicker than the old equivalent FoldList[Plus, 0, list]

approach. So, using Accumulate is the key to the speed improvement.

In order to apply Accumulate, I created a random vector having 0s with

probability p and 1s with probability 1-p. Then applying Accumulate will

create a list like:

{0,0,0,1,1,1,2,2,2, ...}

The even numbers correspond to one value and the odd numbers correspond

to the other value, since toggling twice brings us back to the same value.

One more comment. If your probability p is equal to a fraction with

small integers, say p=num/den, then it is possible to speed things up

further. For example, with p=.9, we have:

In[21]:= UnitStep[RandomReal[{-.9, .1}, 10^7]]; // Timing

UnitStep[RandomInteger[{-9, 0}, 10^7]]; // Timing

Out[21]= {0.562,Null}

Out[22]= {0.297,Null}

This improvement brings the timing down to about .5 seconds, which is

almost 10 times faster than your fun2 function.

Carl Woll

Wolfram Research

>fun2 = Compile[{{n, _Integer}, {p, _Real}},

> NestList[

> If[# == 0, If[RandomReal[] < p, 0, 1],

> If[RandomReal[] < 1 - p, 0, 1]] &, RandomChoice[{0, 1}], n]];

Jul 23, 2007, 3:54:26â€¯AM7/23/07

to

> >>Here are 3 functions. The first is your version; the second is your

> >>version compiled; the third is a slightly different version also

> >>compiled.

>

> >>fun = Function[{n, p},

> >> NestList[

> >>version compiled; the third is a slightly different version also

> >>compiled.

>

> >>fun = Function[{n, p},

> >> NestList[

> >> RandomChoice[{p^# (1 - p)^(1 - #), p^(1 - #) (1 - p)^#} -> {1,

> >> 0}] &, RandomChoice[{0, 1}], n]];

>

> >>fun1 = Compile[{{n, _Integer}, {p, _Real}},

> >> NestList[

>

> >>fun1 = Compile[{{n, _Integer}, {p, _Real}},

> >> NestList[

> >> RandomChoice[{p^# (1 - p)^(1 - #), p^(1 - #) (1 - p)^#} -> {1,

> >> 0}] &, RandomChoice[{0, 1}], n]];

>

> >>fun2 = Compile[{{n, _Integer}, {p, _Real}},

> >> NestList[

> >> If[# == 0, If[RandomReal[] < p, 0, 1],

> >> If[RandomReal[] < 1 - p, 0, 1]] &, RandomChoice[{0, 1}], n]];

>

> >>This gives the relative timings:

>

> >>First[#]/#& @ Table[Timing[i[10^5, .9]][[1]], {i, {fun, fun1, fun2}}]

>

> >>On my machine I get a factor of 5 speed up for fun1 and a factor of 25

> >>speed up for fun2 relative to fun.

>

> >>--Mark.

>

> >>fun2 = Compile[{{n, _Integer}, {p, _Real}},

> >> NestList[

> >> If[# == 0, If[RandomReal[] < p, 0, 1],

> >> If[RandomReal[] < 1 - p, 0, 1]] &, RandomChoice[{0, 1}], n]];

>

> >>This gives the relative timings:

>

> >>First[#]/#& @ Table[Timing[i[10^5, .9]][[1]], {i, {fun, fun1, fun2}}]

>

> >>On my machine I get a factor of 5 speed up for fun1 and a factor of 25

> >>speed up for fun2 relative to fun.

>

> >>--Mark.

Carl makes the clever suggestion to speed things up further using

RandomInteger, but he seems to suggest this idea isn't generally

applicable by referring to "small integers". A small amount of

experimentation suggests it works well for arbitrary p (as long as p

is a machine number). One simply uses Rationalize[p,0] to get the

nearest fraction. Here's a modification of Carl's function:

markov1[len_, p_] :=

With[{r = Rationalize[p, 0]},

BitAnd[Accumulate@

Prepend[UnitStep[

RandomInteger[{-Numerator[r],

Denominator[r] - Numerator[r] - 1}, len]], RandomInteger[]],

1]]

--Mark

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu