How to get fast random samples from a TF1?

55 views
Skip to first unread message

Christoph Deil

unread,
Oct 2, 2013, 5:51:42 AM10/2/13
to rootpy...@googlegroups.com
Hi,

I'd like to sample a ROOT TF1 and have the unbinned values in a numpy array.
Is there a fast way to do this, i.e. where the looping doesn't happen in Python and calling TF1::GetRandom() for each number?

Christoph






Noel Dawe

unread,
Oct 2, 2013, 6:16:53 AM10/2/13
to rootpy...@googlegroups.com
Good idea. I just added this to root_numpy:


You can use my master branch until this is merged in. I'll add TF2 and TF3 sampling support.

The looping is done on the cython end that gets compiled, so it's pretty fast.

Noel



Christoph






--
You received this message because you are subscribed to the Google Groups "rootpy users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rootpy-users...@googlegroups.com.
To post to this group, send email to rootpy...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rootpy-users/F6551A9F-C037-4EA3-BFCC-1DD4AF8896D1%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Noel Dawe

unread,
Oct 2, 2013, 7:02:23 AM10/2/13
to rootpy...@googlegroups.com
TF2 and TF3 are now also supported:

>>> from root_numpy import random_sample
>>> from ROOT import TF1, TF2, TF3
>>> random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)
array([ 0.99989852,  0.43970852,  0.5705115 , ...,  0.76644511,
        0.57712561,  0.99453755])
>>> random_sample(TF2("f2", "sin(x)*sin(y)/(x*y)"), 10000)
array([[ 0.82546291,  0.89052123],
       [ 0.99849559,  0.175143  ],
       [ 0.01066793,  0.56227858],
       ..., 
       [ 0.9717297 ,  0.32888917],
       [ 0.07574692,  0.23760245],
       [ 0.58331683,  0.42593687]])
>>> random_sample(TF3("f3", "sin(x)*sin(y)*sin(z)/(x*y*z)"), 10000)
array([[  4.08825739e-01,   5.29644626e-02,   9.44930468e-02],
       [  1.36657898e-01,   8.06204008e-01,   4.62176733e-01],
       [  4.07551758e-01,   1.74937939e-01,   4.54898638e-01],
       ..., 
       [  4.67710629e-01,   3.36476975e-01,   5.08900918e-01],
       [  3.67303658e-04,   1.99492895e-01,   6.25048016e-02],
       [  2.35600911e-01,   8.33069892e-02,   5.75918953e-03]])

Christoph Deil

unread,
Oct 2, 2013, 7:09:40 AM10/2/13
to rootpy...@googlegroups.com
Awesome!

Thank you Noel.

Christoph

Noel Dawe

unread,
Oct 2, 2013, 7:10:05 AM10/2/13
to rootpy...@googlegroups.com
Currently, it's about 4 times faster than looping in python:

In [1]: from ROOT import TF1

In [2]: from root_numpy import random_sample

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)
1000 loops, best of 3: 1.69 ms per loop

In [4]: func = TF1("f1", "TMath::DiLog(x)")

In [5]: import numpy as np

In [6]: %timeit np.array([func.GetRandom() for i in xrange(10000)])
100 loops, best of 3: 6.79 ms per loop

Noel Dawe

unread,
Oct 2, 2013, 7:44:59 AM10/2/13
to rootpy...@googlegroups.com
By removing some Python API calls in the inner loop I was able to make it a bit faster:

In [1]: from ROOT import TF1

In [2]: from root_numpy import random_sample

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)
1000 loops, best of 3: 1.29 ms per loop

Christoph Deil

unread,
Oct 2, 2013, 7:50:10 AM10/2/13
to rootpy...@googlegroups.com
On Oct 2, 2013, at 1:10 PM, Noel Dawe <noel...@gmail.com> wrote:

Currently, it's about 4 times faster than looping in python:

Looking at the implementation you can quickly see where it's spending it's time (integral array computation on the first GetRandom call and then a loop over the integral array):

I hadn't done any timing before and thought the speedup would be more, but for my current application a factor of 4 is already very helpful.

Christoph


Noel Dawe

unread,
Oct 2, 2013, 7:57:36 AM10/2/13
to Christoph Deil, rootpy...@googlegroups.com
Good point. Sampling 1M times shows a speedup of x13:

In [1]: from ROOT import TF1

In [2]: from root_numpy import random_sample

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 1000000)
10 loops, best of 3: 50.3 ms per loop

In [4]: func = TF1("f1", "TMath::DiLog(x)")

In [5]: import numpy as np

In [6]: %timeit np.array([func.GetRandom() for i in xrange(1000000)])
1 loops, best of 3: 663 ms per loop



On Wed, Oct 2, 2013 at 4:51 AM, Christoph Deil <deil.ch...@googlemail.com> wrote:

On Oct 2, 2013, at 1:44 PM, Noel Dawe <noel...@gmail.com> wrote:

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)
1000 loops, best of 3: 1.29 ms per loop


I'll try this as soon as it's merged in master (being pressed on time and lazy, sorry).
You could see if you get better speedups for 1e6 instead of 1e4 samples because there's the overhead of computing the integrals in sub-bins on the first call.

Thank you very much!
Christoph

Noel Dawe

unread,
Oct 2, 2013, 8:22:54 PM10/2/13
to Christoph Deil, rootpy...@googlegroups.com
Hi Christoph,

The PR is now merged:


Cheers!
Noel
Reply all
Reply to author
Forward
0 new messages