How to get fast random samples from a TF1?

Christoph Deil

unread,

Oct 2, 2013, 5:51:42 AM10/2/13

to rootpy...@googlegroups.com

Hi,

I'd like to sample a ROOT TF1 and have the unbinned values in a numpy array.
Is there a fast way to do this, i.e. where the looping doesn't happen in Python and calling TF1::GetRandom() for each number?

Christoph

Noel Dawe

unread,

Oct 2, 2013, 6:16:53 AM10/2/13

to rootpy...@googlegroups.com

Good idea. I just added this to root_numpy:

https://github.com/rootpy/root_numpy/pull/100

You can use my master branch until this is merged in. I'll add TF2 and TF3 sampling support.

The looping is done on the cython end that gets compiled, so it's pretty fast.

Noel

Christoph

--
You received this message because you are subscribed to the Google Groups "rootpy users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rootpy-users...@googlegroups.com.
To post to this group, send email to rootpy...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rootpy-users/F6551A9F-C037-4EA3-BFCC-1DD4AF8896D1%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Noel Dawe

unread,

Oct 2, 2013, 7:02:23 AM10/2/13

to rootpy...@googlegroups.com

TF2 and TF3 are now also supported:

>>> from root_numpy import random_sample

>>> from ROOT import TF1, TF2, TF3

>>> random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)

array([ 0.99989852, 0.43970852, 0.5705115 , ..., 0.76644511,

0.57712561, 0.99453755])

>>> random_sample(TF2("f2", "sin(x)*sin(y)/(x*y)"), 10000)

array([[ 0.82546291, 0.89052123],

[ 0.99849559, 0.175143 ],

[ 0.01066793, 0.56227858],

...,

[ 0.9717297 , 0.32888917],

[ 0.07574692, 0.23760245],

[ 0.58331683, 0.42593687]])

>>> random_sample(TF3("f3", "sin(x)*sin(y)*sin(z)/(x*y*z)"), 10000)

array([[ 4.08825739e-01, 5.29644626e-02, 9.44930468e-02],

[ 1.36657898e-01, 8.06204008e-01, 4.62176733e-01],

[ 4.07551758e-01, 1.74937939e-01, 4.54898638e-01],

...,

[ 4.67710629e-01, 3.36476975e-01, 5.08900918e-01],

[ 3.67303658e-04, 1.99492895e-01, 6.25048016e-02],

[ 2.35600911e-01, 8.33069892e-02, 5.75918953e-03]])

Christoph Deil

unread,

Oct 2, 2013, 7:09:40 AM10/2/13

to rootpy...@googlegroups.com

Awesome!

Thank you Noel.

Christoph

To view this discussion on the web visit https://groups.google.com/d/msgid/rootpy-users/CAMKNB%2Bt_K-8W7BAwCK8WUjm%2BC_jn3XVriKd2qmnra5F3eZkg-Q%40mail.gmail.com.

Noel Dawe

unread,

Oct 2, 2013, 7:10:05 AM10/2/13

to rootpy...@googlegroups.com

Currently, it's about 4 times faster than looping in python:

In [1]: from ROOT import TF1

In [2]: from root_numpy import random_sample

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)

1000 loops, best of 3: 1.69 ms per loop

In [4]: func = TF1("f1", "TMath::DiLog(x)")

In [5]: import numpy as np

In [6]: %timeit np.array([func.GetRandom() for i in xrange(10000)])

100 loops, best of 3: 6.79 ms per loop

Noel Dawe

unread,

Oct 2, 2013, 7:44:59 AM10/2/13

to rootpy...@googlegroups.com

By removing some Python API calls in the inner loop I was able to make it a bit faster:

In [1]: from ROOT import TF1

In [2]: from root_numpy import random_sample

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)

1000 loops, best of 3: 1.29 ms per loop

Christoph Deil

unread,

Oct 2, 2013, 7:50:10 AM10/2/13

to rootpy...@googlegroups.com

On Oct 2, 2013, at 1:10 PM, Noel Dawe <noel...@gmail.com> wrote:

Currently, it's about 4 times faster than looping in python:

Looking at the implementation you can quickly see where it's spending it's time (integral array computation on the first GetRandom call and then a loop over the integral array):

http://root.cern.ch/root/html/src/TF1.cxx.html#gYdi6C

I hadn't done any timing before and thought the speedup would be more, but for my current application a factor of 4 is already very helpful.

Christoph

To view this discussion on the web visit https://groups.google.com/d/msgid/rootpy-users/CAMKNB%2Bti9qKxhWThhcx0nNQ7n%3DKdH92BmUEcabdkcURtygD1EA%40mail.gmail.com.

Noel Dawe

unread,

Oct 2, 2013, 7:57:36 AM10/2/13

to Christoph Deil, rootpy...@googlegroups.com

Good point. Sampling 1M times shows a speedup of x13:

In [1]: from ROOT import TF1

In [2]: from root_numpy import random_sample

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 1000000)

10 loops, best of 3: 50.3 ms per loop

In [4]: func = TF1("f1", "TMath::DiLog(x)")

In [5]: import numpy as np

In [6]: %timeit np.array([func.GetRandom() for i in xrange(1000000)])

1 loops, best of 3: 663 ms per loop

On Wed, Oct 2, 2013 at 4:51 AM, Christoph Deil <deil.ch...@googlemail.com> wrote:

On Oct 2, 2013, at 1:44 PM, Noel Dawe <noel...@gmail.com> wrote:

In [3]: %timeit random_sample(TF1("f1", "TMath::DiLog(x)"), 10000)
1000 loops, best of 3: 1.29 ms per loop

I'll try this as soon as it's merged in master (being pressed on time and lazy, sorry).

You could see if you get better speedups for 1e6 instead of 1e4 samples because there's the overhead of computing the integrals in sub-bins on the first call.

Thank you very much!
Christoph

Noel Dawe

unread,

Oct 2, 2013, 8:22:54 PM10/2/13

to Christoph Deil, rootpy...@googlegroups.com

Hi Christoph,

The PR is now merged:

https://github.com/rootpy/root_numpy/pull/100

Cheers!

Noel

Reply all

Reply to author

Forward