More collaborative approach between iminuit, probfit, pyhf and other statistics tools.

198 views
Skip to first unread message

Matthieu Marinangeli

unread,
Nov 1, 2018, 3:14:37 AM11/1/18
to Scikit-HEP forum

Dear all,

I would like to start a discussion about building common blocks that can be shared between packages.

For instance, since there is no RooStat equivalent in pure python, except pyhf. But pyhf is for binned shaped analysis/fit and that's why I started a project called statnight (based on iminuit), see https://github.com/marinang/statnight, where you can do un-binned shaped analysis.
Only the asymtpotic calculator is implemented for now and you can see one example here:


I also know that some people from Zurich are also designing a fitter using TensorFlow, see https://github.com/zfit/zfit/tree/pdfbase.

Something quite common in these packages is for instance everyone defines its own Gaussian. So maybe it has to be define once and shared between packages.
Some ideas that could be shared:
  • a package for pdfs, that could be used with different backends (Numpy, TensorFlow, ...). The pdfs should all provide call, integrate, logpdf, cdf, etc ... and I guess in analytical forms if possible. Also tools to build a more complex pdf.
  • a package for cost function builder.
  • a package for generating pseudo-experiments / toys
  • ....
What do you think? There is already a good basis of all of that in probfit, that maybe could be split.

Cheers,
Matt 

Jim Pivarski

unread,
Nov 1, 2018, 10:23:57 AM11/1/18
to matthieu.m...@gmail.com, scikit-h...@googlegroups.com
This sounds great! Common protocols for sharing information about fitting would enable specialized fitters to spring up and solve particular problems while still being usable in an analysis that uses multiple fitting techniques. It's much easier (and more desirable, in my opinion) to develop specialized tools with common protocols than one tool to try to win the mindshare of everyone.

However, I wanted to point out that there's already a pythonic library of PDFs: SciPy. See their continuous distribution library and their discrete distribution library. It gives you all of the PDFs, CDFs, and random deviates you're likely to need in fitting in a variety of ways. More importantly, it standardizes a set of names and calling procedures to get the same kind of information from each, and it plays well with Numpy (after all, Numpy was developed to support SciPy from the beginning).

If some distributions we need for physics are not present, it could be a good idea to create a library of those missing distributions— in the protocol already established by SciPy, so that they're interchangeable with those from SciPy. Maybe they could even be contributed, but maintaining an "extras" library would be more rapid and wouldn't have to be exclusive with formally including them in SciPy. (It could be an on-ramp, like the way that Boost is effectively an on-ramp for C++ standardization...)

(There might already be an "extras" library of statistical distributions in SciPy form... We should search before inventing.)

Also, if you're not aware of the HSF/PyHEP-Histogramming Gitter, I'm trying to develop a standard for exchanging histograms across libraries and this would be very relevant for fitting. I want this protocol to be able to encode such information as which histograms belong to which signal and control regions, which represent systematic variations on a single distribution, etc. This information shouldn't have to be derived from naming conventions by parsing names or re-expressed by forcing the user to create data cards assigning hundreds of free-floating histograms to particular meanings in the fit. Also, because I know that unbinned fits are important and may be part of a combined binned+unbinned fit, this histogram protocol will probably include a simple flat ntuple type, so that a control region within the layout can be easily swapped between binned and unbinned.

The histogram protocol should not describe the fitting model, which is among the things you're talking about standardizing here. I'd like to collaborate on how to draw this line and how a user who wants to combine a pack of histograms (and ntuples) with a model has an easy time doing so.

Cheers,
-- Jim



--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/scikit-hep-forum/3d677112-182c-4c07-a367-208c3d0e48e7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthieu Marinangeli

unread,
Nov 1, 2018, 10:54:57 AM11/1/18
to Jim Pivarski, scikit-h...@googlegroups.com
Hi Jim,
This sounds great! Common protocols for sharing information about fitting would enable specialized fitters to spring up and solve particular problems while still being usable in an analysis that uses multiple fitting techniques. It's much easier (and more desirable, in my opinion) to develop specialized tools with common protocols than one tool to try to win the mindshare of everyone.  

It is also my opinion.


However, I wanted to point out that there's already a pythonic library of PDFs: SciPy. See their continuous distribution library and their discrete distribution library. It gives you all of the PDFs, CDFs, and random deviates you're likely to need in fitting in a variety of ways. More importantly, it standardizes a set of names and calling procedures to get the same kind of information from each, and it plays well with Numpy (after all, Numpy was developed to support SciPy from the beginning). 

Yes true, but if we want to build more complex fit models it would be nice to have a package that adds pdf, normalizes them, if some parameters are shared distribute them correctly, etc... keeping the same Scipy standard for the total pdf for instance.

Cheers,
Matt

Henry Schreiner

unread,
Nov 1, 2018, 10:55:30 AM11/1/18
to Jim Pivarski, matthieu.m...@gmail.com, scikit-h...@googlegroups.com
Just a note on scipy: it is very, very slow. I’ve gone back through several codes and managed a roughly 100x speedup in each just by redoing norm.pdf and norm.cdf in numpy. (I can get another factor 10 in numba, but that was not as important as the first 100 x). I’d like to see a similar, high performance set of distributions, that might be numba or even (dreaming here) CUDA/ROCM friendly. (could be accelerated or already accelerated if numba is present). I would also be interested in PDF composition, with SciPy does not support. It’s not that easy to apply to the PDFs to general fits, either.

I’m not against using it as a standard, but some things like performance and usability need to be kept in mind if it is.

Henry
 

Jim Pivarski

unread,
Nov 1, 2018, 11:25:45 AM11/1/18
to matthieu.m...@gmail.com, scikit-h...@googlegroups.com
I see. I didn't know about the slowness problem (is the venerable old SciPy getting ignored in the new era?). Vectorizability and co-processors would also be good, and maybe SciPy won't be taking that on soon. Also, that point about composability (distribution + distribution = distribution) is clearly important for RooFit-style fit-building, and I guess it's unlikely that SciPy will take that on.

Okay, I'm sold on the inadequacy of SciPy the library, but its centrality in the ecosystem breaks the symmetry of possible choices of conventions: we should adopt the spellings and method signatures that they use. The original example of "everybody implements their own Gaussian" made me think of the fact that I usually write the formula for a Gaussian PDF inline because I can remember its formula better than "normal" vs "norm" vs "gauss" vs "gaussian" vs "Gauss" (capitalized because it's someone's name) vs "gaus" (why did ROOT leave off the second "s"???).

Maybe we could reproduce a well-specified subset of the SciPy interface, not bothering to implement a method for the nth moment, while also supersetting it to provide (e.g.) composition. (Replacing functions with objects that have __add__, __mul__, etc. to build a fit function— and derivaties!— would be pretty cool.)

Using the same names and method structure would also simplify unit testing: overlaps with SciPy should reproduce SciPy results (SciPy can be required in the testing; it doesn't matter if the tests are a little slow).

If possible, it could be valuable to define the implementations purely in terms of Numpy ufuncs, so that it transparently works for scalars, arrays, jagged arrays, delayed Dask arrays, and CuPy on GPUs. These, too, would have to be introduced to the testing early so that not too much is built on a mistaken assumption about how transparent those plug-ins are. (For instance, awkward-array, Dask, and CuPy objects pass through Numpy ufuncs because they implement the __array_ufunc__ protocol, but not Numpy functions. There's an upcoming NEP to make that possible.)

-- Jim

Matthieu Marinangeli

unread,
Nov 1, 2018, 12:02:17 PM11/1/18
to Jim Pivarski, scikit-h...@googlegroups.com
I really like the ideas of building function as object, with a common class like Scipy https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html#scipy.stats.rv_continuous

by adding also __mult__, __add__ etc… I would add an « Integrate » in order to avoid cdf(b) - cdf(a).

I have also thought about Numpy ufuncs, less easy to implement though I guess.

Matt

Jim Pivarski

unread,
Nov 1, 2018, 12:26:31 PM11/1/18
to Matthieu Marinangeli, scikit-h...@googlegroups.com
While I can't dedicate my time to it as a primary developer, I can help with particularly hard ufunc problems and review code to avoid lock-in issues downstream. The hardest things to turn into ufuncs are when a function implementation has to "iterate until converted," but there are some things that can be done.

Here, I'm just talking about a PDF library (and again, keep checking for an already-existing one!). The initial e-mail discussed several points for standardization.

Jim


Brian Pollack

unread,
Nov 1, 2018, 12:54:45 PM11/1/18
to Jim Pivarski, Matthieu Marinangeli, scikit-h...@googlegroups.com
Just my own two cents: I've enjoyed using `lmfit` (http://lmfit.github.io/lmfit-py/index.html) for a lot of my recent curve-fitting in python (including multidimensional and simultaneous fitting).  This package wraps a lot of Scipy.optimize but provides what I find to be relatively easy to use and straightforward model building (http://lmfit.github.io/lmfit-py/model.html).  Similar to Henry's comment, I use a combination of numpy and numba decorators to improve the execution time of these function calls.  This package already implements a slew of optimization algorithms, parameter bounds, constraints, plotting methods, and the ability to create composite models (http://lmfit.github.io/lmfit-py/model.html#composite-models-adding-or-multiplying-models), although I'm not sure if the entire set of desired algebra is currently implemented.

Also, @Henry, could GooFit fit into this discussion, especially regarding performance and GPU parallelization?

-Brian


For more options, visit https://groups.google.com/d/optout.
--
Brian Pollack, Ph.D.
Experimental Particle Physics
Northwestern University

Matthieu Marinangeli

unread,
Nov 5, 2018, 8:27:02 AM11/5/18
to Jim Pivarski, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
Just back at this.  I had a look at how probability density functions where defined in scipy and they actually using ufuncs which are called special functions, see https://docs.scipy.org/doc/scipy/reference/special.html. But it’s true that it is however slow, super slow.  

In TensorFlow Probability there are some distributions already coded https://www.tensorflow.org/probability/api_docs/python/tfp/distributions and also a way to combine pdf called « Mixture » .

Matt

Eduardo Rodrigues

unread,
Nov 5, 2018, 8:31:58 AM11/5/18
to Matthieu Marinangeli, Jim Pivarski, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

Hi all,

 

Just for the sake of it, let me say that I very much like the idea posted by Matt. We actually talked before about the matter, the 2 of us. IIRC Jim will be at CERN this week. Shall we try and meet the few of us that can? Jim, if you let us have your availability we can hopefully find a slot ... Maybe you Matt could come from Lausanne?

 

Thanks,

Eduardo

 

| Eduardo Rodrigues | University of Cincinnati | LHCb Experiment @ CERN & DIANA/HEP | Tel. +41 (0) 22 76 72097 |


For more options, visit https://groups.google.com/d/optout.

 




Avast logo

This email has been checked for viruses by Avast antivirus software.
www.avast.com


Matthieu Marinangeli

unread,
Nov 5, 2018, 8:36:41 AM11/5/18
to Jim Pivarski, Henry Fredrick Schreiner, Eduardo Rodrigues, scikit-h...@googlegroups.com
Hi Eduardo,

Good idea. Yes I can easily come from Lausanne.

Matt 

Jim Pivarski

unread,
Nov 5, 2018, 8:41:32 AM11/5/18
to Eduardo Rodrigues, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
Which Jim? Me? I won't be at CERN. I can join a conference call, though.

Jim Pivarski

Eduardo Rodrigues

unread,
Nov 5, 2018, 8:50:12 AM11/5/18
to Jim Pivarski, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

Hi Jim,

 

Yep, I was under the impression that you would be physically here for the HSF software forum on Wednesday. Will you be coming any time soon? Else we could indeed a meeting with vidyo ...

 

Thanks,

Eduardo

 

| Eduardo Rodrigues | University of Cincinnati | LHCb Experiment @ CERN & DIANA/HEP | Tel. +41 (0) 22 76 72097 |

 

 

Jim Pivarski

unread,
Nov 5, 2018, 9:29:17 AM11/5/18
to Eduardo Rodrigues, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
I'll be calling in to the HSF meeting. The next time I'll be at CERN will either be for a week before or (more likely) after ACAT in March 2019.

We can meet by your choice of video service, though.

Jim



Eduardo Rodrigues

unread,
Nov 5, 2018, 10:39:55 AM11/5/18
to Albert Puig, Jim Pivarski, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

Hi,

 

Seems there is interest above threshold. How would Thursday 16h30 fit? Could most of you attend?

 

Thanks,

Eduardo

 

| Eduardo Rodrigues | University of Cincinnati | LHCb Experiment @ CERN & DIANA/HEP | Tel. +41 (0) 22 76 72097 |

 

From: Albert Puig
Sent: 05 November 2018 15:56
To: Jim Pivarski
Cc: Eduardo Rodrigues Figueiredo; Matthieu Marinangeli; Henry Fredrick Schreiner; scikit-h...@googlegroups.com
Subject: Re: More collaborative approach between iminuit, probfit,pyhfandother statistics tools.

 

Hi all,

 

we have done a lot of this work in zfit already, basing ourselves close to TF distributions (and ecosystem), but we want to build something with a common API so tools are reasonably interchangeable. Of course, sharing too much stuff goes against specialisation and speed, but at least it would be great to define a common API to interact with certain common things.

 

BTW, I can also be around for a discussion this week.

 

Albert

 

PS: The goal of zfit is to build a complete NLL minimisation library based on tensorflow (for now), along with PDFs, that is interchangeable with the rest of the python ecosystem, with focus on scalability, paralellization and ease-of-use (no cython, no C++ needed at all to extend)

 

 

Avast logo

This email has been checked for viruses by Avast antivirus software.
www.avast.com

 

 

--

You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.

Jim Pivarski

unread,
Nov 5, 2018, 10:51:41 AM11/5/18
to Eduardo Rodrigues, alber...@cern.ch, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
On Mon, Nov 5, 2018 at 9:39 AM Eduardo Rodrigues <eduardo....@cern.ch> wrote:

Seems there is interest above threshold. How would Thursday 16h30 fit? Could most of you attend?


Yes, I can attend. Let me know what the venue is (Vidyo, Skype, phone, etc.).
-- Jim
 

Hans Dembinski

unread,
Nov 5, 2018, 12:32:47 PM11/5/18
to Jim Pivarski, matthieu.m...@gmail.com, scikit-h...@googlegroups.com
Jumping late into the discussion...

> On 1. Nov 2018, at 16:25, Jim Pivarski <jpiv...@gmail.com> wrote:
>
> I see. I didn't know about the slowness problem (is the venerable old SciPy getting ignored in the new era?). Vectorizability and co-processors would also be good, and maybe SciPy won't be taking that on soon. Also, that point about composability (distribution + distribution = distribution) is clearly important for RooFit-style fit-building, and I guess it's unlikely that SciPy will take that on.

When you have scaled pdfs that are added as part of the model (signal + background pdf is a typical example), the amplitudes of those pdfs can be fitted semi-analytically with the NNLS algorithm. This can make the fit much faster and/or more stable, so it is good if the fitting tools can distinguish between parameters which change pdfs in shape and amplitude parameters which scale those pdfs.

> Okay, I'm sold on the inadequacy of SciPy the library, but its centrality in the ecosystem breaks the symmetry of possible choices of conventions: we should adopt the spellings and method signatures that they use.

+1

> The original example of "everybody implements their own Gaussian" made me think of the fact that I usually write the formula for a Gaussian PDF inline because I can remember its formula better than "normal" vs "norm" vs "gauss" vs "gaussian" vs "Gauss" (capitalized because it's someone's name) vs "gaus" (why did ROOT leave off the second "s"???).

Yes, why did they do this??? It is probably inherited from PAW. People educated in FORTRAN tend to treat letters in variable and function names like they cost them money.

> Maybe we could reproduce a well-specified subset of the SciPy interface, not bothering to implement a method for the nth moment, while also supersetting it to provide (e.g.) composition. (Replacing functions with objects that have __add__, __mul__, etc. to build a fit function— and derivaties!— would be pretty cool.)
>
> Using the same names and method structure would also simplify unit testing: overlaps with SciPy should reproduce SciPy results (SciPy can be required in the testing; it doesn't matter if the tests are a little slow).

+1

I think that would also keep the project relevant in the long run, because it can be merged upstream into the scipy distribution.

> If possible, it could be valuable to define the implementations purely in terms of Numpy ufuncs, so that it transparently works for scalars, arrays, jagged arrays, delayed Dask arrays, and CuPy on GPUs. These, too, would have to be introduced to the testing early so that not too much is built on a mistaken assumption about how transparent those plug-ins are. (For instance, awkward-array, Dask, and CuPy objects pass through Numpy ufuncs because they implement the __array_ufunc__ protocol, but not Numpy functions. There's an upcoming NEP to make that possible.)

Best regards,
Hans

Eduardo Rodrigues

unread,
Nov 6, 2018, 7:17:07 AM11/6/18
to Jim Pivarski, alber...@cern.ch, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

Hi Matt, Jim, Henry, Hans, Albert, Brian, all,

 

I hope as many of you can join and discuss the proposal from Matt this Thursday 16h30 CET.

We have a room booked at CERN for those who appreciate the face-to-face chat when possible.

 

The conference room 14-4-002

has been booked for Eduardo Rodrigues

on 8 Nov 2018 from 16:30 to 18:00.

Reason: Scikit-HEP Meeting

 

We will be using my room on vidyo as 14-4-002 does not have equipment. It’s the link https://vidyoportal.cern.ch/join/1RPErGNe9gMf.

 

See you, thank a lot,

Eduardo et al.

 

| Eduardo Rodrigues | University of Cincinnati | LHCb Experiment @ CERN & DIANA/HEP | Tel. +41 (0) 22 76 72097 |

 

From: Jim Pivarski
Sent: 05 November 2018 16:52
To: Eduardo Rodrigues
Cc: alber...@cern.ch; Matthieu Marinangeli; Henry Fredrick Schreiner; scikit-h...@googlegroups.com
Subject: Re: More collaborative approach between iminuit, probfit,pyhfandotherstatistics tools.

 

On Mon, Nov 5, 2018 at 9:39 AM Eduardo Rodrigues <eduardo....@cern.ch> wrote:

--

You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.

 


Virus-free. www.avast.com

Jim Pivarski

unread,
Nov 7, 2018, 8:26:26 AM11/7/18
to Hans Dembinski, Matthieu Marinangeli, scikit-h...@googlegroups.com
I was thinking about this project to standardize nonlinear fitting models in Python (like RooStats or SciPy distributions, but fast in Python), and I remembered a few more packages to consider, then found more.

There's a widely-used package called "statsmodels" that is supposed to be the statistical model builder for Python. It emulates R's "formula" interface closely because they're trying to adhere to that established pattern. However, statsmodels provides a hundred different ways to do linear fitting and no truly nonlinear fitting, as far as I could find. It reminds me of the stats book the experienced statistician at my old job had us read: the first half of the book was about modifications to the ordinary least squares loss function, but all of the fit models were linear. (The second half was machine learning. It's a good book anyway, with great insight into how different orders of regularization work.) statsmodels seems to have the same emphasis.

There was a GSoC project to bring nonlinear fitting to statsmodels, but it is listed as being incomplete/not merged. See also: their blog.

Another library that seems like what we want is "patsy," a project for "describing statistical models in Python. However, this seems to actually be about the "kernel trick," in which you apply a nonlinear transform to your features and do a linear fit to the result. It's a great trick when you can do it because linear fitting is fast and deterministic. They also use a formula language that looks like it was inspired by R.

Scikit-Optimize is all about using machine learning to optimize functions, and therefore must include nonlinear fitting, but I don't see much in its documentation about building models from common functions.

One thing that looks exactly like the package we'd want is "LMFIT," a nonlinear least-squares fitter. It has a models library that admittedly has only the basics, but their Model class is composable in a formulaic way: by overriding __add__, __mul__, etc. They have confidence intervals that you can plot as contours and heatmaps— now I'm beginning to wonder if the authors are physicists... Yes, they are: the authors, Matthew Newville and Till Stensitzki, have publications in X-ray and IR-laser spectroscopy, respectively. They're not high-energy physicists, but I'm not at all surprised that they're physicists, given that they're actually interested in nonlinear fitting.

For completeness, the general recommendation for StackOverflow questions about nonlinear fitting is to use SciPy's curve_fit. We've talked about that.

I had to dig to find LMFIT, but it looks like a pretty good match to what we want and I suspect that we'd want to adapt it to our use-cases, starting by adding models. The interface looks pretty much like what I'd design if I were doing it. Can anyone give me a good argument not to team up with LMFIT?

-- Jim





On Mon, Nov 5, 2018 at 11:32 AM Hans Dembinski <hans.de...@gmail.com> wrote:
Jumping late into the discussion...

> On 1. Nov 2018, at 16:25, Jim Pivarski <jpiv...@gmail.com> wrote:
>
> I see. I didn't know about the slowness problem (is the venerable old SciPy getting ignored in the new era?). Vectorizability and co-processors would also be good, and maybe SciPy won't be taking that on soon. Also, that point about composability (distribution + distribution = distribution) is clearly important for RooFit-style fit-building, and I guess it's unlikely that SciPy will take that on.

When you have scaled pdfs that are added as part of the model (signal + background pdf is a typical example), the amplitudes of those pdfs can be fitted semi-analytically with the NNLS algorithm. This can make the fit much faster and/or more stable, so it is good if the fitting tools can distinguish between parameters which change pdfs in shape and amplitude parameters which scale those pdfs.

> Okay, I'm sold on the inadequacy of SciPy the library, but its centrality in the ecosystem breaks the symmetry of possible choices of conventions: we should adopt the spellings and method signatures that they use.

+1


> The original example of "everybody implements their own Gaussian" made me think of the fact that I usually write the formula for a Gaussian PDF inline because I can remember its formula better than "normal" vs "norm" vs "gauss" vs "gaussian" vs "Gauss" (capitalized because it's someone's name) vs "gaus" (why did ROOT leave off the second "s"???).

Yes, why did they do this??? It is probably inherited from PAW. People educated in FORTRAN tend to treat letters in variable and function names like they cost them money.

> Maybe we could reproduce a well-specified subset of the SciPy interface, not bothering to implement a method for the nth moment, while also supersetting it to provide (e.g.) composition. (Replacing functions with objects that have __add__, __mul__, etc. to build a fit function— and derivaties!— would be pretty cool.)
>
> Using the same names and method structure would also simplify unit testing: overlaps with SciPy should reproduce SciPy results (SciPy can be required in the testing; it doesn't matter if the tests are a little slow).

+1

I think that would also keep the project relevant in the long run, because it can be merged upstream into the scipy distribution.

> If possible, it could be valuable to define the implementations purely in terms of Numpy ufuncs, so that it transparently works for scalars, arrays, jagged arrays, delayed Dask arrays, and CuPy on GPUs. These, too, would have to be introduced to the testing early so that not too much is built on a mistaken assumption about how transparent those plug-ins are. (For instance, awkward-array, Dask, and CuPy objects pass through Numpy ufuncs because they implement the __array_ufunc__ protocol, but not Numpy functions. There's an upcoming NEP to make that possible.)

Best regards,
Hans


--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send an email to scikit-h...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/scikit-hep-forum/60F25349-8BE7-4F0D-A180-7B03D3A39B44%40gmail.com.

Henry Schreiner

unread,
Nov 7, 2018, 8:54:26 AM11/7/18
to Jim Pivarski, Hans Dembinski, Matthieu Marinangeli, scikit-h...@googlegroups.com
Hi Jim,

If you look back at earlier portions of the thread, LMFit was suggested. I taught a class over fitting distributions a few weeks ago, and I used LMFit and GooFit - probfit could not be installed on our Python 3 only system at that time (since fixed by a new release). I didn’t see a clear way to do an unbanned NLL fit, but otherwise LMFit looked like it provides a lot of what we need. If we could team up, make sure either methods or maybe just documentation include unbinned fits, and some HEP functionality, that might be ideal. I’m a big fan of not reinventing the wheel unless we have to.

I won’t be at tomorrow’s discussion due to being on a detector shift. I might listen in if it’s quiet, though.

Henry


Jim Pivarski

unread,
Nov 7, 2018, 9:04:54 AM11/7/18
to henrysch...@gmail.com, Hans Dembinski, Matthieu Marinangeli, scikit-h...@googlegroups.com
Ack! Sorry Brian, for missing your suggestion. I started the e-mail about statsmodels and was actively web-searching while I wrote it, finding LMFIT at the end of that.

It is true that unbinned fits are essential. I didn't expect that it would work out of the box; in fact, I was expecting to find things like SciPy distributions, which we adhere to the general API but don't use directly. This LMFIT looks like it would be a smaller delta to our needs than implementing from scratch (which isn't always the case!).

-- Jim

Brian Pollack

unread,
Nov 7, 2018, 10:14:13 AM11/7/18
to Jim Pivarski, henrysch...@gmail.com, Hans Dembinski, Matthieu Marinangeli, scikit-h...@googlegroups.com
No problem ;) I will do my best to attend tomorrow's meeting as well, so we can discuss further.  Like you and Henry noted, LMFIT doesn't seem to have native support for unbinned NLL fitting, or really any kind of fitting that requires univariate data.  But I think adding support would be relatively minor compared to all the groundwork that has been laid thus far by Matt Newville and co.

-Brian


For more options, visit https://groups.google.com/d/optout.
--

Christoph Deil

unread,
Nov 7, 2018, 10:30:15 AM11/7/18
to Jim Pivarski, Hans Dembinski, Matthieu Marinangeli, scikit-h...@googlegroups.com
Hi,

I think in Python there’s too many projects for modelling & fitting and unfortunately none has emerged yet with high quality and a large user and contributor base.

E.g. in Julia it seems that people are collaborating much more, I’m not using it myself, but it seems they have a coordinated and concerted effort to develop a set of packages that complement and build on each other: https://www.juliaopt.org/ .

It would be really nice to have the same in Python.

I’m an astronomer, and there are also a few modelling packages, e.g. to name two big ones:
They have nice modelling languages, but not really cost function builders or unbinned analysis.
There were some efforts to unite at least in astronomy, like discussion sessions a few years ago at one of the http://openastronomy.org/pyastro/ conferences, and something was done (https://saba.readthedocs.io) but overall it hasn’t happened yet.

Sherpa has a few good algorithms for parameter error estimation (https://sherpa.readthedocs.io/en/latest/fit/index.html#estimating-errors ) whereas most nonlinear optimisation packages don’t yield parameter errors. I see that lmfit a week ago merged https://github.com/lmfit/lmfit-py/pull/506 using numdifftools - haven’t tried it yet, would be interesting to know how that compares to HESSE.

I guess for the people that want something built on top of Numpy, talking to the lmfit and stats models and scipy.optimise devs is a good idea, and for Bayesian analysis maybe https://docs.pymc.io ?

In the Tensorflow universe I guess  https://www.tensorflow.org/probability/  is starting to emerge and contributing there would be a good idea?
Or does that not do what you need, and instead you plan to focus your efforts on GooFit or something else?

To explain a bit where I’m coming from: I’m a gamma-ray astronomer at MPIK Heidelberg working on Gammapy and models like https://docs.gammapy.org/0.8/api/gammapy.cube.models.SkyModel.html - if you look, you’ll see that our data is mostly 3D with an energy axis and two sky coordinates (spherical geometry), and our code is built on Astropy, so is not appropriate for HEP analysis, too domain-specific. I’m mostly interested in collaboration on lower-level packages for parameter optimisation and parameter error analysis, i.e. what iminuit and Sherpa do, but extracted into a Python package that just builds on Numpy (or alternatively Tensorflow I guess) but that itself is ideally pure Python, i.e. a light and simple and after some development in the next year or two long-term stable like Numpy dependency to adopt.

I’ll try to join the call tomorrow.

Christoph


Albert Puig

unread,
Nov 7, 2018, 10:35:26 AM11/7/18
to Scikit-HEP forum
Dear all,

(begin shameless plug) the zfit package that Matthieu mentioned in his first email could also be interesting to fill this gap (https://github.com/zfit/zfit/). With a design philosophy similar to LMFIT, zfit is more focused on HEP and therefore aims at providing those features that are more useful to our community. Also, it's based on tensorflow, with all the advantages that this entails.

While still not in beta stage (getting closer though, we're working on finalizing the API to be able to interact with other packages), it already provides a similar fitting experience to LMFIT for unbinned (extended) fits, although the model library is still not that complete (although we're integrating with tf_distributions) and the library needs a bit of polishing. Detailed documentation is also high in our priority list.

In addition, it provides a few advantages over LMFIT:
  - More HEP oriented, so things like PDF handling, integration, unbinned fits, as well as generation, are built in the library. iminuit integration is also there.
  - It's not really built from scratch but on top of scipy.optimizers, TensorFlow Probability Distributions, etc.
  - GPU integration courtesy of tensorflow.
  - Advanced fit mode, also courtesy of tensorflow. While by default every operation is executed eagerly, it's possible to switch to a non-eager mode to performed advanced graph manipulation. 

On top of this, we are working in several features using Tensorflow capability, such as model sharing, that could be very useful for analysis preservation/reproducibility.

All this to say that by using zfit we would not be approaching the problem from scratch either, and even though it's not yet at production-level, its feature set is closer to HEP by construction. We can discuss tomorrow.

Albert

PS: In a couple of weeks we'll know if we have funding to have somebody work part-time on zfit, in case it helps.

Christoph Deil

unread,
Nov 7, 2018, 11:02:06 AM11/7/18
to Albert Puig, Scikit-HEP forum
Hi Albert,

Looking at https://github.com/zfit/zfit it’s really hard to see what it is, even pasting the description from below into the README and docs would be very useful if you want already co-developers or early adopters that try it out.

There’s also the sticky license split preventing collaboration and re-use. I see you put zfit under GPL. That means people working on projects with a liberal license like BSD or MIT (like numpy, scipy, Tensorflow, the larger fraction of the Python scientific ecosystem) can’t import zfit in their BSD or MIT code and won’t look at the code, because it’s a license violation to re-use it in the liberal-licensed package they work on.

I know there are good reasons for choosing GPL, sometimes you must (if you build on GPL yourself) and some people prefer it strongly over a liberal license, that’s a completely valid choice. I’m just saying: if you don’t care strongly about the license, and want to maximise collaboration and sharing, consider putting a liberal license (like Numpy or Tensorflow and most packages built on those).

This unfortunate situation in open source and free software will always remain to some extent and is a real issue that hinders collaboration and re-us. E.g. the scipy devs would like to wrap FFTW because it’s the fastest in the west, but don’t because they want a liberal license and FFTW is GPL.

Christoph


Albert Puig

unread,
Nov 7, 2018, 11:16:57 AM11/7/18
to deil.ch...@googlemail.com, scikit-h...@googlegroups.com
Hi Christoph,

thanks for the feedback, you're right. A few comments below

On Wed, Nov 7, 2018 at 5:02 PM Christoph Deil <deil.ch...@googlemail.com> wrote:
Hi Albert,

Looking at https://github.com/zfit/zfit it’s really hard to see what it is, even pasting the description from below into the README and docs would be very useful if you want already co-developers or early adopters that try it out.


Yeah, this is true. As I said, we're not yet ready to share it to the early adopters (but we think this should happen before Christmas), but it seemed to be a worthy addition to the discussion. By the time we make it "public", we plan to have a bit of documentation and tutorials (you can see https://github.com/zfit/zfit/blob/develop/examples/simple_dist1.ipynb for a first attempt at outlining the features of the library, as well as the tests in the develop branch).
 
There’s also the sticky license split preventing collaboration and re-use. I see you put zfit under GPL. That means people working on projects with a liberal license like BSD or MIT (like numpy, scipy, Tensorflow, the larger fraction of the Python scientific ecosystem) can’t import zfit in their BSD or MIT code and won’t look at the code, because it’s a license violation to re-use it in the liberal-licensed package they work on.

I know there are good reasons for choosing GPL, sometimes you must (if you build on GPL yourself) and some people prefer it strongly over a liberal license, that’s a completely valid choice. I’m just saying: if you don’t care strongly about the license, and want to maximise collaboration and sharing, consider putting a liberal license (like Numpy or Tensorflow and most packages built on those).

This unfortunate situation in open source and free software will always remain to some extent and is a real issue that hinders collaboration and re-us. E.g. the scipy devs would like to wrap FFTW because it’s the fastest in the west, but don’t because they want a liberal license and FFTW is GPL.


There should not be any problem changing license, to be honest. We made a choice, but we didn't have a strong motivation. What we want is a license that is the most useful for the community, so feedback like yours is super useful for this. Given no restrictions, what would be the ideal license from the SciKit-HEP/HEP community point of view?

Thanks,
Albert


--
Dr. Albert Puig Navarro
SNSF Ambizione Fellow
Physik-Institut @ Universität Zürich
http://cern.ch/albert.puig

Jim Pivarski

unread,
Nov 7, 2018, 12:23:46 PM11/7/18
to apui...@gmail.com, deil.ch...@googlemail.com, scikit-h...@googlegroups.com
On Wed, Nov 7, 2018 at 10:17 AM Albert Puig <apui...@gmail.com> wrote:
There should not be any problem changing license, to be honest. We made a choice, but we didn't have a strong motivation. What we want is a license that is the most useful for the community, so feedback like yours is super useful for this. Given no restrictions, what would be the ideal license from the SciKit-HEP/HEP community point of view?

I used to use Apache-2, but switched to BSD-3 because that's what the other projects in Scikit-HEP were using. (Also, I didn't see a difference between Apache, BSD, MIT, etc. Just GPL— that's the one that has the "copyleft" clause.)

Christoph Deil

unread,
Nov 7, 2018, 1:10:16 PM11/7/18
to Jim Pivarski, Albert Puig, scikit-h...@googlegroups.com
There’s a detailed comment by Anthony Scopatz on the difference between BSD and MIT and a strong advocacy for BSD here:

The main difference is that BSD requires attribution if someone copies a some code from your package into theirs, whereas MIT doesn’t, there others can just copy e.g. a function or code snippet, put it into their package (which can be MIT or BSD or GPL licensed) and not even mention that it comes from you.

I don’t know much about Apache-2  - it is longer and contains the mention e.g. of “patent" and “trademark”:
Google chose it for Tensorflow, you can read it here: https://github.com/tensorflow/tensorflow/blob/master/LICENSE 

Those three are all pretty similar, all very liberal.

GPL is the one that’s really different, there’s many presentations and blog posts and articles discussing the differences.


It can get complicated:

ROOT is LGPL: https://root.cern.ch/license : it’s OK to link against (and in Python import) LGPL from liberally licenses packages, e.g. iminuit code is MIT and links and bundles MINUIT which is LGPL. So the total Python package effectively is LGPL. But we could e.g. copy & paste code from the iminuit part into a new package and put that under any license we like.

License choice can actually matter a lot: e.g in the Tensorflow ecosystem you probably won’t be very successful (i.e. be used and promoted by Google) if it’s not a liberal license - they wouldn’t be able to build their commercial products on such a package. Another example could be e.g. that we discuss and find that RooFit has a great design, but should really be rewritten in Python & Numpy to be useful and extensible via Python & Numpy. Then you have to very careful if you want a more liberal license than LGPL (e.g. BSD) for the new package: it would have to be a “clean room” implementation where you only look at high-level description of RooFit, not the code while doing it.

I don’t think scikit-hep should dictate the license for affiliated packages. Astropy doesn’t do this. Most Astropy-affiliated packages are BSD, but I think a few are GPL (have to be if they e.g. use FFTW or GSL, or maybe the author preferred the GPL philosophy). That’s a choice every author / project / repo has to make.

Christoph (not a lawyer)

Albert Puig

unread,
Nov 7, 2018, 3:17:37 PM11/7/18
to deil.ch...@googlemail.com, Jim Pivarski, scikit-h...@googlegroups.com
Hi Christoph, Jim,

thanks a lot for your input. We have changed our license to BSD-3.

Albert

Eduardo Rodrigues

unread,
Nov 8, 2018, 10:33:44 AM11/8/18
to Jim Pivarski, alber...@cern.ch, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

Hi all,

 

We will be starting soon ...

 

Thanks,

Eduardo

 

| Eduardo Rodrigues | University of Cincinnati | LHCb Experiment @ CERN & DIANA/HEP | Tel. +41 (0) 22 76 72097 |

 

 

https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif

Virus-free. www.avast.com

--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.

Jim Pivarski

unread,
Nov 8, 2018, 10:36:47 AM11/8/18
to Eduardo Rodrigues, alber...@cern.ch, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
I've got the infinite-refresh issue with Vidyo (it's recently started doing this). I'm working on it; rebooting various things to try to connect...

On Thu, Nov 8, 2018 at 9:33 AM Eduardo Rodrigues <eduardo....@cern.ch> wrote:

Hi all,

 

We will be starting soon ...

 

Thanks,

Eduardo

 

| Eduardo Rodrigues | University of Cincinnati | LHCb Experiment @ CERN & DIANA/HEP | Tel. +41 (0) 22 76 72097 |

 


Sent: 06 November 2018 13:17
To: Jim Pivarski

 

https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif

Virus-free. www.avast.com

--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.

 

--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.

Hans Dembinski

unread,
Nov 8, 2018, 10:38:05 AM11/8/18
to Jim Pivarski, Eduardo Rodrigues, Albert Puig, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
Hi, Christoph and I also have trouble connecting. Is moving to skype a possible alternative?
> Virus-free. www.avast.com
>
> --
> You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
> To post to this group, send email to scikit-h...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/scikit-hep-forum/f63f400d-0bfc-4900-b51d-535957e25a1e%40cernfe02.cern.ch.
> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
> To post to this group, send email to scikit-h...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/scikit-hep-forum/391e0bb2-a993-40b7-b699-356c394c5a44%40cernfe06.cern.ch.
> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
> To post to this group, send email to scikit-h...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/scikit-hep-forum/CADXTekKtFy5DOZVSax%3DKRcc55P__vEEVrqzyR1V8YeoMY5rq%3DQ%40mail.gmail.com.

Jim Pivarski

unread,
Nov 8, 2018, 12:20:54 PM11/8/18
to Eduardo Rodrigues, alber...@cern.ch, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
Great meeting, everyone! Here's the Google Doc:


It has been templated and is ready to fill in. (Yes, that's an editable link. We're all friends here.)

-- Jim

Jim Pivarski

unread,
Nov 12, 2018, 10:33:31 AM11/12/18
to alber...@cern.ch, Eduardo Rodrigues, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
On Mon, Nov 12, 2018 at 5:13 AM Albert Puig <alber...@cern.ch> wrote:
before filling the google docs, we've spent some time to formalise the high-level API (meaning, minimize, loss functions, etc, not model building) we were discussing during the meeting. You can find it in 


This is basically what zfit implements (or plans to implement). Please let us know what you think.

First, thanks for the clarity! My comments are below.

I don't understand why the "sample" function requires a "limits" argument. Deviate sampling with the rejection method would require parameters like this, but if you can compute the inverse CDF of your distributions, then you can throw random numbers in (0, 1) and pass them through the inverse CDF to get random deviates. I also find the name "n_draws" unnatural. "n" would work. I don't think it's unreasonable to expect an implementation to be able to produce inverse CDFs of its distributions. Does anyone have a counterexample?

Parameter's "upper_error" and "lower_error" are good, but it would be nice if these were aliases to a general function "error(p)" or "error_at(p)". That way, 95% confidence levels would be part of the same mechanism as getting asymmetric errors.

Does "gauss.prob" return probabilities (integrations of the PDF) or probability densities (the PDF)? In the example, it looks like the latter, and if so, the name should not be "prob".

We should decide once and for all "normal" vs "gauss" and whether people's last names need to be capitalized. Some of the distributions will be people's last names. I like "normal" because it occupies a special place in the set of distributions and I think more libraries are calling it "normal" than "gauss" nowadays.

It shouldn't be assumed that a minimizer has an internal state. That's just how Minuit works; it doesn't have to work that way. To get all the parameter estimates that we need, a minimizer would only need:
  • A minimize function, which returns a parameter coordinate or raises an exception when it doesn't converge.
  • A derivative (gradient) and second derivative (hessian) function, evaluated at a given parameter coordinate, usually the result of minimize.
  • A root-finding function, which finds sets of parameter coordinates for the function equal to a given value under constraints. Minuit's MINOS is a special case: it finds the two parameter coordinates closest to the minimum point where the function is equal to the minimum point plus 1 or 0.5 under the constraint that all parameters are fixed at the minimum point except one parameter at a time. Minuit's CONTOUR generalizes this somewhat for two parameters. These are things the user generally wants, but the fitter should ask for a more basic building block from the minimizer.
Generality aside, the main point is that the minimizer shouldn't be expected to hold state in today's multiprocessor world. Even though the minimizer just told us the fitted coordinates, we should pass the fitted coordinates back to the minimizer when asking the next question: the second derivative or roots. Minuit can be wrapped to make that so.

-- Jim


 

Eduardo Rodrigues

unread,
Nov 12, 2018, 10:38:13 AM11/12/18
to Jim Pivarski, alber...@cern.ch, jonas....@cern.ch, may...@jonas.eschle.com, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

Hi Albert, cc Jonas as author,

 

Thanks a lot for the document. It will help a lot. I will look properly asap. For now let me just say that I find it a good idea to move the API from being too coupled with MINUIT, even though MINUIT is so good. It would be nicer IMO to have MINUIT-specific calls done via arguments, so that all miminisers we may have/want will still have the same main calls. More when I actually read the doc properly.

 

Thanks,

Eduardo

 

| Eduardo Rodrigues | University of Cincinnati | LHCb Experiment @ CERN & DIANA/HEP | Tel. +41 (0) 22 76 72097 |

 


Virus-free. www.avast.com

Christoph Deil

unread,
Nov 13, 2018, 3:06:44 AM11/13/18
to Jim Pivarski, alber...@cern.ch, Eduardo Rodrigues, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

On 12. Nov 2018, at 16:32, Jim Pivarski <jpiv...@gmail.com> wrote:

On Mon, Nov 12, 2018 at 5:13 AM Albert Puig <alber...@cern.ch> wrote:
before filling the google docs, we've spent some time to formalise the high-level API (meaning, minimize, loss functions, etc, not model building) we were discussing during the meeting. You can find it in 


This is basically what zfit implements (or plans to implement). Please let us know what you think.


What is the scope of the API you are designing?

Is the scope fitting 1D distributions to unbinned data?

I think for binned data (e.g. counts histograms with Poisson data, or other kind of binned data) the API would need to be different?
And for 2D or 3D or ND data and models, again the API would have to be different, no?

Even if the scope here for many people is 1D distributions, the optimiser and error estimator interface could still be shared with people like me that have 2D or 3D or ND data (in my case, e.g. 3D binned counts histograms with Poisson statistics, and a 3D model with a power-law spectrum and an elongated Gaussian spatial model on the sphere), because there the number of dimensions of the data doesn’t matter, only the number of parameters in the cost function and there you also already have several parameters of course.

Suggest to state the scope of the API you are describing at the top of the notebook.

Christoph


First, thanks for the clarity! My comments are below.

I don't understand why the "sample" function requires a "limits" argument. Deviate sampling with the rejection method would require parameters like this, but if you can compute the inverse CDF of your distributions, then you can throw random numbers in (0, 1) and pass them through the inverse CDF to get random deviates. I also find the name "n_draws" unnatural. "n" would work. I don't think it's unreasonable to expect an implementation to be able to produce inverse CDFs of its distributions. Does anyone have a counterexample?

Parameter's "upper_error" and "lower_error" are good, but it would be nice if these were aliases to a general function "error(p)" or "error_at(p)". That way, 95% confidence levels would be part of the same mechanism as getting asymmetric errors.

Does "gauss.prob" return probabilities (integrations of the PDF) or probability densities (the PDF)? In the example, it looks like the latter, and if so, the name should not be "prob".

We should decide once and for all "normal" vs "gauss" and whether people's last names need to be capitalized. Some of the distributions will be people's last names. I like "normal" because it occupies a special place in the set of distributions and I think more libraries are calling it "normal" than "gauss" nowadays.

It shouldn't be assumed that a minimizer has an internal state. That's just how Minuit works; it doesn't have to work that way. To get all the parameter estimates that we need, a minimizer would only need:
  • A minimize function, which returns a parameter coordinate or raises an exception when it doesn't converge.
  • A derivative (gradient) and second derivative (hessian) function, evaluated at a given parameter coordinate, usually the result of minimize.
  • A root-finding function, which finds sets of parameter coordinates for the function equal to a given value under constraints. Minuit's MINOS is a special case: it finds the two parameter coordinates closest to the minimum point where the function is equal to the minimum point plus 1 or 0.5 under the constraint that all parameters are fixed at the minimum point except one parameter at a time. Minuit's CONTOUR generalizes this somewhat for two parameters. These are things the user generally wants, but the fitter should ask for a more basic building block from the minimizer.
Generality aside, the main point is that the minimizer shouldn't be expected to hold state in today's multiprocessor world. Even though the minimizer just told us the fitted coordinates, we should pass the fitted coordinates back to the minimizer when asking the next question: the second derivative or roots. Minuit can be wrapped to make that so.

-- Jim


 

--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.

Christoph Deil

unread,
Nov 13, 2018, 3:41:39 AM11/13/18
to Albert Puig, Jim Pivarski, Eduardo Rodrigues Figueiredo, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
Hi Albert,

In the PDF and Loss functions part of the API you have limits and range arguments.

These would have to be changed to support more than one dimension.

E.g. where you have fit_range=(-10, 10) for 1D, for 3D it could be fit_range = [(x_lo, x_hi), (y_lo, y_hi), (z_lo, z_hi)] or fit_range = ([x_lo, y_lo, z_hi], [x_hi, y_hi, z_hi]). At least in my applications, we don’t use such box limits, but we have binned data and a boolean mask that specifies which bins should contribute to the cost function and which should be excluded. So the places where the model and loss function interacts with the data are different, and it partly shows in API you show in https://github.com/zfit/zfit-tutorials/blob/master/API.ipynb .

The Parameters and Minimizers part of the API is generic, doesn’t depend on the kind of model or data. This part I’m certainly very interested in.

For that part: could you please be a bit more specific in the API document what kind of state the Parameter and PDF object holds?
Why do you have a Parameter.init_value attribute?
Presumably the Minimizer will change model parameter values in-place on the PDF / Parameter object when the fit runs?
And after running hesse() the parameter values will be left in the state of whatever the last function evaluation there was (usually not the best-fit values)?

FYI: we are also working on modelling / fitting for our application here, and are struggling a bit with the design decisions of where the parameter error information. Currently we don’t have parameter error information on the Parameter object, but on a Parameters object which holds a list of Parameter objects plus a covariance matrix (https://github.com/gammapy/gammapy/blob/master/gammapy/utils/fitting/parameter.py). And now we’re trying to decide how to add asymmetric parameter errors.

One option we are considering is to follow Sherpa (see https://sherpa.readthedocs.io/en/latest/fit/index.html#estimating-errors) and not include or expose parameter error information via the Parameter class, but only an extra class (called ErrorEstResults in Sherpa, similar to the current Parameters in Gammapy) that holds the parameter error information and offers an API to retrieve it. The resulting API isn’t as nice as my_model.my_parameter.error and to have everything exposed via the model (called PDF in your case) and parameter objects. But it has the advantage that the state of the model and parameter objects is limited to only values, i.e. those objects are more loosely coupled to the cost function and minimiser / error estimator objects.

Christoph

On 13. Nov 2018, at 09:08, Albert Puig <alber...@cern.ch> wrote:

Dear Christoph,

in principle we want to be as general as possible. Why do you say the API should be changed to deal with N-D distributions or binned data? If we've missed something, we really need to change it :-)

Thanks,
Albert



For more options, visit https://groups.google.com/d/optout.

Christoph Deil

unread,
Nov 13, 2018, 5:47:19 AM11/13/18
to Albert Puig, Jim Pivarski, Eduardo Rodrigues Figueiredo, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com


On 13. Nov 2018, at 10:20, Albert Puig <alber...@cern.ch> wrote:

Hi Christoph,

I see your point.

On Tue, Nov 13, 2018 at 9:41 AM 'Christoph Deil' via Scikit-HEP forum <scikit-h...@googlegroups.com> wrote:
Hi Albert,

In the PDF and Loss functions part of the API you have limits and range arguments.

These would have to be changed to support more than one dimension.

E.g. where you have fit_range=(-10, 10) for 1D, for 3D it could be fit_range = [(x_lo, x_hi), (y_lo, y_hi), (z_lo, z_hi)] or fit_range = ([x_lo, y_lo, z_hi], [x_hi, y_hi, z_hi]).

You're right, the API is unclear here. Ranges and limits should be multidimensional, it's just the examples we put here are not (but in zfit multidimensional are supported interchangeably)
 
At least in my applications, we don’t use such box limits, but we have binned data and a boolean mask that specifies which bins should contribute to the cost function and which should be excluded. So the places where the model and loss function interacts with the data are different, and it partly shows in API you show in https://github.com/zfit/zfit-tutorials/blob/master/API.ipynb .

This is a very interesting use case, clearly not included in the API. I have a question: how do you integrate the PDFs in this case (ie, without limits)? Or do you just use unnormalized functions?

We mostly use unnormalised functions. Our models are flux models, and then multiplied with an effective area, an observation time, and then integrated over bins, usually simply by evaluating at the bin centre and then multiplying with the bin width. In cases where this is not sufficient we oversample on sub grids to get more precise approximations of the integrals.  In very few cases (basically Gaussian only) there are analytical integrals available if you’re not in 1D. Sherpa has separate data classes for binned data (https://sherpa.readthedocs.io/en/latest/data/index.html#binned-data) and an API in model evaluation that supports models where analytical integrals are available (https://sherpa.readthedocs.io/en/latest/evaluation/examples.html#evaluating-a-2d-model-to-match-a-data2d-object).

There are also convolutions, e.g. an energy dispersion matrix and point spread function. Basically the strategy is to “forward-fold” the model to counts space and then do a simple Poisson Likelihood comparing predicted counts per bin with observed counts per bin.

So for binned analysis we don’t need PDFs and to distinguish between the parameters that represent PDFs that integrated to 1, and the extra amplitude parameters. I think for unbinned analysis one has to do it the way you do. Still, you could consider calling your class Model instead of PDF, if it can represent non-normalised models. E.g. a good use case to support would be to fit a histogram with background with some parameters (e.g. polynomial) and signal (e.g. Gaussian).

I’m not saying your design should accommodate these things. It’s a large task to design a model evaluation and fitting framework.
It might make sense to try and figure out the part about models, parameters and parameter error information and the cost function and minimiser interface first, because that is generic. For the data model and model evaluation it will probably much harder to develop a solution that works for everyone.


 

The Parameters and Minimizers part of the API is generic, doesn’t depend on the kind of model or data. This part I’m certainly very interested in.

For that part: could you please be a bit more specific in the API document what kind of state the Parameter and PDF object holds?

The Parameter should just hold the value and the errors. The PDF just holds the parameters.
 
Why do you have a Parameter.init_value attribute?

Sometimes it's interesting to know which was the initial value, but I'm not sure this should stay, tbh.
 
Presumably the Minimizer will change model parameter values in-place on the PDF / Parameter object when the fit runs?

That is the current idea, yes, and the "minimizer state" is just a view of the PDF/parameters.
 
And after running hesse() the parameter values will be left in the state of whatever the last function evaluation there was (usually not the best-fit values)?


They should be left to the same value as before calling, I think. We have not been precise here, and I think you make an excellent point. What do people think about this?

I’m not sure.

Evaluating models and likelihood without changing model parameter attributes is probably possible, but then basically you’re working on a copy of the parameters somehow, and you might just work on a copy of the whole model? And in that case I guess it’s a matter of taste whether users should make copies of models where they like (model2 = model.copy()), or if optimiser calls should make copies of models on input and leave inputs unmodified.

I guess if everyone is happy to assume that making model copies is cheap, then leaving inputs unmodified is the cleaner solution?
But even in that case, some users that they fit a model, access it’s parameters after, and find that they are left at the initial values, not the fit result.

That’s why I asked about the “init_value” attribute you mention in the Parameters section at https://github.com/zfit/zfit-tutorials/blob/master/API.ipynb
It’s really a key design question which state is stored where and who updates it when.


 
FYI: we are also working on modelling / fitting for our application here, and are struggling a bit with the design decisions of where the parameter error information. Currently we don’t have parameter error information on the Parameter object, but on a Parameters object which holds a list of Parameter objects plus a covariance matrix (https://github.com/gammapy/gammapy/blob/master/gammapy/utils/fitting/parameter.py). And now we’re trying to decide how to add asymmetric parameter errors.


Oh, that is interesting! You make me realize right now the API doesn't contemplate covariance matrices outside the Minimizer, and maybe that is not a good idea, per Jim's comments. Let me think about this and get back to you…

I think the simplest solution would be to keep Parameter and Model limited to a “value”, and have all covariance and parameter error information separate.

If you look, that’s how it’s done in Sherpa:
The main drawback here is that the API is less user-friendly, one cannot say my_model.my_parameter.error, one always has to access parameter error information via a second object / API that’s separate from the Model.

The problem with adding error information on the model & parameter objects is when and how to set it (e.g. in a compound "model = background + signal”), but also to support all use cases (e.g. 2-sigma errors, asymmetric errors, for MCMC Bayesian analysis people will want to have parameter sample arrays).

A compromise could be to have three numbers error / error_up / error_hi on the Parameter object, and to have those set sometimes at the end of optimiser or error estimator calls. So the most common / limited parameter error info is available via the model, more fancy information (like covariance matrix or parameter traces) only elsewhere. Not sure if this is a good idea.


 
One option we are considering is to follow Sherpa (see https://sherpa.readthedocs.io/en/latest/fit/index.html#estimating-errors) and not include or expose parameter error information via the Parameter class, but only an extra class (called ErrorEstResults in Sherpa, similar to the current Parameters in Gammapy) that holds the parameter error information and offers an API to retrieve it. The resulting API isn’t as nice as my_model.my_parameter.error and to have everything exposed via the model (called PDF in your case) and parameter objects. But it has the advantage that the state of the model and parameter objects is limited to only values, i.e. those objects are more loosely coupled to the cost function and minimiser / error estimator objects.


Thanks for the info, this is super useful. I also like this idea, and split the MinimizerState into FitResults and FitErrors, avoiding the concept of keeping state. Let's see if we can come up with something nice on this front, basically using the Sherpa and GammaPy ideas :-)
 
Christoph


Thanks a lot again!
Albert
 

For more options, visit https://groups.google.com/d/optout.

Hans Dembinski

unread,
Nov 13, 2018, 6:11:27 AM11/13/18
to Eduardo Rodrigues, Jim Pivarski, alber...@cern.ch, jonas....@cern.ch, may...@jonas.eschle.com, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com

> On 12. Nov 2018, at 16:38, Eduardo Rodrigues <eduardo....@cern.ch> wrote:
>
> Hi Albert, cc Jonas as author,
>
> Thanks a lot for the document. It will help a lot. I will look properly asap. For now let me just say that I find it a good idea to move the API from being too coupled with MINUIT, even though MINUIT is so good. It would be nicer IMO to have MINUIT-specific calls done via arguments, so that all miminisers we may have/want will still have the same main calls. More when I actually read the doc properly.

I agree. MINUIT works pretty well, but there are so many good optimizers out there, I used libnlopt very successfully. Scipy also has a great collection of optimizers. The special thing about MINUIT is the computation of covariance matrices, contours, and asymmetric errors. This is hard to find in other packages, but this functionality in MINUIT could in principle be combined with any other minimizer.

Christoph once started a project to re-implement MINUIT in Python. Perhaps that project should be revived, and the goal shifted to not reimplement the minimizer of MINUIT, where there are other and better alternatives, but to reimplement the parts necessary to compute covariances, contours, and asymmetric errors.

Best regards,
Hans

Christoph Deil

unread,
Nov 13, 2018, 6:53:32 AM11/13/18
to Hans Dembinski, Eduardo Rodrigues, Jim Pivarski, alber...@cern.ch, jonas....@cern.ch, may...@jonas.eschle.com, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
I just thought about this, I don’t have the expertise or time to re-code MINUIT in Python / Numpy.

But yes, I am very much interested to have a Python package that includes error estimation without the C++ dependency and frozen API due to the long legacy that MINUIT or Sherpa have.

In previous discussions here contributing to and evolving lmfit was discussed as an option. Is someone investigating that route?

If that doesn’t work out for whatever reason (e.g. scope too large and different, or they also already have design choices that prevent a good solution), then I think we should start a new package.

For me, having it work with just Python & Numpy would be important, and if it can also work with Tensorflow that would be a plus, but it would have to be optional. Note that on the Tensorflow side there is https://www.tensorflow.org/probability/ which already contains optimisers, and presumably it will grow error estimators like HESSE / MINOS?

I’m not even sure an exact MINUIT clone is needed. I always thought that it’s the super-robust solution, but then there are simple cases where it doesn’t work (see https://github.com/scikit-hep/iminuit/issues/271) and when it fails, the user has none or too few parameters to control the MIGRAD / HESSE / MINOS algorithm.

Christoph


Best regards,
Hans


--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.

Henry Fredrick Schreiner

unread,
Nov 13, 2018, 7:21:48 AM11/13/18
to Timothy David Evans, Hans Dembinski, Eduardo Rodrigues, Jim Pivarski, Albert Puig Navarro, Jonas Eschle, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
Adding Tim because he might be interested; he works on AmpGen, which does a great job of handling complex components and incoherent/coherent sums with a simple symbolic interface in C++14. Also I’m not sure TensorFlowAnalysis has been mentioned yet, it has been doing fits in TensorFlow to 3-body amplitudes (at least) for a couple of years now. Tim: There’s a gitter channel as well with a lot more discussion:

Jonas Eschle

unread,
Nov 13, 2018, 7:43:48 AM11/13/18
to Henry Fredrick Schreiner, Timothy David Evans, Hans Dembinski, Eduardo Rodrigues, Jim Pivarski, Albert Puig Navarro, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil

Hi Henry, Christoph,

to shortly comment on that: we're trying to build a new, improved (more generalized) version of TensorFlowAnalysis using (next to TensorFlow) a lot of what's available in TensorFlow Probability, it's called zfit and still in pre-beta. We're also developing close to the discussion here/in the gitter channel. So we're currently exploring this direction.

If you have specific ideas/comments, let us know.

Cheers,
Jonas

13.11.18 13:21, Henry Fredrick Schreiner пишет:

Henry Fredrick Schreiner

unread,
Nov 13, 2018, 7:50:31 AM11/13/18
to Jonas Eschle, Timothy David Evans, Hans Dembinski, Eduardo Rodrigues, Jim Pivarski, Albert Puig Navarro, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
Hi Jonas,

I just spoke with Eduardo and learned that zfit was informed by TensorFlowAnalysis, which is great. It shares quite a bit with AmpGen too, I expect, so Tim might have more input. AmpGen builds everything in static graphs in C++, then renders an output file in pure C or CUDA for the final PDFs, not that different than TensorFlow’s graphs.

Henry

Jim Pivarski

unread,
Nov 13, 2018, 10:56:55 AM11/13/18
to Henry Fredrick Schreiner, jonas....@cern.ch, timothy.d...@cern.ch, Hans Dembinski, Eduardo Rodrigues, alber...@cern.ch, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
On Tue, Nov 13, 2018 at 2:08 AM Albert Puig <alber...@cern.ch> wrote:
On Tue, Nov 13, 2018 at 9:06 AM 'Christoph Deil' via Scikit-HEP forum <scikit-h...@googlegroups.com> wrote:
What is the scope of the API you are designing? 
 
in principle we want to be as general as possible. Why do you say the API should be changed to deal with N-D distributions or binned data? If we've missed something, we really need to change it :-)

Just a general comment from experience in designing specifications: defining the scope and explicitly leaving things out (naming them, as Arrow does here) is the most important thing the first draft of a specification does. "As general as possible" is literally unspecified.

I didn't comment on this before because it's hard to turn into a to-do action, but the zFit notebook is more like documentation of a library than specification. It's very useful input to this discussion about what kinds of things we want a fitter to do, as is the LMFIT documentation, SciPy documentation, Hans's mock-up API, etc. They're all examples from a set, but a specification needs to be a definition of a set, saying what implementations must do and must not do.

You (everyone here) may come out of this with the conclusion that you don't really need a specification, but a library developed from within this community, and zFit may be that library. That would be preferred if you know from the outset that there's only going to be enough time and people to create one library that everyone uses, and everyone is willing to contribute to that one library. A specification would be preferred if you expect a lot of small libraries that each do one focused thing but have to share components.

The first question is: "Are we all sure we want a specification? Does that fit our problem? Are we really going to be sharing fit models? Why?"

The second (if "yes") is: "What needs to be specified and what can be left to the implementations? What's out of scope (keeping in mind that a specification is more likely to be successful if it's not too broad)?"
Getting back to the origin of this discussion, there was a lot of initial interest in sharing distributions for model building: their names and parameters. Specifications for the objective function itself was mentioned at the beginning but was only taken up in earnest later. Specifying the entire library— how physicists configure it and interact with it— is overkill because it implies that you'll have more than one fitting library that behaves the same way. That might be desirable across languages (like the failed AIDA project), but why would anyone want two Python fitting libraries with exactly the same interface?

It seems to me that you want a distributions and models specification for fitting libraries like zFit to use. Maybe you want such a specification to also describe full objective functions (with binned vs unbinned), since those are only a little more broad than the models themselves and can then fully encapsulate a fitting problem. But the API of how you do a fit is exactly what a user would want to move from one library to another for, right? If so, then it shouldn't be part of the specification.

-- Jim

Christoph Deil

unread,
Nov 13, 2018, 2:04:15 PM11/13/18
to Jim Pivarski, alber...@cern.ch, Eduardo Rodrigues, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
On 12. Nov 2018, at 16:32, Jim Pivarski <jpiv...@gmail.com> wrote:


Jim or anyone - Do you have an example showing how to do this?

Is it possible to use MINUIT (via iminuit?) to run “hesse” and “minos” and “proflie” and “contour” without having to hold on to the iminuit.Minuit object in between the calls?

I’m prototyping a fitting backend where I try to wrap existing methods in Minuit, Sherpa. For now, I hold on to the iminuit.Minuit object in between computations and close my eyes, hoping it will not get out of sync with state on the other objects that I have. Looks like this: https://github.com/gammapy/gammapy/blob/d24df5266d85d4b6b2dd6fe3987095a13488142f/gammapy/utils/fitting/fit.py#L213 


-- Jim


 

--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.
To post to this group, send email to scikit-h...@googlegroups.com.

Jim Pivarski

unread,
Nov 13, 2018, 3:21:10 PM11/13/18
to Christoph Deil, alber...@cern.ch, Eduardo Rodrigues, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com
On Tue, Nov 13, 2018 at 1:04 PM Christoph Deil <deil.ch...@googlemail.com> wrote:
Is it possible to use MINUIT (via iminuit?) to run “hesse” and “minos” and “proflie” and “contour” without having to hold on to the iminuit.Minuit object in between the calls?

I’m prototyping a fitting backend where I try to wrap existing methods in Minuit, Sherpa. For now, I hold on to the iminuit.Minuit object in between computations and close my eyes, hoping it will not get out of sync with state on the other objects that I have. Looks like this: https://github.com/gammapy/gammapy/blob/d24df5266d85d4b6b2dd6fe3987095a13488142f/gammapy/utils/fitting/fit.py#L213

I don't remember, and it's been about 10 years since I've looked at the Minuit API. Of course, it probably hasn't changed at all in that time; just saying.

I think it might be possible to create a new Minuit state object and assign parameters and errors to it: errors are interpreted as step sizes before minimization. The question I don't know the answer to is whether HESSE and MINOS would happily do their computations, starting from that point. It might not be possible to initialize an object with a non-diagonal covariance matrix, since errors before minimization are interpreted as step sizes.

Beyond that, I don't know if there are any global settings that are defined outside of this state object, which would render any process that changes those variables thread-unsafe, even if threads use local state objects.

Hans Dembinski

unread,
Nov 15, 2018, 3:55:45 AM11/15/18
to Jim Pivarski, Christoph Deil, alber...@cern.ch, Eduardo Rodrigues, Matthieu Marinangeli, Henry Fredrick Schreiner, scikit-h...@googlegroups.com


> On 13. Nov 2018, at 21:20, Jim Pivarski <jpiv...@gmail.com> wrote:
>
> On Tue, Nov 13, 2018 at 1:04 PM Christoph Deil <deil.ch...@googlemail.com> wrote:
> Is it possible to use MINUIT (via iminuit?) to run “hesse” and “minos” and “proflie” and “contour” without having to hold on to the iminuit.Minuit object in between the calls?
>
> I’m prototyping a fitting backend where I try to wrap existing methods in Minuit, Sherpa. For now, I hold on to the iminuit.Minuit object in between computations and close my eyes, hoping it will not get out of sync with state on the other objects that I have. Looks like this: https://github.com/gammapy/gammapy/blob/d24df5266d85d4b6b2dd6fe3987095a13488142f/gammapy/utils/fitting/fit.py#L213
>
> I don't remember, and it's been about 10 years since I've looked at the Minuit API. Of course, it probably hasn't changed at all in that time; just saying.
>
> I think it might be possible to create a new Minuit state object and assign parameters and errors to it: errors are interpreted as step sizes before minimization. The question I don't know the answer to is whether HESSE and MINOS would happily do their computations, starting from that point. It might not be possible to initialize an object with a non-diagonal covariance matrix, since errors before minimization are interpreted as step sizes.

I haven't tested it, but I also think that this should work.

HESSE does not need to know anything about the minimum. It just computes the second derivative and inverts it.

MINOS does a scan starting from a point along parameter. It probably uses the diagonal terms of the covariance matrix to find a good initial step size for this scan, but it does not need to know about non-diagonal terms of the covariance matrix.

CONTOUR may be the only one that fully benefits from the state data accumulated by MINUIT.


Hans Dembinski

unread,
Nov 15, 2018, 4:15:06 AM11/15/18
to Jim Pivarski, Henry Fredrick Schreiner, jonas....@cern.ch, timothy.d...@cern.ch, Eduardo Rodrigues, alber...@cern.ch, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
Dear Jim,

> On 13. Nov 2018, at 16:56, Jim Pivarski <jpiv...@gmail.com> wrote:
>
> Just a general comment from experience in designing specifications: defining the scope and explicitly leaving things out (naming them, as Arrow does here) is the most important thing the first draft of a specification does. "As general as possible" is literally unspecified.
>
> I didn't comment on this before because it's hard to turn into a to-do action, but the zFit notebook is more like documentation of a library than specification. It's very useful input to this discussion about what kinds of things we want a fitter to do, as is the LMFIT documentation, SciPy documentation, Hans's mock-up API, etc. They're all examples from a set, but a specification needs to be a definition of a set, saying what implementations must do and must not do.
>
> You (everyone here) may come out of this with the conclusion that you don't really need a specification, but a library developed from within this community, and zFit may be that library. That would be preferred if you know from the outset that there's only going to be enough time and people to create one library that everyone uses, and everyone is willing to contribute to that one library. A specification would be preferred if you expect a lot of small libraries that each do one focused thing but have to share components.
>
> The first question is: "Are we all sure we want a specification? Does that fit our problem? Are we really going to be sharing fit models? Why?"
>
> The second (if "yes") is: "What needs to be specified and what can be left to the implementations? What's out of scope (keeping in mind that a specification is more likely to be successful if it's not too broad)?"
> Getting back to the origin of this discussion, there was a lot of initial interest in sharing distributions for model building: their names and parameters. Specifications for the objective function itself was mentioned at the beginning but was only taken up in earnest later. Specifying the entire library— how physicists configure it and interact with it— is overkill because it implies that you'll have more than one fitting library that behaves the same way. That might be desirable across languages (like the failed AIDA project), but why would anyone want two Python fitting libraries with exactly the same interface?
>
> It seems to me that you want a distributions and models specification for fitting libraries like zFit to use. Maybe you want such a specification to also describe full objective functions (with binned vs unbinned), since those are only a little more broad than the models themselves and can then fully encapsulate a fitting problem. But the API of how you do a fit is exactly what a user would want to move from one library to another for, right? If so, then it shouldn't be part of the specification.

These are important questions. I don't know how to design a spec and I am not sure whether this is the right approach for the problem, as explained below.

In my understanding, we would like to have something like Roofit but without the ROOT part. A library that makes it easy to design models that are fitted to data. The library should offer a high-level language to describe the model and the fitted parameters, but hide the technical details on how do the actual fit for the most part by choosing best practices for any given situation. Let's keep calling it cool_api for now, just to have a name.

cool_api would use other libraries as a backend. Here, your spec would be useful, which would allow this library to communicate with backends in a unified way. However, I personally find it challenging to come up with a spec without a working prototype I am good at abstracting from the specific, but without the specific I feel lost. I would first implement a prototype of the high-level library and then develop a spec based on this for backends, but perhaps that's not even necessary/useful.

Potential backends basically fall into two groups. The first group are the fine people which are already part of this community. They already work on components than can serve as backends for this high-level library, for example GooFit and iminuit, and have a natural interest in making these work with cool_api. We can just adapt our projects or write backend adaptors for the new library. The other group consists of the big external packages like Tensor Flow, which will are unlikely to consider our spec, because we are a nice for them. So even if we write a spec, they will ignore it. Since we are small and they are big, the only solution is again to write backend adaptors in cool_api for those packages.

What do you think?

Best regards,
Hans

Hans Dembinski

unread,
Nov 15, 2018, 4:17:58 AM11/15/18
to Jim Pivarski, Henry Fredrick Schreiner, Jonas Eschle, timothy.d...@cern.ch, Eduardo Rodrigues, Albert Puig, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
PS: Argh, should read again before hitting send:

> The other group consists of the big external packages like Tensor Flow, which will are unlikely to consider our spec, because we are a nice for them.

not "nice", niche"!

Jim Pivarski

unread,
Nov 15, 2018, 8:55:13 AM11/15/18
to Hans Dembinski, Henry Fredrick Schreiner, Jonas Eschle, timothy.d...@cern.ch, Eduardo Rodrigues, alber...@cern.ch, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
You made a good argument for specifying the optimizer backends as well. As I hear it, there's a desire to standardize:

   * The model building, definitely probability distributions and maybe aspects of the fit function, such as binned/unbinned, regulators, etc. The goal of this would be a unified way for physicists to express a fit problem, what it is they want to have optimized.

   * The fitting backend, so that Minuit and TensorFlow look the same to the fitting library.

These are two scopes, and neither one of them is a whole fitting library. The library (or libraries) can develop the normal way, without specification.

The first of these is user-facing and a lot of thought will have to go into making it human-friendly. Much of the early discussion was about this. The best approaches here introduce as few names and concepts as possible, e.g. + is better than SumPDF, as far as the latter can be avoided. It should make easy things easy and all things possible, as the saying goes.

There's another option: we could do the easier task of creating a non-user-facing distribution standard and then write our favorite human interfaces over that in different libraries. The problem is that it's hard to get a group of people to agree on what "easy" is. Different libraries may promote different visions of what the ideal human interface is, while still using the same distribution library and model builders via some export/import commands.

The second thing to standardize, the optimization library backends, is not user-facing. It can be baroque and namey, where the main point is to avoid getting stuck, unable to do something because of the structure of the interface.

You're right that TensorFlow would never listen to us— we'd have to provide the adaptor. We should keep that in mind when we design it: that work falls on our shoulders.

If it's easiest to keep coming up with examples, rather than start specifying, then we're in the early brainstorming stage. Specifying is a bit like fitting itself, in which a set of samples get generalized as a function. If we need more samples, then let's keep doing that.

But I think at this point that we can focus those examples on two main problems: describing models/optimization problems and describing fitting backends.

Unless I've missed anything?

Jim

Brian Pollack

unread,
Nov 15, 2018, 9:57:30 AM11/15/18
to Jim Pivarski, Hans Dembinski, henry.fredri...@cern.ch, jonas....@cern.ch, timothy.d...@cern.ch, Eduardo Rodrigues, alber...@cern.ch, may...@jonas.eschle.com, Matthieu Marinangeli, scikit-h...@googlegroups.com, deil.ch...@googlemail.com
Hi Jim, Hans, All,

I just want to point out that `lmfit` provides what I think is a good example of separating the user-facing API from the fitting backend.  All model building is done in a manner that is completely agnostic to the fitting backend/optimization algorithm that will eventually be used.  Once a model is generated by the user, either through using a pre-defined model (e.g. `GaussianModel(...)`, etc) or a user-defined parameterization, the user can specify an optimization algorithm from a list of available algorithms and then calls `Model.fit`.  The devs for `lmfit` are responsible for coding up any necessary translations needed for each individual fitting backend, which makes it very simple for the end user to swap out algorithms without any modifications to the model.  I think this is an excellent way to design a fitting package, and makes adoption of the library as reasonably simple as possible for the end users.  I promise the `lmfit` devs are not paying me to plug their product, I just really enjoy using it :)

-Brian

--
You received this message because you are subscribed to the Google Groups "Scikit-HEP forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-hep-for...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Brian Pollack, Ph.D.
Experimental Particle Physics
Northwestern University

Brian Pollack

unread,
Nov 19, 2018, 1:45:27 PM11/19/18
to alber...@cern.ch, Jim Pivarski, Hans Dembinski, Henry Schreiner, Jonas Eschle, timothy.d...@cern.ch, Eduardo Rodrigues, Jonas Eschle, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
Hi All,
While slightly off-topic, this post was made recently in the CMS hypernews regarding TMinuit, which might also affect the standalone version:

*** Discussion title: Statistical Tools for Physics Analysis

Dear Colleagues:

I was checking the statistics questionnaire for SMP-18-012. One of
the authors of that analysis (Josh Bendavid) claims that he had to use
TensorFlow instead of Minuit/RooFit because of problems with gradient
calculations. See, in particular, slide 24 of

(link removed, internal talk for CMS)

This prompted me to look into gradient calculations in TMinuit:
https://root.cern.ch/doc/master/TMinuit_8cxx_source.html#l02187

I am afraid that, whoever wrote that code, had no idea about
the issues involved in calculating derivatives on a computer.
While the formula used there is standard,

df/dx = (f(x+h) - f(x-h))/(2 h),

there are two important problems:

1) Use of a wrong power of machine precision in the determination
   of h. h should be O(eps^(1/3)), it looks like it is instead
   O(eps^(1/4)).

2) Not using an exactly machine-representable number for h.

If you are curious to know more, you can find a detailed
discussion of these issues in the "Numerical Derivatives"
section of the "Numerical Recipes" book by Press et al.

While problem 1) can be hackishly addressed by changing the effective
machine precision with a Minuit command, problem 2) can only be fixed
by modifying the code. In combination, these problems can easily
result in dumbing down the precision of gradient calculation by
several orders of magnitude.

Best regards,
Igor Volobouev

On Thu, Nov 15, 2018 at 10:10 AM Albert Puig <alber...@cern.ch> wrote:
Hi Brian,

I wholeheartedly agree with you. The interface to access any minimization/optimization algorithm should be the same.

In the "API" proposal that was sent around this fact was encoded in a loss function that could be used by any minimizer (so, a couple more steps than in lmfit), but shortcuts to have an interface as lmfit seem to be very desirable. High level "simple" fitting functions still cover 90% of uses cases, so it's important these are easy and transparent for the users.

Albert 


For more options, visit https://groups.google.com/d/optout.

Hans Dembinski

unread,
Nov 20, 2018, 6:36:42 AM11/20/18
to Brian Pollack, alber...@cern.ch, Jim Pivarski, Henry Schreiner, Jonas Eschle, timothy.d...@cern.ch, Eduardo Rodrigues, Jonas Eschle, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
Hi Brian,

thanks for sharing this. I wouldn't be surprised if Igor is right and the gradient calculation could be improved. I think when you really study the MINUIT source code, like I did, you will find many things that should worry you. For this and other reasons, I think that Minuit is probably not the best minimizer in the world. It is, however, one of very few packages out there which can compute covariance matrices and confidence intervals.

That's why I think someone should extract the algorithms that MINUIT uses to compute covariances and confidence intervals and put those into a separate package that can be used with an arbitrary minimizer.

Best regards,
Hans

Jim Pivarski

unread,
Nov 20, 2018, 7:27:39 AM11/20/18
to Hans Dembinski, Brian Pollack, alber...@cern.ch, Henry Schreiner, Jonas Eschle, timothy.d...@cern.ch, Eduardo Rodrigues, Jonas Eschle, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
I've long thought that Minuit as we know it could be built out of pieces from SciPy: minimizers, naturally, but also the numerical derivatives for HESSE and the root-finders for MINOS. Am I missing anything? It's there some part that's essential to MINOS that can't be done with a SciPy root-finder?

It would have to be one of the multidimensional root-finders, since it needs to minimize all dimensions except the one in which it's searching for roots.

And if it's just that feature that's missing, what about changing the objective function so that the desired roots (the +1 and –1 sigma points) are local minima in an objective function with an extra term? Then an ordinary minimizer would be a MINOS-like root-finder.

Jim



Hans Dembinski

unread,
Nov 20, 2018, 11:15:15 AM11/20/18
to Jim Pivarski, Brian Pollack, Albert Puig, Henry Schreiner, Jonas Eschle, timothy.d...@cern.ch, Eduardo Rodrigues, Jonas Eschle, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
Dear Jim,

> On 20. Nov 2018, at 13:27, Jim Pivarski <jpiv...@gmail.com> wrote:
>
> I've long thought that Minuit as we know it could be built out of pieces from SciPy: minimizers, naturally, but also the numerical derivatives for HESSE and the root-finders for MINOS. Am I missing anything? It's there some part that's essential to MINOS that can't be done with a SciPy root-finder?
>
> It would have to be one of the multidimensional root-finders, since it needs to minimize all dimensions except the one in which it's searching for roots.
>
> And if it's just that feature that's missing, what about changing the objective function so that the desired roots (the +1 and –1 sigma points) are local minima in an objective function with an extra term? Then an ordinary minimizer would be a MINOS-like root-finder.

Christoph discovered this document,

https://github.com/sherpa/sherpa/files/2581603/algs.pdf

from Mark Fischler who studied the HESSE algorithm. The workings are quite complicated, mostly because one has to consider bounded parameters and optimal step sizes for the various derivatives. You will easily find a software package which computes first and second derivatives for unbounded parameters (e.g. in scipy or the GSL), but I don't know of an algorithm which computes the full Hessian effectively for bounded parameters.

However, implementing MINOS may actually be as simple as you describe.

Best regards,
Hans

Brian Pollack

unread,
Nov 20, 2018, 11:42:34 AM11/20/18
to Hans Dembinski, Jim Pivarski, alber...@cern.ch, Henry Schreiner, Jonas Eschle, timothy.d...@cern.ch, Eduardo Rodrigues, Jonas Eschle, Matthieu Marinangeli, scikit-h...@googlegroups.com, Christoph Deil
Hi Hans, Jim,

I want to also point out that "lmfit" implements bounds directly from the formulation used in MINUIT (https://lmfit.github.io/lmfit-py/bounds.html).  This is to assist with calculating derivates near the bounds, as Hans has pointed out.  Additionally, "lmfit" also includes confidence interval calculation and calculates the covariances of the free parameters by default when performing a fit.  I have not tested the confidence interval implementation myself, but I've skimmed through the documentation (https://lmfit.github.io/lmfit-py/confidence.html).

Cheers,
Brian

Jonas Eschle

unread,
Sep 11, 2019, 6:44:23 AM9/11/19
to scikit-h...@googlegroups.com
Hi all,

digging out this 1 year old exchange about fitting tool APIs and
interoperability to give an update on it: we have implemented and
progressed a lot of the discussions in here. Today at 17h there will be
a PyHEP meeting and discussion about python fitting tools including zfit:

https://indico.cern.ch/event/834210/

zfit is ongoing WIP and any input is still very welcome!

Cheers,
Jonas


Reply all
Reply to author
Forward
0 new messages