Hello everyone!I was wondering if there was interest in having an interactive front-end for some of lmfit's capabilities.
I'm new to lmfit, but need some specific functionality for my current project, and don't know of any good solutions already in the Python ecosystem.
I understand that this would be a very large project in terms of maintenance & (optional) dependencies, so I completely understand (and rather expect) that this is out of scope. However, if there is interest, I'm happy to develop this either as part of lmfit or some kind of extension rather than as an unrelated module.
A little about me - I'm interested in these capabilities for my current work in x-ray detector development. Matt, I don't know if you recall specific detectors on the APS beamlines, but I work for the group behind the MM-PAD detectors (Cornell, Sol Gruner). They're high dynamic range, high frame rate, low pixel count units with Si (and soon CdTe!) sensors.
The functionality I'm looking to write is basically:1) Interactive selection of initial parameters. Rather than laboriously reading every coordinate off a graph, just drag around and resize the model components2) Interactive plot of both the fit and the residuals3) Interactive way to modify the composition of the model
I've got a matplotlib prototype of all of this by building off of HyperSpy's fitting capabilities, and this has been immensely improved my work "quality of life". However, HyperSpy is extremely heavy for what I need and GPLv3. Additionally, I'm trying to migrate off matplotlib and towards the Bokeh/pyviz ecosystem.
A little more on my use case: I usually know the form of what I'm trying to fit and have good initial guesses, but have a lot of free parameters. For example, I might have 5 Gaussian peaks and a linear background, which is 17 free parameters. This sort of tool makes it much, much easier for me to get accurate results quickly. Plus, it's always grinded my gears a bit that (to my knowledge) no one in the Python community has built an open-source & improved version of Matlab's curve fitting toolbox, so there's some personal motivation in there for me as well.
Thanks a lot for your time & consideration! Looking forward to hearing back from you guys either way. Great work so far, been loving it!
--
You received this message because you are subscribed to a topic in the Google Groups "lmfit-py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lmfit-py/oFv_6Y96qXc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbp8UdSjWLqzEO1cCin5MedCyR3tKyMK%2BjMOvcqZb%3D2DJw%40mail.gmail.com.
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/78c5f9cb-8214-48ed-b217-80b82f5de0bbn%40googlegroups.com.
Hi Matt,Great to hear from you. A quick note, the data I'm interested in fitting is just 1D, so I'm looking for a general purpose 1D fitting app.
As for my project, that's just it: image stacks with metadata added transparently.
While this may seem mundane or trivial it's only been possible recently due to i.e. pymeasure and has become more important due to the increased complexity of next-gen PADs with adaptive gain, 10kHz framing, and so on. I also think that the knock-on effects will be large - speaking for myself at least, it's transformed my workflow from data collection-centric to analysis-centric.
PyX also enables automation, letting me acquire meaningful 5- or 6- dimensional datasets very quickly, but then I somehow have to interpret that data. Enter my interest in interactive regression! I want to extract some regressed values at every point (10s to 1000s) of my dataset. But then I must have some way to quickly ensure that all regressions are sufficiently well-conditioned, and fix those that are not, very quickly. I think this is where some degree of graphical interactivity is called for. If lmfit had an interface that allowed me to quickly and interactively show the fit, change the regression parameters, and re-run the fit, I could use this interface to manually fix 10-100 of the worst fits without too much pain.
Interactivity is also something I think may be able to help crack the eventual challenge of visualizing the entire dataset. This is what I call "sparse, medium-D data" - say from 4-8 data points per dimension and 4-8 dimensions - and I want to use interactive regression tools as a small-scale test of these ideas.
You may be surprised to know that I am also not a fan of Jupyter & broadly skeptical of web interfaces as well. After all I grew up with matplotlib and never really transitioned away. However, I don't want to write a web app: I just want the browser as my front-end, controlled interactively by a terminal. One reason for this is portability: the browser makes cross-platform remote work (highly relevant right now!) far more practical.But a bigger driver for me is that I'm looking for a lot of interactivity, and despite my preference for matplotlib I think that within the past 6~12 months the Bokeh ecosystem has begun to outstrip matplotlib in this regard. It's not that one couldn't do it in matplotlib, but 1) there'd probably be backend-specific code and 2) I think the high-level API offered by Panel and holoviews is unmatched, and upcoming projects like hvPlot and Lumen will only further this. In principle, and I think in practice too, no custom JS will be needed; Bokeh manages both the JS and the client/server connection all on its own.
I rather like plotly as well, but it seems to me that I can put plotly inside Bokeh (via Panel) a lot easier than I can do the reverse. But for this particular application, Bokeh already has built-in support for draggable points & shapes, whereas plotly doesn't and apparently won't (and this demo JS hack no longer works for me on Chrome/Firefox).
I understand if this isn't a direction you want to pursue for lmfit, but, if you're curious about the recent progress on Bokeh & co, such an add-on might serve as an interesting risk-free demo/way to see what it is or isn't capable of!
--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbqkfOELTaLS7%3DgFaVBkg-p9g%2BebGhwzD84tMu5KORpuhw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CAOUO9Y0r0SFwatsGLnDsrtuLcS3FRNEW6AX%3D1LoPdsE%3D1WZN%3Dw%40mail.gmail.com.
You received this message because you are subscribed to a topic in the Google Groups "lmfit-py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lmfit-py/oFv_6Y96qXc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbqhcn34vSot1DC3mUYyGfUZHV0W9PDq6FjiPeeW2%3DC7jg%40mail.gmail.com.
Hello everyone!Apologies I got briefly pulled off to another project. However, I much appreciate everybody's thoughts and I'm very glad there's interest. I've got a relatively small Bokeh/Panel prototype put together. You can only fit one set of data, but I think it's illustrative (details below).William - Here are Bokeh's graph-editing tools, if that's of interest.Evan - I also like pyqtgraph, though I've not tried Veusz. I think for a desktop app this would be a great choice, but I've had some other issues with Qt in the past - remote access, low-level interface, and integrating other technologies with it.Ray - I did not know about NexPy's fitting interface! That's very close to the base of what I was imagining, and I've replicated part of its interface. I'll touch base with Jacob and see if he had some more thoughts.Shane - Very glad to hear that someone who's already had to work around this problem thinks this is a viable solution! 1e7 fits is certainly a big number. And I agree that lmfit should preserve the option to ship headless.Matt - Apologies for any confusion with earlier remarks. I hope my description below will clear things up & thanks for putting together the wiki. Also:> Is this the sort of thing you would want? Use such a GUI app for interactive and/or problematic cases but be able to save the Model this interactive fit generates to then be able to "apply to 10,000 similar datasets"?Yes, that's pretty much exactly what I'm looking for.
> Are draggable data points desirable?See below, not dragging the data points, but dragging "handle points" for model components. Though now you mention it, maybe that's a good strategy for exploring the impact of outliers on a regression!
> Personally, I would probably suggest starting with 1-D or simple 2-D fitting as a standalone unit ("lmfit GUI - fityk replacement") and think about how to have that as part of the analysis workflow along with domain-specific data reduction.I think we're on the same page. As for the analysis workflow, my approach here has been to make regression interactive & easy: then you simply re-run the analysis and regression top-to-bottom.
However, it would be nice to have changes in pre-processing reflected live in your regression window. Doing this manually "`igs.data = new_data`" is certainly possible, but you've got me thinking...doing this in a truly live/transparent way may be possible in an ipython context.
--- About my prototype ---My code's not well documented & I'd like to put together an illustrative demo before I show everyone. However, I think my description below will help add some specifics to discuss.The prototype I wrote extends Model by adding an `interactive_guess` function. It works like `guess`, but returns an `InteractiveGuessSession` object instead of a `ModelResult`. `InteractiveGuessSession` is conceptually similar to `ModelResult`, but it's also attached to a front-end. For example, it has a `params` attribute to get the current parameters displayed by the front-end.Here's a picture of the front-end in my browser (aesthetics very much in-progress):This is linked to a command-line session.
You can edit parameters in one of three ways:1) Programmatically via `igs.params['g0_amplitude'].value = 1`2) Interactively via the table on the right3) Interactively by the dragging red points on the graph, or using the scroll wheelThis works with CompositeModels and all of this is synced together with the front-end. FYI this is about ~300 lines of code.
To make a model component interactive, it must implement two functions:1) a map from its Parameters to a polygon shape
2) a map from a polygon shape to its Parameters
It may optionally implement:3) a way of updating its parameters when the scroll wheel is scrolledIf initial parameters are not specified to `interactive_guess`, it must also support `guess`.Current holes include:- Changing models is somewhat awkward. It's patched in via a `set_model` method to `InteractiveGuessSession`, but you essentially have to rebuild everything. You also can't change the data. However, if you want to change that, is having a method on Model even a logical entry point for this function?
1) The interactivity relies on callbacks added to the Parameter objects. This means if you replace the Parameter objects, for example using Parameters.update, the interactivity breaks; you have to modify the attributes of the Parameter objects without replacing them.
2) Because JSON doesn't support infinities, the min/max bounds are clipped to 1e308, which just looks ugly
3a) Right now you have to select the model component you want to drag via a dropdown. It'd be nice if you could just double click, or if all of the model handles could be shown together.3b) A bug in Bokeh means you have to click 4 times after each drag to drag again.As for dependency management, I think the easiest option is to attempt to import the requisite libraries:- If it fails, issue a warning (or a log message)- If it works, dynamically replace the class definitions with the interactive versions. Or hack in some mixins by modifying __bases__.An alternative that I personally don't prefer is:- Attempt to dynamically load the requisite libraries when any interactive functions are called (as I also modify Parameters, Parameter, pre-built models). However, this requires distributing the dependencies around in the source code, and users will see functions that they expect to work but then might fail anyway.Please let me know if anyone has thoughts on my approach here. I'll try to send out a nice usable demo within the next few days.
--
You received this message because you are subscribed to a topic in the Google Groups "lmfit-py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lmfit-py/oFv_6Y96qXc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbouksnaEYGCuBUedrquUM9S%2BecoA9NDXsFu32YP7yVazA%40mail.gmail.com.
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CAOu--pEQ-9n6tzejsxQRT5Lh%3DkGyg5YYCUPaoiPFKG5ez4T2OA%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CAOUO9Y27r-X0YpoRawuvHnF7U-Dz8fcz5rz6CyNj-iA8AZkcNw%40mail.gmail.com.
Hello everyone!
As promised I've cleaned up the code for my demo; it's here on GitHub (gist). There's a quick guide included. Let me know if you have any issues playing with it or other thoughts!Matt -As my response is admittedly quite long, I've divided it into sections.Meta: about this discussion itselfGoals: identifying users, use-cases, justifying this approachImplementation: responding to your comments on my prototype so far--- meta ---> For sure, this sort of conversation can turn into "well, I do this" with a reply of "well, this other thing works for me". And, I think that could be OK: we don't currently have a "general-purpose lmfit app", ad maybe the desire to have one is simply is not that strong. OTOH, if that is the actual goal, I think it would be helpful to discuss this as if the intention was to consider working together on such a project. I'll continue making that assumption for a few more rounds of this discussion.Apologies, I'm trying to avoid the "well, I do this" sort of conversation. My intention is also to work together, but I know many people are already fairly oversubscribed. Thus, I was planning to do most or all of the coding work, and wanted feedback & ideas for usability/ecosystem/interface/code style/strategy sort of thing. I figured I'd come out with a more complete/polished/widely applicable solution that hopefully someone else could find useful. If this model is not desirable to you, I'd be happy to discuss distributing the work in a different manner.
> But I should also be realistic: I don't know that we'll agree on how to do this, even if the time and resources were available.I could be wrong, but I really suspect we agree on almost all the major points. With that being said, I'm having a hard time describing what's in my head, and I think that accounts for much of the apparent difference in opinion. I hope that looking at the demo will help with this. If this continues to be a problem I'm happy to set up a time to talk; if you're amenable to this, I've found it's a lot easier and faster to communicate clearly in real-time.
--- goals ---> What is your target user and what do you expect them to have to do to be able to do a fit? It might also be good to start with a use-case.What is your target user?Anyone who wants to fit complex models built out of simple components, and/or a lot of data to apply the same model to. Experimental physicists come to mind.
What do you have to do for a fit?Call `model.interactive_guess(data)`, then edit your model to your satisfaction. No more no less.
I'll talk about target users first. Who's got...1) Complex models from simple components? While I don't know enough about experimental physics to argue it permeates the field, it seems packages that think primarily in this manner are pretty widespread. Already 5 have been mentioned:- NexPy, as Ray mentioned- XASViewer, as you mentioned on the wiki- HyperSpy, and its nearly 30 pre-built 1D model components- I can speak somewhat for the PAD group here at Cornell - this kind of modeling is the basis for many of our bread-and-butter measurements- Will's prototype GUI
I think there are probably others if we went hunting. So now:a) if you believe that there are users who want to regress complex models that are simple components summed togetherb) and that the simple components have a natural interactive interface, such as the line I showed on the Gaussian peak that you can drag to resize itThen I think this is a good use-case!
2) A lot of data to apply the same model to? Part of what I'm selling with PyX is that even if you don't think you have a lot of data, you actually want & can have a lot of data, making this capability useful. So I'd tell you that this is everyone in the hard sciences - but I know that's more conjectural, so I'll stick to:- Shane described this as exactly his use-case (in quantum computing)- Again speaking for my group, we'd love to be able to, say, automatically optimize detector performance, as measured by metrics derived from a fit- Materials scientists & co are also starting to think hard about high-throughput characterization & automated phase space exploration. See for example SARA (recently brought in $15M) or this review in Joule. One may consider those proposals relatively aggressive, but even in a conservative sense, deriving performance metrics from a fit seems perfectly reasonable to do with tons of phase-space data.I think Shane nailed the reasons why an interactive interface would be good for this use case:> Since most of our fitting succeeds automatically, we would want to activate the interactivity on demand, less than 1% of the time. For example, when a fit fails it's often because our automated parameter initialization was bad, so we would interactively fix them.
Now, on to the actual fitting:`model.interactive_guess(data)` is really a replacement for the `guess`/`fit`/`plot` workflow. You gather your data, and do all the preprocessing on the command line. Then, rather than trying to guess reasonable initial parameters ala `guess`/`fit`/`plot`, you call `interactive_guess` to launch a non-blocking GUI. You select your first guess, and can then fit, interactively in a live setting. If the fit isn't good enough, you can change your initial guesses/fitting parameters as-needed until it is, either via the GUI or the command line. If you want to change your preprocessing and thus the data you're fitting, you may do that via the command line (again, GUI's still open, no need to explicitly "save" model/parameters/etc, it doesn't block the command line). Once you're happy with your data & fit you can close out the GUI. You're left with a command line object that's a complete record of your data, model, parameters, and fit results. This can be saved/processed however you like.
I like this workflow for small-scale fitting (1-10 datasets). When you have multiple data sets and well-defined models with many parameters (e.g. sum of 5 Gaussians), this is a significant quality of life & speed bump in my opinion. But I think this actually enables new possibilities in the 10-10000 dataset range (more on the importance of this below). Shane's figure on doing many regressions was a <1% failure rate. At 10000 datasets that's <100 that need fixing. Fixing 100 regressions by tweaking parameters, repeatedly doing `fit`, tweaking `params`, and looking at `plot`, sounds pretty time-consuming to me. I think it'd take me at least 4-5 minutes each. 500 minutes is 8 hours - an entire work day. However, with `interactive_guess`, I could:1) Easily come up with a good first guess to apply to 10k datasets, via the already described workflow2) Write a script to apply it to all 10000 datasets, then identify failed regressions (~100)3) Call `interactive_guess` sequentially on all 100 datasets. As soon as the user is satisfied with the fit/closes the GUI, launch the next GUI with the next dataset.I think I could do that at about 1 minute per regression. That's 1.6 hours total - half my morning. While that's not great, it's a lot better than a full day! While you can disagree with my specific numbers, my core argument is that I expect this workflow to be 4-5x faster per regression, and for a sufficiently large number of regressions, this becomes an enabling difference in what is possible.
And finally, a use-case:My prototypical use case is in extracting gain & noise figures from x-ray detectors. This essentially involves taking 3D image stacks, doing some preprocessing down to 1D, and then fitting to a sum of 5 or so Gaussians. In theory this is only one fit - but these are early stage detectors. Given the current empirical understanding of variations in gain & noise, I actually need to do 6 separate, but similar, fits, to properly characterize one detector at one operating point. This is a little painful to do without the interactive interface.But now, I want to optimize this detector, let's say to decrease its performance variation. I'll do this by changing operating parameters. But for every single set of operating parameters I try, I've got to do another 6 regressions. Two things here:1) There are literally hundreds of operating parameters with maybe 6-10 steps each. A complete exploration would require 200**7 = 1e16 sets of parameters. With some intuition, maybe I only care about 2-5 parameters and 3-6 steps at a time. That's between 2**3 = 8 and 5**6 = 15000 sets of parameters. At every point I need 6 regressions. So that's between 54 - 90000 regressions.2) The "6 regressions" figure is empirically determined. What if the nature of the performance variation has been misunderstood? A complete exploration would require 2*128*128 (every pixel) = 16000 fits, per set of operating parameters. Realistically, with maybe 100 regressions I can be fairly confident my set of 6 is sufficient.What's my point? I've got a space that theoretically requires 1e20 regressions to explore, but I've only got the tools to do 1-10 regressions. I've got to guess, via my best intuition, the behavior of my detector in this huge space from a limited set of data points. If I could do more regressions, on the 100-10000 scale, relatively easily, that would be tremendously beneficial, and massively limit the guesswork required.Traditionally people close this gap with intuition & experience. That's hard. Moreover, even experienced people can't be 100% certain, because they've had to make assumptions; they simply don't have the data. And with regression capability in the 100-10000 range, we could have much, if not all, of the data required.Don't forget that my task here is to come up with just one number, supposedly requiring just one regression. Imagine what a more complex task might require!I've described a narrow use-case: radiation imaging detector development. But I think this general case of having a large operating parameter space and metrics to be extracted from regression is common, perhaps much more common than is broadly admitted. For example, of the past 3 major projects I've worked on, I would claim that at least 2 were majorly held back for lack of practical bulk (100-1000 fits) regression ability. I had enough raw data, but not the tools to make sense of it.To support my claim I'll speak briefly of another project. I was essentially doing I-V characterization of some novel perovskites on the ~100fA scale. I discovered that these perovskites had all kinds of nasty hysteretic behaviors to voltage (literally degrading during measurement), light, air, humidity, temperature, you name it. I wanted to understand how the apparent trap density & carrier mobility changed (again - just two numbers), which required a fit to a piecewise linear & quadratic model. I stared at those numbers for months, one dataset at a time, but there were just too many variables that I couldn't directly control. I couldn't understand any of the trends. One day I buckled down and wrote a highly kludgy auto-regression framework that barely worked - and literally had insights overnight (which I manually confirmed, of course!)If I could have had easy access to the ability to perform several hundred regressions and verify the output earlier, it would have absolutely been a game changer for me.
--- Implementation ---> OK. I think I cannot tell if you think the screenshots shown on the wiki pages interesting or worth trying to use or emulate.I do think it's interesting and worth emulating. I've already taken the parameter table, and plan to implement the same kind of interface for `expr` & fit results. Perhaps more uniquely, I really like the sidebar where it appears one can scroll through different files - I hadn't considered that yet, and I think that could be quite a valuable option to have especially in my & Shane's use cases.
> I'm afraid I may not fully understand you here. I think you might mean that you want to click and drag on a plot of data and have a model function "follow the mouse" - redrawing. I'm not certain of that. I'm also not certain how you would like that to happen, at least for a truly general case.Yes, I want the model to "follow the mouse". I agree there's no true general solution, but recall that my use-case here is specifically models with a number of simple components. I think this use is far more straightforward:1) Select the component model you want to edit2) Each component decides how it should "follow the mouse" - interpreting drags or scrolls3) Only one component at a time is following the mouse; the model as a whole updates when any component changes
I've found this greatly speeds up the regression process, which as I discussed in "goals" is what I'm looking for.
This also relates to:> How do you know what parameters to change on mouse or scroll-wheel events?The component being actively edited decides for itself how to respond
Thus, each component in the model has to implement this interface. However, many models are reusable, it takes very little code (~20 easy lines) to do so, and the interface is pretty reusable between shapes. For example, the interface for a Gaussian distribution would be nearly identical to that for the Voigt or Lorentzian distributions, just with renamed parameters for center/spread/height.Related:> What is a polygon shape?Each component should represent itself as a connected series of line segments (a "polygon shape"). When those line segments move, the component updates itself to match.
> Hm, that seems kind of fragile. Would subclassing Parameters be helpful for this?I agree that this is fragile and want to find a better way. I've already had to subclass Parameter to hack in the callbacks. I can think of a few possible workarounds, but none are great:1) Put watchers on Parameters so whenever a Parameter is replaced, we transfer the callbacks. However, Parameters subclasses OrderedDict. Off the top of my head, one could replace a Parameter with bracket-indexing, via `.set`, or via `.update`. I feel like there are probably others - I just don't know readily where to put a callback to catch all instances of a Parameter being replaced.2) When a user wants to replace a Parameter object, instead, just copy all the attributes over. In addition to sharing the problem with 1), this sounds like it could have a lot of unintended side effects.3) Watch for the Parameter attributes themselves changing. However, I personally don't know of any way to just "watch" an arbitrary Python variable change.
> Not sure I follow you there, but maybe that is specific to the toolkit choices you're making. Should I understand what 'igs' is?Apologies, I had an editing error. `igs` is the InteractiveGuessSession instance - I'm saying, you could send new data to the front-end display via a method call from the Python terminal.> Is this IPython or Jupyter? Or is it in some browser? I don't recognize it.This is not IPython or Jupyter and does not depend on them in any way. This is Panel, which provides HTML/JS front-end tools for Python, and is primarily designed to run standalone with a web browser. It is very high-level and built on top of Bokeh.
> Wait, JSON doesn't support infinities?That's what I said too! As it turns out, Python's default implementation (`allow_nan=True`) is not officially JSON-compliant, as the JSON specification doesn't support NaN/Inf. For this reason, Bokeh has had to disable Python's NaN/Inf serialization. And forcibly enabling Bokeh's NaN/Inf serialization indeed causes errors JS-side for me.
I hope this was informative and worth reading. Thank you for working with me on this!
--
You received this message because you are subscribed to a topic in the Google Groups "lmfit-py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lmfit-py/oFv_6Y96qXc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbqY93MrBOp88SwsjwOxAx-io%2B61%2BzLcrjZym9gxk3bEfQ%40mail.gmail.com.