Introduction: ZeroGuess - Better initial parameter through ML-estimation

49 views
Skip to first unread message

Deniz Bozyigit

unread,
Mar 11, 2025, 8:47:10 PMMar 11
to lmfit-py
Hi everyone,
I wanted to introduce a tool I've developed that might help many of you with a common curve fitting challenge - finding good initial parameters.
After years of struggling with parameter sensitivity in complex fits (and seeing similar questions on this list), I created ZeroGuess, an open-source library that uses machine learning to automatically generate optimal starting parameters for curve fitting functions.
The Problem It Solves:
We've all been there:
  • A fit converges with one set of starting parameters but fails with slightly different ones
  • You're batch processing hundreds of datasets and can't manually tune each one
  • You resort to computationally expensive global optimization methods because local methods are too sensitive to starting points
How It Works:
ZeroGuess trains a neural network on your specific fitting function to recognize what parameters would likely produce a given curve. It integrates seamlessly with lmfit:

from zeroguess.integration import ZeroGuessModel

# Enhanced lmfit Model with parameter estimation
model = ZeroGuessModel(
wavelet,
independent_vars_sampling={"x": x_data},
estimator_settings={
"snapshot_path": "model_dg.pth", # saves and loads model automatically
},
)

# Still use your usual parameter hints
model.set_param_hint("frequency", min=0.05, max=1.0)
model.set_param_hint("phase", min=0.0, max=2.0 * np.pi)
model.set_param_hint("position", min=5.0, max=15.0)
model.set_param_hint("width", min=0.1, max=3.0)

# But now the guess step is ML-powered
params = model.guess(y_data, x=x_data)

# Run the fit
result = model.fit(y_data, x=x_data, params=params)

Key Benefits:
  • Benchmarks show it can match the success rate of global optimizers while using simple local methods (100× faster)
  • Once trained, parameter estimation is extremely fast. The model saves to disk, so you only train once.
  • It handles the "symmetry problem" (when multiple parameter combinations produce identical output)
The library is MIT licensed, Python 3.10+ compatible, and available at:
I'd welcome any feedback from the community and am happy to answer questions. If anyone is working on particularly challenging fits, I'd be interested to hear your use cases too.
Best regards,
Deniz

Matthew Newville

unread,
Mar 11, 2025, 10:51:27 PMMar 11
to lmfi...@googlegroups.com

Hi Denis,

 

That looks interesting and ambitious.

 

 

from zeroguess.integration import ZeroGuessModel

 

# Enhanced lmfit Model with parameter estimation

model = ZeroGuessModel(

wavelet,

independent_vars_sampling={"x": x_data},

estimator_settings={

"snapshot_path": "model_dg.pth", # saves and loads model automatically

},

)

 

 

Is “wavelet” a model function, or is that some part of the learning algorithm?  I think it would be helpful to see a use-case with a clearly defined model function.  

 

What is `x_data` and is that needed?   Why does this ZeroGuessModel need that?

 

# Still use your usual parameter hints

model.set_param_hint("frequency", min=0.05, max=1.0)

model.set_param_hint("phase", min=0.0, max=2.0 * np.pi)

model.set_param_hint("position", min=5.0, max=15.0)

model.set_param_hint("width", min=0.1, max=3.0)

 

 

Please do not do this. 

 

Parameter hints belong to the Model – like it is right there in the `Model.set_param_hint` call.   They are intended to represent cases where a range for a parameter value just makes no sense for the Model.  An example would be setting a minimum value for “sigma” to 0 for a Gaussian Model:  No matter what the data is,  `sigma` is a positive value. GaussianModel works if amplitude=1.3e9, center=34e7, and sigma=803. It also works if amplitude=-80000, center=-2.3, and sigma=0.03.     The initial parameters will need to be different, but the model is the same.  And sigma < 0 just does not make sense, so the Model disallows it.

 

Models are independent of data.   Parameters made for a model to be used to fit data will need to have values that depend on the data.  I would assume that you, writing a library to guess initial values would fully understand this.  Bounds on parameters for a particular set of data belong to the parameters.  They do not belong to the Model.  

 

Parameter hints for a Model should not be set based on a particular range of data.   

 

If the user is not expected to be able to have good starting values on their own, and needs ZeroGuessModel, what makes you think they will have realistic estimates of the bounds?  

 

In my experience, a really common problem that people have is setting bounds too tightly, often with parameter hints, or setting the bounds based on some heuristics.  I would suggest trying to avoid that problem.

 

 

 

Key Benefits:

·         Benchmarks show it can match the success rate of global optimizers while using simple local methods (100× faster)

·         Once trained, parameter estimation is extremely fast. The model saves to disk, so you only train once.

·         It handles the "symmetry problem" (when multiple parameter combinations produce identical output)

 

It would be interesting to see the benchmarks. 

 

But also (and again), training a model implies that is for some range of data values, and so is dependent on a range of data, which are not part of the Model itself.

 

--Matt

 

 

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lmfit-py/09f47e01-b2ac-4400-af0c-ddadbe0dee4bn%40googlegroups.com.

Laurence Lurio

unread,
Mar 12, 2025, 9:42:17 AMMar 12
to lmfi...@googlegroups.com
Looks intriguing.  I have some problems where this seems like the right solution, I'll try to give it a shot.

Larry Lurio

--

Deniz Bozyigit

unread,
Mar 22, 2025, 10:09:51 AMMar 22
to lmfit-py
Hi Matt,

Thank you for your feedback! (I tried to answer earlier via email, but that seems not to have worked)


Is “wavelet” a model function, or is that some part of the learning algorithm?  I think it would be helpful to see a use-case with a clearly defined model function.  
No, the wavelet was just a convenience function - but confusing - I agree. I added a simplified, more explicit, example code at the end.


What is `x_data` and is that needed? Why does this ZeroGuessModel need that?
Currently, zeroguess only supports data with consistent x-axis sampling [see limitations]. It is practically used for synthetic data generation for learning. Planning to remove this limitation in the next version. (As for your next point, this should probably go with the estimator - rather than the model.)

Parameter hints belong to the Model [...] 
I see your point - let me reflect on this. The main reason I added the estimator to the Model class was that I figured it would fit the lmfit approach to implement Model.guess. Ultimately, to make the estimation learnable, there needs to be a finite parameter space - a limitation - but probably still very useful for many applications. Maybe cleaner to keep the estimator separate from Model. Let me know if you have any opinions here.

It would be interesting to see the benchmarks. 
You can see the benchmarks here: 
All the best
Deniz

Example Code:  

import numpy as np

# Define a simple wavelet function directly
def wavelet(x, frequency, phase, position, width):
envelope = np.exp(-((x - position) ** 2) / (2 * width**2))
return envelope * np.sin(2 * np.pi * frequency * (x - position) + phase)

# Create some synthetic experimental data with known parameters
true_params = {"frequency": 0.5, "phase": 1.0, "position": 7.0, "width": 1.5}

# Generate x, y data points
x_data = np.linspace(0, 20, 200)
y_clean = wavelet(x_data, **true_params)

# Add noise
np.random.seed(42) # For reproducibility
noise_level = 0.05
y_data = y_clean + np.random.normal(0, noise_level * (np.max(y_clean) - np.min(y_clean)), size=y_clean.shape)

from zeroguess.integration import ZeroGuessModel

# Enhanced lmfit Model with parameter estimation
model = ZeroGuessModel(
wavelet,
independent_vars_sampling={"x": x_data},
estimator_settings={
# Save and load model automatically
"snapshot_path": "estimator_lmfit.pth",
},
)

# Set parameter hints
model.set_param_hint("frequency", min=0.05, max=1.0)
model.set_param_hint("phase", min=0.0, max=2.0 * np.pi)
model.set_param_hint("position", min=5.0, max=15.0)
model.set_param_hint("width", min=0.1, max=3.0)

# Guess parameters with ZeroGuess estimator
params = model.guess(y_data, x=x_data)

# Run the fit
result = model.fit(y_data, x=x_data, params=params)

Deniz Bozyigit

unread,
Mar 22, 2025, 10:09:51 AMMar 22
to lmfit-py
Hi Laurence,

Great to hear! Let me know if you want to share any of the problems where this might be a fit - happy to help.

I simplified the quickstart (based on Matts comments) - should make it a bit easier.

Benchmarks might be interesting to look at:

Best
Deniz

PS: I had answered a few days ago via email, but that seems not to have worked...

Laurence Lurio

unread,
Mar 27, 2025, 4:48:33 PMMar 27
to lmfi...@googlegroups.com
Hi Deniz,

I finally got around to trying the zeroguess code.  Here is a section of my code where I tried to use it.  I wasn't sure exactly how the line

independent_vars_sampling={"x": x_data}

was supposed to be handled.  My independent variable is 'alpha" but I have another variable 'E0' which is just a constant I input to the function.  I put it in the same line e.g. 

independent_vars_sampling={'alpha': alpha[rr], 'E0':E0}

but I'm not sure if that's correct.

In any case, when I ran my code (see below) I got an error

"Loading failed: Model file not found: estimator_lmfit.pth "

I'm not sure what that means.  

For reference, here is a longer detail of the code I used and below that the error message.  

This is not the full code, I have the functions defined in an auxiliary file, but I think that shouldn't matter for what you need to understand how I'm using the code. 

Larry Lurio

----


from zeroguess.integration import ZeroGuessModel
alpha = th*scc.degree
bi_refl_model = ZeroGuessModel(
    bi_refl_wrapper,
    independent_vars_sampling={'alpha': alpha[rr], 'E0':E0},
    estimator_settings={
        # Configure training parameters
        # "n_samples": 1000,
        # "n_epochs": 200,
        # "validation_split": 0.2,
        # "add_noise": True,
        # "noise_level": 0.1,
        # 'verbose': True
        # Provide a function to make parameters canonical
        # "make_canonical": ...,
        # Save and load model automatically
        "snapshot_path": "estimator_lmfit.pth",
    },
)

bi_refl_model.set_param_hint('I0',min=1e7,max=1e11)
bi_refl_model.set_param_hint('dalpha',min=0,max=0.02*scc.degree)
bi_refl_model.set_param_hint('res',value=1e-6,vary=False)
bi_refl_model.set_param_hint('alpha_foot',value=0.0162,vary=False)
bi_refl_model.set_param_hint('sig_sio2',value=3.84,vary=True, min=1, max = 10)
bi_refl_model.set_param_hint('rho_a',value=0.7,vary=True, min=0.7, max = 1.5)
bi_refl_model.set_param_hint('d_b',value=44.317,vary=True, min=30, max = 70)
bi_refl_model.set_param_hint('d_h',value=3,vary=True, min=1.5, max = 6)
bi_refl_model.set_param_hint('d_m',value=0,vary=False)
bi_refl_model.set_param_hint('d_sio2',value=9.107,vary=True,min = 5, max = 20)
bi_refl_model.set_param_hint('sig',value=6.03,vary=True,min = 1, max = 20)
bi_refl_model.set_param_hint('alpha_foot',value=0.0162,vary=False)
bi_refl_model.set_param_hint('alpha_foot',value=0.0162,vary=False)


# Guess parameters with ZeroGuess estimator
params = bi_refl_model.guess(I[rr], alpha=alpha[rr], E0 = E0)
# Run the fit
result = bi_refl_model.fit(I[rr], alpha=alpha[rr], params=params)

-----

Loading estimator of type NeuralNetworkEstimator from estimator_lmfit.pth Loading failed: Model file not found: estimator_lmfit.pth Creating new estimator.

FileNotFoundError Traceback (most recent call last) File c:\Users\th0lxl1\AppData\Local\anaconda3\Lib\site-packages\zeroguess\estimators\nn_estimator.py:539, in NeuralNetworkEstimator.create_or_load(cls, snapshot_path, device, **kwargs) 537 raise ValueError("No path provided.") --> 539 estimator = cls.load(snapshot_path, device) 540 estimator.function = kwargs.get("function", None) File c:\Users\th0lxl1\AppData\Local\anaconda3\Lib\site-packages\zeroguess\estimators\nn_estimator.py:566, in NeuralNetworkEstimator.load(cls, snapshot_path, device) 565 if not os.path.exists(snapshot_path): --> 566 raise FileNotFoundError(f"Model file not found: {snapshot_path}") 568 # Determine device to load model onto FileNotFoundError: Model file not found: estimator_lmfit.pth During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) File c:\Users\th0lxl1\AppData\Local\anaconda3\Lib\site-packages\zeroguess\integration\lmfit_adapter.py:197, in Model._initialize_estimator(self, **train_kwargs) 196 # Create or load estimator --> 197 self._estimator = zeroguess.create_estimator( 198 function=self.func, 199 param_ranges=self.param_ranges, 200 independent_vars_sampling=self.independent_vars_sampling, 201 # # Load if snapshot_path is provided 202 # snapshot_path=self.estimator_settings.get("snapshot_path", None),
...
213 # If initialization or training fails, log the error and set estimator to None 215 self._estimator = None --> 217 raise RuntimeError(f"Failed to initialize or train parameter estimator: {str(e)}. ") RuntimeError: Failed to initialize or train parameter estimator: Sampling for E0 must be a numpy array.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

Reply all
Reply to author
Forward
0 new messages