Fit: Asymmetric errorbars?

Ingo Thies

unread,

Mar 10, 2010, 11:34:36 AM3/10/10

to

Dear all,

a colleague of mine is looking for a tool that can fit a power law (or a
linear function to be transformed into a power law) to a data set which
contains values for x, y, and for each y a lower and upper errorbar. The
task is now to create a best fit to these data with respect to these
asymmetric errorbars. In particular, the errorbars for a typical y value
extends from slightly below y to high above y; e.g. y = 10, y-=9, and
y+=15. Using the mean error as standard error would lead to wrong
results (fit curve too low).

As far as I can see there is no documented feature for this in the
recent gnuplot.pdf in the Fit section; however, does anyone know whether
either such a feature is up on the horizon or can be achieved by some
tricks? Or did I simply not see an existing option that is already there?

Ingo

Hans-Bernhard Bröker

unread,

Mar 11, 2010, 2:43:23 PM3/11/10

to

Ingo Thies wrote:

> asymmetric errorbars. In particular, the errorbars for a typical y value
> extends from slightly below y to high above y; e.g. y = 10, y-=9, and
> y+=15. Using the mean error as standard error would lead to wrong
> results (fit curve too low).

No, gnuplot's "fit" command has no algorithms to work with asymmetric
data errors whatsoever. Like all least-squares fitting programs it's
based on the assumption of Gaussian error distributions. Once that
assumption breaks, you can't use it. Errors as lopsided as those can't
possibly be Gaussian, so 'fit' can't be used here.

One would need a much more complex computing environment than gnuplot
ever wanted to be to attack that kind of maximum likelihood
approximation with completely distorted errors. You need a real
numerics workstation. Octave or something like that should work.

Ingo Thies

unread,

Mar 12, 2010, 5:06:28 AM3/12/10

to

Hans-Bernhard Bröker wrote:

> One would need a much more complex computing environment than gnuplot
> ever wanted to be to attack that kind of maximum likelihood
> approximation with completely distorted errors. You need a real
> numerics workstation. Octave or something like that should work.

Hmm, I didn't expect this problem to be that complex. I'd expect that a
simple transformation could map asymmetric errors on an intervall
[-sigma:sigma]. I had a similar problem with asymmetric errorbars which
were actually logarithmically mapped Poisson errors, to be fitted as a
line in the bi-log graph (i.e. actually a power law). Therefore, I could
easily fit the underlying power law in Gnuplot directly and then convert
back into logarithm (this was one of two reasons for my recent post "Use
fitting error values as variables"; the second concerns the
approximation splines; see that thread).

Since the real error distribution is unknown, I would suggest something
like mapping [minerr:maxerr] to [-1:1] via some simple continuous and
strictly monotonic function, calculate the chi^2 in this transformed
space, and then convert back. However, this discussion might be more
on-topic in a math group, so thanks anyway.

By the way, are you sure that Octave can do it? My colleague just told
me that even most expensive professional tools like Origin (according to
their web page) cannot deal with asymmetric errors...

Ingo

Oliver Jennrich

unread,

Mar 12, 2010, 3:37:14 PM3/12/10

to

Ingo Thies <ingo....@gmx.de> writes:

> Hans-Bernhard Bröker wrote:
>
>> One would need a much more complex computing environment than
>> gnuplot ever wanted to be to attack that kind of maximum likelihood
>> approximation with completely distorted errors. You need a real
>> numerics workstation. Octave or something like that should work.
>
> Hmm, I didn't expect this problem to be that complex.

Welcome to the wonderful world of non-gaussian probability distributions

> I'd expect that
> a simple transformation could map asymmetric errors on an intervall
> [-sigma:sigma].

Sure, you can always do that. But you shouldn't expext that the fit is
optimal in a statistical sense. And you should not expect that the
parameters of the fit are those of maximum likelihood and that the
errorbars of the parameters have anything to do with the width (or
indeed anything) with the PDF for the parameters.

> By the way, are you sure that Octave can do it?

Yes, of course. Pretty much the same way that Matlab, C, Fortran or any
other programming language can do it: You have to implement an algorithm
that evaluate the posterior PDF for the parameters according to the
Bayes formula.

--
Space - The final frontier

Hans-Bernhard Bröker

unread,

Mar 12, 2010, 3:39:10 PM3/12/10

to

Ingo Thies wrote:

> Hmm, I didn't expect this problem to be that complex. I'd expect that a
> simple transformation could map asymmetric errors on an intervall
> [-sigma:sigma].

That asymmetry of the error measure is only the outermost symptom of the
problem. It's the proverbial tip of the iceberg.

The real problem is errors like that clearly aren't distributed
normally, and there's not really much chance they will be after some
somewhat arbitrarily chosen transformation.

> Since the real error distribution is unknown, I would suggest something
> like mapping [minerr:maxerr] to [-1:1] via some simple continuous and
> strictly monotonic function, calculate the chi^2 in this transformed
> space, and then convert back.

Just hammering the distribution into a symmetric shape in some haphazard
way is pretty much guaranteed to be no help at all.

Paraphrasing Knuth: the probability that an unknown random distribution
transformed by some random method will end up Gaussian can safely be
assumed as zero.

> By the way, are you sure that Octave can do it?

No.

Dr Engelbert Buxbaum

unread,

Mar 14, 2010, 12:56:54 PM3/14/10

to

Am 11.03.2010, 15:43 Uhr, schrieb Hans-Bernhard Bröker
<HBBr...@t-online.de>:

> Like all least-squares fitting programs it's based on the assumption of
> Gaussian error distributions. Once that assumption breaks, you can't
> use it.

There is one common method that does not rely on Gaussian error
distribution, namely the Simplex algorithm. As long as you can calculate a
goodness-of-fit in any way, Simplex works. An application I once ran into
was to fit a function where both controlled and dependent variable spanned
6 orders of magnitude, with constant relative error. Marquard -Levenberg
simply ignored the low values, but fitting with Simplex for minimal chi^2
worked beautifully. In the OP's case, fitting for minimal median deviation
could perhaps work (in a practical if not theoretical sense), but that
depends on what he is actually doing.

@article{Cac-84,
AUTHOR= {Caceci, M.S. and Cacheris, W.P.},
TITLE= {Fitting Curves to Data: {T}he Simplex Algorithm is the
Answer},
JOURNAL= {Byte},
VOLUME= {9},
NUMBER= {5}
YEAR= {1984},
PAGES= {340-362},
ABSTRACT= {complete implementation in Pascal},
LANGUAGE= {engl}
}

@article{Nel-65,
AUTHOR= {Nelder, J.A. and Mead, R.},
TITLE= {A simplex method for function minimization},
JOURNAL= {The computer journal},
VOLUME= {7},
NUMBER= {4},
PAGES= {308-313},
YEAR= {1965},
ABSTRACT= {Simplex-Algorithmus zur nicht-linearen Kurvenanpassung }
DOI={10.1093/comjnl/7.4.308}
}

The disadvantage of Simplex is that by itself it can not provide error
estimates for the fit parameters, these have to be calculated by
bootstrapping. On a PC-compatible of the mid-80's that took a couple of
minutes; today the computational effort isn't worth mentioning:

@article{Str-92,
AUTHOR= {Straume, M. and Johnson, M.L.}
TITLE= {Monte Carlo Method for Determining Complete Confidence
Probability Distributions of Estimated Modell Parameters},
JOURNAL= {Meth. Enzymol.},
VOLUME= {210},
YEAR= {1992},
PAGES= {117-129},
ABSTRACT= {Fehlergrenzen für die mit Simplex ermittelten Parameter
einer Kurve durch Bootstraping },
DOI= {10.1016/0076-6879(92)10009-3},
LANGUAGE= {engl}
}

Note that by this method unsymetrical error distributions of the
parameters may be detected.