Hi all,
Please take a look at the following plot:
http://img63.imageshack.us/img63/5599/gggyf5.jpg
There are two curves:
The red one is the ideal one and the blue one is the real data from
experiments.
Due to some numerical instability, the real data strays away from the ideal
curve at around x=15, and then begins oscillates weirdly after x=19. Since
the real data is definitely wrong beyond x=19, I only capture the real data
from x=1 to x=19.
I also know that at x=+infinity, the asymptotic value of the curve should be
1.5. That's to say, the ideal curve slowly converges to y=1.5 from below and
in theory it never touches y=1.5, but numerically, it can reach y=1.5 when x
is a very large number.
The questions is how to recover the ideal curve(the red one) from the
portion of the real data(the blue curve, up to x=19). More specifically:
1. Is there a symtematic method to detect the turning point starting from
which the real curve begins to stray away from the ideal curve (here is
x=15)?
If I throw away the data from x=15 to x=19, and only do the interpolaton
based on the data from x=1 to x=15. Then I will probably do a very good job
fitting the curve. However, I want to find a systematic method that works
automatically and programmatically for all such curves but with varying
turning points. It's going to be troublesome if every time I have to
physically inspect the curves use my eyes manually.
2. Is there a way to utilize the knowledge of the asymptotic value y=1.5?
I tried to do polynomial fit and other cubic line fit for x=1 to 19 and then
x=100000(at which y=1.5), but the result is not very good.
There must be a way to exploit the smoothness of the curve and recover the
ideal curve(the red one) based on the partial real data(the blue curve, up
to x=19, before it diverges)...
Please shed some lights on me! Thanks a lot!
You want a way to massage your experimental data to make it match your
theory of how it should be behave?
If so, your idea of ignoring points that don't lie on the curve predicted by
your theory seems the best approach; just extend this idea to any other
points that don't conform with your theory.
This comes with the terroritory. Another example is that if you
approximate the graph of 1/(1+x^2) by a polynoimal, the graph of the
polynomial will oscillate wildly.
> I also know that at x=+infinity, the asymptotic value of the curve should be
> 1.5. That's to say, the ideal curve slowly converges to y=1.5 from below and
> in theory it never touches y=1.5, but numerically, it can reach y=1.5 when x
> is a very large number.
>
> The questions is how to recover the ideal curve(the red one) from the
> portion of the real data(the blue curve, up to x=19).
In general, it can't be done.
> More specifically:
>
> 1. Is there a sy[s]tematic method to detect the turning point starting from
> which the real curve begins to stray away from the ideal curve (here is
> x=15)?
>
> If I throw away the data from x=15 to x=19, and only do the interpolaton
> based on the data from x=1 to x=15. Then I will probably do a very good job
> fitting the curve. However, I want to find a systematic method that works
> automatically and programmatically for all such curves but with varying
> turning points. It's going to be troublesome if every time I have to
> physically inspect the curves use my eyes manually.
>
> 2. Is there a way to utilize the knowledge of the asymptotic value y=1.5?
> I tried to do polynomial fit and other cubic line fit for x=1 to 19 and then
> x=100000(at which y=1.5), but the result is not very good.
>
> There must be a way to exploit the smoothness of the curve and recover the
> ideal curve(the red one) based on the partial real data(the blue curve, up
> to x=19, before it diverges)...
>
> Please shed some lights on me! Thanks a lot!
What type of curve are you assuming your ideal curve to be? This also
plays a big part in how the interpolation comes out. For instance, if
you try to fit data to a polynomial, where you have a limit of 1.5 as
x approaches infinity, the resulting curve won't work. (This may have
been what you did and found out.) You should try looking at curves
which approach 1.5 as x approaches infinity. The form
y = 1.5 - A e^(-B x)
comes to mind (where A and B are positive constants yet to be
determined). It might or might not be what you want, though. (If so,
then there's a way to turn the problem into a linear regression. I'll
give more details if this is what you want to do.)
--- Christopher Heckman
Here is a suggestion. I can't predict how well it will work but you can
give it a try. Implicit in this, I am assuming you are at least familiar with the
basics of numerical derivatives and numerical regression. You may even
remember being taught a curvature formula in your calculus days. :-)
STEP 1: Smooth the data with a moving average or kernel smoother as much
as necessary before STEP 2. Since it may not be a priori obvious how much smoothing
is required to facilitate the best performance of the following steps, you may wish to
make, say the size of the smoothing window, a parameter of the overall "systematic
method" which you can adjust later on by trying out different values.
STEP 2: From the smoothed data, call it f(x), "plot" the T(x) = (Curvature(f)(x))^(-2)
From Calculus,
Curvature(f)(x) = (1+f'(x)^2)^(3/2)/f''(x)
So, T(x) = (Curvature(f)(x))^(-2) = (f''(x))^2/(1+f'(x)^2)^3 . What this really means is
that you will fill in one array with the first numerical derivative of your data, f'(x); you
will in a second array with the second numerical derivative of your data, f''(x); you
will in a third array with T(x) = (Curvature(f)(x))^(-2) .
STEP 3: The "systematic method" locates x1 corresponding to the first peak
of T(x), locates x2 corresponding to the second peak of T(x), then locates xm
corresponding to the minimum of T(x) inside the interval [x1,x2]. The T(x) is increasing
again after xm, but your "real" curve isn't supposed to do that. According to your
claim, there is only going to be one turning point, and we've already seen it at x1.
So, surely anything going up to point (x2,T(x2)) is headed in the wrong direction,
whereas everything going down to point (xm,T(xm)) was at least headed in the
right direction, if not entirely perfect up to xm. So throw away the data which comes
after xm.
STEP 4: Use regression method and model of your choosing against the
good original data which STEP 3 says continued up to xm.
ARTIFICIAL EXAMPLE: Imagine that the real data was generated by the
function f(x) = x^3+x , which does resemble your JPG somewhat, up to and
including the second turning point. The first turning point was good. The
second turning point was bad. According to theory,
T(x) = 36*x^2*(1+(1+3*x^2)^2)^(-3)
If you plot this T(x), you will see that T(x) attains maximum at x1 = -.3407498972
and x2 = .3407498972 and attains minimum at xm = 0.0 . We would throw away
the data that comes after xm = 0.0 and the attempt to recover f(x) from the remaining
data.
In any case where the "real data" is not exact, what you want to do is
regression, not interpolation. Typically you might choose some
functions f_i(x), i = 1 .. n, and try to fit the data to the formula
f(x) = 1.5 + sum_{i=1}^n c_i f_i(x), using linear least squares.
There may be theoretical considerations about the behaviour of the ideal
function as x -> infinity and as x -> 0 which will help choose the f_i.
For example, you might try f_i(x) = 1/x^i if that sort of behaviour
would be reasonable.
--
Robert Israel isr...@math.MyUniversitysInitials.ca
Department of Mathematics http://www.math.ubc.ca/~israel
University of British Columbia Vancouver, BC, Canada V6T 1Z2
I would fit cubic splines, then look for the peak in f'' in much the same
way as Klueless. This is harder to do than moving average, but moving
average has a lag which will shift the turning point to the right.
If the data deviates near, or even before the bend in the graph then you
will get garbage coming out.
I have used an alternative technique.
First the data is nomalised.
An polynomial approximation is calculated.
Then "best" functional candidates are sought with the Taylor polynomial
matching as well as possible to the original polynomial.
In one variable case the x- interval is divided in several "wavelets"
Finally the best function is found using nonlinear regression among a group
of candidates.
The method has some restrictions that are not very strict.
1 The phenomenon and the derivates must be continuous.
2 The function set is resembling a neural net. with 8 parameters.
The nodes can be different functions and parameters are set only at the
lowest level of the net.
The result is like following
f =
(((2.01+c)*(0.129+a))+EXP((.903+c)+(1.04+b)))/((5.05+b)*(0.306+a))*SIN((1.87+a)*(0.708+b))
See more www.estlab.com
No, you absolutely don't want to do a polynomial fit.
I'd also recommend not doing a nonlinear fit to a
negative exponential. You are likely to see lack of fit,
unless that is a realistic model for the process.
I'd also suggest that dropping some points that fail
to fit theory is not the worst thing to do in life, WITH
A STRONG CAVEAT.
When you write this up in a paper or your thesis,
show the entire data set. Describe a logical
mechanism for the deviation from theory. Only
now would you drop those few points from the
end of the curve.
How would I select the points to drop? I'd fit an
interpolating cubic spline to the curve. Compute
the second derivative of the curve. This will be
a piecewise linear curve. Find that point where
the second derivative goes negative, i.e., where
it crosses the x axis.
Now I'd refit the curve with only the data below
that point. Myself, I'd use a least squares cubic
spline for the fit, where the last knot extrapolates
to some reasonably large number. I'd constrain
the curve to
1. be a monotone increasing function
2. have an everywhere negative second derivative
3. have its value everywhere <= 1.5.
Given the first two requirements, this last
requires only that the right end point of the
curve be less than or equal to 1.5.
HTH,
John
Perhaps the physics/chemistry of the phenomenon should be driving the
mathematics and not the other way round. You are clearly justified in
taking the inverse relation existing between x and y with accuracy in
your theory and mathematical model as valid up to x =15. The
assumptions in the theory are stretched too far after that may be, and
you may perhaps want to look for what caused this abrupt unsustainable
deviation.Remember how Boyle's law was modified by van der Waals in
his famous equation incorporating inter-particle attractive forces?
Narasimham
The equation is an equation of state for a fluid composed of
particles that have a non-zero size and a pairwise (such as the van
der Waals force.) It was derived by Johannes Diderik van der Waals in
1873, based on a modification of the ideal gas law. The equation
approximates the behavior of real fluids, taking into account the
nonzero size of molecules and the attraction between them.
- Hide quoted text -
- Show quoted text -
> How to interpolate this curve?
> Hi all,
> Please take a look at the following plot:
> http://img63.imageshack.us/img63/5599/gggyf5.jpg
> There are two curves:
> The red one is the ideal one and the blue one is the real data from
> experiments.
> Due to some numerical instability, the real data strays away from the ideal
> curve at around x=15, and then begins oscillates weirdly after x=19. Since
> the real data is definitely wrong beyond x=19, I only capture the real data
> from x=1 to x=19.
> I also know that at x=+infinity, the asymptotic value of the curve should be
> 1.5. That's to say, the ideal curve slowly converges to y=1.5 from below and
> in theory it never touches y=1.5, but numerically, it can reach y=1.5 when x
> is a very large number.
> The questions is how to recover the ideal curve(the red one) from the
> portion of the real data(the blue curve, up to x=19). More specifically:
> 1. Is there a symtematic method to detect the turning point starting from
> which the real curve begins to stray away from the ideal curve (here is
> x=15)?
> If I throw away the data from x=15 to x=19, and only do the interpolation
> based on the data from x=1 to x=15. Then I will probably do a very good job
> fitting the curve. However, I want to find a systematic method that works
> automatically and programmatically for all such curves but with varying
> turning points. It's going to be troublesome if every time I have to
> physically inspect the curves use my eyes manually.
> 2. Is there a way to utilize the knowledge of the asymptotic value y=1.5?
> I tried to do polynomial fit and other cubic line fit for x=1 to 19 and then
> x=100000(at which y=1.5), but the result is not very good.
> There must be a way to exploit the smoothness of the curve and recover the
> ideal curve(the red one) based on the partial real data(the blue curve, up
> to x=19, before it diverges)...
> Please shed some lights on me! Thanks a lot!
Perhaps the physics/chemistry of the phenomenon should be driving the
mathematics and not the other way round. You are clearly justified in
taking the inverse relation existing between x and y with accuracy in
your theory and mathematical model as valid up to x = 15. The
assumptions in the theory are stretched too far after that may be, and
you may perhaps want to look for what caused this abrupt unsustainable
deviation.Remember how the ideal gas law was modified by Van der Waals
you wrote:
(snip)
> The questions is how to recover the ideal curve(the red one) from the
> portion of the real data(the blue curve, up to x=19).
(snip)
> There must be a way to exploit the smoothness of the curve and recover the
> ideal curve(the red one) based on the partial real data(the blue curve, up
> to x=19, before it diverges)...
I suggest using an extended kalman filter where the governing
equations in the filter are based on the fundamental physics of the
experiment.
The Kalman filter is a very very powerful tool.
Book: ISBN 0-471-39254-5
Kalman Filtering theory and practice using matlab (2nd ed) by Grewal
and Andrews (c) 2001