[SciPy-User] Comparing variable time-shifted two measurements

0 views

Skip to first unread message

Gökhan Sever

unread,

Nov 5, 2009, 9:21:15 PM11/5/09

to SciPy Users List, Discussion of Numerical Python

Hello,

I have two aircraft based aerosol measurements. The first one is dccnConSTP (blue), and the latter is CPCConc (red) as shown in this screen capture. (http://img513.imageshack.us/img513/7498/ccncpclag.png). My goal is to compare these two measurements. It is expected to see that they must have a positive correlation throughout the flight. However, the instrument that gives CPCConc was experiencing a sampling issue and therefore making a varying time-shifted measurements with respect to the first instrument. (From the first box it is about 20 seconds, 24 from the seconds before the dccnConSTP measurements shows up.) In other words in different altitude levels, I have varying time differences in between these two measurements in terms of their shapes. So, my goal turns to addressing this variable shifting issue before I start doing the comparisons.

Is there a known automated approach to correct this mentioned varying-lag issue? If so, how?

Thank you.

--
Gökhan

Anne Archibald

unread,

Nov 5, 2009, 11:48:21 PM11/5/09

to SciPy Users List

2009/11/5 Gökhan Sever <gokha...@gmail.com>:

There are several tools you can use, depending on exactly what the problem is.

If the problem is that there's a constant lag for each data set but
you don't know what it is, then you can use the correlation to fit for
the lag - if you take the correlation of two vectors, then the highest
peak in the correlation vector is the lag where the two vectors are
most similar. Correlations can be calculated rapidly using FFTs.

If the lag isn't constant over a data set, you can try using
correlations to find the lag at several points in the data set and
interpolate to get the lag as a function of time (but be careful -
depending on what caused the lag, a steadily-drifting model isn't
necessarily appropriate; maybe you'll have periods of constant offset
separated by jumps).

If you know the lag, but it isn't constant and you're not sure how to
resample your data set to remove the lag, look at scipy's ndimage.
This should have the tools to do what you want.

If your data sets are unevenly sampled, so that you can't use simple
correlations, I'm not sure quite what to suggest, except perhaps
interpolating them to evenly-spaced samples and then running the
correlation. For this try scipy.interpolate.

If you do end up fitting for the lag, keep in mind that you'll have
adjusted the lags to make the time series as similar as possible, so
that there's a risk of overestimating their similarities. But the only
way around that problem is to know the lags from some independent
source.

Anne

> Thank you.
>
> --
> Gökhan
>
> _______________________________________________
> SciPy-User mailing list
> SciPy...@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

Gökhan Sever

unread,

Nov 6, 2009, 6:36:23 PM11/6/09

to SciPy Users List

On Thu, Nov 5, 2009 at 10:48 PM, Anne Archibald <peridot...@gmail.com> wrote:

2009/11/5 Gökhan Sever <gokha...@gmail.com>:

> Hello,
>
> I have two aircraft based aerosol measurements. The first one is dccnConSTP
> (blue), and the latter is CPCConc (red) as shown in this screen capture.
> (http://img513.imageshack.us/img513/7498/ccncpclag.png). My goal is to
> compare these two measurements. It is expected to see that they must have a
> positive correlation throughout the flight. However, the instrument that
> gives CPCConc was experiencing a sampling issue and therefore making a
> varying time-shifted measurements with respect to the first instrument.
> (From the first box it is about 20 seconds, 24 from the seconds before the
> dccnConSTP measurements shows up.) In other words in different altitude
> levels, I have varying time differences in between these two measurements in
> terms of their shapes. So, my goal turns to addressing this variable
> shifting issue before I start doing the comparisons.
>
> Is there a known automated approach to correct this mentioned varying-lag
> issue? If so, how?

There are several tools you can use, depending on exactly what the problem is.

If the problem is that there's a constant lag for each data set but
you don't know what it is, then you can use the correlation to fit for
the lag - if you take the correlation of two vectors, then the highest
peak in the correlation vector is the lag where the two vectors are
most similar.

That's how I discovered the varying lag. I was expecting a nicer correlation when I shifted the data at a constant value however, it turned wrong and later analysis showed that the lags are not constant.

Correlations can be calculated rapidly using FFTs.

I am curious to know how to use FFT in this case?

If the lag isn't constant over a data set, you can try using
correlations to find the lag at several points in the data set and
interpolate to get the lag as a function of time (but be careful -
depending on what caused the lag, a steadily-drifting model isn't
necessarily appropriate; maybe you'll have periods of constant offset
separated by jumps).

Ok, good idea. Probably the more finer I correlate the data the higher accuracy I will get from the correlations therefore a better interpolated result. "steadily-drifting model" is another new term to me.

If you know the lag, but it isn't constant and you're not sure how to
resample your data set to remove the lag, look at scipy's ndimage.
This should have the tools to do what you want.

This is a 1D data. Could you give me an example how to utilize the ndimage library for my case?

If your data sets are unevenly sampled, so that you can't use simple
correlations, I'm not sure quite what to suggest, except perhaps
interpolating them to evenly-spaced samples and then running the
correlation. For this try scipy.interpolate.

I don't think uneven sampling is an issue in my case. Both instruments sample at 1Hz. One samples from 0.5 L/min flow, the other from 1.0 L/min where it cannot maintain this rate when the pressure gets lower.

If you do end up fitting for the lag, keep in mind that you'll have
adjusted the lags to make the time series as similar as possible, so
that there's a risk of overestimating their similarities. But the only
way around that problem is to know the lags from some independent
source.

Thank you for your suggestions. For now I am sure that these varying lags are only determined via a manual inspection. If I had the sample flow rate recorded than it would be easy to correct the data, unfortunately this will be something for the future experiments.

Anne

> Thank you.
>
> --
> Gökhan
>
> _______________________________________________
> SciPy-User mailing list
> SciPy...@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user