Correlation length and spatial correlation

D

unread,

Aug 8, 2013, 3:45:13 AM8/8/13

to

Hi MATLABbers!

I should have posted here ages ago, but apparently I enjoy getting tangled in knots. Many thanks in advance, and apologies for such a long post, but I wanted to really clarify what my problem is!

Here's what I'm trying to do:

1. I have a vector field of velocities corresponding to the flow patterns in a population of kidney cells. The field is split into V_x and V_y in two different matrices. These matrices are often 100 x 100.

2. I would like to compute the *spatial* correlation length (correlation coefficient as a function of distance) within a single instance of the vector field. Here is why: I am trying to demonstrate that I can control how these cells behave, and I already can cleanly show this with order parameters and pretty plots, but correlation length is a standard benchmark in the swarm community. So, I would like to be able to say that I have motion/orientations that are coordinated across a distance of L. This is very similar to an Nth nearest neighbor correlation. The best image I could find for the general equation is here: http://patentimages.storage.googleapis.com/WO2011047033A2/imgf000021_0001.png

3. Here is my understanding of how to do this conceptually: Go to an element and then multiply that vector by every other element in the matrix *that is ~a certain distance 'R' away*. Repeat this operation for every element in the system, average the correlations, normalize, and then iterate the radius so that we are now multiplying a given element by those elements that are ~'R+dR'. In other words, I am correlating a given element against those elements at an increasing radius away from it, and I am doing this over the whole system. The normalization is similar to what is shown in the link above.

4. Here is how I have tried to do it:

I have tried many methods like element-by-element looping (seems to work but a nightmare) and k-Nearest-Neighbors (seems to work, but also seems like it won't scale well beyond a few elements/radii and I can't seem to parse the output vector of cells into something I can use for multiple search points at one time).

What I'd prefer to use for speed and 'simplicity' is a cross-correlation method that works only with the data matrices, but I'm clearly missing some things

To this end,

%I load a bunch of data in here and process it
% u is a 100 x 100 matrix of x-velocity components
% ur = u-mean2(u); %=residuals of u

for(NN=1:1:maxN)

wind=double(bwmorph(imcircle(2*NN+1),'remove')); %Create window matrix
% wind is 2*NN+1 and contains all zeroes except at a radius of 2*NN+1
%Ex: for the 1st nearest neighbor, wind =...
% 1 1 1
% 1 0 1
% 1 1 1

localprod=conv2(ur,wind,'same'); %not sure about 'same' here, but it is for the next line
corrprod = ur.*localprod; %element-wise mult by cross-corr outputs
upstairs = mean2(corrprod); %I think that's the definition of the operation I want
downstairs = mean2(u.*u); %similar to what I should normalize by, but not correct
%downstairs = numel(find(wind)); %alternate normalization...sort of
C(NN) = upstairs/downstairs; %output CorrCoeff(distance)

end

%Thanks!

%D

Sebastian

unread,

Nov 21, 2013, 4:28:11 PM11/21/13

to

Hi D, Hi everybody else!
Unfortunately I can not provide you a solution to your problem. But I hope my reply might lead to other people looking again at this question.
I am facing the exact same problem (and I am a matlab beginner). I suppose you derived this vector fields from a PIV (particle image velocimetry) evaluation of your cell system?

If anyone could provide a solution or maybe just some hints on how to tackle this problem, that would really be appreciated!

Thanks in advance.

Fred

unread,

Nov 21, 2013, 6:34:14 PM11/21/13

to

I think part of the problem is that there are many methods for doing this and no two people do it the same way.

The method I will probably wind up using is to auto-correlate the velocity field from a heatmap of the field (encoding 2D spatial information). I think this needs to be done for the V_x and V_y components separately. Once you get the 2D auto-correlation data, you can take a profile plot through it and get a sense of the correlation length in pixels, and this can be converted to a distance in whatever you units you have. Maybe this isn't the 'right' way to do it, but it seems like it should work conceptually.

Thoughts, anyone?

Thanks!

Sebastian Forester

unread,

Nov 22, 2013, 3:15:18 AM11/22/13

to

Dear Fred,

thank you for your reply.
I think what you suggest might work to get a estimate of the correlation, if I understood correctly. Unfortunately I need a more 'robust' value taking into account all the data I have.

Maybe I need to clarify what I am looking for: I have data of a group of cells (similar to what 'D' mentioned above) and did a particle image velocimetry (PIV) analysis of it. This gives you a the x- and y- components of the velocity (by which the cells moved from one image to the next in the sequence) on a grid. The position in the matrix represents the spacial coordinate.
Now I need to show that neighboring velocities are correlated over a certain distance (which would prove that the cell movement is also correlated over a certain distance). Therefore I should basically correlate every value of the velocity matrix with every other value in the matrix and keep track of the distance the two values have. So that I can then get a graph with correlation value versus distance.

Basically I need to do something like Correlation(R)= \sum (velocity(r) * velocity(r+R)) where R is the distance of the two velocities compared, and the sum includes every entry of the matrix. (This will of course have to be normalized somehow afterwards, but that is another easy to solve problem once I know how to solve the correlation problem)

I think this is possible to do by large for loops comparing entry (1,1) to all other entries, then comparing entry (1,2) to all other entries and so on. This seems to be a rather 'brute force' approach, and due to the amount of data I have this might take a while.

I hope I am making myself clear?

Again, any help would be appreciated.

Thanks!

"Fred" wrote in message <l6m59m$lcb$1...@newscl01ah.mathworks.com>...

Fred

unread,

Nov 22, 2013, 4:16:05 AM11/22/13

to

Maybe someone else can advise (please!), but my understanding is that this is what the autocorrelation of the velocity field does. I, too, have PIV data. You split it into X and Y components and analyze each separately. You should get a 2D image that you can then use to calculate spatial correlations.

Alternately, look up variogram methods (very similar). They have nice implementations in MATLAB from the users on forums, but I have had issues getting them to work with my PIV data due to some of the assumptions made. YMMV.

Sebastian Forester

unread,

Nov 23, 2013, 9:08:19 AM11/23/13

to

Dear Fred,

thank you very much for your constant feedback. I think you are right about the autocorrelation. I guess I might use the xcorr2 function for what I want. Just have to figure out what it does exactly and how it works.

In the mantime I tried to solve the problem with several nested for loops, but that takes insane amounts of time (would take days of calculation for the amount of data I have).

If I figure out a solution that works, I will come back. In the meantime I still welcome any comments or suggestions on how to approach the problem maybe in another way (in case the xcorr2 is not what I need).

Best wishes!

"Fred" wrote in message <l6n7cl$dcc$1...@newscl01ah.mathworks.com>...

Sebastian Forester

unread,

Nov 23, 2013, 10:05:16 AM11/23/13

to

I spent some time with the xcorr2 function. And I think it is what i need. I am not completely sure yet, because I couldn't test it with my data, since the data contains NaNs at various positions where no data was acquired.
The problem is, all solutions I found in the web recommend to replace the NaNs by zero, but this will alter the correlation for sure.
First, because during the calculation of the correlation a mean of the matrix is produced and substracted from all data points in the matrix (which will change all zeros to zero - mean and at the same time lower the mean itself significantly, because a lot of zeros are taken into account for its calculation).
Second there will be a correlation of matrix entries where there have previously been NaNs (which originally should not produce a correlation, since there is no data at that entry).

Does someone have a simple solution for that problem? Like you can do with "nanmean" instead of "mean" where NaN are simply ignored. That would be REALLY helpful.

Best wishes!

"Sebastian Forester" wrote in message <l6qcsj$79v$1...@newscl01ah.mathworks.com>...

dpb

unread,

Nov 23, 2013, 10:20:47 AM11/23/13

to

On 11/23/2013 9:05 AM, Sebastian Forester wrote:
> I spent some time with the xcorr2 function. And I think it is what i
> need. I am not completely sure yet, because I couldn't test it with my
> data, since the data contains NaNs at various positions where no data
> was acquired.

...

> Does someone have a simple solution for that problem? Like you can do
> with "nanmean" instead of "mean" where NaN are simply ignored. That
> would be REALLY helpful.

...

Not at convenient point to test but I'd first see if the ML
implementation ends up propagating NaNs everywhere or just in the local
area(s). If the latter, you've probably got the best solution other
than interpolating to fill in the missing values first or simply
removing those positions entirely.

--

Sebastian Forester

unread,

Nov 23, 2013, 10:39:05 AM11/23/13

to

Hi dpb,

thanks for your reply.
In my case the xcorr2 function in combination with my data ends up propagating NaNs all over the results matrix.
The problem ist, that the original data is coming from a biology experiment. It is a PIV evaluation of some cell clusters growing over time. So I only have data points at some position (say in the middle of my field of view thus also in the center of the matrix I want to process). Since there are no cells moving at other positions there should be no data points at all, and I can not interpolate anything there. Also the shape of the object (i.e. the cell cluster) is amorphous (roughly circular), so it is also not possible to entirely remove the values which belong to positions where no cells are present.

So I am really stuck here at the moment.

Thanks for any further input, advise or suggestions!

dpb <no...@non.net> wrote in message <l6qh4f$h5i$1...@speranza.aioe.org>...

Fred

unread,

Nov 24, 2013, 9:16:17 PM11/24/13

to

The autocorrelation method I used is based on using xcorr2 to calculate the autocorrelation, so I think we are doing the same thing.

I had the same NaN issue you have with migrating cells, and there are a lot of ways to handle it. What I normally do is just interpolate the NaNs out. Most PIV software packages do this anyway (I used the excellent PIVLab package for my data), or you can just do a nearest-neighbor interpolation. I think interpolation is often the fairest method because many of my PIV NaNs are just artifacts in the middle of uniform migration fields, so the interpolations restore the actual behavior of the cells.

Does that work for you?

How has the xcorr2 implementation worked?

Fred

unread,

Nov 25, 2013, 12:55:09 AM11/25/13

to

Fred

unread,

Nov 25, 2013, 1:14:09 AM11/25/13

to

Emanuele Martini

unread,

Jun 25, 2015, 4:19:08 AM6/25/15

to

Hi to all,
I am facing the same problem, and in the end I am using xcorr2.

But I have some doubts in anycase, how do you normalize the data?
and how do you extract a vector from xcorr2?
I use nanmean along vertical and horizontal dimension, so at the end I have four correlation plot.
vy and vx are y and x component coming from PIV analysis minus respective means.
And I use some interpolation methods to

VY_AUTO=xcorr2(vy);
VY_AUTOalong1=nanmean(vy,1); VY_AUTOalong2=nanmean(vy,2);
VX_AUTO=xcorr2(vy);
VX_AUTOalong1=nanmean(vx,1); VY_AUTOalong2=nanmean(vx,2);
it is a correct approach?
I think yes because I try some comparison with openpiv spatialbox toolbox.

I am trying to do xcorr2 with a "fake" velocity field with 1 values everywhere.
I am supposing to have a flat autocorrelation because it is equals everywhere, but it does not works.
There is something that am I considering bad?

Thank you,
Emanuele