problem with Correlation function

65 views
Skip to first unread message

PRAKASH SARKAR

unread,
Feb 12, 2015, 8:17:53 AM2/12/15
to astroml...@googlegroups.com
Hi,
Can anyone help me in the estimation of 2 point correlation function of 3D data . I have writing the code and its working but the results seems to be not correct. I wonder where I have done mistake.
My input data contains Lambda, eta, x, y, z. I want to estimate the correlation function as a function of R. I have attached the code

#!/usr/bin/python

from astroML.correlation import two_point
import numpy as np
import pylab as plt
np.random.seed(0)
# X = np.random.random((50000, 3))
file1=raw_input('Enter file name : ')
file2=raw_input('Enter output file name : ')
inp1=raw_input('Enter range : ')
Nlimit=int(inp1)
bins = np.linspace(0, Nlimit, 50)
print 'No of bins are : ',len(bins)
fp = open(file1, 'rt')
lines=fp.readlines()
fp.close()

lv=[]
ev=[]
xv=[]
yv=[]
zv=[]

for line in lines:
    p=line.split()
    lv.append(float(p[0]))
    ev.append(float(p[1]))
    xv.append(float(p[2]))
    yv.append(float(p[3]))
    zv.append(float(p[4]))
Xtemp = [np.array(xv), np.array(yv), np.array(zv)]
X=np.array(Xtemp)
X=X.T
print 'shape of array is : ', X.shape
# print X

# fig, axes = plt.subplots(1, 3, figsize=(12,4))

# axes[0].plot(xv, yv)
# axes[1].plot(xv, zv)

corr = two_point(X, bins)
xx=[]
for ii in range(0, len(bins)-1):
    xx.append(0.5*(bins[ii]+bins[ii+1]))
fp = open(file2, 'wt')
for ii in range(0, len(xx)):
    fp.write('%f\t%f\n' %(xx[ii], corr[ii]));
fp.close()
print len(xx), len(corr)
plt.plot(xx, corr, color="blue", linewidth=1, ls='-', marker='o', markersize=2)
# axes[2].plot(xx, corr);
plt.show()


Jake Vanderplas

unread,
Feb 12, 2015, 12:14:37 PM2/12/15
to PRAKASH SARKAR, astroml...@googlegroups.com
Hi,
It's hard to tell what's going wrong without some description of the data or more information about why you think the results are incorrect.

My guess is this: the correlation function results depend on comparison to a uniform random sample which matches any spatial selection effects in the data. If you don't provide that, the algorithm attempts to construct a random sample through shuffling of the data, but this does not work if your data has high skew, strong correlations, or complicated selection effects. In this case, it's better to construct a suitable uniform comparison sample by hand.

Another possibility: if you are using a dataset which is on the smaller side, it might be that there are not enough points to put a good constraint on the correlation function. In that case doing a bootstrap (with ``bootstrap_two_point``) to estimate the error in the result might give you more insight into their meaning.

Hope that helps,
   Jake

--
You received this message because you are subscribed to the Google Groups "astroML-general" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astroml-gener...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages