Hi,
LDSC is a great tool and is very useful in this era of GWAS where we already have multiple published GWAS of large scale.
I was interested in doing some "heritability estimates" and "genetic correlation" of two quantitative phenotypes based on their summary statistics. For comparison shake I used a study which has imputed data and GWAS results for two quantitative trait.
I used the high quality markers that we get after using "munge_sumstats.py" to perform genetic correlation using "ldsc.py". The syntax used are mentioned below.
python munge_sumstats.py --sumstats file1.input.gz --N 1397 --out file1 --merge-alleles w_hm3.snplist
ldsc.py \
--rg file1.sumstats.gz,file1.sumstats.gz \
--ref-ld-chr eur_w_ld_chr/ \
--w-ld-chr eur_w_ld_chr/ \
--out checking
Then I used the same set of markers that were used for estimating the correlation and estimated heritablity using GCTA. This was just to make sure we are on the same page. However, I find that the heritablity estimates for the two phenotypes are much higher when using ldsc than GCTA. The results of GCTA using whole genome data and the selective list of SNPs are almost comparable. So I do not think the SNP list makes such a huge difference.
|
GCTA
|
LDSC
|
phenotype1
|
0.42
(0.23)
|
0.759
(0.3352)
|
phenotype2
|
0.33
(0.23)
|
0.681
(0.3361)
|
Any comments as to why we differ so much in terms of heritability estimates between the two software.
With best regards,
Ganesh Chauhan
INSERM, FRANCE