HapMap3 SNPs

1,059 views
Skip to first unread message

Anil Ori

unread,
Apr 1, 2016, 7:04:27 PM4/1/16
to ldsc_users
Hi, 

Quick question on the recommend hm3 SNP list; are these the released consensus (across populations) polymorphic SNPs after QC or are they specific to CEU? 
I am running analyses on the EAS population and want to make sure I am using the correct SNPs. 

Thanks!

Anil

Raymond Walters

unread,
Apr 4, 2016, 11:04:41 AM4/4/16
to Anil Ori, ldsc_users
Hi Anil,
The hm3 list is not Europe-specific. As long as you are using LD scores from the EAS population they should be fine.
Cheers,
Raymond




--
You received this message because you are subscribed to the Google Groups "ldsc_users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ldsc_users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ldsc_users/201d5c85-1d19-41c1-9b05-54b9bfd20654%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Anil Ori

unread,
Apr 4, 2016, 2:37:19 PM4/4/16
to ldsc_users, anil.p...@gmail.com
Got it, thank you.

On the EAS LD scores; I would like to partition h2 and thus need to use annotation files specific to EAS. Do you by any chance have these .annot.gz files for the baseline categories for EAS population? Or perhaps a file with the locations/windows for these 53 baseline categories that can help me create these .annot files for EAS? 

Best, Anil

Op maandag 4 april 2016 08:04:41 UTC-7 schreef Raymond Walters:

Raymond Walters

unread,
Apr 4, 2016, 5:10:43 PM4/4/16
to Anil Ori, ldsc_users
Hi Anil,
To my knowledge we don’t have pre-computed partitioned LD scores available for EAS, but Hilary might know more. The .annot.gz files indicating the baseline annotations should be available at https://data.broadinstitute.org/alkesgroup/LDSCORE/ in baseline_ldscores.tgz.
Cheers,
Raymond
 

hil...@mit.edu

unread,
Apr 5, 2016, 8:51:07 PM4/5/16
to ldsc_users, anil.p...@gmail.com
Right, we don't have partitioned LD scores available for EAS. 

Best,

Hilary
To unsubscribe from this group and stop receiving emails from it, send an email to ldsc_users+unsubscribe@googlegroups.com.

Masahiro Kanai

unread,
Apr 5, 2016, 9:25:51 PM4/5/16
to ldsc_users, hil...@mit.edu, anil.p...@gmail.com
Hi,

We've also been trying to generate partitioned LD scores for EAS, which we'd like to share with the community when finished. We partially succeeded by extracting annotations for common SNPs of EUR and EAS, but the data is somewhat incomplete and still undergoes a validation process.

When generating, we encountered several issues described below. Could you help us to create a complete reference?

1) Additional annotations for complete 1KGP SNPs are available?
Since the provided .annot.gz files only contain SNPs for EUR, several SNPs in EAS don't exist. Do you have any additional 
annotations/source data for such extra SNPs?


2) Which threshold was applied for the reference panel generation, MAC > 5 or MAF > 5%?
According to the paper, MAF > 5% was adopted (page 9). However, the files retrieved from here (https://data.broadinstitute.org/alkesgroup/LDSCORE/) suggests MAC > 5 (as "1000G.mac5eur.*"). We counted the number of variants and it seems MAC > 5 was certainly applied.


Best Regards,
Masa


Hilary

--
You received this message because you are subscribed to the Google Groups "ldsc_users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ldsc_users+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



-- 
Masahiro KANAI
| Laboratory for Statistical Analysis
| RIKEN Center for Integrative Medical Sciences
 

Hilary Finucane

unread,
Apr 12, 2016, 5:25:43 PM4/12/16
to Masahiro Kanai, ldsc_users, anil.p...@gmail.com
Hi Masa,

1. I have put the hg19 bed files in our public data directory http://data.broadinstitute.org/alkesgroup/LDSCORE/baseline_bedfiles.tgz. Hopefully this will solve your problem.

2. For the reference panel MAC>5 was applied. The results reported in our paper and the default in our software is to partition the heritability explained by SNPs with MAF>5%. We include SNPs with MAC>5 but MAF<5% in the reference panel (but not in the set of SNPs we are trying to draw confident conclusions about) because LD to these SNPs is important. This is discussed in the "Choice of regression SNPs and reference SNPs" section of the Supplement.

I hope this helps,

Hilary

Masahiro Kanai

unread,
Apr 14, 2016, 2:31:47 AM4/14/16
to Hilary Finucane, ldsc_users, Anil Ori
Hi Hilary,

1. It definitely helps! really appreciate it. 

I just wrote a simple script to generate annot files from your bed files. The repo is yet incomplete, but I will upload the generated ldscores after validation.

To confirm my approach, I checked the output baseline annotation for chr22 of EUR, using the same .bim (1000G.mac5eur.22.bim) and .bed provided.
However, I found several discrepancy in specific categories--Enhancer_Hoffman, Enhancer_Hoffman.extend.500, UTR_3_UCSC, and UTR_5_UCSC. I manually investigated those SNPs, and found that they are basically outside the specified intervals (see the example below).

e.g. rs2379981(22:17030792) for 'Enhancer_Hoffman' category

baseline.22.annot.gz (provided):1013,col19 --- '1'
(mine) --- '0'

Enhancer_Hoffman.bed:115226-115227
>>>
chr22 16928200 16928543
chr22 17082427 17082622
>>>

Since rs2379981 seems to be located outside the intervals, should it be '0'? Did you have any other criteria for incursion/exclusion, in addition to the intervals specified in the bed files? I attached my baseline annotation for chr22 of EUR just for the record.

2. Thank you for the explanation. Just for clarity, the following relationship is correct? -- Reference SNPs: MAC > 5, Regression SNPs: MAF > 5%, Output SNPs for ldscore: HapMap 3 SNPs (w/ --print-snps option).

Then, to appropriately calculate baseline ldscores, the command below is sufficient, right?
```
python ldsc.py\
    --l2\ 
    --bfile 1000G.mac5eas.22\ # EAS reference SNPs
    --ld-wind-cm 1\ 
    --annot baseline.22.annot.gz\ # new EAS annotation
    --out baseline.22\
    --print-snps eas_hm.22.snp # EAS HapMap3 SNPs
```

Sorry for many questions, but I'd like to confirm to generate appropriate ldscores for EAS. I really appreciate your support.

Masa
mkanai_baseline.22.annot.gz
Reply all
Reply to author
Forward
0 new messages