malformed plink files

96 views
Skip to first unread message

Maryiam Shöâeè

unread,
Apr 27, 2017, 12:39:18 PM4/27/17
to Alzheimer's Disease Neuroimaging Initiative (ADNI) Data
Hi ADNI team members


I converted the WGS (GATK called) vcf.tar files available online to vcf.gz files and then to plink files and started with normal QC. I had to halt the process when running --check-sex on plink generated 430 errors. The only time I have had such issues has been with malformed plink files. I am repeating the process to regenerate the plink files hoping that will solve the problems. 

 
This is what I have done so far: 

1) unzip with tar 
2) zip with tabix tools and produce tbi files
3) merge the files and produce 1 vcf file (interestingly when I tried to produce vcf.gz and the corresponding tbi file for the merged data, there were many errors, so I decided to just proceed with the uncompressed vcf file. 

4) using bcf tools I assign a different ID for multiallelic SNPs and indels: 


bcftools norm -Ou -m -any concatADNI.vcf | 
bcftools norm -Ou -f human_g1k_v37.fasta | 
bcftools annotate -Ob -x ID -I +'%CHROM:%POS:%REF:%ALT' | 
plink --bcf /dev/stdin --keep-allele-order  --vcf-idspace-to _    --const-fid --allow-extra-chr 0  --split-x b37 no-fail  --make-bed  --out output


So my question is this: What do you recommend for converting the GATK aligned vcf files to plink format files if not the above? I am repeating each stage hoping to solve the issues but I doubt it will, so it would be good to know what the recommended alternative is. 


Thanks in advance

Maryiam
 

Danielle J Harvey

unread,
Apr 27, 2017, 2:29:17 PM4/27/17
to adni...@googlegroups.com

Hi Maryiam,

 

I don’t work with the genetics data, so can’t help you. If you do not get a response from this group, I’d recommend using the Ask the Experts tool on the ADNI website and direct your question to the Genetics Core (http://adni.loni.usc.edu/support/experts-knowledge-base/ask-experts/) to see if they can answer your questions.

 

Danielle

--
You received this message because you are subscribed to the Google Groups "Alzheimer's Disease Neuroimaging Initiative (ADNI) Data" group.
To unsubscribe from this group and stop receiving emails from it, send an email to adni-data+...@googlegroups.com.
To post to this group, send email to adni...@googlegroups.com.
Visit this group at https://groups.google.com/group/adni-data.
For more options, visit https://groups.google.com/d/optout.

Maryiam Shöâeè

unread,
Apr 28, 2017, 10:45:02 AM4/28/17
to Alzheimer's Disease Neuroimaging Initiative (ADNI) Data
Thanks I will do. 
Reply all
Reply to author
Forward
0 new messages