Conversation from .idat to .vcf

1,792 views
Skip to first unread message

Sintia Belangero

unread,
Sep 22, 2022, 4:03:08 PM9/22/22
to plink2...@googlegroups.com
Dear PLINK-users,
Is there any way to convert .idat to .vcf without going through Genome Studio? The only format accepted as an input file in Plink is .vcf, correct?
Thank you.
Sintia 

-----------------------------------
  Sintia I. Belangero
  Universidade Federal de São Paulo
  São Paulo-SP / Brazil




DAVID J Cutler

unread,
Sep 22, 2022, 4:37:16 PM9/22/22
to Sintia Belangero, plink2...@googlegroups.com
Sintia,

An .idat file is an image of a chip, I believe. A lot of very specific
Illumina analysis has to be done to take an image into genotype calls,
and there is no conceivable way to do that without knowing exactly how
Illumina designed the chip, and that sort of information isn't really available
outside of illumina software.

Cheers,
dave
> --
> You received this message because you are subscribed to the Google Groups "plink2-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/CAPWBcKAy2EOA1EEsRE%3DTRbP6LKQmdDe6xZvxBS-0u%3Dq4x0svHA%40mail.gmail.com.

Kevin Esoh

unread,
Sep 22, 2022, 5:20:21 PM9/22/22
to Sintia Belangero, plink2...@googlegroups.com
Hi Sintia,

Indeed, you will need the Illumina Array Analysis Platform Genotyping Command Line (iaap-cl) to process IDAT (Illumina intensity) files.
Luckily, iaap-cl is freely available here: https://emea.support.illumina.com/downloads/iaap-genotyping-cli.html
All you need is to have an Illumina account and then you can download it.

What you actually use in the iaap-cl is the Illumna gencall algorithm which will convert the IDAT files to GTC (preferably, if you wish to get a VCF at the end)
You will then use the tool gtc2vcf to generate your VCF file.

All the steps have been described on the gtc2vcf GitHub page perfectly: https://github.com/freeseek/gtc2vcf
You will notice that gtc2vcf has been ported into bcftools as a plugin, which makes it quite simple to use.
You can download the most recent version of the tools from the Broad Institute site: https://software.broadinstitute.org/software/gtc2vcf/

NB: Importantly, you will need the manifest and cluster files that are specific to the chip used to generate your IDAT files.

Here are sample codes I used to process my IDAT files after installing the tools

Convert IDAT to GTC
gencall \
   manifest/H3Africa_2017_20021485_A3.bpm \
   manifest/GenomeStudio-H3Africa-array-clusters-HapMap2-186-samples.egt \
   calls \
   --idat-folder intensities \
   --output-gtc \
   --gender-estimate-call-rate-threshold 0.95 \
   --gender-estimate-x-het-rate-threshold 0.2
intensities is the directory containing the .idat files or the subdirectories containing the .idat files

Convert GTC to VCF
bcftools +gtc2vcf \
   --bpm manifest/H3Africa_2017_20021485_A3.bpm \
   --csv manifest/H3Africa_2017_20021485_A3.csv \
   --egt clusterFile/GenomeStudio-H3Africa-array-clusters-HapMap2-186-samples.egt \
   --gtcs aw2019.gtc.list \
   --fasta-ref ~/esoh/data/db/human_g1k_v37.fasta.gz \
   --extra calls/AW2019_genotype_stats.tsv | \
   bcftools sort -T ./bcftools-sort.XXXXXX | \
   bcftools norm \
   --threads 15 \
   --no-version \
   -Oz \
   -c x \
   -f ~/esoh/data/db/human_g1k_v37.fasta.gz | \
   tee vcf/AW2019.vcf.gz | \
   bcftools index \
   --threads 15 \
   -ft \
   --output vcf/AW2019.vcf.gz.tbi
I hope this helps

Cheers,
Esoh


--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/CAPWBcKAy2EOA1EEsRE%3DTRbP6LKQmdDe6xZvxBS-0u%3Dq4x0svHA%40mail.gmail.com.


--
Kevin Esoh
SADaCC Research Fellow
GeneMAP Research Center
Division of Human Genetics
University of Cape Town
Cape Town, South Africa

Sintia Belangero

unread,
Sep 23, 2022, 12:24:43 PM9/23/22
to Kevin Esoh, plink2...@googlegroups.com
Dave and Esoh,
It will be super useful and helpful! 
Thank you very much.
My best 
--

-----------------------------------
  Sintia I. Belangero
  -Associate Professor, Department of Morphology and Genetics 
  -Coordinator of The Lab of Integrative Neuroscience (LINC)
  Universidade Federal de São Paulo
Reply all
Reply to author
Forward
0 new messages