gnomAD 2.1.1

260 views
Skip to first unread message

Konrad Karczewski

unread,
Aug 28, 2019, 12:40:22 PM8/28/19
to exac_data_a...@googlegroups.com
Hi ExAC and gnomAD enthusiasts,

Many of you have already probably noticed, but we've been busy over here preparing a number of new exciting features for you. Earlier this year, we released a new version of gnomAD (2.1.1), which has many new features and fixes (summary here), as well as new constraint metrics, including updated pLI scores and a more continuous version of the score, LOEUF (see flagship paper). Additionally, in this release, we have generated a dataset of structural variants which are also available for download (more information here). This is the data freeze that is used within the gnomAD team for all analyses and is recommended for all users to update if they are using a previous version, as it includes substantially improved QC and bug fixes (including critical updates to FAF and constraint). Shortly, we will release a minor update 2.1.2, but this will contain the same variants from the same underlying data, with only fixes to metadata and RSID mappings.

Perhaps most prominently, we've also deployed a shiny new browser in which you can now browse the current dataset, subsets of the data, the structural variant data, and the legacy ExAC data, in the dropdown box in the top right corner of the gene and variant pages. Since all the data can now be found there, we will soon be decommissioning the old ExAC browser. The new browser has a number of additional features, including age histograms for all variant carriers, subpopulation frequencies, improved region-based searches, the new constraint metrics, transcript expression data (including pext), and of course, the updated dataset.

We have also prepared a number of manuscripts that describe many aspects of the dataset: a flagship paper describing the callset and the genic tolerance to loss-of-function variation, one describing the structural variants in an overlapping set of genomes, a discussion of the implications of pLoF variants for the discovery and validation of therapeutic drug targets and a case study to validate the safety of inhibition of LRRK2  a candidate therapeutic target for Parkinson’s disease. We demonstrate the value of tissue expression data in the interpretation of genetic variation, and we characterize the impact of two understudied classes of human variation: multi-nucleotide variants and variants creating or disrupting open reading frames in the 5’UTR.

Finally, we've also released this dataset lifted over to GRCh38. Later this year, we will release a new callset of ~80K genomes natively aligned to GRCh38. We will send another update when this callset is released.

Happy exploring,
-Konrad
Reply all
Reply to author
Forward
0 new messages