gnomAD v4.0 release

180 views

Skip to first unread message

gnomAD Production Team

unread,

Nov 1, 2023, 2:25:38 PM11/1/23

to gnomAD Production Team

Dear gnomAD enthusiasts,

Today, the gnomAD Production Team is proud to announce the release of gnomAD v4! The v4 data set includes data from 807,162 individuals, including 730,947 exomes and 76,215 genomes, all mapped to the GRCh38 reference sequence. This release is nearly 5x larger than the combined v2/v3 releases and adds additional global diversity including nearly 138,000 individuals of non-European genetic ancestry (see our new stats page for more details).

Over 1 billion variants were analyzed with the following passing QC:

786,500,648 SNVs and 122,583,462 InDels available in the browser and in our downloads
1,199,117 genome structural variants
66,903 rare (< 1% site frequency) exome copy number variants

The v4 release is a minimum viable product (MVP) release, which allows us to get the most critical piece of the v4 gnomAD dataset to our users as soon as possible. It also means that a few of the existing features found in v2 or v3 are not yet included in v4 but will be coming soon. This includes updated gene constraint, pext scores, sub-genetic ancestry groups, variant co-occurrence, and regional missense constraint! To see a full list of features not included in v4 MVP and where to continue accessing them in past versions please see this FAQ. However, we are also excited to have gotten in a few new features including combined exome and genome filtering allele frequencies, structural variants in GRCh38, and adding VRS annotations based on GA4GH standards to our downloadable files.

Further details regarding the production of the v4 release, are described in the following blog posts:

We are also happy to announce the launch of the gnomAD forum. This will be a place for our users to help each other, with support from our team, discuss the data and ask questions. We look forward to seeing you all there. Keep in mind that we are a small team and it can take us some time to respond when we have a high volume of help requests, so we are hoping this forum will enable our users to assist each other.

We hope this new dataset will prove useful to our users, and we welcome feedback on the latest release at our email address: gno...@broadinstitute.org.

Acknowledgements

We wish to acknowledge the extraordinary efforts that made this release possible, beginning with the gnomAD Consortium of 308 data contributors, whose willingness to share data is the cornerstone of all our efforts.

The gnomAD v4 launch team spans multiple groups, including 46 individuals spanning our production, browser, and operations teams, oversight from our steering committee as well as individuals who are part of Broad Data Sciences Platform, the Hail team, the Talkowski lab, and Broad compliance. gnomAD would not be possible without tremendous effort from each team collaborating extensively to ensure the success of the project. Please join us in acknowledging the gnomAD v4 release team:

MPG Genomic Data and Metadata Collection: Sam Baxter, Christine Stevens, Sinéad Chapman, Caroline Cusick, Sam Bryant, Felecia Cerrato, Trish Vosburg, Candace Candace Patterson

DSP Data Processing: Eric Banks, Laura Gauthier, Sam Lee, Lee Lichtenstein, Charlotte Tolonen, Sam Novod; Broad Compliance; Kelly Flanagan, Lauren Witzgall with support from Emily Lipscomb and Andrea Saltzman

gnomAD Production Team: Katherine Chao, Julia Goodrich, Mike Wilson, Kristen Laricchia, Qin He, Wenhan Lu, Leo Gruenschloss, Vlad Savelyev, Grace Tiao, Daniel Martin; Hail team: Tim Poterba, Chris Vittal, Dan King, Jackie Goldstein, Daniel Goldstein

Structural variant calling/Talkowski Lab: Harrison Brand, Jack Fu, Xuefang Zhao, Ryan Collins, Mark Walker, Nehir Kurtas, Emma Pierce-Hoffman, Chris Whelan

gnomAD Browser Team: Matt Solomonson, Phil Darnowsky, Riley Grant, Steve Jahl, Konrad Karczewski, Ben Weisburd, Elissa Alarmani

gnomAD Steering Committee: Samantha Baxter, Katherine Chao, Mark Daly, Julia Goodrich, Konrad Karczewski, Daniel G. MacArthur, Benjamin Neale, Anne O'Donnell-Luria, Heidi Rehm, Kaitlin Samocha, Matthew Solomonson, Michael Talkowski

Finally we would like to thank all the individuals who have enrolled in research. Without their willingness to share data and participate in research, gnomAD would not exist!

Happy Exploring,

The gnomAD Production Team

Reply all

Reply to author

Forward

0 new messages