[genome-announce] New Short Tandem Repeat (STR) tracks on hg38

4 views
Skip to first unread message

Jairo Navarro Gonzalez

unread,
Apr 10, 2026, 7:44:33 PMApr 10
to genome-...@soe.ucsc.edu
Hello everyone,

We are pleased to announce several new Short Tandem Repeat (STR) tracks on the human genome assembly (GRCh38/hg38).

A new Tandem Repeat Variation track collection brings together population-level tandem repeat variation data from multiple sources:

  • WebSTR – 1,710,833 STR loci from the EnsembleTR panel with allele frequency distributions for five continental populations from the 1000 Genomes Project (3,550 individuals). The EnsembleTR panel represents consensus calls from four STR genotyping methods (HipSTR, GangSTR, ExpansionHunter, and AdVNTR).
  • STRchive – 75 disease-associated tandem repeat expansion loci curated from published literature by the STRchive project. Each locus includes the pathogenic repeat motif, the minimum pathogenic repeat count, the mode of inheritance, and the associated disease. Items are colored by inheritance mode.
  • TRExplorer V2 – 5,599,658 tandem repeat loci (STRs and VNTRs) from the TRExplorer catalog at the Broad Institute, compiled from 17 sources including perfect repeats in the reference, polymorphic TRs from T2T assemblies, and curated disease-associated loci. Includes population allele frequency histograms from TenK10K and HPRC256 cohorts.
  • ToMMo 61K STR – 174,300 STR loci with allele count distributions from 61,000 Japanese individuals from the Tohoku Medical Megabank Organization (ToMMo), genotyped with Expansion Hunter.
  • 1KG Vienna ONT VNTR – 361,362 VNTR loci with allele statistics from 1,019 samples of the 1000 Genomes ONT Vienna project, genotyped with VAMOS from Oxford Nanopore long-read sequencing. Unlike the other tracks, which use short-read data, this track can span longer repeat regions.

Additionally, a new gnomAD STR track has been added under the gnomAD Variants collection. This track displays genotype data for 87 disease-associated STR loci from gnomAD v3.1.3, including loci associated with Huntington disease, fragile X syndrome, Friedreich ataxia, and various spinocerebellar ataxias. The data were generated using ExpansionHunter v5 on 18,511 whole-genome sequenced samples across 10 populations. Each locus shows the distribution of repeat allele sizes, providing a reference for normal and expanded allele ranges.

We would like to thank Melissa Gymrek (UC San Diego) and the WebSTR team for providing the WebSTR data, Harriet Dashnow (University of Colorado) and the STRchive team for their curated disease-associated loci, Ben Weisburd, Egor Dolzhenko, and the TRExplorer team at the Broad Institute for their tandem repeat catalog, the Tohoku Medical Megabank Organization for the ToMMo STR data, the 1000 Genomes ONT Vienna consortium and the Marschall Lab at Heinrich Heine University Düsseldorf for the VNTR data, and the gnomAD production team for making the STR genotype data available.

Reply all
Reply to author
Forward
0 new messages