New PrimateAI-3D and PromoterAI variant impact tracks from Illumina

5 views
Skip to first unread message

Luis Nassar

unread,
May 1, 2026, 1:40:06 PMMay 1
to genome-...@soe.ucsc.edu

We are pleased to announce the release of two new variant-impact prediction tracks from IlluminaPrimateAI-3D, which scores every possible coding missense variant on the human GRCh38/hg38 and GRCh37/hg19 assemblies, and PromoterAI, which scores every possible non-coding single-nucleotide substitution in proximal promoter regions on GRCh38/hg38. Together, these deep-learning predictors extend pathogenicity prediction across both protein-coding and regulatory sequence. Both tracks are grouped under the Deleteriousness Predictions container in the Phenotype and Disease Associations track group.

The PrimateAI-3D track displays approximately 70.7 million scored missense variants per assembly, covering every possible single-base coding change across all protein-coding genes. PrimateAI-3D is a deep-learning model that learns variant pathogenicity from common variation observed across 233 non-human primate species and large human population databases. Each variant is colored red (pathogenic) or blue (benign) based on Illumina's prediction call. Items are labeled by default with the nucleotide change (e.g. C>T); the corresponding amino acid change is shown on hover and can be toggled as the on-feature label from the Track Settings.

The PromoterAI track provides pre-computed scores for every possible single-nucleotide substitution within 500 bp of annotated transcription start sites, covering approximately 39.5 million genomic positions. Scores range from −1 to +1, where negative values indicate predicted under-expression and positive values indicate predicted over-expression of the associated transcript; the magnitude reflects the predicted size of the expression change. The container track includes:

  • Four per-base score subtracks – one bigWig for each alternate allele (ACGT). Bars are colored red for positive scores and blue for negative scores.
  • PromoterAI Overlaps bigBed subtrack summarizing positions where multiple transcripts overlap. Each item lists the contributing transcripts, their per-transcript scores, and their strand orientation. About 90% of overlap items lie at bidirectional promoters where transcripts on opposite strands share regulatory sequence.
Genome Browser screenshot at the HBB locus on hg38 showing PrimateAI-3D
  pathogenicity calls across the coding region and PromoterAI per-allele scores
  in the 5' UTR and proximal promoter

PrimateAI-3D and PromoterAI at the start of HBB on hg38 (chr11:5,226,883-5,227,212). PrimateAI-3D scores every coding missense variant as red (pathogenic) or blue (benign), while the four PromoterAI per-allele bigWig subtracks score every possible non-coding substitution in the 5′ UTR and proximal promoter just upstream of the HBB start codon.

See the track description pages for Illumina's recommended interpretation thresholds.

Both tracks are distributed by Illumina under a license agreement and are not available through the Table Browser, Data Integrator, REST API, or public download. The full methods are described in Gao et al. 2023 (PrimateAI-3D, Science) and Jaganathan et al. 2025 (PromoterAI, Science).

We would like to thank Hong Gao, Kishore Jaganathan, and the Illumina PrimateAI-3D and PromoterAI teams for making these predictions available to the research community.

Reply all
Reply to author
Forward
0 new messages