right syntax for the beginning and end of a transcript (exon) deletion

15 views
Skip to first unread message

Bayley, Jean-Pierre (HG - LUMC)

unread,
Dec 22, 2025, 4:22:28 AM12/22/25
to hgvs-nom...@googlegroups.com

Hi, I'm having trouble finding the right syntax for the beginning and end of a transcript (exon) deletion. 843, for example, is the final nucleotide of exon 8 SDHB. Is it (843_+1)del or as above, as suggested by the LOVD syntax checker? Moreover, how to describe an exon 1 deletion? NC_000001.11(NM_003000.3):c.(?_-151)_(72+1_73-1)del, as widely reported despite the HGVS ban on use of question marks (see: HGVS/DNA/deletion/exons/deletions extending beyond the transcribed region)? That extra 151 upstream is not included in (NM_003000.3), by the way. No idea where it comes from. Should it not actually be c.(1) or c.(1_-1) or something else: e.g. c.(-13_1). There are namely 13 nuc upstream of exon 1 in NM_003000.3? Any help would be appreciated. Thanks, JP

Johan den Dunnen

unread,
Dec 22, 2025, 4:54:56 AM12/22/25
to HGVS Nomenclature
Dear Jean-Pierre,

the topic you address is discussed on the HGVS nomenclature website under "Recommendations > Uncertain" (https://hgvs-nomenclature.org/stable/recommendations/uncertain/). The description should be as precise as possible and depends on the technology used to detect the deletion. Although it is popular to use exon based descriptions, it is formally not correct.

> last nuccleotide of exon 8
When I check the  NM_003000.3 reference sequence, the last nucleotide seems to be c.*159

how to describe an exon 1 deletion?
As said, this depends on the technology used to detect the deletion (e.g. MLPA, PCR, sequencing, etc.). The popular "exon-based" description would be NM_003000.2:c.(?_-151)_(72+1_73-1)del or NM_003000.3:c.(?_-13)_(72+1_73-1)del. Note the difference -151/-13 based on the difference in the NM_ reference sequence used. In addition, you should always give the description based on a genomic reference sequence as well, NC_000001.11:g.(17044889_17053947)_(17054170_?)del linked to NM_003000.2 or NC_000001.11:g.(17044889_17053947)_(17054032_?)del linked to NM_003000.3.

the HGVS ban on use of question marks
There is no HGVS ban on the use of the Question mark, see the Recommendations > General page where it states: ? (question mark) is used to indicate unknown positions (nucleotide or amino acid)

Should it not actually be c.(1) or c.(1_-1) or something else:
As said, the "exon-based" description does not follow HGVS nomenclature. What we see is that different groups make different choices, all basically not correct. The only thing I notice is that a format like "(72+1_73-1)" best shows the breakpoint is located IN the intron.

Best regards,

Johan den Dunnen
HUGO HGVS Variant Nomenclature Committee (HVNC)
Op maandag 22 december 2025 om 10:22:28 UTC+1 schreef Bayley, Jean-Pierre (HG - LUMC):
Reply all
Reply to author
Forward
0 new messages