Dear Jean-Pierre,
the topic you address is discussed on the HGVS nomenclature website under "Recommendations > Uncertain" (
https://hgvs-nomenclature.org/stable/recommendations/uncertain/). The description should be as precise as possible and depends on the technology used to detect the deletion. Although it is popular to use exon based descriptions, it is formally not correct.
> last nuccleotide of exon 8
When I check the NM_003000.3 reference sequence, the last nucleotide seems to be c.*159
> how to describe an exon 1 deletion?
As said, this depends on the technology used to detect the deletion (e.g. MLPA, PCR, sequencing, etc.). The popular "exon-based" description would be NM_003000.2:c.(?_-151)_(72+1_73-1)del or NM_003000.3:c.(?_-13)_(72+1_73-1)del. Note the difference -151/-13 based on the difference in the NM_ reference sequence used. In addition, you should always give the description based on a genomic reference sequence as well, NC_000001.11:g.(17044889_17053947)_(17054170_?)del linked to NM_003000.2 or NC_000001.11:g.(17044889_17053947)_(17054032_?)del linked to NM_003000.3.
> the HGVS ban on use of question marks
There is no HGVS ban on the use of the Question mark, see the Recommendations > General page where it states: ? (question mark) is used to indicate unknown positions (nucleotide or amino acid)
> Should it not actually be c.(1) or c.(1_-1) or something else:
As said, the "exon-based" description does not follow HGVS nomenclature. What we see is that different groups make different choices, all basically not correct. The only thing I notice is that a format like "(72+1_73-1)" best shows the breakpoint is located IN the intron.
Best regards,
Johan den Dunnen
HUGO HGVS Variant Nomenclature Committee (HVNC)
Op maandag 22 december 2025 om 10:22:28 UTC+1 schreef Bayley, Jean-Pierre (HG - LUMC):