Complex variants with large deletions/large duplications/insertions/inversions on one allele

34 views

Skip to first unread message

Aneta Molenda

unread,

Mar 11, 2025, 5:26:31 PMMar 11

to HGVS Nomenclature

Dear

Recently I had some complex variants eg. deep intronic large inversion in the middle and deletions/insertions/inversions around large inversion breakpoints. I thought that it would be the easiest to describe such scenario as indel.

Eg.

- ----------First example

Coding nomenclature:

NM_170675.5: c.754+18340_977+14312delins[754+18710_901-33975inv;CTGCTCAGA]

Nomenclature on genomic level:

NC_000015.10: g. 37228213_37357632delins[TCTGAGCAG; 37276576_37357262inv]

GRCh37/hg19

- ---------Second example (without coding nomenclature as breakpoint is within non-coding RNA)

NC_000011.9: g.118342279_119337690delins [119337574_119337679inv; TTTAAAA;118345768_119335442inv] GRCh37/hg19

I could see on HGVD committee website some examples where variant is described as multiple events linearly placed on one allele eg NC_000002.12:g.[32310435_32310710del;32310711_171827243inv;insG]

However, this approach seems to produce longer nomenclature (eg. NC_000015.10: g.[37228213_37276575delinsTCTGAGCAG;37276576_37357262inv;37357262_37357632del] / NM_170675.5: c.[754+18341_754+18710del;754+18710_901-33975inv;901-33974_977+14312delinsCTGCTCAGA] .

I would like to avoid long nomenclature, as there is sometimes overlap of adjacent bases (two adjacent variants could be less/more shifted left/right).

What would you think is the most correct approach?

g.(c.) [v1;v2;v3;v4;v5…]

g.(c.) XX_XXXdelins[v1;v2;v3..]

Thank you

Aneta Molenda

Johan den Dunnen

unread,

Mar 12, 2025, 7:33:24 AMMar 12

to HGVS Nomenclature

Dear Aneta,

regarding your question reporting the variant as NC_000015.10:g.37228213_37357632delins[TCTGAGCAG;37276576_37357262inv] seems OK (please note I removed several spaces). I need to check with the HVNC however whether officially it should not be reported as the longer description you give, i.e. NC_000015.10:g.[37228213_37276575delinsTCTGAGCAG;37276576_37357262inv;37357263_37357632del]. This, based on the remark on the nomenclature pages stating "descriptions removing part of a reference sequence and replacing it with part of the same sequence are not allowed" (see https://hgvs-nomenclature.org/stable/recommendations/general/).

The description NM_170675.5:c.754+18340_977+14312delins[754+18710_901-33975inv;CTGCTCAGA] is not correct, it should be NC_000015.10(NM_170675.5):c.754+18340_977+14312delins[754+18710_901-33975inv;CTGCTCAGA]. NM_170675.5 does not contain intronic nucleotides so can not be used as the reference sequence.

So, to be continued after I have discussed the topic with the HVNC.

Best regards,

Johan den Dunnen
HUGO HGVS Variant Nomenclature Committee (HVNC)

Op dinsdag 11 maart 2025 om 22:26:31 UTC+1 schreef Aneta Molenda:

Reply all

Reply to author

Forward

0 new messages