Repeated Sequences vs Duplication calling

59 views
Skip to first unread message

Richard Wong

unread,
Jan 20, 2025, 2:07:27 PMJan 20
to HGVS Nomenclature

Hello,

I was hoping to get your expert opinions on repeated sequences vs duplication calling in the clinical realm. The particular case scenario I was wondering about is calling of intronic variants. In the HGVS notes under repeat sequences the below is stated:  

exception: using a coding DNA reference sequence ("c." description), a repeated sequence variant description can be used only for repeat units with a length which is a multiple of 3, i.e. which can not affect the reading frame. Consequently, use NM_024312.4:c.2692_2693dup and not NM_024312.4:c.2686A[10]; use NM_024312.4:c.1741_1742insTATATATA and not NM_024312.4:c.1738TA[6].

A given example in the same section (below) uses [X] and not dup in the calling of an intronic variant.  Is the suggestion that intronic variants are unlikely to change the reading frame so brackets are ok/preferred?

CFTR intron 9 NM_000492.3:c.1210-33_1210-6GT[11]T[6] the mixed repeat sequence form position c.1210-33 to c.1210-6 contains 11 GT and 6 T copies.

 

There is uncertainty in the clinical testing realm about the use of “brackets” [X] vs dup when reporting variants in general, and intronic regions are even more murky.  The clinvar example below highlights the variability of calls and also variability of the 3’ rule usage.      

https://www.ncbi.nlm.nih.gov/clinvar/variation/315710/

NM_005159.5:c.809-58TG[25]  

NM_005159.4:c.809-16_809-13dupTGTG                            

NC_000015.10:g.34791308CA[25]                           

NC_000015.9:g.35083509CA[25]                             

NG_007553.1:g.9374TG[25]                       

LRG_388:g.9374TG[25]                 

LRG_388t1:c.809-16_809-13dup               

 

Do you have any general advice or rules of thumb on this subject?  Thank you for your help!

 

Best wishes,

 

-Richard. 

Johan den Dunnen

unread,
Jan 30, 2025, 5:45:59 AMJan 30
to HGVS Nomenclature
Dear Richard,

for this subject, best is to check the recommendation on the DNA > Repeated Sequences page (https://hgvs-nomenclature.org/stable/recommendations/DNA/repeated/). The "[]" format for repeated sequences has been introduced following formats in use when HGVS nomenclature was introduced, for clarity and to reduce the complexity of variant descriptions in repeated sequences. Assume the presence of a CA repeat sequence of 10 copies from nucleotides g.110 to g.119. Compare the format to describes variants like g.100_119CA[11], g.100_119CA[12], ..., g.100_119CA[20], g.100_119CA[21], , g.100_119CA[22] versus g.118_119dup g.116_119dup, g.114_119dup, ..., g.100_119dup, g.[100_119dup;119_120insCA], g.[100_119dup;119_120insCACA].

Regarding the examples you give:

- NM_005159.5:c.809-58TG[25]
correct is NC_000015.10(NM_005159.5):c.809-58_809-13TG[25]

- NM_005159.4:c.809-16_809-13dupTGTG
preferred is NC_000015.10(NM_005159.5):c.809-58_809-13TG[25], alternatively NC_000015.10(NM_005159.5):c.809-16_809-13dup. "dupTGTG" should not be used     

- NC_000015.10:g.34791308CA[25]
correct is NC_000015.10:g.34791308_34791353CA[25]

- NC_000015.9:g.35083509CA[25]
correct is NC_000015.9:g.35083509_35083554CA[25]

Best regards,

Johan den Dunnen
HUGO HGVS Variant Nomenclature Committee (HVNC)
         

Op maandag 20 januari 2025 om 20:07:27 UTC+1 schreef r1wo...@gmail.com:

Richard Wong

unread,
Jan 30, 2025, 12:06:43 PMJan 30
to HGVS Nomenclature
Hello Johan, 

Thank you for taking the time to answer my question.  I appreciate your advice on this subject.

Best wishes, 

-Richard.   

Reply all
Reply to author
Forward
0 new messages