Interpreting complex cis variants

82 views
Skip to first unread message

Minh-Duc Nguyen

unread,
Jan 22, 2026, 10:07:34 AMJan 22
to HGVS Nomenclature
Hello everyone,

I am working on somatic variants in the EGFR gene. I would like to ask what is the correct way to annotate and interpreting the following case:EGFR.png

Should I report these as 4 separate events (the default from my variant calling pipeline using mutect2 and VEP):
_ NM_005228.5:c.2232C>A (p.I744=)
_ NM_005228.5:c.2236_2237del (p.E746Ifs*16)
_ NM_005228.5:c.2241_2247del (p.R748Qfs*16)
_ NM_005228.5:c.2248G>C (p.A750P)

Or should I combine the two frameshift events into one in-frame event like so (using Mutalyzer), which I know doesn't follow the "delins" recommendation:
_ NM_005228.5:c.2232C>A (p.I744=)
NM_005228.5:c.2236_2247delinsATT (p.E746_A749delinsI)
_ NM_005228.5:c.2248G>C (p.A750P)

The rationale behind this is the theoretically deduced protein change after combining would both make more sense to me personally and align better with the clinical data of this sample (in-frame oncogenic deletions are common in EGFR). 

Thank you for your time. Please let me know if you require any futher information.

Susana Hernández Prieto

unread,
Jan 26, 2026, 3:35:07 AMJan 26
to HGVS Nomenclature

Dear colleague,


In our laboratory, we identified a case very similar to the one you described, where our analysis software reported two independent variants:

NM_005228.5:c.2236_2248delinsATTC p.(E746_A750delinsIP). Coverage: 7545x; VAF: 66.4%.  This variant includes an additional nucleotide insertion (ATTC versus ATT) compared to the one mentioned in your email.
NM_005228.5:c.2248G>C p.(A750P). Coverage: 7400x; VAF: 66%.


Regarding these two variants, our team—based on the IGV visualization and the description—ultimately reported only the first one, as the second is already encompassed within the first.


As you mentioned, in-frame oncogenic deletions are common in EGFR exon 19. Furthermore, NM_005228.5:c.2236_2248delinsATTC p.(E746_A750delinsIP) has been previously reported in the literature.


Waiting for more input from other colleagues who might be able to provide further details.


Best regards

EGFR.jpg

j.f.j.laros

unread,
Jan 30, 2026, 8:03:53 AMJan 30
to HGVS Nomenclature
Dear,

Thank you for your question.

Since the variants you have found are in cis, they should be described together in order to obtain correct effect predictions down the line. In HGVS, the allele syntax can be used for this.

In your case,  a description like NM_005228.5:c.[2232C>A;2236_2237del;2241_2247del;2248G>C] would be fine, leading to the predicted protein effect: NM_005228.5(NP_005219.2):p.(Glu746_Ala750delinsIlePro), which indeed seems to be a lot milder at first glance than any of the single frame shift inducing variants.

Please note that the current version of Mutalyzer (see link above) normalises this allele description to NM_005228.5:c.2232_2248delinsAAAGATTC (in an upcoming version, this will be: NM_005228.5:c.[2232C>A;2236_2237del;2241_2248delinsC]). If you are interested in the motivation behind this normalisation, please feel free to ask.


With kind regards,
Jeroen.

Johan den Dunnen

unread,
Jan 30, 2026, 10:32:54 AMJan 30
to HGVS Nomenclature
Dear nguyenminhducbiotech,

following current HGVS nomenclature recommendations the variants should be described as:

1) first variant NC_000007.14:g.55174769C>A NM_005228.5:c.2232C>A p.(Ile744=)
Please note that HGVS nomenclature demands that all variants are described at the genomic level, descriptions at other levels may be added when desired. This variant is separated by three nucleotides from the next variant and there seems no reason to combine it with the other variants. Finally, when it is a somatic variant, the description changes to NC_000007.14:g.55174769=/C>A NM_005228.5:c.2232=/C>A.

2) following HGVS nomenclature the next three variants are described as NC_000007.14:g.[55174773_55174774del;55174778_55174785delinsC] NM_005228.5:c.[2236_2237del;2241_2248delinsC] p.(Glu746_Ala750delinsIlePro). Note that variants c.2241_2247del and c.2248G>C must be described together as a "delins" because they are not separated by a nucleotide.
An alternative description is NC_000007.14:g.55174773_55174785delinsATTC NM_005228.5:c.2236_2248delinsATTC p.(Glu746_Ala750delinsIlePro).
Finally, when the variant is somatic, the format for somatic variants will be required.

Best regards,

Johan den Dunnen
HUGO HGVS Variant Nomenclature Committee (HVNC)

Op vrijdag 30 januari 2026 om 14:03:53 UTC+1 schreef j.f.j.laros:

j.f.j.laros

unread,
Jan 30, 2026, 10:51:45 AMJan 30
to HGVS Nomenclature
Dear,

At the risk of stating something you already know, since you seem to have evidence that these variants are in trans, the following syntax can be used to describe both alleles: NM_005228.5:c.[2236_2248delinsATTC];[2248G>C]. This format is also described on the alleles page.


With kind regards,
Jeroen.

Minh-Duc Nguyen

unread,
Feb 2, 2026, 4:04:17 AMFeb 2
to HGVS Nomenclature
Dear Dr. Jeroen and Dr. Johan,

Many thanks for your answers! After discussing with my team, we would like to ask for clarifications regarding:

_ The motivation behind the change in normalisation format in Mutalyzer.
_ Several papers (one example related to my case above is the paper with PMID: 25179728) reported cis co-mutations instead of one large delins event, leading to clinical databases like clinvar and oncoKB also listed these as separated cis events. Our clinicians insists we follow the clinical databases and past papers for variant interpretation and reporting, while I am trying to follow the HGVS recommendations as closely as possible. What would be the best way to harmonize the clinical references with the recommended HGVS formats?

Best regards,
Minh-Duc

Vào lúc 22:51:45 UTC+7 ngày Thứ Sáu, 30 tháng 1, 2026, j.f.j.laros đã viết:

Johan den Dunnen

unread,
Feb 4, 2026, 10:43:15 AMFeb 4
to HGVS Nomenclature
Dear Minh-Duc,

> The motivation behind the change in normalisation format in Mutalyzer.

I can not answer this question, you should ask the Mutalyzer team for this.

> ...What would be the best way to harmonize the clinical references with the recommended HGVS formats?

There is only one international standard to describe variants in DNA, RNA and protein sequences, i.e. HGVS nomenclature. Either you follow the standard or you do not. We have no influence on how variants get reported in publications, esp.not when journals do not insist on the uses of HGVS nomenclature when reporting variants nor demand that the variants reported are checked using tools like Mutalyzer or VariantValidator.

Best regards,

Johan den Dunnen
HUGO HGVS Variant Nomenclature Committee (HVNC)
Op maandag 2 februari 2026 om 10:04:17 UTC+1 schreef nguyenminh...@gmail.com:
Reply all
Reply to author
Forward
0 new messages