Respected Sir/Ma'am,
In our laboratory, we perform next-generation sequencing (NGS) using Illumina's short-read technology. A question has arisen regarding the appropriate HGVS nomenclature for reporting duplication copy number variants (CNVs).
Since our CNV calls are derived from NGS data, we are unable to define exact breakpoints. Instead, we report a range of possible breakpoints, as illustrated in the following example:
chr19:(45409925_45411016)_(45411210_45411789)dup
c.(43+1_44-1)_(236+1_237-1)dup
(Duplication of exon 3)
We have identified two limitations with using the dup suffix in this context:
- Tandem assumption: The term "duplication" implies that the amplified region is in tandem with the original sequence. Using short-read NGS data alone, we cannot definitively confirm whether the duplicated segment is arranged in tandem.
- Copy number ambiguity: The
dup suffix does not convey the number of copies of the amplified region, which is a value we are able to derive from our NGS data.
To address these limitations, we propose replacing dup with [n], where n represents the copy number determined from NGS data. An example of this proposed notation is as follows:
chr19:(45409925_45411016)_(45411210_45411789) [3]
c.(43+1_44-1)_(236+1_237-1) [3]
(Duplication of exon 3)
We would like to know whether this annotation approach is acceptable under current HGVS guidelines.
Regards
Ketaki Karmalkar
Scientist-III