Annotating illumina Dragen Fusion calls

49 views
Skip to first unread message

Rini Pauly

unread,
May 3, 2024, 10:02:45 AM5/3/24
to cBioPortal for Cancer Genomics Discussion Group
Hi, 
We are using illumina Dragen for fusion calling. However they use Ensembl annotations. However, all the cbioportal formats are currently in Refseq. Is there a way to convert Ensembl to Refseq within cbioportal?
Thanks,
Rini

de Bruijn, Ino

unread,
May 4, 2024, 10:08:19 AM5/4/24
to Rini Pauly, cBioPortal for Cancer Genomics Discussion Group, Li, Xiang

Hi Rini,

 

Would you be able to share some info of what your fusion calls look like? It depends a bit how complex the annotations are that Dragen provides

 

If you’re just looking to do a mapping of Refseq IDs to Ensemble IDs, you could use one the mapping files from our variant annotation tool Genome Nexus. It has a recommended ensembl transcript id and associated refseq id for each hugo symbol:

 

https://raw.githubusercontent.com/genome-nexus/genome-nexus-importer/master/data/grch37_ensembl92/export/ensembl_biomart_canonical_transcripts_per_hgnc.txt

 

If you have specific breakpoints within transcript and exon numbers etc it might get a bit more hairy. The SV output files and their annotations don’t have a common standard format across fusion calling pipelines afaik so it’s not that straightforward to write a tool that would work for converting most from refseq to ensembl.

 

That said if you’re only looking to import the data into cBioPortal, the SV format is pretty flexible and it’s perhaps fine to import the refseq annotations on an ensembl transcript id with a matching hugo symbol even if they’re not identical. Most of the SV annotation fields are plain text, so you could indicate there that the annotation is for RefSeq

 

Hope that helps!

 

Best wishes,

Ino

 

From: 'Rini Pauly' via cBioPortal for Cancer Genomics Discussion Group <cbiop...@googlegroups.com>
Date: Friday, May 3, 2024 at 10:02 AM
To: cBioPortal for Cancer Genomics Discussion Group <cbiop...@googlegroups.com>
Subject: [EXTERNAL] [cbioportal] Annotating illumina Dragen Fusion calls

Hi,  We are using illumina Dragen for fusion calling. However they use Ensembl annotations. However, all the cbioportal formats are currently in Refseq. Is there a way to convert Ensembl to Refseq within cbioportal? Thanks, Rini -- You received

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/f3b44d1e-a59c-45be-8540-92573bd5d37an%40googlegroups.com.

=====================================================================

Please note that this e-mail and any files transmitted from
Memorial Sloan Kettering Cancer Center may be privileged, confidential,
and protected from disclosure under applicable law. If the reader of
this message is not the intended recipient, or an employee or agent
responsible for delivering this message to the intended recipient,
you are hereby notified that any reading, dissemination, distribution,
copying, or other use of this communication or any of its attachments
is strictly prohibited. If you have received this communication in
error, please notify the sender immediately by replying to this message
and deleting this message, any attachments, and all copies and backups
from your computer.

Disclaimer ID:MSKCC

Pauly, Rini (NIH/NCI) [C]

unread,
May 6, 2024, 4:30:29 PM5/6/24
to de Bruijn, Ino, cBioPortal for Cancer Genomics Discussion Group, Li, Xiang

Thank you, I think we would need help with the breakpoints and Exon numbers too. I am attaching a format of the fusions here.

Any help would be appreciated.
Thanks,

Rini

 

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.

 

sample_fusion_final_data.txt

Rini Pauly

unread,
May 16, 2024, 4:19:09 PM5/16/24
to cBioPortal for Cancer Genomics Discussion Group
Hi, 
Just checking, do we have any updates on the below query?
Thanks,
Rini

de Bruijn, Ino

unread,
May 22, 2024, 9:40:25 AM5/22/24
to Rini Pauly, cBioPortal for Cancer Genomics Discussion Group, Satravada, Baby

Hi Rini,

 

Apologies for the delay

 

CC’ing Anusha on our end as well who helped do the conversion on the MSK side of the SV calls

 

I would maybe iteratively try to add fields. At a minimum only Sample_Id, Site1_Hugo_Symbol, Site2_Hugo_Symbol and SV_Status are required. You can e.g. split the #FusionGene column into cBoPortal’s Site1_Hugo_Symbol and Site2_Hugo_Symbol. If these are all somatic calls, the SV_Status would be SOMATIC

 

You can find more info here:

 

https://docs.cbioportal.org/file-formats/#data-file-9

 

Once that loads and you are able to see the Structural Variant tab, I would start adding other fields. The “Event_Info” column is shown in many places and mostly free text, so you could any additional info about the event there (e.g. exon x in transcript y). Even if the transcript IDs aren’t matching with the rest of cBioPortal it might still be useful for users to be able to see this information

 

If you end up writing a script to do the conversion, we would be very grateful if you could share it and we can help make it available for other users via our curation tools repo: https://github.com/cBioPortal/datahub-study-curation-tools.

 

Best wishes,

Ino

 

Reply all
Reply to author
Forward
0 new messages