Hi Rini,
Would you be able to share some info of what your fusion calls look like? It depends a bit how complex the annotations are that Dragen provides
If you’re just looking to do a mapping of Refseq IDs to Ensemble IDs, you could use one the mapping files from our variant annotation tool Genome Nexus. It has a recommended ensembl transcript id and associated refseq id for each hugo symbol:
If you have specific breakpoints within transcript and exon numbers etc it might get a bit more hairy. The SV output files and their annotations don’t have a common standard format across fusion calling pipelines afaik so it’s not that straightforward to write a tool that would work for converting most from refseq to ensembl.
That said if you’re only looking to import the data into cBioPortal, the SV format is pretty flexible and it’s perhaps fine to import the refseq annotations on an ensembl transcript id with a matching hugo symbol even if they’re not identical. Most of the SV annotation fields are plain text, so you could indicate there that the annotation is for RefSeq
Hope that helps!
Best wishes,
Ino
From:
'Rini Pauly' via cBioPortal for Cancer Genomics Discussion Group <cbiop...@googlegroups.com>
Date: Friday, May 3, 2024 at 10:02 AM
To: cBioPortal for Cancer Genomics Discussion Group <cbiop...@googlegroups.com>
Subject: [EXTERNAL] [cbioportal] Annotating illumina Dragen Fusion calls
Hi, We are using illumina Dragen for fusion calling. However they use Ensembl annotations. However, all the cbioportal formats are currently in Refseq. Is there a way to convert Ensembl to Refseq within cbioportal? Thanks, Rini -- You received
--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
cbioportal+...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/cbioportal/f3b44d1e-a59c-45be-8540-92573bd5d37an%40googlegroups.com.
Thank you, I think we would need help with the breakpoints and Exon numbers too. I am attaching a format of the fusions here.
Any help would be appreciated.
Thanks,
Rini
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.
Hi Rini,
Apologies for the delay
CC’ing Anusha on our end as well who helped do the conversion on the MSK side of the SV calls
I would maybe iteratively try to add fields. At a minimum only Sample_Id, Site1_Hugo_Symbol, Site2_Hugo_Symbol and SV_Status are required. You can e.g. split the #FusionGene column into cBoPortal’s Site1_Hugo_Symbol and Site2_Hugo_Symbol. If these are all somatic calls, the SV_Status would be SOMATIC
You can find more info here:
https://docs.cbioportal.org/file-formats/#data-file-9
Once that loads and you are able to see the Structural Variant tab, I would start adding other fields. The “Event_Info” column is shown in many places and mostly free text, so you could any additional info about the event there (e.g. exon x in transcript y). Even if the transcript IDs aren’t matching with the rest of cBioPortal it might still be useful for users to be able to see this information
If you end up writing a script to do the conversion, we would be very grateful if you could share it and we can help make it available for other users via our curation tools repo: https://github.com/cBioPortal/datahub-study-curation-tools.
Best wishes,
Ino
.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/8d6ed60c-cc78-4079-bf07-b0e85cda3da8n%40googlegroups.com.