Question regarding pdac_msk_2024 upload

11 views
Skip to first unread message

elias

unread,
Oct 28, 2025, 10:12:50 AM (9 days ago) Oct 28
to cBioPortal for Cancer Genomics Discussion Group
Hello cBioPortal Team,
We have implemented a private cBioPortal instance for the University Medical Center Göttingen, Germany. Here we want to compare our studys with the pdac_msk_2024 Study from the Public cBioPortal. We have Downloaded the Data from the CBioPortal website as this worked with outher studys sofar. 

With the msk_2024 study we get some Errors on missing values in the structural variant data. The Error Message we get is like this:


DEBUG: data_sv.txt: Starting validation of file
WARNING: data_sv.txt: line 1: Missing genomic information. Consider adding the fields: [Site1_Contig, Site1_Ensembl_Transcript_Id, Site1_Entrez_Gene_Id, Site1_Region, Site1_Region_Number, Site2_Contig, Site2_Ensembl_Transcript_Id, Site2_Entrez_Gene_Id, Site2_Region, Site2_Region_Number]
INFO: data_sv.txt: lines [2, 5, 6, (252 more)]: No Entrez gene id or gene symbol provided for site 2. Assuming either the intragenic, deletion, duplication, translocation or inversion variant
WARNING: data_sv.txt: lines [32, 33, 112, (9 more)]: Gene symbol not known to the cBioPortal instance. This record will not be loaded.; values encountered: ['CDKN2AP14ARF', 'CDKN2AP16INK4A']
WARNING: data_sv.txt: lines [32, 33, 112, (8 more)]: All Entrez Gene Ids and Gene Symbols provided for site 1 and site 2 are not known to the cBioPortal instance. This record will not be loaded.;; values encountered: ['CDKN2Ap14ARF', 'CDKN2Ap16INK4A']
WARNING: data_sv.txt: lines [160, 300, 308]: Gene symbol maps to a single Entrez gene id, but is also associated to other genes as an alias. The system will assume the official gene symbol to be the intended one.; value encountered: 'MET'
ERROR: data_sv.txt: lines [271, 272, 273, (42 more)]: No Entrez gene id or gene symbol provided for site 1 and site 2
INFO: data_sv.txt: line 318: No Entrez gene id or gene symbol provided for site 1. Assuming either the intragenic, deletion, duplication, translocation or inversion variant
INFO: data_sv.txt: Validation of file complete
INFO: data_sv.txt: Read 431 lines. Lines with warning: 16. Lines with error: 45

INFO: -: Validation complete


#######################################################################
One or more errors reported above. Please fix your files accordingly
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


We think, that this should'nd be like this and a study this size should be ready to import in other instances. Is there a way to fix this issue or is it a known problem ? 

Thanks for your help.
Best regards
Elias Lautensack

Baby Anusha Satravada

unread,
Oct 28, 2025, 1:58:55 PM (9 days ago) Oct 28
to cBioPortal for Cancer Genomics Discussion Group

Hi Elias,

Thanks for sharing the validation log. The errors you’re seeing in the data_sv.txt file correspond to rows where the annotation section indicates:

“DIAGNOSTIC INTERPRETATION: NEGATIVE FOR GENE FUSIONS IN THE CLINICALLY VALIDATED PANEL.NEGATIVE FOR GENE FUSIONS IN THE INVESTIGATIONAL PANEL.”

These entries represent negative fusion results, meaning no structural variants were detected for those samples.

To ensure a successful import, please remove the rows containing these “negative for gene fusions” annotations from your data_sv.txt file. 

After removing these rows, the file should validate and load successfully in your local cBioPortal instance.

Best regards,

Anusha.

Reply all
Reply to author
Forward
0 new messages