Hello cBioPortal Team,
We have implemented a private cBioPortal instance for the University Medical Center Göttingen, Germany. Here we want to compare our studys with the pdac_msk_2024 Study from the Public cBioPortal. We have Downloaded the Data from the CBioPortal website as this worked with outher studys sofar.
With the msk_2024 study we get some Errors on missing values in the structural variant data. The Error Message we get is like this:
DEBUG: data_sv.txt: Starting validation of file
WARNING: data_sv.txt: line 1: Missing genomic information. Consider adding the fields: [Site1_Contig, Site1_Ensembl_Transcript_Id, Site1_Entrez_Gene_Id, Site1_Region, Site1_Region_Number, Site2_Contig, Site2_Ensembl_Transcript_Id, Site2_Entrez_Gene_Id, Site2_Region, Site2_Region_Number]
INFO: data_sv.txt: lines [2, 5, 6, (252 more)]: No Entrez gene id or gene symbol provided for site 2. Assuming either the intragenic, deletion, duplication, translocation or inversion variant
WARNING: data_sv.txt: lines [32, 33, 112, (9 more)]: Gene symbol not known to the cBioPortal instance. This record will not be loaded.; values encountered: ['CDKN2AP14ARF', 'CDKN2AP16INK4A']
WARNING: data_sv.txt: lines [32, 33, 112, (8 more)]: All Entrez Gene Ids and Gene Symbols provided for site 1 and site 2 are not known to the cBioPortal instance. This record will not be loaded.;; values encountered: ['CDKN2Ap14ARF', 'CDKN2Ap16INK4A']
WARNING: data_sv.txt: lines [160, 300, 308]: Gene symbol maps to a single Entrez gene id, but is also associated to other genes as an alias. The system will assume the official gene symbol to be the intended one.; value encountered: 'MET'
ERROR: data_sv.txt: lines [271, 272, 273, (42 more)]: No Entrez gene id or gene symbol provided for site 1 and site 2
INFO: data_sv.txt: line 318: No Entrez gene id or gene symbol provided for site 1. Assuming either the intragenic, deletion, duplication, translocation or inversion variant
INFO: data_sv.txt: Validation of file complete
INFO: data_sv.txt: Read 431 lines. Lines with warning: 16. Lines with error: 45
INFO: -: Validation complete
#######################################################################
One or more errors reported above. Please fix your files accordingly
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
We think, that this should'nd be like this and a study this size should be ready to import in other instances. Is there a way to fix this issue or is it a known problem ?
Thanks for your help.
Best regards
Elias Lautensack