missing miRNA genes prevent imports

53 views
Skip to first unread message

Katsap Smazhenyi

unread,
Aug 2, 2022, 10:55:40 AM8/2/22
to cBioPortal for Cancer Genomics Discussion Group
Hi,

I am trying to import the datahub data into cBioPortal deployed via docker. Some of the imports fail, all because of errors related to data_mirna.txt where all lines are being skipped:

Reading data from:  /study/datahub/public/pancan_pcawg_2020/data_mirna.txt
Recaching...
Finished recaching...
--> profile id:  154
--> profile name:  miRNA expression (UQ normalized)
--> genetic alteration type:  MRNA_EXPRESSION
--> total number of samples: 749
--> total number of data lines:  1864
--> records inserted into `sample_profile` table: 749
--> total number of data entries skipped (see table below):  1864
org.mskcc.cbio.portal.dao.DaoException: Something has gone wrong!  I did not save any records to the database!
at org.mskcc.cbio.portal.scripts.ImportTabDelimData.importData(ImportTabDelimData.java:307)
at org.mskcc.cbio.portal.scripts.ImportProfileData.run(ImportProfileData.java:125)
at org.mskcc.cbio.portal.scripts.ConsoleRunnable.runInConsole(ConsoleRunnable.java:145)
at org.mskcc.cbio.portal.scripts.ImportProfileData.main(ImportProfileData.java:150)


Indeed, our cbioportal database, does not contain any of the genes or gene aliases listed in problematic data_mirna.txt files. I am curious as why the number of miRNA genes has decreased so drastically in recent seedDB files?

katsap@machine:~$ cat seed-cbioportal_hg19_v2.1.0.sql | tr ',' "\n" | grep "hsa-mir" | wc -l
894
katsap@machine:~$ cat seed-cbioportal_hg19_v2.4.0.sql | tr ',' "\n" | grep "hsa-mir" | wc -l
894
katsap@machine:~$ cat seed-cbioportal_hg19_v2.7.2.sql | tr ',' "\n" | grep "hsa-mir" | wc -l
882
katsap@machine:~$ cat seed-cbioportal_hg19_v2.7.3.sql | tr ',' "\n" | grep "hsa-mir" | wc -l
882
katsap@machine:~$ cat seed-cbioportal_hg19_v2.12.8.sql | tr ',' "\n" | grep "hsa-mir" | wc -l
6
katsap@machine:~$ cat seed-cbioportal_hg19_v2.12.12.sql | tr ',' "\n" | grep "hsa-mir" | wc -l
6

Is this expected?

Srivatsan V

unread,
Oct 16, 2023, 9:42:22 AM10/16/23
to cBioPortal for Cancer Genomics Discussion Group
Hi, I'm having the same issue while trying to import pancan_pcawg_2020. 
cbioportal version: 5.3.19
script to download the seedDB files: wget -O cgds.sql "https://raw.githubusercontent.com/cBioPortal/cbioportal/v5.3.6/db-scripts/src/main/resources/cgds.sql" && wget -O seed.sql.gz "https://github.com/cBioPortal/datahub/raw/master/seedDB/seedDB_hg19_hg38_archive/seed-cbioportal_hg19_hg38_v2.12.14.sql.gz"

Is this issue expected on this version of seedDB? If so please advice on how to resolve this?

Thanks,
Srivatsan V

Matthijs Pon

unread,
Oct 16, 2023, 9:45:08 AM10/16/23
to Srivatsan V, cBioPortal for Cancer Genomics Discussion Group
Hi Srivatsan,

Thanks for reaching out! In order to load miRNA data into the cBioPortal database, you need to manually import the microRNA genes. The process to load miRNAs is described here under point 4.

I hope this answers your question. If you have any further questions, please hit reply all, so our continued conversation is captured by the cBioPortal google groups.

With kind regards, Matthijs Pon

Data Engineer cBioPortal


E matt...@thehyve.nl

T +31 30 700 9713

W thehyve.nl

    


--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/e92cf8db-e94c-4b86-81b9-7cd8936cca3cn%40googlegroups.com.

Matthijs Pon

unread,
Oct 17, 2023, 2:49:32 AM10/17/23
to Srivatsan V, cBioPortal for Cancer Genomics Discussion Group
Glad I could help!

With kind regards, Matthijs Pon

Data Engineer cBioPortal



On Mon, Oct 16, 2023 at 7:10 PM Srivatsan V <sriva...@strandls.com> wrote:
Hi Matthijs, this worked (missed this part as I had followed the steps for deploy with docker section). Thanks alot!

Thanks,
Srivatsan V

Srivatsan V

unread,
Oct 17, 2023, 2:50:22 AM10/17/23
to Matthijs Pon, cBioPortal for Cancer Genomics Discussion Group
Hi Matthijs, this worked (missed this part as I had followed the steps for deploy with docker section). Thanks alot!

Thanks,
Srivatsan V

On Mon, Oct 16, 2023 at 7:15 PM Matthijs Pon <matt...@thehyve.nl> wrote:
Reply all
Reply to author
Forward
0 new messages