Hi !
I'm trying to import
h37d5 reference genome
into cBioPortal following
this documentation.
However, I deployed cBioPortal using Docker, and the documentation seems to assume a non-Docker deployment (please correct me if I’m wrong).
My Attempts:
1. Using importReferenceGenome.pl
I attempted to imitate the study validation/import process with the following command:
docker compose exec cbioportal importReferenceGenome.pl --ref-genome ./refGenome.txt
But I
encountered this error:
OCI runtime exec failed: exec failed: unable to start container process: exec: "importReferenceGenome.pl": executable file not found in $PATH: unknown
2. Direct MySQL Insertion
As previous attempt did not work out, I try another suggestion from a
Google Group discussion to directly insert into MySQL database, I ran:
The insertion to the database succeeded but when I try to validate my data with the validateData.py, after adding 'reference_genome: hs37d5' to the meta_study.txt, I got:
ERROR: meta_study.txt: Unknown reference genome defined. Should be one of ['hg19', 'hg38', 'mm10']; value encountered: 'hs37d5'
To resolve this, I try to restart the cBioPortal container using docker compose restart and docker compose down/up but both didn’t resolve the issue. On top of this, the error prompt seems saying no other reference genome will be expected, is this the latest restriction of cBioPortal?
3.
Running importReferenceGenome.pl in Bash
As in my second attempt, I did not import the reference genome using the
importReferenceGenome.pl script and the mentioned
discussion were 5 years ago, I wonder if this solution doesn't apply anymore, especially in a Docker container setting. So I choose to follow back to the original
documentation.
I try to run the importReferenceGenome.pl in the bash environment of the container:
docker compose cp ./refGenome/refGenome.txt cbioportal:/core/scripts/
docker compose exec cbioportal bash
cd /core/scripts
export PORTAL_HOME=/cbioportal
./importReferenceGenome.pl --ref-genome refGenome.txtBut
got this error:
PORTAL_DATA_HOME Environment Variable is not set. Please set, and try again.Thus I try
export PORTAL_DATA_HOME =/cbioportal./importReferenceGenome.pl --ref-genome refGenome.txtAnd this time the script seems to be able to run, but it seems to have some connection issues where I receive:
Reading reference genome from: /core/scripts/refGenome.txt
--> total number of lines: 1
Standard Commons Logging discovery in action with spring-jcl: please remove commons-logging.jar from classpath in order to avoid potential conflicts
09:54:21.523 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Attempting to read properties file: /cbioportal/application.properties
09:54:21.525 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Failed to read properties file: /cbioportal/application.properties
09:54:21.525 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Attempting to read properties file from classpath
09:54:21.526 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Successfully read properties file
09:54:21.527 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Attempting to read properties file: /cbioportal/maven.properties
09:54:21.527 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Failed to read properties file: /cbioportal/maven.properties
09:54:21.527 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Attempting to read properties file from classpath
09:54:21.527 [main] INFO org.mskcc.cbio.portal.util.GlobalProperties -- Successfully read properties file
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
java.sql.SQLException: Cannot create PoolableConnectionFactory (Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.)
at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:633)
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:535)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:711)
at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:159)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:117)
at org.springframework.jdbc.datasource.TransactionAwareDataSourceProxy$TransactionAwareInvocationHandler.invoke(TransactionAwareDataSourceProxy.java:223)
at jdk.proxy1/jdk.proxy1.$Proxy0.prepareStatement(Unknown Source)
at org.mskcc.cbio.portal.dao.DaoReferenceGenome.reCache(DaoReferenceGenome.java:68)
at org.mskcc.cbio.portal.dao.DaoReferenceGenome.<clinit>(DaoReferenceGenome.java:43)
at org.mskcc.cbio.portal.scripts.ImportReferenceGenome.addReferenceGenomesToDB(ImportReferenceGenome.java:101)
at org.mskcc.cbio.portal.scripts.ImportReferenceGenome.importData(ImportReferenceGenome.java:89)
at org.mskcc.cbio.portal.scripts.ImportReferenceGenome.run(ImportReferenceGenome.java:150)
at org.mskcc.cbio.portal.scripts.ConsoleRunnable.runInConsole(ConsoleRunnable.java:145)
at org.mskcc.cbio.portal.scripts.ImportReferenceGenome.main(ImportReferenceGenome.java:179)
Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
And there are multiple similar chunks saying Communications link failure.
On top of this, out of those chunks, despite a prompt stating "reference genome added to the database," no new entry appeared in the SQL database.
Question
How can I properly import a new reference genome into a Docker-deployed cBioPortal? I’m sorry for the lengthy explanation and beginner-level question, but I’d appreciate your guidance!
Thanks in advance!