DB schema mismatch

279 views
Skip to first unread message

Jong Cheol Jeong

unread,
May 15, 2019, 5:27:07 PM5/15/19
to cBioPortal for Cancer Genomics Discussion Group

I am using cBioPortal v2.2.0 and have installed complete MySQL DB from https://github.com/cBioPortal/datahub/
cBioPortal has been successfully working until I am loading my own data.
When metaImport.py was run, it showed DB Schema version mismatch error as shown below.
I have tried migrate_db.py and also followed cBioPortal updates (https://cbioportal.readthedocs.io/en/latest/Updating-your-cBioPortal-installation.html)
None of them has been worked.

Could anyone help me?

Thank you.



#######################################################################
Overriding Warnings. Importing study now
#######################################################################

Data loading step using /home/jjeong/local/opt/cbioportal-2.2.0/cbioportal/scripts/target/scripts-2.2.0-17-g1f2c6beb8-dirty-SNAPSHOT.jar

log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /cbioportal.log (Permission denied)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
at org.apache.log4j.rolling.RollingFileAppender.activateOptions(RollingFileAppender.java:179)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)
at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.log4j.Logger.getLogger(Logger.java:104)
at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:262)
at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:108)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1025)
at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:844)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:655)
at org.mskcc.cbio.portal.dao.JdbcUtil.<clinit>(JdbcUtil.java:51)
at org.mskcc.cbio.portal.dao.DaoInfo.setVersion(DaoInfo.java:56)
at org.mskcc.cbio.portal.dao.DaoInfo.checkVersion(DaoInfo.java:83)
at org.mskcc.cbio.portal.util.VersionUtil.main(VersionUtil.java:41)
Checked DB schema version: (expected: 2.8.2) (found: 2.7.3)
Error, probably due to this version of the portal being out of sync with the database. Run the database migration script located at CBIOPORTAL_SRC/core/src/main/scripts/migrate_db.py before continuing.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Error occurred during data loading step. Please fix the problem and run this again to make sure study is completely loaded.
Traceback (most recent call last):
  File "./metaImport.py", line 147, in <module>
    cbioportalImporter.main(args)
  File "/home/jjeong/local/opt/cbioportal-2.2.0/cbioportal/core/src/main/scripts/importer/cbioportalImporter.py", line 447, in main
    check_version(jvm_args)
  File "/home/jjeong/local/opt/cbioportal-2.2.0/cbioportal/core/src/main/scripts/importer/cbioportalImporter.py", line 164, in check_version
    run_java(*args)
  File "/home/jjeong/local/opt/cbioportal-2.2.0/cbioportal/core/src/main/scripts/importer/cbioportal_common.py", line 845, in run_java
    raise RuntimeError('Aborting due to error while executing step.')
RuntimeError: Aborting due to error while executing step.


Robert Sheridan

unread,
May 17, 2019, 4:27:46 PM5/17/19
to cBioPortal for Cancer Genomics Discussion Group
From the output you show, it looks like the migration script (migrate_db.py) failed to properly update your database. When you ran migrate_db.py, what was the output from that script? Did it generate errors? Could you show the command you executed to run the script?

migrate_db.py uses a file with mysql commands called "migration.sql".
The current version of that file can be seen here:
The python script applies each section of migration.sql which is numbered higher than the db_schema_version from your currently installed database.
Installing the current seed database installs a database with db_schema_version 2.7.3 ... this can be seen in the "info" table, which stores the current db_schema_version of the database.
So the migrate_db.py should execute the following sections from migration.sql:
##version: 2.7.4
##version: 2.8.0
##version: 2.8.1
##version: 2.8.2

At the end of each section there is a line which updates the value of db_schema_version in the database table. So it seems that your database was not updated at all (not even one step forward).

My guess is that the migrate_db.py program failed because it did not receive all the required properties. One command line argument you pass to migrate_db.py is a properties file, and the other is the migration.sql script filename. The properties file (perhaps called "portal.properties") should have the following 4 settings filled in with values which are correct for your mysql database:
db.host
db.portal_db_name
db.user
db.password

double check that you have filled in the correct values for these properties, and that the mysql user which you have created has authorities granted to CREATE tables, DROP table, ALTER tables, INSERT records, DELETE records, and to do similar database alterations. This file is then referenced with the command line argument "--properties-file portal.properties".

If all that is correct, make sure you supply the proper path on the command line for migration.sql. This is done with something like "--sql db-scripts/src/main/resources/migration.sql" if your current directory is the root directory of your clone of the cbioportal code repository. If you have set the environment variable "PORTAL_HOME" then maybe you can use "--sql $PORTAL_HOME/db-scripts/src/main/resources/migration.sql"

After running (maybe like) "python $PORTAL_HOME/core/src/main/scripts/migrate_db.py --properties-file $PORTAL_HOME/src/main/resources/portal.properties --sql $PORTAL_HOME/db-scripts/src/main/resources/migration.sql" the database should have been updated to database schema version 2.8.2. You can check that by connecting to your database with the mysql command line tool and then doing "select * from info" ... if the db_schema_version column does not have the value "2.8.2", then the migrate_db.py script failed.

If this does not help, please post a follow up message to this group showing what command you used to execute migrate_db.py and any output that came on the screen.

A separate issue you are seeing is the failure to connect the logger to a logfile:
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /cbioportal.log (Permission denied)

This is due to not updating (or not providing) an appropriate log4j.properties file. Please create a file in $PORTAL_HOME/src/main/resources/log4j.properties (it can be based on $PORTAL_HOME/src/main/resources/log4j.properties.EXAMPLE) and fill in a valid path to a location on your filesystem where the log file can be created. The two properties to set are:
log4j.appender.a.rollingPolicy.FileNamePattern
log4j.appender.a.File

the default directory path in the example file includes ${java.io.tmpdir}, which seems to not be set when you launch your portal. You can either replace this with a hardcoded path to a directory which exists on your filesystem, or you might set this when you start up your java JVM with a command line argument to java such as "java -Djava.io.tmpdir=/path/to/tmpdir ... "

I hope that is helpful


Jong Cheol Jeong

unread,
May 17, 2019, 4:34:56 PM5/17/19
to Robert Sheridan, cBioPortal for Cancer Genomics Discussion Group
Thank you for the reply. 

I found that in migration.sql  v 2.7.3 and the SQL did not match since ‘cancer_study_tags’ already exists, so it requires "DROP TABLE `cancer_study_tags`;” statement. 
There is another syntax error. I think it was in 2.8.1. There was a wrong table name. 

I hope this will be helpful for other people as well.

Again, thank you for your help. 


-- 
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/a3e25011-fc1d-4597-a232-655af1f41e2c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert Sheridan

unread,
May 18, 2019, 8:48:10 AM5/18/19
to cBioPortal for Cancer Genomics Discussion Group
Thank you very much for pointing this out. I will test the migration script and correct these errors.

For a long time we have wanted to create automated tests which verify the correctness of the migration.sql file ... sorry that we have not put those in place yet and that errors like this still sneak out and affect the cBioPortal community.

Robert Sheridan

unread,
May 20, 2019, 5:04:39 PM5/20/19
to cBioPortal for Cancer Genomics Discussion Group
Hello. I have tried to reproduce the errors you reported but I was not able to see them.
 
One thing which may be unclear in the documentation is which initial database schema to load into the database. I think the instructions on this page:
are a little out of date. For step 1, the instruction say "Import the database schema (/db-scripts/src/main/resources/cgds.sql)" .. and do not tell you that you should import the version of cgds.sql which matches the seed database that you plan to import.
This is explained better on the datahub documentation page here:
Where it gives you links to both the cgds.sql file to use before importing the seed database, and then the seed database itself.
I have tested using the links in datahub section "Seed database schema 2.7.3" which shows the link to the cgds.sql to use as https://raw.githubusercontent.com/cBioPortal/cbioportal/v2.0.0/db-scripts/src/main/resources/cgds.sql
This is the schema as it existed at the time of the 2.0.0 release of cbioportal. That is the correct database schema for the seed database (schema 2.7.3).
I downloaded the migration.sql from the latest version of the cbioportal code base : (master is currently at this point:) https://github.com/cBioPortal/cbioportal/blob/ea3342206abf90bc04e8315d5c71afffaf313b7f/db-scripts/src/main/resources/migration.sql
I downloaded the migrate_db.py script from the latest version of the cbioportal code base: https://github.com/cBioPortal/cbioportal/blob/ea3342206abf90bc04e8315d5c71afffaf313b7f/core/src/main/scripts/migrate_db.py

Here is a transcript showing the output I saw while creating the database, loading the initial schema (2.7.3), loading the seed database, and then migrating the database to the latest schema version (2.8.2) using the script:
%> mysql -u cbio_user -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 32363
Server version: 5.7.17 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>create database sheridan_test_db;
Query OK, 1 row affected (0.00 sec)

mysql> quit
Bye
%> mysql -u cbio_user -p sheridan_test_db < cgds.sql 
Enter password: 
%> mysql -u cbio_user -p sheridan_test_db < seed-cbioportal_hg19_v2.7.3.sql 
Enter password: 
%> python3 migrate_db.py -p seedportal.properties -s migration.sql 
WARNING: This script will alter your database! Be sure to back up your data before running.
Continue running DB migration? (y/n) y
Running statements for version: 2.7.4
    Executing statement: CREATE TABLE `cancer_study_tags` ( `CANCER_STUDY_ID` int(11) NOT NULL, `TAGS` text NOT NULL, PRIMARY KEY (`CANCER_STUDY_ID`), FOREIGN KEY (`CANCER_STUDY_ID`) REFERENCES `cancer_study` (`CANCER_STUDY_ID`) ON DELETE CASCADE );
    Executing statement: ALTER TABLE `mutation_count_by_keyword` MODIFY COLUMN `KEYWORD` VARCHAR(255);
    Executing statement: UPDATE `info` SET `DB_SCHEMA_VERSION`="2.7.4";
Running statements for version: 2.8.0
    Executing statement: ALTER TABLE `cancer_study` MODIFY COLUMN `PMID` varchar(1024) DEFAULT NULL;
    Executing statement: UPDATE `info` SET `DB_SCHEMA_VERSION`="2.8.0";
Running statements for version: 2.8.1
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `TUMOR_SEQ_ALLELE1` TEXT;
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `TUMOR_SEQ_ALLELE2` TEXT;
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `MATCH_NORM_SEQ_ALLELE1` TEXT;
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `MATCH_NORM_SEQ_ALLELE2` TEXT;
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `TUMOR_VALIDATION_ALLELE1` TEXT;
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `TUMOR_VALIDATION_ALLELE2` TEXT;
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `MATCH_NORM_VALIDATION_ALLELE1` TEXT;
    Executing statement: ALTER TABLE `mutation` MODIFY COLUMN `MATCH_NORM_VALIDATION_ALLELE2` TEXT;
    Executing statement: ALTER TABLE `mutation_event` DROP INDEX `CHR`;
    Executing statement: ALTER TABLE `mutation_event` MODIFY COLUMN `TUMOR_SEQ_ALLELE` TEXT;
    Executing statement: ALTER TABLE `mutation_event` MODIFY COLUMN `REFERENCE_ALLELE` TEXT;
    Executing statement: ALTER TABLE `mutation_event` ADD KEY `KEY_MUTATION_EVENT_DETAILS` (`CHR`, `START_POSITION`, `END_POSITION`, `TUMOR_SEQ_ALLELE`(255), `ENTREZ_GENE_ID`, `PROTEIN_CHANGE`, `MUTATION_TYPE`);
    Executing statement: UPDATE `info` SET `DB_SCHEMA_VERSION`="2.8.1";
Running statements for version: 2.8.2
    Executing statement: ALTER TABLE `mutation_event` DROP KEY `KEY_MUTATION_EVENT_DETAILS`;
    Executing statement: ALTER TABLE `mutation_event` ADD KEY `KEY_MUTATION_EVENT_DETAILS` (`CHR`, `START_POSITION`, `END_POSITION`, `TUMOR_SEQ_ALLELE`(240), `ENTREZ_GENE_ID`, `PROTEIN_CHANGE`, `MUTATION_TYPE`);
    Executing statement: UPDATE `info` SET `DB_SCHEMA_VERSION`="2.8.2";
Finished.
%> 

After this was complete, the db_schema_version in the info table showed 2.8.2.

I am guessing that in order to see the table name differences, or the missing "drop table" command, you may have used a version of cgds.sql which was not matching with the 2.7.3 db_schema_version of the seed database ... or that you used a migration.sql version which was not from the head of the master branch. (Perhaps a different branch was checked out in your clone of cbioportal, or the branch selection menu had been switched on the github website to a different branch.

If you could try again, using the versions of the source files I have linked to above and let me know if you still have errors, that would be appreciated. Please also send a transcript of the output that you see from running each step (deleting any passwords or sensitive usernames of course)

Jong Cheol Jeong

unread,
May 21, 2019, 1:39:04 PM5/21/19
to Robert Sheridan, cBioPortal for Cancer Genomics Discussion Group
Thank you for the update. 
Yes, it may be possible that I have used different versions since I have tried couple versions and DBs. 
I have been working on integrating public DB and my own data sets using cBioPortal Public MySQL Exports (http://download.cbioportal.org/mysql-snapshots/mysql-snapshots-toc.html
So, I created new DB ‘cbioportal’ and dump the MySQL exports, and then uploaded my data. 
I haven’t reproduced this process, so I am not sure why it happened. 
I have also tested new GitHub source to check the table name mismatch, but it seems the table name is same as what I updated in my old ‘migration.sql’ file.

Again, thank you for your help. 

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.

ole...@thehyve.nl

unread,
May 29, 2019, 9:21:20 AM5/29/19
to cBioPortal for Cancer Genomics Discussion Group
Hi,

It seems that you managed to solve the issue. Am I correct or do you need more help from our side?



Op dinsdag 21 mei 2019 19:39:04 UTC+2 schreef Jong Cheol Jeong:
To unsubscribe from this group and stop receiving emails from it, send an email to cbiop...@googlegroups.com.

Jong Cheol Jeong

unread,
May 29, 2019, 9:35:52 AM5/29/19
to ole...@thehyve.nl, cBioPortal for Cancer Genomics Discussion Group
I have solved issues. 

Thank you. 


To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.

To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.
Reply all
Reply to author
Forward
0 new messages