[Mifos-developer] Proposal for near zero downtime when deploying new release in production.

Vivek Singh

unread,

Nov 18, 2010, 7:59:19 AM11/18/10

to Mifos software development

[This is to solve delays caused from database perspective]

Please rate it between 1-10 and also add your views what would take it to be 10.

terms

(I know we might be using some of the terms differently, but for reading this email please use the following. Thanks)

schema change: adding/changing/deleting any of the database elements (table, column, index, constraint, view)

base data: data required by Mifos in its tables for it to function correctly

business data: data created by the use and running of the Mifos in production

database code: stored procedures, functions and triggers

data migration: transforming business data from one form to another

database upgrade: schema upgrade + base data change + database code upgrade + data migration

goal

near zero downtime for Mifos release deployment

simpler rollback of release when required

where we are today

We need to stop Mifos web server and batch jobs. Database is upgraded. New version of code is deployed. Web servers and batch jobs are started. Database upgrade can be slow depending on the volume of data and upgrade.
Database upgrade is triggered by starting the web server. This is rather complex approach and couples database upgrade to web server start. If there are cluster of servers accidentally started together then it can corrupt the database as well.
Some of the DataMigration/BaseData is carried out in the context of tomcat runtime using the production domain entities and services. This can go wrong when the release upgrade is not sequential. Using an example. If in release 1.7 we carried out some DataMigration/BaseData change. This would work when release is upgraded from 1.6 to 1.7. But when upgrading from 1.6 to 1.8 directly it might fail. The reason being the code used to perform the upgrade would be for 1.8 release, where these services entities might have changed. Such things can easily fall in the blind spot of testers as well. This can also be significantly slower than plain sql based approach.
Mifos is maintaining its own tool for managing database changes.
Web servers at the startup check the revision number of database against the code's revision.
We defragment the change scripts into categorized scripts for our understanding, new deployments and development environments.

proposal

Use dbdeploy instead of maintaining our own tool. (dbdeploy works based on numbers which means that one needs to bit more to manage script numbers across branches. As an implementation detail I have avoided it. If you feel concerned about it please ask for the detail.)
Start writing undo scripts explained here for every upgrade except data migration and database code.
Do not use Java code to do data migration and base data change. If it cannot be avoided then version control the old runnable version of migration.
Continue doing 5&6 above.
Categorize our database scripts as expansion and contraction scripts explained here (http://exortech.com/blog/2009/02/01/weekly-release-blog-11-zero-downtime-database-deployment/).
Change the high level release steps to a) review database upgrade (both do and undo) scripts b) run expansion scripts and data migration scripts when system is in use, preferably in offhours c) deploy the code and database code (only) d) wait for the release to become stable e) run contraction scripts (if stable) or undo of expansion (if unstable) [Please note that database code is treat in the same way as source code.]
When we get into situations where database operations are less per-formant and have impact with running system then we can take a case by case call. There are some good tips here and here.
It is possible that there might be some issues in application when run against expansion script. For example, if the sql fired to database depends on the order of column (unlikely because of hibernate mostly). I cannot think of any other scenario. If there are some then we can run the unit/integration/functional tests against expansion scripts as well as complete upgrade scripts.

--
Vivek Singh | +91 98452 32929 | http://sites.google.com/site/petmongrels

Udai Gupta

unread,

Nov 18, 2010, 8:56:15 AM11/18/10

to Mifos software development

Hi Vivek,

Nice proposal, I will it 8.

What I liked about this proposal is -
- It explains the problem that we can/would face with the current implementation.
- It divides different type of upgrades (DDL change, DML change, migration)
- 6th point is great, If I am correct, it's trying to establish a formal review of all the upgrades before release. (Which I think we are doing this time informally)

What I think will make it 10.
- The 1st point is trying to reverse http://mifosforge.jira.com/browse/MIFOS-2896, I would prefer using LiquiBase compared dbdeploy.
- I think keeping upgrade/migration code outside mifos might be a good idea but we need more formal approach/details here. i.e. with more example how are we going to do that.
- I am more concerned about DDL changes which are most destructive in most of the cases and more responsible for actual downtime. Specially when it is going to be on large tables with lots of data. I like how facebook upgrade MySQL http://tinyurl.com/32zeb86

Thanks,
Udai

Artur Siekielski

unread,

Nov 18, 2010, 9:01:02 AM11/18/10

to Mifos software development

On 11/18/2010 01:59 PM, Vivek Singh wrote:

proposal

Use dbdeploy instead of maintaining our own tool.

Is it a mature tool? Number of commits and size of sources suggests it is in BETA phase. On issue list there are defects unsolved for months, eg. including a comment in scripts results in exception: http://code.google.com/p/dbdeploy/issues/detail?id=40

Do not use Java code to do data migration and base data change.

If it cannot be avoided then version control the old runnable version of migration.

I think in general it is needed to somehow tag the whole source tree which is needed for executing a given upgrade.

I had such situation when I added phone_number_stripped column to customer_address_detail table, and to Hibernate mapping. It broke a previous upgrade which also (indirectly) used this table, because code was new - with the field added to the Hibernate mapping - but the database schema was old - without the new column.

So not using Java is good because we don't use Hibernate, it's bad because logic (sometimes complicated) must be reimplemented and MySQL's doesn't have rich builtin functions library. BTW, currently DatabaseUpgrader from Mifos doesn't support PL/SQL functions because SQL parsers doesn't support changing a DELIMITER needed for a function definition (I've seen hacks with placing semicolons at the beginning of the line but it seems it works only if DECLARE isn't used ;).

Regards,
Artur

Reply all

Reply to author

Forward