Schema changes - what happens to older documents?

4,313 views
Skip to first unread message

Nik Martin

unread,
May 3, 2013, 4:13:51 PM5/3/13
to mongoo...@googlegroups.com
I'm very new to mongoose and mongo, but proficient in relational sql databases. I have a project that is very well suited to a nosql document model, so I'm using mongo(ose).

How do I handle changes to a document's schema over time?  

In my SQL RDB lingo, I'm referring to things like adding required columns (keys)  to tables (collections), possibly migrating a data type or required constraint, etc. The process usually involves lengthy migrations, temporary tables, lots of sql scripts, application code changes, etc. In node/mongo/mongoose, it appears that I jest need to open my model.js file, edit the scheme definition, make any app changes to support the new schema, and poof, the mongodb collection magically changes to support my new schema/ What happens to all the documents that used the old schema?  How are they accessible now when keys have potentially moved, been renamed, deleted, etc? I've been all over the Group and StackExchange and there are lots of questions, but no answers.  Mongoose is supposed to handle exaclty this problem with SQL/RDBMSs, but I can't find out how it happens or what the process is.  

Is it magic, and everything "just works"?  
How do I know which schema a document was created with? 
What happens when a constraint is added to a key that wasn't there or is now invalid on older documents?

I'm in a cold sweat over this.

Aaron Heckmann

unread,
May 3, 2013, 5:18:23 PM5/3/13
to mongoo...@googlegroups.com
Hi Nik, great question. 

First, MongoDB doesn't have any concept of a schema and will gladly accept any shaped document in any collection in any db at any time. So its up to the application layer to handle schema enforcement.

One approach is to gradually migrate documents as the application runs. Application code / mongoose plugins/middleware would handle the case where the document is considered "old" and update the document to the new style. As each document is saved the collection gets migrated. You might add a virtual field to help sort out accessing different properties based on the document version as well. Keep in mind if some documents are never touched again they never get updated. Summary: this approach doesn't really increase write load since updates only occur when they'd normally be taking place anyway. Disadvantage is multiple existing document versions and long migration periods.

Another approach is to write a script that walks the entire collection and updates each document one-by-one. This approach increases write load during the duration of the script which may impact production performance. Keep in mind MongoDB doesn't have transactions - write atomicity is at the document level so plan accordingly. Advantage is a shorter migration period but realistically, clients may still see both versions of the document if the process takes long enough.

Adding a version property to your schemas that is incremented when you make schema changes is an approach - could even managed through mongoose middleware. This might be overkill for some scenarios since you can often figure out which document version you are using just based on existing / missing properties.

Whatever approach you choose, its important to consider that application code and scheduled jobs that read/write to mongodb are prepared to handle different versions of the documents.

Another aspect of a modified schema is the introduction / removal of indexes. By default, mongoose sends an ensureIndex command for each index declared in the schema during application startup. While convenient for development, this can be a bad thing in production for large datasets since index creation may take a long time and if created in the foreground will block. Mongoose does specify indexes be created in the background so the database is not blocked during their creation which is generally better but results in the creation process taking much longer. For production I generally recommend disabling the auto index feature and managing them manually. Auto index creation can be disabled by setting the autoIndex schema option to false: http://mongoosejs.com/docs/guide.html#autoIndex As for index removal, mongoose does not remove the index from the collection, this must be performed manually though there may be a plugin out there that does this for you. Try searching http://plugins.mongoosejs.com.

Bottom line, migrations in MongoDB are not magical; care and planning are still necessary to perform a successful migration with MongoDB.


--
Documentation - http://mongoosejs.com/
Plugins - http://plugins.mongoosejs.com/
Bug Reports - http://github.com/learnboost/mongoose
Production Examples - http://mongoosejs.tumblr.com/
StackOverflow - http://stackoverflow.com/questions/tagged/mongoose
Google Groups - https://groups.google.com/forum/?fromgroups#!forum/mongoose-orm
Twitter - https://twitter.com/mongoosejs
IRC - #mongoosejs
---
You received this message because you are subscribed to the Google Groups "Mongoose Node.JS ODM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongoose-orm...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Aaron


Nik Martin

unread,
May 3, 2013, 6:25:11 PM5/3/13
to mongoo...@googlegroups.com
Aaron,

Thank you for the detailed and informative reply.  I am actually GLAD MongoDB is Schema-Less, that's 90% of why I chose it vs. DynamoDB for our document storage.  Our document schema is HUGE, it contains an entire Patient Care Report, and after comparing MongoDB + Mongoose to DynamoDB it was a no-brainer.  That relieves me that with careful planning, an entire collection can pretty easily be brought forward pretty lazily without issues. A schema_version key with a default value is a perfect way to know how to handle a specific document, if that's even necessary.

Thanks again, and I look forward to using  Mongoose.  

Nik
Open Frame
Reply all
Reply to author
Forward
0 new messages