A lot of this depends on what exactly your application is doing with or expecting from the missing fields, but in general I would prefer b as it saves a bit of space (on disk and when in ram) in a document to omit a field as opposed to setting it to null. Some kind of helper function, like a getOrElse, should make dealing with missing values easy (or at least reduce it to the null case). The following small example in the shell shows off ways to query missing and null field values:
> db.test.drop()
> db.test.insert({ "a" : null })
> db.test.insert({ })
// find just missing field values
> db.test.find({ "a" : { "$exists" : false } })
{ "_id" : ObjectId("...") }
// find just not missing field values
> db.test.find({ "a" : { "$exists" : true } })
{ "_id" : ObjectId("..."), "a" : null }
// find missing or null field values - a bit tricky!
> db.test.find({ "a" : null })
{ "_id" : ObjectId("..."), "a" : null }
{ "_id" : ObjectId("...") }
// find values that are not missing and not null - inefficient because of $ne
> db.test.find({ "a" : { "$ne" : null } })
// no results
If you are planning on using a
sparse index, make sure you read the docs carefully as, in MongoDB 2.6, a sparse index won't be chosen when it could return incomplete results. For example,
> db.test.find({ "a" : null })
would return 1 result if it used a sparse index on a because there would not be an index entry for the document missing an a field. MongoDB >= 2.6 will not use a sparse index on a unless the sparse index is explicitly hinted:
> db.test.ensureIndex({ "a" : 1 }, { "sparse" : true })
> db.test.count({ "a" : null }) // does not use index
2
> db.test.count({ "a" : null }).hint({ "a" : 1 }) // forced to use index
1
Hopefully this is helpful in guiding you on what to do with missing/null fields. Feel free to follow up with further questions.
-Will