Best Practices: Save Empty Fields as null or omit Field completely and Manage missing fields in code?

765 views
Skip to first unread message

Ron Yosipovich

unread,
Jan 29, 2015, 12:34:58 PM1/29/15
to mongod...@googlegroups.com
Given collection that will have 50+ million documents,
and every document will have a maximum number of fields (shown in option a)
What is the best practice for dealing with fields that can be null/sparse?

a)is it better to save every document with the same fields and empty fields as null?

{
  "_id": "54ca5b234d2dfeba4f9ab613",
  "person_id": 1,
  "office_phone": null,
  "person_name": "Ron",
  "lname": " lime",
  "description": "This is where the description is entered",
  "technical_description_derived": null

}


b)Or is it better to leave out the fields if they have no data?

{
  "_id": "54ca5b234d2dfeba4f9ab613",
  "person_id": 1,
  "person_name": "Ron",
  "lname": " lime",
  "description": "This is where the description is entered",
}

What are the considerations in keeping fields with null in terms of mongod ram usage and performance? 
Essentially omitting fields that are empty would create a situation where the application needs to implement a way to check if fields exist.
What considerations should there be at the application level if delegating this "empty check" to code? Is it heavy?
Will omitting fields in the collection if they are empty just delegate the heaviness to the client code layer?

Will Berkeley

unread,
Jan 29, 2015, 1:00:04 PM1/29/15
to mongod...@googlegroups.com
A lot of this depends on what exactly your application is doing with or expecting from the missing fields, but in general I would prefer b as it saves a bit of space (on disk and when in ram) in a document to omit a field as opposed to setting it to null. Some kind of helper function, like a getOrElse, should make dealing with missing values easy (or at least reduce it to the null case). The following small example in the shell shows off ways to query missing and null field values:

> db.test.drop()
> db.test.insert({ "a" : null })
> db.test.insert({ })

// find just missing field values
> db.test.find({ "a" : { "$exists" : false } })
{ "_id" : ObjectId("...") }

// find just not missing field values
> db.test.find({ "a" : { "$exists" : true } })
{ "_id" : ObjectId("..."), "a" : null }

// find missing or null field values - a bit tricky!
> db.test.find({ "a" : null })
{ "_id" : ObjectId("..."), "a" : null }
{ "_id" : ObjectId("...") }

// find values that are not missing and not null - inefficient because of $ne
> db.test.find({ "a" : { "$ne" : null } })
// no results

If you are planning on using a sparse index, make sure you read the docs carefully as, in MongoDB 2.6, a sparse index won't be chosen when it could return incomplete results. For example, 

> db.test.find({ "a" : null })

would return 1 result if it used a sparse index on a because there would not be an index entry for the document missing an a field. MongoDB >= 2.6 will not use a sparse index on a unless the sparse index is explicitly hinted:

> db.test.ensureIndex({ "a" : 1 }, { "sparse" : true })
> db.test.count({ "a" : null })    // does not use index
2
> db.test.count({ "a" : null }).hint({ "a" : 1 })    // forced to use index
1

Hopefully this is helpful in guiding you on what to do with missing/null fields. Feel free to follow up with further questions.

-Will

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/7a4855f1-b3e8-43bc-bf52-5bbb37a3fd5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages