Suitability of MongoDB for hierarchial type queries

207 views
Skip to first unread message

IanC

unread,
Apr 8, 2011, 3:47:01 PM4/8/11
to mongodb-user
I have a particular data manipulation requirement that I have worked
out how to do in a relational DB. However, I'm not too happy with the
speed, so I am investigating MongoDB (which I haven't used, yet).

The best way to describe the query is as follows. Picture the
hierarchical data of the USA: Country, State, County, City. Let's say
a particular vendor can service the whole of California. Another can
perhaps service only Los Angeles. There are potentially hundreds of
thousands of vendors and they all can service from some point(s) in
this hierarchy down. I am not confusing this with Geo - I am using
this to illustrate the need.

Using recursive queries, it is quite simple to get a list of all
vendors who could service a particular user. If he were in say
Pasadena, Los Angeles, California, we would walk up the hierarchy to
get the applicable IDs, then query back down to find the vendors.

I know this can be optimized. Again, this is just a simple query
example.

I know MongoDB is a document store. That suits other needs I have very
well. The question is how well suited is it to the query type I
describe? (I know it doesn't have joins - those are simulated).

I get that this is a "how long is a piece of string" question. I just
want to know if anyone has any experience with MongoDB doing this sort
of thing.

Jared

unread,
Apr 8, 2011, 4:12:03 PM4/8/11
to mongodb-user
One option is to use an array key. You can store the hierarchy as an
array of values (for example ['US','CA','Los Angeles']). Then you can
query against records based on individual elements in that array key

For example:

First, store some documents with the array value representing the
hierarchy

> db.hierarchical.save({ location: ['US','CA','LA'], name: 'foo'} )
> db.hierarchical.save({ location: ['US','CA','SF'], name: 'bar'} )
> db.hierarchical.save({ location: ['US','MA','BOS'], name: 'baz'} )

Make sure we have an index on the location field so we can perform
fast queries against its values

> db.hierarchical.ensureIndex(['location',1])

Find all records in California

> db.hierarchical.find({location: 'CA'})
{ "_id" : ObjectId("4d9f69cbf88aea89d1492c55"), "location" : [ "US",
"CA", "LA" ], "name" : "foo" }
{ "_id" : ObjectId("4d9f69dcf88aea89d1492c56"), "location" : [ "US",
"CA", "SF" ], "name" : "bar" }

Find all records in Massachusetts

> db.hierarchical.find({location: 'MA'})
{ "_id" : ObjectId("4d9f6a21f88aea89d1492c5a"), "location" : [ "US",
"MA", "BOS" ], "name" : "baz" }

Find all records in the US

> db.hierarchical.find({location: 'US'})
{ "_id" : ObjectId("4d9f69cbf88aea89d1492c55"), "location" : [ "US",
"CA", "LA" ], "name" : "foo" }
{ "_id" : ObjectId("4d9f69dcf88aea89d1492c56"), "location" : [ "US",
"CA", "SF" ], "name" : "bar" }
{ "_id" : ObjectId("4d9f6a21f88aea89d1492c5a"), "location" : [ "US",
"MA", "BOS" ], "name" : "baz" }

Note that in this model, your values in the array would need to be
unique. So for example, if you had 'springfield' in different states,
then you would need to do some extra work to differentiate.

> db.hierarchical.save({location:['US','MA','Springfield'], name: 'one' })
> db.hierarchical.save({location:['US','IL','Springfield'], name: 'two' })
> db.hierarchical.find({location: 'Springfield'})
{ "_id" : ObjectId("4d9f6b7cf88aea89d1492c5b"), "location" : [ "US",
"MA", "Springfield" ], "name" : "one" }
{ "_id" : ObjectId("4d9f6b86f88aea89d1492c5c"), "location" : [ "US",
"IL", "Springfield" ], "name" : "two" }

You can overcome this by using the $all operator and specifying more
levels of the hierarchy. For example:

> db.hierarchical.find({location: { $all : ['US','MA','Springfield']} })
{ "_id" : ObjectId("4d9f6b7cf88aea89d1492c5b"), "location" : [ "US",
"MA", "Springfield" ], "name" : "one" }

> db.hierarchical.find({location: { $all : ['US','IL','Springfield']} })
{ "_id" : ObjectId("4d9f6b86f88aea89d1492c5c"), "location" : [ "US",
"IL", "Springfield" ], "name" : "two" }

-j

IanC

unread,
Apr 9, 2011, 12:56:11 AM4/9/11
to mongodb-user
Reply all
Reply to author
Forward
0 new messages