(This may already be filed, but there are too many index-related tickets to find any one in particular, and the ticket search is not very good.)
Given a collection like this:
db.player = {
name: "Glenn",
scores: {
ping_pong: { points: 2000 },
golf: { points: 100 }
}
}
I want an index on each of the keys of scores, so I can efficiently eg. find and sort on 'scores.ping_pong.points'. However, there are too many keys to create indexes on 'scores.ping_pong', 'scores.golf', and so on--there many be hundreds of possible keys.
It would help if it was possible to create an index on 'scores.$key.points', which would effectively be a compound index on (key name, points), so this type of structure can be indexed without creating an index for every possible key. That way, a single index would work for both find({scores.ping_pong.points: 100}) and find({scores.golf.points: 100}). In the above data, two index entries would be created: one on ('ping_pong', 2000) and the other on ('golf', 100).
This could probably also be used to accelerate $exists. A ({'name': 1, 'scores.$key': 1}) index should allow find({name: 'Glenn', 'scores.ping_pong': {$exists: true}}) to be done efficiently.
A compound key of ('x', 'scores.$key.points', 'y') would expand to ('x', 'key name', 'points', 'y'). Similarly, ('a.$key.b', 'b.$key.c') expands to ('first key name', 'b', 'second key name', 'c').
The workarounds for this are awkward, like representing scores as an array and moving the keys to fields, eg. scores: [{game: 'ping_pong', points: 2000}], so I can index on 'scores.game'. That's painfully unnatural (extra hoops have to be jumped to ensure duplicate entries aren't created; looking up a score for a game is harder, and so on), and prevents covered index searches entirely (since it's now a multikey index). It'd be nice to not need hacks like this.
(Moving scores entries to their own collection isn't ideal, either, since it would make another use case--retrieving all scores for a single user--more expensive. If the user has 100 score entries to look up, and the records are scattered across the collection, that's a lot of disk seeking. That's one reason it's all stored in one document. There's more data in the real entries, so covered-index searches aren't a comfortable workaround--it won't fit in memory.)
--
Glenn Maynard