tags: ['weather', 'hot', 'record', 'april']However, there is just one quick section on the indexing of JSON objects inside an array:
Additionally the same technique can be used for fields in embedded objects:
> db.posts.find( { "comments.author" : "julie" } )
{"title" : "How the west was won" ,
"comments" : [{"text" : "great!" , "author" : "sam"},
{"text" : "ok" , "author" : "julie"}],
"_id" : "497ce79f1ca9ca6d3efca325"}
That's if I understand the question.
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.
>
{a:1, b:1} != {b:1, a:1}
When comparisons are done on the document they are basically a binary compare.
On Wed, Feb 23, 2011 at 9:00 AM, Keith Branton <ke...@branton.co.uk> wrote:
> Does your last test work if the order matches? i.e. is it multiple elements
> or arbitrary order that is breaking it?
>
You can't use a whole document index to match fields within that doc.
It would require the match code to decode each document in the index
to do a comparison for the just the fields you are looking for; it is
something that could be useful, but it doesn't work that way now. If
you want that behavior you are better off creating a compound index
with just the fields you are concerned about.
> If a query for an exact matching document is used it works and uses the
> index:
> .find({records:{status:200,type:"img/jpg"}}).explain()
>
You can't use a whole document index to match fields within that doc.
It is something that could be useful, but it doesn't work that way now.
Once you have broken it down into k/v then you can just create a
compound index. The reason Eliot suggested the full document index was
since you didn't have a known set of fields.
Nothing can reach into values within an indexed field (like an
embedded document). For example, regex queries that need to check
within a string can't use an index either. It is the same concept.
>>
>> It is something that could be useful, but it doesn't work that way now.
>
> Do you want to open a jira for that or should I? It seems like it's a
> deficiency with some nasty workarounds - either
>
> Create an arbitrary number of compound indexes to speed up all the slow
> queries - very difficult to keep track of since the OP stated that there are
> many possible keys in these documents, and not very scalable because of the
> increase in working set size and degradation in update performance from
> maintaining all these indexes
> Restructure your data to take advantage of what Mongo is able to do today,
> but losing a lot of the intuitiveness of a clean, simple data structure.
>
> Neither particularly appealing imo.
>
> db.test3.save({_id:1, "records" : [ {name:"ralph"}, {age:"old"}, {hair:"brown"}]})
> db.test3.save({_id:2, "records" : [ {name:"scott"}, {age:"old enough"}, {hair:"brown"}]})
> db.test3.save({_id:3, "records" : [ {name:"chris"}, {eyes:"blue"}, {hair:"green"}]})
>db.test3.ensureIndex({records:1})
Now you can search for documents with any number of record elements.
> db.test3.find({records:{hair:"brown"}}) // 2 results
> db.test3.find({records:{$all:[{name:"scott"}, {hair:"brown"}]}}) //just scott
This is very much what Kieth suggested but there is no need to add the
extra k/v fields. Normally when you do the k/v thing you would index
with a compound index on both values, but this is probably simpler and
easier to understand.
On Wed, Feb 23, 2011 at 10:58 AM, Aleksei T <ale...@gmail.com> wrote:
> I would love to use Eliot's suggestion, as it is definitely simple and
> intuitive, but I would need the ability to find documents with a match on an
> arbitrary number of fields in the order that may be different than what's in
> the document. It seems like with that approach I could only match on a
> single field, but if I did multiple in matching or non-matching order, it
> didnt work.
You can match on multiple fields but the order does matter; and that
can be complicated.
> If the array approach does work with a compound index, that's fine, too, as
> the queries will be generated programmatically, so their "clumsiness" is
> less important than ability to use the index and allow querying by multiple
> elements in random order.
>
The order doesn't matter in the array. The part where the order
mattered is the field order, {a:1, b:1} (the order of a, b in that
doc). Since all of your embedded docs are single value the order is
always the same. If you started to store more than one field in the
array elements then it will problematic.
>> r.find({records:{ $all:[ {custid:456},{status:200} ]
>> }})
> { "_id" : ObjectId("4d6574f485904ac21ab43fca"), "bytes" : 10, "id" : 1,
> "records" : [ { "status" : 200 }, { "url" : "yahoo.com" }, { "custid" : 456
> } ] }
> { "_id" : ObjectId("4d6574fa85904ac21ab43fcb"), "bytes" : 10, "id" : 1,
> "records" : [ { "status" : 200 }, { "url" : "cnn.com" }, { "custid" : 456 }
> ] }
>> r.find({records:{ $all:[ {custid:456},{status:200} ] }}).explain()
> {
> "cursor" : "BtreeCursor records_1",
> "nscanned" : 4,
> "nscannedObjects" : 4,
> "n" : 2,
> "millis" : 0,
> "indexBounds" : {
> "records" : [
> [
> {
> "custid" : 456
> },
> {
> "custid" : 456
> }
> ]
> ]
> }
> }
>
> One last question - the explain() statement in the above examples - does
> info in "indexBounds" mean it is using the index? How do you know if the
> query is indeed using an index?
The "cursor : BtreeCursor records_1" part means it is using a
BTreeCursor called records_1
Yes, and yes; I'm not sure what you mean by works like a compound
index. Since it is a multivalue field it is a little more complicated
and in some ways restrictive. For each value in the array and index
value is created. In order to do an all it could require scanning the
index for all of the criteria and then taking a union, or just search
for the first term and then analyzing the docs for the rest of the
criteria. I believe ATM mongodb does that later only.
It is up to the indexing system to choose the best behavior; as I
mentioned, that is the way things currently work, but in the future
the query optimizer could choose a different path. ATM it is rather
simplistic and optimizations will be added.
There are a huge number of improvements/optimizations that can be made..
Sometimes the first step is get the indexes working at all for certain
use-cases :)
One little problem (and probably the reason we thought of using the array) is that these embedded objects can have different names and values
For example, these two documents:
{"id:123}
"records" : {
"status" : "200",
"URL" : "www.cnn.com",
"customerId" : "12345"
}
"bytes" : 2500
"requests" : 546
vs.
{"id":456}
"records" : {
"contentType" : "img/jpg",
"status" : "404",
}
"bytes" : 45600
"requests" : 2300
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/L7Q7z1gTE9QJ.