On Friday, May 11, 2012 8:51:07 AM UTC-4, Daniel Galinkin wrote:
> Just to record it here, we have tried another (failed) approach, that
> was to use a binary mapping field.
> Let's use the same example. Suppose we have thre fields, "f1", "f2"
> and "f3", each of them with three possible values, 1, 2 and 3.
> We would then map those values to bits.
> 1=01
> 2=10
> 3=11
> And then, we built a new field in the documents, "b", that was a
> binary string representing the values of the three fields ("f1", "f2",
> "f3"). The first two bits represent the "f1" value, the next two
> represent the "f2" value, and the last two represent the "f3" value.
> A document that has "f1"=1, "f2"=3 and "f3"=2 would have the field
> "b"=011110.
> Our ideia was to index this field, and then we would have a simple
> index on a single field.
> This would work fine if the user always queried for a specific value
> in all of the three fields. However, as we said in the first post, the
> user can also ask for a "anything" value for a specific field, a
> "don't care" value.
> So, if the user asks for documents in which f1=1,f2=1 and f3=2, we
> would just make a query like db.docs.find({b:'010110'}). However, if
> this user then decides that f2 can be anything, we would have to
> search for documents that match a certain bitmask. This operation is
> only possible with a map reduce operation, which is expensive, as we
> would have to scan the whole collection.
> We then tried to expand the possibilities, and query using an $in
> operation. If the user queries for documents in which f1=1, f2 can be
> anything and f3=2, he would actually be querying for documents in
> which "b" is either 010110, 011010 or 011110. However, our actual
> application has 10+ fields, and since the user can set any number of
> fields to 'anything', the number of combinations would get huge, and
> the query would get too expensive again.
> I just listed this hoping to make this problem clearer, and to record
> our attempts to tackle it.
> On May 11, 9:17 am, Daniel Galinkin <danielgalin...@gmail.com> wrote:
> > Thanks for the answer!
> > I just voted for and began watching this ticket.
> > As for the suggestion, I didn't quite understand how this will enable
> > us to query for documents that match a set of criterias, such as f1=1
> > AND f2=3. Also, what's the advantage of using a separate collection
> > and index for this, instead of indexing the original collection?
> > Thanks once more for your trouble.
> > On May 9, 2:55 pm, Kyle Banker <k...@10gen.com> wrote:
> > > First, I'd recommend commenting and voting and adding yourself as a
> watcher
> > > on SERVER-1000.
> > > Another technique to try would be to store each field along with the
> > > document id it references in a separate collection. So, your entries
> would
> > > look like this:
> > > {_id: ObjectId, d_id: ObjectId, k: "f1", v: 1}
> > > Next, build the following compound index:
> > > {k: 1, v: 1, d_id: 1}
> > > Then you can do a query like this:
> > > db.docs.find({$all: [{k: "f1", v: 1}]}, {_id: 0, d_id: 1})
> > > This should do an index-only scan and return a set of document ids
> > > corresponding to your original document. I know this isn't ideal, but
> the
> > > probably the next best strategy until SERVER-1000 is fixed.
> > > On Tuesday, May 8, 2012 5:58:20 PM UTC-4, Daniel Galinkin wrote:
> > > > Hello,
> > > > We have a problem with our application backed by MongoDB, we were
> > > > wondering if you could give us some hints as to how to proceed.
> > > > Our application is as follows: each document has a number of fields
> > > > that can be included in a query issued by the user. When the user
> > > > queries for the documents, he can either choose a specific value for
> a
> > > > field, or say he does not care what the value of this field is.
> > > > For example: suppose we have the fields "f1", "f2" and "f3". Each of
> > > > those can have 3 possible values: 1, 2 or 3. A possible query the
> user
> > > > can make is: find me the documents where f1=2, f2=1 and f3=2.
> Another
> > > > possible query is: find me the documents in which f1=2, f3=3 and the
> > > > value of f2 can be anything.
> > > > We are having some problems indexing this collection for this query.
> > > > We have many fields (10+), so making a compound index with all the
> > > > fields is too expensive, as the index gets way too big, and since we
> > > > do not use every field in the query every time, it is also
> inneficient
> > > > to use.
> > > > We are currently trying to implement a fulltext search-like
> solution,
> > > > in which we created a array field "a" in each document, that
> contains
> > > > "tags" in it representing the values of our fields. For example, a
> > > > document that has f1=1, f2=3 and f3=1 would have this array containg
> > > > the tags ["f1-1", "f2-3", "f3-1"]. Then, our idea was to use the
> $all
> > > > operator based on this. Using the previously mentioned example
> query:
> > > > find me the documents in which f1=2, f3=3 and the value of f2 can be
> > > > anything. This would be translated to {"a" : {$all : ["f1-2",
> > > > "f3-3"]}. We also indexed this field, following the suggestions
> listed
> > > > inhttp://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo.
> When
> > > > making this query, using this index, the results were not good.
> Using
> > > > the explain() feature, we saw that the database is going through all
> > > > documents in which f1=2, and searches in them the ones that have
> f3=3.
> > > > For example, if there are 2000 documents with f1=2, and out of
> those
> > > > just 2 also have f3=3, mongo scans all the 2000 documents to return
> > > > just those 2.
> > > > We came across this jira ticket (https://jira.mongodb.org/browse/
> > > > SERVER-1000 <https://jira.mongodb.org/browse/SERVER-1000>) that
> seems to
> > > > be directly related to our problem, so it
> > > > seems mongo still does not support the solution we tried to use.
> > > > Do you guys have a suggestion of how to solve this problem?
> > > > Thanks in advance.