index issues (parallel arrays and sort without index controversy)

1,392 views
Skip to first unread message

Valentin Kuznetsov

unread,
Jul 29, 2013, 11:23:44 AM7/29/13
to mongod...@googlegroups.com
Hi,
My documents have common key which I indexed, and I always retrieve documents using this index. But most of the times I need to sort retrieved document on additional key. This additional key quite often is an array. But sorting (which is requested by user) can be applied to different keys (whose values are arrays). So on one hand I can't have index on parallel arrays, on another sort can bark when too much value to sort without the index. Here is an example of a document I need to deal with:

{h: value, a:[1,2,3], b:[x,y,z]}

so I have index on h key, but need to sort on either a or b, e.g.

db.collection.find({h:v}).sort({a:DESCENDING})
db.collection.find({h:v}).sort({b:DESCENDING})

If I create two indexes on a and b, I"ll get error: cannot index parallel arrays
while if I don't have index on a and b, occasionally I get error: too much data for sort() with no index

Is it possible to resolve this issue. I know that people may suggest to modify the document structure, but this is not what I'm looking for. The documents should stay as is since they're created dynamically from different streams and that the whole purpose of using Mongo.

I also understand the issue with combinatorics while dealing with cartesian product of the compound keys, but I thought that having not overlapping compound keys where one key is a base type and another is array may be supported.
Thanks,
Valentin.


Asya Kamsky

unread,
Jul 31, 2013, 1:33:59 PM7/31/13
to mongodb-user
Sorting on an array field does not make sense to me. Let's use this example:

doc1: { h: 5, a: [ 1, 5, 100 ] }
doc2: { h: 5, a: [ 6, 11, 43 ] }

I select { h:5 } so I get both documents. Now if I say .sort( { a :
-1 } ) to sort on "a" descending - what does that mean?

Which of doc1 and doc2 should be returned first and which second?
Highest to lowest means doc1 has highest so it should be first but
doc1 also has lowest so ... can you clarify exactly how you would
expect a sort on array field to work? I suspect since it is ambiguous
you may need to adjust your schema to allow sorting that
deterministic.

Asya
> --
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
>
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Michael Ryan

unread,
Jul 31, 2013, 1:59:59 PM7/31/13
to mongod...@googlegroups.com
Agreed, if you want to sort on data like this, it would be good to put the value you want to sort on outside said array...  I had a similar issue with case-insensitive string keys...

I added an _meta:{...} property to all my objects in said collection, and copied toLowerCase() strings and other calculated search/sort fields into that object... it worked out pretty well for me.

> Date: Wed, 31 Jul 2013 10:33:59 -0700
> Subject: Re: [mongodb-user] index issues (parallel arrays and sort without index controversy)
> From: as...@10gen.com
> To: mongod...@googlegroups.com

Valentin Kuznetsov

unread,
Aug 3, 2013, 10:33:45 AM8/3/13
to mongod...@googlegroups.com
Asya,

using your example I can create index on both h and a keys without problems, Mongo allows that!!! Here is python code which works just fine:

doc1= { "h": 5, "a": [ 1, 5, 100 ] }
doc2= { "h": 5, "a": [ 6, 11, 43] }
db.col.insert(doc1)
db.col.insert(doc2)
db.col.create_index([("h",-1)])
res=db.col.find({"h":5})
for r in res: print r
{u'a': [1, 5, 100], u'h': 5, u'_id': ObjectId('51fd0e3a3f228d1484520083')}
{u'a': [6, 11, 43], u'h': 5, u'_id': ObjectId('51fd0e3c3f228d1484520084')}

db.col.create_index([("a",-1)])
db.col.index_information()
{u'_id_': {u'key': [(u'_id', 1)], u'v': 1},
 u'a_-1': {u'key': [(u'a', -1)], u'v': 1},
 u'h_-1': {u'key': [(u'h', -1)], u'v': 1}}

res=db.col.find({"h":5}).sort([("a",-1)])
for r in res: print r
{u'a': [1, 5, 100], u'h': 5, u'_id': ObjectId('51fd0e3a3f228d1484520083')}
{u'a': [6, 11, 43], u'h': 5, u'_id': ObjectId('51fd0e3c3f228d1484520084')}

res=db.col.find({"a":{"$gt":0}})
for r in res: print r
{u'a': [1, 5, 100], u'h': 5, u'_id': ObjectId('51fd0e3a3f228d1484520083')}
{u'a': [6, 11, 43], u'h': 5, u'_id': ObjectId('51fd0e3c3f228d1484520084')}

res=db.col.find({"a":{"$gt":50}})
for r in res: print r
{u'a': [1, 5, 100], u'h': 5, u'_id': ObjectId('51fd0e3a3f228d1484520083')}

So, the index on a works just fine and its interpretation is MongoDB internals. Honestly I don't know how it is implemented in MongoDB, but it works. While in my example:

{h: value, a:[1,2,3], b:[x,y,z]}

I can't create index on h, a, and b together and I understand that its combinatoric problem. What I don't understand is why sorting operations are not allowed for the cases which I just discussed above, e.g.

db.collection.find({h:v}).sort({a:DESCENDING})
db.collection.find({h:v}).sort({b:DESCENDING})

In both of this cases I only ask to sort on one array at a time and my understanding that Mongo allows to do so (which I have shown above with working code). But I do understand that indexes should exists a-priory and this is where the problem is (i.e. index on parallel arrays). The remedy seems to me should be creation of intermediate "table" from find results, index it on the field the sort function has been asked to, and yield results back to end-user.

Valentin.

Daniel Coupal

unread,
Aug 3, 2013, 4:07:02 PM8/3/13
to mongod...@googlegroups.com

You can create an index on 'a' or on 'b', and you can create both indexes.
What you can't do is have an array on 'a' and 'b' at the same time ({a:1, b:1}).
That array could create a huge amount of combinations. Every document create len(a) x len(b) entries in the index.
If MongoDB accepts 2 values from the array in a single index, why not accepting 3, 4, 10, ... 100?
I guess you see the combinatorial issue.

Indexes on arrays have a different meaning than regular indexes. The latter have a 1-to-n relationship with the data, while the indexes on arrays have a n-to-n relationship. If you query something like a:2, and a document has two times the value '2', you are still getting the document once in the answer. 

If your values on 'a' and 'b' are limited and they have a relation between each other, you can create tuples then create the index on that tuple. Something like t :[(a1,b1), (a2,b2), ... ], then index on 't'.
Otherwise create 2 separates indexes on 'a' and 'b'.

Daniel Coupal

Asya Kamsky

unread,
Aug 3, 2013, 9:32:11 PM8/3/13
to mongod...@googlegroups.com
Of course mongo allows it - my point is that it does not make sense to sort of a field that has multiple values per document.

Asya

From: Valentin Kuznetsov <vku...@gmail.com>
Date: Sat, 3 Aug 2013 07:33:45 -0700 (PDT)
Subject: Re: [mongodb-user] index issues (parallel arrays and sort without index controversy)

Valentin Kuznetsov

unread,
Aug 4, 2013, 10:28:11 PM8/4/13
to mongod...@googlegroups.com, as...@10gen.com
In this sense, it should not be allowed and proper message should be provided back to the user. Since IT IS allowed, it creates ambiguous interpretation, confusion and as I shown leads to unresolved situation. And, my point was that because sort allows it, it barks at the end if I don't have index when I deal with large number of docs. The correct behavior would be that code should throw exception when sort is applied to array field.

arun....@auw.co.in

unread,
Sep 27, 2016, 4:42:44 PM9/27/16
to mongodb-user
I am also getting the same issue i have not created index yet got error it the time of insertion 

Wan Bachtiar

unread,
Oct 10, 2016, 2:28:56 AM10/10/16
to mongodb-user
 

I am also getting the same issue i have not created index yet got error it the time of insertion 

Hi Arun,

The original question on this thread is 2+ years old, please open a new discussion thread with the following relevant information:

  • MongoDB specific version and storage engine.

  • A description of your issue: i.e. your indexes, your operations, your expected outcome, etc.

  • Provide example documents to help clarify your problem.

  • The error message(s) that you are seeing.

Regards,

Wan.


Arun Yadav

unread,
Oct 12, 2016, 5:50:21 AM10/12/16
to mongod...@googlegroups.com
Hi Wan

Thank you so much for your this kind of help , now my problem has been resolved , that was due to wrong value for it 



Regards
Arun

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/aO9q7mhVcCg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user+unsubscribe@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/4084ed61-0549-4372-ac10-ca2f0c5d6104%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Thanks
Arun Kumar Yadav
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages