Aggregation: documents sorted in reverse order?

83 views
Skip to first unread message

Matt Parlane

unread,
Apr 21, 2015, 3:00:16 AM4/21/15
to mongod...@googlegroups.com
Hi all...

I'm trying to do a "dynamic attributes" system, as described by Asya here: http://askasya.com/post/dynamicattributes

I've got pretty far with it, the only problem I'm having is sorting the documents. This is what I have so far:

db.test.save({data:[{f:1,v:'Adam'}]});
db.test.save({data:[{f:1,v:'Bob'}]});
db.test.save({data:[{f:1,v:'Charles'}]});

db.test.aggregate([
  { $match: { 'data.f': 1 } },
  { $unwind: '$data' },
  { $sort: { 'data.v': 1 } },
  { $group: {
      _id: '$_id',
      data: {
        $push: { f: '$data.f', v: '$data.v' }
      }
    }
  }
]);

Obviously "data" is the array of key/value pairs, "f" if the field ID (in this case 1), and "v" is the value.

The first problem is that it seems to sort in reverse order -- the above spits out Charles/Bob/Adam instead of Adam/Bob/Charles. That's not so much an issue because I can just change 1 to -1, but the other issue is that the sorting is inconsistent -- maybe 10% of the time it will be (eg) Bob/Charles/Adam. I suspect it's using insertion order as a clue.

Have I done the sorting correctly? Any other tips?

Thanks,

Matt

Asya Kamsky

unread,
Apr 21, 2015, 8:19:57 AM4/21/15
to mongodb-user

Matt,

What you describe actually sounds like a bug to me...  What server version are you using?

Asya

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/a644a35d-48da-4d9a-b22b-363c8b754962%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matt Parlane

unread,
Apr 21, 2015, 2:28:19 PM4/21/15
to Asya Kamsky, mongodb-user
On Wed, Apr 22, 2015 at 12:19 AM, Asya Kamsky <as...@mongodb.com> wrote:

> What you describe actually sounds like a bug to me...  What server version are you using?

3.0.2-rc0, I just haven't got round to trying 3.0.2 yet. My dev machine is running 3.0.2-rc0/WT but our production machines are still running 3.0.1/MMAPv1 and I can reproduce it there too.

If it is a bug, it's pretty easy to reproduce -- just run that snippet and observe the ordering. Are you able to reproduce it?

Thanks,

Matt

Matt Parlane

unread,
Apr 22, 2015, 8:12:50 PM4/22/15
to mongod...@googlegroups.com, ma...@webgenius.co.nz, as...@mongodb.com
I thought I'd post a follow-up here in case anyone else came across the same issue.

It turns out it's not a bug, I created SERVER-18163 but it got quickly shot down (fair enough) because grouping makes no guarantees about ordering.

This is what I ended up doing:

db.test.save({data:[{f:1,v:'Adam'},{f:2,v:'Smith'}]});
db.test.save({data:[{f:1,v:'Bob'},{f:2,v:'Jones'}]});
db.test.save({data:[{f:1,v:'Charles'},{f:2,v:'Franklin'}]});

db.test.aggregate([
  { $match: { 'data.f': 1 } },
  { $unwind: '$data' },
  { $project: { sortF: '$data.f', sortV: '$data.v' } },
  { $match: { sortF: 2} },
  { $sort: { sortV: 1 } },
]);

If we say that field 1 is the first name and field 2 if the last name field, this query selects all records and then sorts by the last name field. If we wanted to search on the first name, you would change { $match: { 'data.f': 1 } } to { $match: { data: { $elemMatch: { f: 1, v: /a/ } } } }. That will select anyone whose first name contains "a".

There are two downsides to this approach: the first is that it changes the structure of the output -- the documents you get back will only have the last name field and the _id. This is not a problem for me because although I'm searching thousands of records I'm only displaying 10 at a time, so I can just run the cursor through my own iterator and retrieve the full object using the ID.

The second downside is that any record which does not include the value will be missing from the output. This one is a bigger deal for me -- if I'm sorting by last name, I will probably have some percentage of records which don't have a last name, and I either want those to go first or last, but currently they are missing from the set. One possible solution is to store a null for every missing value, but that will pad out the documents quite a lot when you have lots of fields, which goes against the whole point of this approach.

Hope that helps someone in future, any questions please let me know.

Thanks,

Matt
Reply all
Reply to author
Forward
0 new messages