distinct and sorting

1,845 views
Skip to first unread message

Dominik Gehl

unread,
Oct 3, 2011, 9:01:19 AM10/3/11
to mongodb-user
Hi,

currently, the distinct command doesn't seem to support sorting
(http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-
Distinct). Is this something which is planned ? It would be really
nice if the distinct command supported, in addition to the 'query'
filter, a 'sort' parameter (and why not even a 'limit' and 'skip').

Dominik

Karl Seguin

unread,
Oct 3, 2011, 9:22:35 AM10/3/11
to mongod...@googlegroups.com
There is a jira open for it:

Aside from  limit/offset though, I'm not sure what advantage sorting it in Mongo has vs doing it in code (though I agree, once you add limit/offset, sorting in mongo is critical).


Dominik Gehl

unread,
Oct 3, 2011, 9:30:07 AM10/3/11
to mongod...@googlegroups.com
Independently of the limit/offset part, sorting on fields which are different from the 'distinct' field would provide something which is currently not possible to do on the client side. So yes, both of them together are important, but the sorting in itself also has already a huge benefit …

Dominik

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/TvSggMkazioJ.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Karl Seguin

unread,
Oct 3, 2011, 9:33:20 AM10/3/11
to mongod...@googlegroups.com
I guess. Seems like an edge case. Merely out of curiosity, I'd be curious why you want distinct values sorted by a different field ? :)

Karl

Dominik Gehl

unread,
Oct 3, 2011, 9:38:22 AM10/3/11
to mongod...@googlegroups.com
One possible example could be email messages and attachments (attachments saved in an array inside the email message document). Then getting the distinct list of attachment names ordered by message date would ask for the distinct on 'attachment name', but with a sort on 'message date'.

Dominik

On 2011-10-03, at 9:33 AM, Karl Seguin wrote:

I guess. Seems like an edge case. Merely out of curiosity, I'd be curious why you want distinct values sorted by a different field ? :)

Karl

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/BxJKdE6gHmAJ.

Karl Seguin

unread,
Oct 3, 2011, 9:47:45 AM10/3/11
to mongod...@googlegroups.com
Gotcha.  distinct names sorted by date, or counts makes sense to me now :)

You should upvote the jira that i sent you and maybe drop a comment with that kinda of use case/example

Karl

Dominik Gehl

unread,
Oct 3, 2011, 10:40:21 AM10/3/11
to mongodb-user
Done :-)

Dominik

Brandon Diamond

unread,
Oct 4, 2011, 1:34:18 PM10/4/11
to mongodb-user
Hey there,

You're right -- we don't currently support a "sort" clause for
distinct(). But there are a few things you can try in the meantime.

First -- noting that mapreduce is currently single threaded and relies
on running javascript code at the server -- you can use a mapreduce to
write a distinct dataset to a temporary collection to be sorted
subsequently. Here's an invocation that simply picks the first
document out of a collision group and drops it into a new collection
(... with a slightly different schema):

db.runCommand({ 'mapReduce': 'nums', 'map': function() { emit(this.q,
this); }, 'reduce': function(k, vs) { return vs[0]; }, 'out':
{ 'replace': 'tmpdistinct' } })

Next, you can query this data and sort as you see fit. While there IS
a sort parameter to mapReduce, it won't do what we need: it sorts the
input rather than the output.

Note that this is only a good option if you're dealing with a ton of
data. If, on the other hand, you're working with data that'll fit in
memory easily, your best bet is to do the distinct or the sort in your
client code.

Hope this helps,
- Brandon



On Oct 3, 10:40 am, Dominik Gehl <domi...@dokdok.com> wrote:
> Done :-)
>
> Dominik
Reply all
Reply to author
Forward
0 new messages