Re: [mongodb-user] Using distinct() with limit()

1,301 views
Skip to first unread message

Max Schireson

unread,
Aug 1, 2012, 7:26:33 AM8/1/12
to mongod...@googlegroups.com

Which 20 values would you want?

Often people want eg the most commonly occuring which means looking at everything.

Is it really just any arbitrary 20 that you need?

-- Max

On Aug 1, 2012 3:07 AM, "rang3r" <ana...@choister.net> wrote:
Hello.
Please help me with following issue.
I need to get distinct values (size of result <= n) from collection by fields.
On jira exists issue (https://jira.mongodb.org/browse/SERVER-2130) but its unresolved.
How to effectively get need values ?
Execute distinct and get first n values - dont suggest :)

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb

rang3r

unread,
Aug 1, 2012, 7:42:36 AM8/1/12
to mongod...@googlegroups.com
Yes, i need any 10(or less) values by every field on collection.
Its debug information. 

среда, 1 августа 2012 г., 15:26:33 UTC+4 пользователь Max Schireson написал:

Max Schireson

unread,
Aug 1, 2012, 12:50:33 PM8/1/12
to mongod...@googlegroups.com

I don't know of a great solution. Part of the challenge is that you may need to look at nearly every value to get 10 distinct (imagine there are 12 possible values but 7 of them are very very rare).

If you can live with sometimes having less than 10, what I'd do in real life is just get 100 values (or 1000) and do the distinct client side, stopping if I got to 10. Not sure how fast this needs to be but you can easily dial between performance and the odds of getting all 10. While its ugly, that same basic tradeoff exists on any inindexed field even if the server supported it.

Hopefully someone else has a better idea.

-- Max

>>> mongodb-user...@googlegroups.com


>>> See also the IRC channel -- freenode.net#mongodb
>

> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to

> mongodb-user...@googlegroups.com

rang3r

unread,
Aug 2, 2012, 5:06:07 AM8/2/12
to mongod...@googlegroups.com
So, thank you Max for yours answers.
I write following function:
function (type, coll) {
var result = {};
var filter = (type == undefined || type == null ? {} : {type:type});
var collection = (coll == undefined || coll == null ? 'objects' : coll);
db[collection].find(filter).forEach(function(object) {
for (var field_name in object) {
// var field_name = field_names[i];
var values = result[field_name];
if (values == undefined || values == null)
values = [];
if (values.length >= 10) {
continue;
}
var value = object[field_name];
if ((value != undefined || value != null) && values.indexOf(value) == -1) {
values.push(value);
result[field_name] = values;
}
}
});
return result;
}
They work as i need.

среда, 1 августа 2012 г., 20:50:33 UTC+4 пользователь Max Schireson написал:

I don't know of a great solution. Part of the challenge is that you may need to look at nearly every value to get 10 distinct (imagine there are 12 possible values but 7 of them are very very rare).

If you can live with sometimes having less than 10, what I'd do in real life is just get 100 values (or 1000) and do the distinct client side, stopping if I got to 10. Not sure how fast this needs to be but you can easily dial between performance and the odds of getting all 10. While its ugly, that same basic tradeoff exists on any inindexed field even if the server supported it.

Hopefully someone else has a better idea.

-- Max

On Aug 1, 2012 4:42 AM, "rang3r" <ana...@choister.net> wrote:
>
> Yes, i need any 10(or less) values by every field on collection.
> Its debug information. 
>
> среда, 1 августа 2012 г., 15:26:33 UTC+4 пользователь Max Schireson написал:
>>
>> Which 20 values would you want?
>>
>> Often people want eg the most commonly occuring which means looking at everything.
>>
>> Is it really just any arbitrary 20 that you need?
>>
>> -- Max
>>
>> On Aug 1, 2012 3:07 AM, "rang3r" <ana...@choister.net> wrote:
>>>
>>> Hello.
>>> Please help me with following issue.
>>> I need to get distinct values (size of result <= n) from collection by fields.
>>> On jira exists issue (https://jira.mongodb.org/browse/SERVER-2130) but its unresolved.
>>> How to effectively get need values ?
>>> Execute distinct and get first n values - dont suggest :)
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongod...@googlegroups.com
>>> To unsubscribe from this group, send email to


>>> See also the IRC channel -- freenode.net#mongodb
>
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to

Max Schireson

unread,
Aug 2, 2012, 11:58:58 AM8/2/12
to mongod...@googlegroups.com

Great! Glad its working and sorry its not built in yet...

-- Max

Reply all
Reply to author
Forward
0 new messages