Calculating match percentage using new aggregation pipeline set operators

1,199 views
Skip to first unread message

Tam Do

unread,
May 3, 2014, 7:23:00 PM5/3/14
to mongod...@googlegroups.com
Hi,

I am working on a search algorithm implemented using the mongo aggregation pipeline.  I have essentially two sets:

Set A contains a list of unique items e.g., ['A', 'B', 'C', 'D', 'E', 'F', 'G']

Set B contains a unique list of items which may or may be present in Set A e.g., ['C', 'B', 'G', 'H', 'J', 'K']

Using the $setIntersection operator I can find a list of elements common to sets A and B

{$project : { C : { $setIntersection : [A, B] } }

Which should result in C : ['C', 'B']

What I would like to do is take the count of C and divide it by the length of A

percent_match = len(A) / len(C)

and return results which are greater than a defined threshold.

Is this possible with the aggregation pipeline? If so, I could use some pointers in terms of operators and structure of the pipeline.

Thanks!

--Tam Do

Asya Kamsky

unread,
May 10, 2014, 3:48:08 AM5/10/14
to mongodb-user

You can use the $size operator (new in 2.6) to project size of array and use $divide and $multiply to calculate percentage from that.

Asya

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/e1719e83-433a-40f4-8269-1dbf1e38bd43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages