unwind vs reduce

1,305 views
Skip to first unread message

Shiv

unread,
Apr 15, 2017, 4:53:51 PM4/15/17
to mongodb-user
We have new $reduce operator in 3.4 and I'm finding I can replace my existing $unwind calls with $reduce. 

Just wanted to understand the impact. 

1. Is $reduce a better alternative to $unwind ? If yes please explain in terms of space and time complexity.

2. Does $unwind includes sort in its processing ? 

Some references and documentation will be helpful. Appreciate the response. Please let me know if you need specific example.

Rupa Narayanan

unread,
Apr 21, 2017, 4:40:09 PM4/21/17
to mongodb-user
Shiv,

$unwind is an aggregation pipeline stage,  which deconstructs the output by each element in the array and $reduce is an aggregation pipeline operator to combine elements in an array. $reduce needs to be used with an aggregation pipeline stage like $project. For both instances you would need to be cognizant of the aggregation pipeline limitations.

Note that $unwind deconstruct an array input and turns into individual documents. $reduce operator is to process each element in an array through an expression, but is not able to "expand/create" new documents.  They have quite different purpose.
It would be good to understand the structure of your document on how the array is embedded, so could you please share the details with a sample document and how you are using $reduce for your use case, so that I can help provide additional guidance. Additionally, please confirm which MongoDB version you are using with db.version(), is it 3.4?
 
To address your second question, $unwind doesn't include $sort, please refer to this documentation on how to utilize $unwind and $sort together.


Regards,
Rupa

Shiv

unread,
Apr 21, 2017, 5:41:58 PM4/21/17
to mongodb-user
Thank you Rupa for clean explanation. Here is stackoverflow post which I believe demonstrates the use of $reduce and $unwind to find the min item in a collection type.


The post has both answers. Yes my db.version() is 3.4.0.

If you could take a look at both the accepted answer and the other answers and provide your feedback that will be really helpful.

Also, I would be interested to know the Big-0 algorithm space and time complexity for both solutions.

Please let me know if you need more details.

Thanks again.

-Shiv

Rupa Narayanan

unread,
Apr 25, 2017, 7:55:31 PM4/25/17
to mongodb-user
Shiv,

Thanks for sharing the example from Stackoverflow.

In the first instance using $reduce, it compares the current value in the array with the minimum value identified (which is initially set to the first one), as described in the comments the time complexity should O(n), n being the size of the array and the result is stored in the variable $$value, thereby requiring a fixed space O(1).

While using $unwind as described in the second instance, you need to deconstruct the array elements, which would be O(n). Subsequently, need to sort on item_field, once you unwind the array, you can't utilize any available indexes.
In this case, $reduce only executes per document array field items, while $unwind would expand the number of documents to group, i.e. If you unwind 100 documents of 3 array items each, the subsequent stages (sort and group) will end up with 300 documents to process. Thereby increasing the complexity for $unwind to O(n+logn), O(logn) for the sort

Additionally, sort has a couple of limitations - memory and performance that you need to consider, for a bigger array.

Regards,
Rupa


On Saturday, April 15, 2017 at 1:53:51 PM UTC-7, Shiv wrote:
Message has been deleted

Shiv

unread,
Apr 26, 2017, 10:40:19 AM4/26/17
to mongodb-user
Thanks again Rupa for to the point explanation. 

Just one last thing. Would it have matter if we were to use $unwind & $min in the $group instead of $sort and $first ? I believe the Big-0 complexity (O(n+logn) for $unwind, O(n) for $min usage) of this approach. I was just wondering if there is some kind of optimization behind the schema that will reduce it to O(n+logn) for the $unwind part and log n for the min part, same as second instance ? 

Rupa Narayanan

unread,
Apr 27, 2017, 5:37:18 PM4/27/17
to mongodb-user

Shiv,

$min is an option as well and $group has O(n) complexity, so you could achieve the same result. Although the major performance impact for the second instance with $unwind, you will be deconstructing the array, thereby multiplying the number of documents based on the length of 'items' array.
I highly recommend that you run explain on each of these options to understand how the aggregation pipeline processes, each stage and that way you can pick the most efficient query based on your use case.

Regards,
Rupa


On Saturday, April 15, 2017 at 1:53:51 PM UTC-7, Shiv wrote:
Reply all
Reply to author
Forward
0 new messages