What does "BLOCKING" mean in "BLOCKING SORT STAGE"

SeungUck Lee

unread,

Jul 22, 2015, 2:55:22 PM7/22/15

to mongodb-dev

Hi.

In MongoDB, lots of peoples said MongoDB have to do "Blocking sort" when MongoDB optimizer can not utilize index ordering.

(Also including AND_HASH stage, they call "Blocking stage")

https://github.com/mongodb/mongo/blob/master/src/mongo/db/query/planner_analysis.cpp#L575

I don't understand why the word("BLOCKING") is included in this stage name (And what "BLOCKING" means?).

I try to find out this on google, But there's only MongoDB source code listed.

Is there anyone who knows about story of "BLOCKING" keyword ?

Thanks.

MarkCallaghan

unread,

Jul 22, 2015, 9:34:51 PM7/22/15

to mongodb-dev, sungu...@gmail.com

I assume this is a blocking operator as used by many DBMS products. All input must be consumed by the sort step before it can produce any output. If the input is a full table or index scan you might not be happy with the response time. If there is an index to provide the order then you might not need a blocking sort.
http://stackoverflow.com/questions/27168905/blocking-sort-operators

SeungUck Lee

unread,

Jul 23, 2015, 2:45:18 PM7/23/15

to mongodb-dev, mdca...@gmail.com

Thanks Mark.

I have assumed "There's some document or database lock might be involved during sorting, Because stage name contains BLOCKING keyword".

Anyway this "BLOCKING" means buffering all result document for sorting in MongoDB server memory.

David Storch

unread,

Jul 23, 2015, 3:38:18 PM7/23/15

to mongodb-dev, mdca...@gmail.com, sungu...@gmail.com

Hi SeungUck,

Mark is exactly right. When we talk about a "blocking" query execution stage, we mean that the algorithm requires retrieval of all the results from the child stage before it can begin returning results. A "blocking query plan" is a plan that includes at least one blocking execution stage. This has nothing to do with locking or latching; that is, when we say a query plan is "blocking", it does not mean that the execution is waiting on a lock manager lock or otherwise sleeping on an operating system wait queue. (In fact, a .find() operation will request an intent shared, or MODE_IS, lock on the collection at the beginning of the operation. Once this lock is granted, there is no further locking and minimal further latching involved in the query engine layer.)

The blocking stages currently implemented in the MongoDB query engine, as you mentioned, are SORT and AND_HASH. SORT does an in-memory sort of the result set, and therefore must buffer the entire result set before sorting it and then returning the results. AND_HASH is hash-based index intersection. It must retrieve all of the results from its first child index scan in order to construct a hash table. The results from the second child index scan are then used to probe this hash table; hits are returned to the parent stage and misses are thrown out.

Non-blocking plans are referred to as "fully pipelined". This means that as soon as the high-level query-running code starts requesting query results in order to copy into the outgoing network message, the matching results can stream through a pipeline of query stages (all the way from the storage-engine access stages at the leaves to the higher-level stages like PROJECTION or LIMIT).

Hope that helps, and happy to answer any further questions.

Best,

Dave

SeungUck Lee

unread,

Jul 23, 2015, 4:05:51 PM7/23/15

to mongodb-dev, mdca...@gmail.com, david....@mongodb.com

Hi Dave.

Really thanks for your detailed explain.

Regards,

Reply all

Reply to author

Forward