Hello,
I am working on supporting Geo-spatial operations in MongoDB (Intersects, Contains, etc) using GPUs. My plan is to move the filter phase (where the condition is checked for the bounding boxes) on GPUs. Using the results of filter phase (a Boolean array), the refine phase (the usual fetch call stack, see below) is invoked only for the records that satisfy the filter phase.
For this purpose, I want to access all the data present in a collection before actually fetching the rows following the fetch calls. Currently, I am able to do it using the cursor available in mongo::CollectionScan::work() method in src/mongo/db/exec/collection_scan.cpp. On the first call to this method, I get all the record geometries from a collection using the cursors, get the query geometry and invoke filter phase on GPU. After filter phase, in this call and on all successive calls, I invoke the remaining functions of fetch only if the current row satisfied the filter phase (i.e. has 1 in the result array from filter phase).
However, this method has obvious overheads. I am fetching every geometry twice (if the geometry satisfies the filter phase). This makes the query slower when the selectivity is greater than 10%.
I would like know:
1. Is there a faster way of fetching row instead of using cursor. I am actually calling the same functions to extract geometries from the records returned by the cursor, as followed in the fetch call stack.
2. Is-mongo::CollectionScan::work() the right place to fetch the data?
3. Any comments on my approach
I will be happy to answer your questions and provide more information.
Thank you for your help.
Fetch (also called as "refine phase" above) call stack:
#1 0x0000000003215de9 in S2Polygon::Intersects (this=0x7ffff59627a0, b=0x7ffff5963120)
at src/third_party/s2/s2polygon.cc:418
#2 0x0000000002104068 in mongo::GeometryContainer::intersects (this=0x7ffff5d69fa0, otherPolygon=...)
at src/mongo/db/geo/geometry_container.cpp:792
#3 0x00000000021029c3 in mongo::GeometryContainer::intersects (this=0x7ffff5d69fa0,
otherContainer=...) at src/mongo/db/geo/geometry_container.cpp:535
#4 0x00000000021678ff in mongo::GeoMatchExpression::matchesSingleElement (this=0x7ffff5f3cc80, e=...)
at src/mongo/db/matcher/expression_geo.cpp:349
#5 0x0000000002169cd3 in mongo::LeafMatchExpression::matches (this=0x7ffff5f3cc80,
doc=0x7ffff7eaea60, details=0x0) at src/mongo/db/matcher/expression_leaf.cpp:73
#6 0x00000000020114c8 in mongo::Filter::passes (wsm=0x7ffff5d52a00, filter=0x7ffff5f3cc80)
at src/mongo/db/exec/filter.h:157
#7 0x00000000020107ea in mongo::CollectionScan::returnIfMatches (this=0x7ffff5f3cb60,
member=0x7ffff5d52a00, memberID=1, out=0x7ffff7eaed00)
at src/mongo/db/exec/collection_scan.cpp:178
#8 0x00000000020105ef in mongo::CollectionScan::work (this=0x7ffff5f3cb60, out=0x7ffff7eaed00)
at src/mongo/db/exec/collection_scan.cpp:170
#9 0x0000000002308f7f in mongo::PlanExecutor::getNextImpl (this=0x7ffff5ceefe0,
objOut=0x7ffff7eaede0, dlOut=0x0) at src/mongo/db/query/plan_executor.cpp:390
#10 0x00000000023089ee in mongo::PlanExecutor::getNext (this=0x7ffff5ceefe0, objOut=0x7ffff7eaeff0,
dlOut=0x0) at src/mongo/db/query/plan_executor.cpp:319
Thanks,
Harshada