Will sharding speed up aggregation pipeline queries

103 views
Skip to first unread message

Alex Paransky

unread,
Feb 10, 2016, 4:45:59 PM2/10/16
to mongodb-user
Group,

We are running with a 3 member replica set with our aggregations going against SECONDARY members with proper indexes to avoid full collection scans.  At times, we are aggregating over millions of documents and things starting to run a bit slow.  If we split our collections amongst N shards (using hash key) each shard will effectively get 1/N number of documents.

Q1: Will this help with aggregations as effectively, I believe,  multiple queries will run across N shards in parallel and then be combined into a single result in mongos?  
Q2: Will each mongo shard actually perform the full aggregation (as if it was the only one returning the data) and then the results of N aggregations from N shards get combined into a single result set by mongos?

In this particular case, I don't want to isolate data commonly read together to a single shard, but rather break it up into multiple shards to get parallel performance.

Is my understanding of how sharding helps with queries against a large data set correct?

Thanks.
-AP_

Asya Kamsky

unread,
Feb 11, 2016, 11:50:18 PM2/11/16
to mongodb-user
Alex,

In general, there are a couple of things to consider here, sort of working against each other.

On one hand, yes, having each shard only process 1/Nth of data will cause more of the work to happen in parallel, however depending on the exact pipeline, only part of it might be handled on each shard, certain parts of the pipeline must always be finished in a single place (like $group and $sort, both must return a single "unified" result).

However, it's possible that there may be other ways to speed up your aggregation - can you provide some details about the pipeline and also what version you are using?

Asya


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/44aeca01-1010-41d7-8fa2-454554ef2c8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
Free MongoDB Monitoring - cloud.mongodb.com
Free Online Education - university.mongodb.com
Get Involved - mongodb.org/community
We're Hiring! - https://www.mongodb.com/careers

Alex Paransky

unread,
Feb 12, 2016, 10:41:39 AM2/12/16
to mongodb-user
Asya,

The query we are performing is doing an aggregation of just over 3 million
rows. Our MongoDB (3.0.8) is running in AWS using Optimized EBS with
Enhanced Networking enabled.

Using R3.8XL (32 CPU, 244 gig of memory) machines we get the following
performance:

1) Duration:151202 milliseconds
2) Duration:42559 milliseconds
3) Duration:42318 milliseconds
4) Duration:42656 milliseconds
5) Duration:42270 milliseconds
6) Duration:42298 milliseconds
7) Duration:43097 milliseconds
8) Duration:42425 milliseconds
9) Duration:42419 milliseconds

The initial query is obviously data being loaded into the cache. The rest
of the queries, I presume are being read from the cache. What's not clear
to me is why it takes 43 seconds to execute the query if data is already in
memory. While the data is being read from the cache we are not seeing any
resources such as IOPS, memory, or network at it's maximum. When the data
is being processed from cache, we are not seeing CPU spikes. In fact,
things seem quite idle on this 32 CPU 244 GB machine.

Here is the query we are doing:

db.runCommand({
"aggregate": "events-qos-timeupdate",
"pipeline": [
{
"$match": {
"$and": [
{
"$or": [
{
"vd.ple.vd.pid":
"ff58e1a5-9ebe-7c04-341d-51bf9a816326"
}
]
},
{
"$or": [
{
"vd.ple.vd.acc": "AccountName"
}
]
}
]
}
},
{
"$project": {
"rR2Rr": "$vd.ple.vd.userid",
"rR3Rr": {
"$year": [
{
"$add": [
"$vd.ts",
-28800000
]
}
]
},
"rR4Rr": {
"$month": [
{
"$add": [
"$vd.ts",
-28800000
]
}
]
},
"rR5Rr": {
"$dayOfMonth": [
{
"$add": [
"$vd.ts",
-28800000
]
}
]
},
"rR6Rr": {
"$hour": [
{
"$add": [
"$vd.ts",
-28800000
]
}
]
},
"rR7Rr": {
"$minute": [
{
"$add": [
"$vd.ts",
-28800000
]
}
]
}
}
},
{
"$group": {
"_id": {
"rR2Rr": "$rR2Rr",
"rR3Rr": "$rR3Rr",
"rR4Rr": "$rR4Rr",
"rR5Rr": "$rR5Rr",
"rR6Rr": "$rR6Rr",
"rR7Rr": "$rR7Rr"
},
"rR8Rr": {
"$sum": 1
}
}
},
{
"$group": {
"_id": {
"rR3Rr": "$_id.rR3Rr",
"rR4Rr": "$_id.rR4Rr",
"rR5Rr": "$_id.rR5Rr",
"rR6Rr": "$_id.rR6Rr",
"rR7Rr": "$_id.rR7Rr"
},
"rR8Rr": {
"$sum": 1
}
}
}
],
"allowDiskUse": true
});

Here is the explain plan:

{
"stages" : [
{
"$cursor" : {
"query" : {
"$and" : [
{
"$or" : [
{
"vd.ple.vd.pid" :
"ff58e1a5-9ebe-7c04-341d-51bf9a816326"
}
]
},
{
"$or" : [
{
"vd.ple.vd.acc" : "AccountName"
}
]
}
]
},
"fields" : {
"vd.ple.vd.userid" : 1,
"vd.ts" : 1,
"_id" : 1
},
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cdc.events-qos-timeupdate",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"vd.ple.vd.acc" : {
"$eq" : "AccountName"
}
},
{
"vd.ple.vd.pid" : {
"$eq" :
"ff58e1a5-9ebe-7c04-341d-51bf9a816326"
}
}
]
},
"winningPlan" : {
"stage" : "CACHED_PLAN",
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"vd.ple.vd.acc" : 1,
"vd.ple.vd.pid" : 1,
"vd.ts" : 1
},
"indexName" :
"vd.ple.vd.acc_1_vd.ple.vd.pid_1_vd.ts_1",
"isMultiKey" : false,
"direction" : "forward",
"indexBounds" : {
"vd.ple.vd.acc" : [
"[\"AccountName\", \"AccountName\"]"
],
"vd.ple.vd.pid" : [
"[\"ff58e1a5-9ebe-7c04-341d-51bf9a816326\",
\"ff58e1a5-9ebe-7c04-341d-51bf9a816326\"]"
],
"vd.ts" : [
"[MinKey, MaxKey]"
]
}
}
}
},
"rejectedPlans" : [
]
}
}
},
{
"$project" : {
"rR2Rr" : "$vd.ple.vd.userid",
"rR3Rr" : {
"$year" : [
{
"$add" : [
"$vd.ts",
{
"$const" : -28800000
}
]
}
]
},
"rR4Rr" : {
"$month" : [
{
"$add" : [
"$vd.ts",
{
"$const" : -28800000
}
]
}
]
},
"rR5Rr" : {
"$dayOfMonth" : [
{
"$add" : [
"$vd.ts",
{
"$const" : -28800000
}
]
}
]
},
"rR6Rr" : {
"$hour" : [
{
"$add" : [
"$vd.ts",
{
"$const" : -28800000
}
]
}
]
},
"rR7Rr" : {
"$minute" : [
{
"$add" : [
"$vd.ts",
{
"$const" : -28800000
}
]
}
]
}
}
},
{
"$group" : {
"_id" : {
"rR2Rr" : "$rR2Rr",
"rR3Rr" : "$rR3Rr",
"rR4Rr" : "$rR4Rr",
"rR5Rr" : "$rR5Rr",
"rR6Rr" : "$rR6Rr",
"rR7Rr" : "$rR7Rr"
},
"rR8Rr" : {
"$sum" : {
"$const" : 1
}
}
}
},
{
"$group" : {
"_id" : {
"rR3Rr" : "$_id.rR3Rr",
"rR4Rr" : "$_id.rR4Rr",
"rR5Rr" : "$_id.rR5Rr",
"rR6Rr" : "$_id.rR6Rr",
"rR7Rr" : "$_id.rR7Rr"
},
"rR8Rr" : {
"$sum" : {
"$const" : 1
}
}
}
}
],
"ok" : 1
}

Here are the statistics on the collection:

{
"ns" : "cdc.events-qos-timeupdate",
"count" : 123792961,
"size" : 510046799787,
"avgObjSize" : 4120,
"storageSize" : 59246358528,
"capped" : false,
"wiredTiger" : {
"metadata" : {
"formatVersion" : 1
},
"creationString" :
"allocation_size=4KB,app_metadata=(formatVersion=1),block_allocation=best,block_compressor=zlib,cache_resident=0,checkpoint=(WiredTigerCheckpoint.275801=(addr=\"01e37b7ea681e4fa5b2a7de3dc9456c015e45e61955be3dc94abc00ae4c29cabc2808080e50dcb5b8fc0e50df30e3fc0\",order=275801,time=1455234016,size=59912380416,write_gen=9622261)),checkpoint_lsn=(22590,10734592),checksum=on,collator=,columns=,dictionary=0,format=btree,huffman_key=,huffman_value=,id=97,internal_item_max=0,internal_key_max=0,internal_key_truncate=,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=1MB,memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=0,prefix_compression_min=4,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,value_format=u,version=(major=1,minor=1)"
,
"type" : "file",
"uri" : "statistics:table:collection-34--1012075229251210100",
"LSM" : {
"bloom filters in the LSM tree" : 0,
"bloom filter false positives" : 0,
"bloom filter hits" : 0,
"bloom filter misses" : 0,
"bloom filter pages evicted from cache" : 0,
"bloom filter pages read into cache" : 0,
"total size of bloom filters" : 0,
"sleep for LSM checkpoint throttle" : 0,
"chunks in the LSM tree" : 0,
"highest merge generation in the LSM tree" : 0,
"queries that could have benefited from a Bloom filter that did
not exist" : 0,
"sleep for LSM merge throttle" : 0
},
"block-manager" : {
"file allocation unit size" : 4096,
"blocks allocated" : 4556,
"checkpoint size" : 59912380416,
"allocations requiring file extension" : 1527,
"blocks freed" : 1327,
"file magic number" : 120897,
"file major version number" : 1,
"minor version number" : 0,
"file bytes available for reuse" : 462888960,
"file size in bytes" : 59246358528
},
"btree" : {
"btree checkpoint generation" : 101,
"column-store variable-size deleted values" : 0,
"column-store fixed-size leaf pages" : 0,
"column-store internal pages" : 0,
"column-store variable-size leaf pages" : 0,
"pages rewritten by compaction" : 0,
"number of key/value pairs" : 0,
"fixed-record size" : 0,
"maximum tree depth" : 6,
"maximum internal page key size" : 368,
"maximum internal page size" : 4096,
"maximum leaf page key size" : 3276,
"maximum leaf page size" : 32768,
"maximum leaf page value size" : 1048576,
"overflow pages" : 0,
"row-store internal pages" : 0,
"row-store leaf pages" : 0
},
"cache" : {
"bytes read into cache" : 28357059038,
"bytes written from cache" : 727515422,
"checkpoint blocked page eviction" : 0,
"unmodified pages evicted" : 59643,
"page split during eviction deepened the tree" : 0,
"modified pages evicted" : 453,
"data source pages selected for eviction unable to be evicted"
: 24,
"hazard pointer blocked page eviction" : 14,
"internal pages evicted" : 313,
"pages split during eviction" : 259,
"in-memory page splits" : 51,
"overflow values cached in memory" : 0,
"pages read into cache" : 132183,
"overflow pages read into cache" : 0,
"pages written from cache" : 4356
},
"compression" : {
"raw compression call failed, no additional data available" :
293,
"raw compression call failed, additional data available" : 0,
"raw compression call succeeded" : 6013,
"compressed pages read" : 132178,
"compressed pages written" : 20,
"page written failed to compress" : 0,
"page written was too small to compress" : 273
},
"cursor" : {
"create calls" : 37,
"insert calls" : 129400,
"bulk-loaded cursor-insert calls" : 0,
"cursor-insert key and value bytes inserted" : 624634044,
"next calls" : 0,
"prev calls" : 1,
"remove calls" : 0,
"cursor-remove key bytes removed" : 0,
"reset calls" : 34632666,
"search calls" : 34496956,
"search near calls" : 0,
"update calls" : 0,
"cursor-update value bytes updated" : 0
},
"reconciliation" : {
"dictionary matches" : 0,
"internal page multi-block writes" : 98,
"leaf page multi-block writes" : 390,
"maximum blocks required for a page" : 49,
"internal-page overflow keys" : 0,
"leaf-page overflow keys" : 0,
"overflow values written" : 0,
"pages deleted" : 0,
"page checksum matches" : 1950,
"page reconciliation calls" : 911,
"page reconciliation calls for eviction" : 453,
"leaf page key bytes discarded using prefix compression" : 0,
"internal page key bytes discarded using suffix compression" : 0
},
"session" : {
"object compaction" : 0,
"open cursor count" : 37
},
"transaction" : {
"update conflicts" : 0
}
},
"nindexes" : 4,
"totalIndexSize" : 11469344768,
"indexSizes" : {
"_id_" : 7807283200,
"vd.ts_1" : 1212760064,
"vd.ple.vd.acc_1_vd.ts_1" : 1216249856,
"vd.ple.vd.acc_1_vd.ple.vd.pid_1_vd.ts_1" : 1233051648
},
"ok" : 1
}

Here is the slow query log from the server:

Feb 11 20:34:47 ec2-52-72-90-211 mongod.27000[1390]: [conn19] command
cdc.$cmd command: aggregate { aggregate: "events-qos-timeup

date", pipeline: [ { $match: { $and: [ { $or: [ { vd.ple.vd.pid: "
ff58e1a5-9ebe-7c04-341d-51bf9a816326" } ] }, { $or: [ { vd.ple.

vd.acc: "AccountName" } ] } ] } }, { $project: { rR2Rr:
"$vd.ple.vd.userid", rR3Rr: { $year: [ { $add: [ "$vd.ts", -28800000.0 ] }

] }, rR4Rr: { $month: [ { $add: [ "$vd.ts", -28800000.0 ] } ] }, rR5Rr: {
$dayOfMonth: [ { $add: [ "$vd.ts", -28800000.0 ] } ] },

rR6Rr: { $hour: [ { $add: [ "$vd.ts", -28800000.0 ] } ] }, rR7Rr: {
$minute: [ { $add: [ "$vd.ts", -28800000.0 ] } ] } } }, { $g

roup: { _id: { rR2Rr: "$rR2Rr", rR3Rr: "$rR3Rr", rR4Rr: "$rR4Rr", rR5Rr:
"$rR5Rr", rR6Rr: "$rR6Rr", rR7Rr: "$rR7Rr" }, rR8Rr: { $

sum: 1.0 } } }, { $group: { _id: { rR3Rr: "$_id.rR3Rr", rR4Rr:
"$_id.rR4Rr", rR5Rr: "$_id.rR5Rr", rR6Rr: "$_id.rR6Rr", rR7Rr: "$_

id.rR7Rr" }, rR8Rr: { $sum: 1.0 } } } ], allowDiskUse: true } keyUpdates:0
writeConflicts:0 numYields:23795 reslen:36660 locks:{

Global: { acquireCount: { r: 48702 }, acquireWaitCount: { r: 247 },
timeAcquiringMicros: { r: 390753 } }, Database: { acquireCoun

t: { r: 24351 } }, Collection: { acquireCount: { r: 24351 } } } 43244ms

>> email to mongodb-user...@googlegroups.com <javascript:>.


>> To post to this group, send email to mongod...@googlegroups.com

>> <javascript:>.

>> <https://groups.google.com/d/msgid/mongodb-user/44aeca01-1010-41d7-8fa2-454554ef2c8f%40googlegroups.com?utm_medium=email&utm_source=footer>

Reply all
Reply to author
Forward
0 new messages