Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Aggregation Framework generally faster than MapReduce?

Received: by 10.52.174.201 with SMTP id bu9mr1728563vdc.5.1349388155086;
        Thu, 04 Oct 2012 15:02:35 -0700 (PDT)
X-BeenThere: mongodb-user@googlegroups.com
Received: by 10.220.224.8 with SMTP id im8ls2823590vcb.4.gmail; Thu, 04 Oct
 2012 15:02:24 -0700 (PDT)
Received: by 10.52.70.82 with SMTP id k18mr1411389vdu.1.1349388144888;
        Thu, 04 Oct 2012 15:02:24 -0700 (PDT)
Date: Thu, 4 Oct 2012 15:02:24 -0700 (PDT)
From: Mark Hansen <m...@digitalbrandmine.com>
To: mongodb-user@googlegroups.com
Message-Id: <b2b3b5d3-f1cf-492f-ad06-d2a5b4024439@googlegroups.com>
In-Reply-To: <f23d4c62-c5d9-4e1c-aa6f-65ae9d585424@googlegroups.com>
References: <51b02aa3-9c42-452a-8d39-61155fd2e883@googlegroups.com>
 <f23d4c62-c5d9-4e1c-aa6f-65ae9d585424@googlegroups.com>
Subject: Re: Aggregation Framework generally faster than MapReduce?
MIME-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_282_10878180.1349388144450"

------=_Part_282_10878180.1349388144450
Content-Type: multipart/alternative; 
	boundary="----=_Part_283_25039990.1349388144450"

------=_Part_283_25039990.1349388144450
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit

Thanks Sean.  Did a quick test today on one of our long running MR 
transformations.  The equivalent AF ran 7 times faster.  So, I think we'll 
be porting a lot of code this weekend ... ;-)


On Thursday, October 4, 2012 1:04:12 PM UTC-4, Sean Reilly wrote:
>
> *TL;DR:* Oh heck yes! It will almost certainly be worth it.
>
> More detailed answer: I am using the aggregation framework in anger on a 
> large project (hasn't released yet, and I can't go into any specifics) that 
> is handling lots of data. Interestingly, a previous prototype of the 
> project (somewhat different functionality, written in a different way by a 
> completely different team) did use map reduce for data transformation 
> purposes.
>
> Performance wise, the aggregation framework won hands down. There's a lot 
> of reasons why, but here are (IMO) the three biggest reasons:
>
> 1. With map reduce, you generally output to a collection (often a 
> temporary collection). The aggregation framework is much better suited to 
> return data directly to a calling library when that's what you actually 
> want to do.
>
> 2. Map reduce is single threaded, per-server. A mongod instance can only 
> run one map reduce query at a time, whereas the aggregation framework can 
> run multiple operations at once.
>
> 3. The aggregation framework can use indexes to reduce the cost of 
> operations where you're only interested in a subset of the contents of a 
> collection.
>
> For us. the combination of these three gave the aggregation framework a 
> huge performance advantage of map reduce for similar (but not identical) 
> problems.
>
> In your case, things might be different, but I'd go so far as to say that 
> the aggregation framework should be everybody's default choice for 
> batch/data processing jobs on MongoDB (well, either AF or application code, 
> I suppose). The list of things that map reduce is best at on this platform 
> is quite small, and getting smaller all of the time.
>
> Sean
>
> On Thursday, 4 October 2012 17:24:00 UTC+1, Mark Hansen wrote:
>>
>> Has anybody tried a comparison of the same aggregation function using the 
>> AF vs. MR?  If so, what (if any) performance improvements did you achieve?  
>> We have a MR library that we are thinking of converting over to AF and 
>> wondering if it will be worth the effort in terms of improved performance.
>>
>>
------=_Part_283_25039990.1349388144450
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Thanks Sean.&nbsp; Did a quick test today on one of our long running MR tra=
nsformations.&nbsp; The equivalent AF ran 7 times faster.&nbsp; So, I think=
 we'll be porting a lot of code this weekend ... ;-)<br><br><br>On Thursday=
, October 4, 2012 1:04:12 PM UTC-4, Sean Reilly wrote:<blockquote class=3D"=
gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc so=
lid;padding-left: 1ex;"><b>TL;DR:</b> Oh heck yes! It will almost certainly=
 be worth it.<div><br></div><div>More detailed answer: I am using the aggre=
gation framework in anger on a large project (hasn't released yet, and I ca=
n't go into any specifics) that is handling lots of data. Interestingly, a =
previous prototype of the project (somewhat different functionality, writte=
n in a different way by a completely different team) did use map reduce for=
 data transformation purposes.</div><div><br></div><div>Performance wise, t=
he aggregation framework won hands down. There's a lot of reasons why, but =
here are (IMO) the three biggest reasons:</div><div><br></div><div>1. With =
map reduce, you generally output to a collection (often a temporary collect=
ion). The aggregation framework is much better suited to return data direct=
ly to a calling library when that's what you actually want to do.</div><div=
><br></div><div>2. Map reduce is single threaded, per-server. A mongod inst=
ance can only run one map reduce query at a time, whereas the&nbsp;aggregat=
ion&nbsp;framework can run multiple operations at once.</div><div><br></div=
><div>3. The aggregation framework can use indexes to reduce the cost of op=
erations where you're only interested in a subset of the contents of a coll=
ection.</div><div><br></div><div>For us. the combination of these three gav=
e the aggregation framework a huge performance advantage of map reduce for =
similar (but not identical) problems.</div><div><br></div><div>In your case=
, things might be different, but I'd go so far as to say that the aggregati=
on framework should be everybody's default choice for batch/data processing=
 jobs on MongoDB (well, either AF or application code, I suppose). The list=
 of things that map reduce is best at on this platform is quite small, and =
getting smaller all of the time.</div><div><br></div><div>Sean<br><br>On Th=
ursday, 4 October 2012 17:24:00 UTC+1, Mark Hansen  wrote:<blockquote class=
=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc s=
olid;padding-left:1ex">Has anybody tried a comparison of the same aggregati=
on function using the AF vs. MR?&nbsp; If so, what (if any) performance imp=
rovements did you achieve?&nbsp; We have a MR library that we are thinking =
of converting over to AF and wondering if it will be worth the effort in te=
rms of improved performance.<br><br></blockquote></div></blockquote>
------=_Part_283_25039990.1349388144450--

------=_Part_282_10878180.1349388144450--