Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Pre-aggregation Question
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Fumiko  
View profile  
 More options Oct 5 2012, 12:54 pm
From: Fumiko <fhanre...@gmail.com>
Date: Fri, 5 Oct 2012 09:54:28 -0700 (PDT)
Local: Fri, Oct 5 2012 12:54 pm
Subject: Pre-aggregation Question

Hi,

I'm new to mongodb and trying to use it for reporting taking some data from
MySQL (e.g. fetch-and-modify as a purchase is made).  There are several
filters to the report such as product category id, gender, country,
affiliate, etc.  I'm thinking of two ways.  One is
{ _id : today,
  sale_amount : 100.00,
  gender : ,
  country : ,
...... }
for each purchase (without pre-aggregation).  Another one is
{ _id : { date : today, gender :    , country :    , (more filters) },
  sale_amount : 100.00,
  sale_count : 1 }

for each combination of filters (with pre-aggregation). An example of what
we need to see in the reporting is "Sales of product A per day made by male
in some country in the last 90 days."
If somebody can give me advice on how pre-aggregation should be done in
this example, I'd highly appreciate it.
Thank you in advance.

Fumiko


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jenna deBoisblanc  
View profile  
 More options Oct 10 2012, 1:36 pm
From: Jenna deBoisblanc <jenna.deboisbl...@10gen.com>
Date: Wed, 10 Oct 2012 10:36:41 -0700 (PDT)
Local: Wed, Oct 10 2012 1:36 pm
Subject: Re: Pre-aggregation Question

Hello Fumiko,

Could you give us a little more information? When you say, "Sales of
product A per day made by male in some country in the last 90 days" - do
you need the total sales of product A in dollars, a count of the sales of
product A, or both?

If you choose the first approach, you can use the aggregation framework
(available in v. 2.2+) or Map Reduce to aggregate total sales based on
query filters:
http://www.mongodb.org/display/DOCS/MapReduce
http://docs.mongodb.org/manual/applications/aggregation/

The better solution may be to use the $inc operator to update an existing
document as new products are purchased {_id: <filter>, same_amount: x,
sale_count: y}
http://www.mongodb.org/display/DOCS/Updating#Updating-%24inc


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Fumiko  
View profile  
 More options Oct 11 2012, 7:44 pm
From: Fumiko <fhanre...@gmail.com>
Date: Thu, 11 Oct 2012 16:44:57 -0700 (PDT)
Local: Thurs, Oct 11 2012 7:44 pm
Subject: Re: Pre-aggregation Question

Hello Jenna,

I truly appreciate your help.  Yes, both dollar amount and count are
needed.  Your suggestion is pre-aggregation.  I'd read those links prior to
this posting and watched some recordings, and I may be having difficulty
understanding it right.  But, using those several filters as compound ids,
and if, say to report "Sales in dollar amount per day made by male in the
past 90 days" (i.e. one filter, gender, instead of multiple filters), then
we are pre-aggregating wtih $inc followed by aggregating ($match: {
_id.gender: "male" }) as well as by day?  There may be 0 or more filters
needed depending on report (sometimes all filters are needed and other
times none - just sales in dollar amount, for example).  I'd like to know
if filters (includes date) as compound id is how it should be done or not.
 If not, could you please suggest what else can be possible?  I'm not at
all familiar with map-reduce.  I'm hoping pre-aggregation or just plain
aggregation itself is the solution.
Thank you.

Fumiko


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jenna deBoisblanc  
View profile  
 More options Oct 19 2012, 12:20 pm
From: Jenna deBoisblanc <jenna.deboisbl...@10gen.com>
Date: Fri, 19 Oct 2012 09:20:56 -0700 (PDT)
Local: Fri, Oct 19 2012 12:20 pm
Subject: Re: Pre-aggregation Question

Hello Fumiko,

I think the following examples may help, but please let me know if this is
not what you are trying to accomplish.

A female in the US purchases an umbrella for $100. We insert a document:

> doc =

{ _id : 1,
  product: "umbrella",
  sale_amount : 100.00,
  sale_count: 1,
  gender : female,
  country : USA,

}
> db.collection.insert(doc)

Another female in the US purchases an umbrella for $86, and we would like
to update the document to adjust the sale amount and count:

> db.collection.update({product: "umbrella", gender: female, country: USA},

{$inc: {sale_amount: 86, sale_count: 1}})
> db.collection.find()

{ _id : 1,
  product: "umbrella",
  sale_amount : 186.00,
  sale_count: 2,
  gender : female,
  country : USA,

}

Alternatively, you can insert a new document every time a purchase is made,
and perform aggregation at a later point.  The advantage of this procedure
is that you maintain a record of every purchase (useful for tracking), but
the disadvantage is that your data size will grow, and in addition, the
aggregation command may be more expensive than a simple update with $inc.
 Here is an example of aggregation using the aggregation framework:
{ _id : 1,
  product: "umbrella",
  sale_amount : 100.00,
  gender : "female",
  country : "USA",
}

{ _id : 2,
  product: "umbrella",
  sale_amount : 86.00,
  gender : "female",
  country : "USA",

}
> db.collection.aggregate(

     {$match: {product: "umbrella", gender: "female", country: "USA"}},
     {$group:
        { _id: {product: "$product", gender: "$gender", country:
"$country"},
          count: {$sum: 1},
          totalAmount: {$sum: "$sale_amount"}
        }
     })

  {
   "result" : [
{
"_id" : {
"product" : "umbrella",
"gender" : "female",
"country" : "USA"

},

"count" : 2,
"totalAmount" : 186
}

],
"ok" : 1
  }

Map Reduce involves first emitting, and subsequently grouping, documents.
The process is similar to the aggregation command above.  The advantage of
MR is that you can output the results to a new collection; however, since
MR uses Javascript (and MongoDB currently uses a single-threaded Javascript
engine), the performance may not be as good as the aggregation framework.
Nevertheless, if you would like sample MR functions, please let me know.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »