MapReduce for on-demand analytics ?

10 views
Skip to first unread message

tom

unread,
Nov 18, 2011, 6:31:22 PM11/18/11
to google-a...@googlegroups.com
I want to add some analytics to my app, and I'm trying to get a gauge on whether mapreduce could be fast enough to allow reports to be generated 'on demand'. I know it's not realistic to get an exact answer, I'm just looking for order of magnitude... 1 second? 10 seconds? 100 seconds?  For example, user #1 might want to generate a report on how many times each type of event occurred between date-B and date-C. So something like:

in Events, group-by event_type  count where (user_id='1'    &&    date >= date-B  &&   date < date-C)

This would be querying a single Model, which would have a 4 properties: user_id(int), event_type (int), start_date (date/time), end_date (date_time).

Let's assume that the are 10,000,000 Event entities in total, of which 1,000,000 have user_id =1, and 100,000 lie between date A and date B, in which there are  200 different types of event. So our output would have 200 rows, each with a integer value. 

Let's also assume (just for the sake of example) that we're willing to have however many instances it takes to get the results in ~110% of the quickest possible time. 

Would that take 1 second? 10 seconds? 100 seconds? 1000 seconds?

thanks

tom



Reply all
Reply to author
Forward
0 new messages