Map reduce index defintion

33 views
Skip to first unread message

Daniel Zolnjan

unread,
Aug 27, 2015, 5:04:35 AM8/27/15
to RavenDB - 2nd generation document database
I got following collections:

    public class Prophet
   
{
       
public string Id { get; set; }
   
}

   
public class Prediction
   
{
       
public string Id { get; set; }
       
public string ProphetId { get; set; }
       
public string Profit { get; set; }
   
}

   
public class Team
   
{
       
public string Id { get; set; }
       
public List<string> MemberProphetIds { get; set; }

   
}

Defining map-reduce index that yields Profit by Prophets is straightforward:
   
    public class Predictions_ByProphet : AbstractMultiMapIndexCreationTask<Predictions_ByProphet.Result>
   
{
       
public class Result
       
{
           
public string ProphetId { get; set; }
           
public double Profit { get; set; }
       
}

       
public Predictions_ByProphet()
       
{
           
AddMap<Prediction>(items => from x in items
                                       
select new
                                       
{
                                           
ProphetId = x.ProphetId,
                                           
Profit = x.Profit,
                                       
});


           
Reduce = results => from result in results
                               
group result by result.ProphetId into g
                               
select new
                               
{
                                   
ProphetId = g.Key,
                                   
Profit = g.Sum(x => x.Profit)
                               
};
       
}
   
}

Would it be possible to create map-reduce index that would yield Profit by Teams?

Kijana Woodard

unread,
Aug 27, 2015, 9:58:19 AM8/27/15
to rav...@googlegroups.com
How many teams can a Prophet be on? 0? 1? Many?

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Zolnjan

unread,
Aug 27, 2015, 10:11:21 AM8/27/15
to RavenDB - 2nd generation document database
Same prophet can be on multiple teams.

----------------------------------------------------------
Simple typo correction: Profit type on Prediction should be 'double' not 'string' as in my original post. 

public class Prediction
   
{
       
public string Id { get; set; }
       
public string ProphetId { get; set; }

       
public double Profit { get; set; }
   
}

Oren Eini (Ayende Rahien)

unread,
Aug 28, 2015, 5:53:32 AM8/28/15
to ravendb
Since you are recording the profits only based on the prophet, how do you share revenues between the various teams when a single prophet is on a few of them?

In more detail, assume that you have 3 teams, and 5 prophets, which each team having 3 prophets

Team 1 - Prophets: A,B,C
Team 2 - Prophets: B,C,D
Team 3 - Prophets: C,D,E

Prophets: A,B,E have each four predictions, each with profit of 150.
Prophets: C,D each have one prediction with 200 profit.

How do you suggest, based on your model, that you'll calculate the profit per team?


Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 

Daniel Zolnjan

unread,
Aug 28, 2015, 7:34:48 AM8/28/15
to RavenDB - 2nd generation document database
Thanks for taking deep interest in this problem.

Team 1: (A + A + A +A) + (B + B + B +B) + (C) = 150*4 + 150*4 + 200 = 800
Team 2: 150 + 200 + 200 = 550
Team 3: 200 + 200 + 150 = 550

This has nothing to do with sharing revenue. It is simply looking to identify which teams (groups of prophets) would/are making highest profit sums, and want it indexes to show like 10 best performing teams on home page. It is perfectly fine to same prophet in multiple teams there.

I also want leader board by Prophets to single out best individual prophets.. Show best 10 prophets that are producing highest profit sum.
This one I already created index for initial post. 

So two map-reduce indexes:
1) Sum prediction profits by Prophets - already done
2) Sum prediction profits By Teams

Now, I got solution for second one by simply modifying my model not to store Prophets inside Teams like in original post. 
But within Prophet document store the Teams (Ids) he/she member of like so:

 
   public class Prophet
   
{
       
public string Id { get; set; }

       
public List<string> MemberOfTeams { get; set; }
   
}

so Team based map-reduce index is simply going up the hierarchy (Prediction < Prophet < Team)

    public class Prediction_ByTeam : AbstractMultiMapIndexCreationTask<Prediction_ByTeam.Result>
   
{
       
public class Result
       
{
           
public string TeamId { get; set; }

           
public double Profit { get; set; }
       
}



       
public Prediction_ByTeam()

       
{


           
AddMap<Prediction>(items => from x in
items
                                        let prophet
= LoadDocument<Prophet>(x.ProphetId)
                                       
from m in prophet.MemberOfTeams
                                       
select new
                                       
{
                                           
Profit = x.Profit,
                                           
TeamId = m,

                                       
});


           
Reduce = results => from result in
results
                               
group result by result.TeamId into g
                               
select new
                               
{
                                   
TeamId = g.Key,
                                   
Profit = g.Sum(x => x.Profit),
                               
};
       
}
   
}

I'm still wondering if it is possible to define map-reduce for initial model where prophets are stored with team as it seems more natural model. Got some ideas but this post is getting long so skipping it.

Oren Eini (Ayende Rahien)

unread,
Aug 28, 2015, 12:06:59 PM8/28/15
to ravendb
Your second model makes it much easier.
Technically you can try doing a multi map/reduce to try to solve it, but it wouldn't be easy at all.

Daniel Zolnjan

unread,
Aug 28, 2015, 1:08:14 PM8/28/15
to RavenDB - 2nd generation document database
First comes down to this example:

class TeamProphetProfit
{
    public string TeamId { get; set; }
    public string ProphetId { get; set; }
    public double? Profit { get; set; }
}
var results = new List<TeamProphetProfit>()
{
    new TeamProphetProfit() { TeamId = "team/1", ProphetId = "prophet/1", Profit = 0 },
    new TeamProphetProfit() { TeamId = "team/1", ProphetId = "prophet/2", Profit = 0 },
    new TeamProphetProfit() { TeamId = "team/2", ProphetId = "prophet/2", Profit = 0 },
    new TeamProphetProfit() { TeamId = "team/3", ProphetId = "prophet/1", Profit = 0 },
    new TeamProphetProfit() { TeamId = "-1", ProphetId = "prophet/1", Profit = 1 },
    new TeamProphetProfit() { TeamId = "-1", ProphetId = "prophet/1", Profit = 2 },
    new TeamProphetProfit() { TeamId = "-1", ProphetId = "prophet/2", Profit = 1 },
};
var reduce = from result in results.Take(1)
            let prophetsByTeam
= results.Where(x => x.TeamId != "-1")
            let profitByProphet
= results.Where(x => x.TeamId == "-1").GroupBy(y => y.ProphetId).Select(y => new { ProphetId = y.Key, Profit = y.Sum(p => p.Profit) })
            let items
= prophetsByTeam.Select(t => new
           
{
               
Profit = profitByProphet.Where(x => t.ProphetId == x.ProphetId).Sum(u => u.Profit),  
               
TeamId = t.TeamId,
               
ProphetId = "-1"
           
})
           
from item in items
           
group item by item.TeamId into g
           
select new
           
{

               
Profit = g.Sum(x => x.Profit),

               
TeamId = g.Key,
               
ProphetId = "-1"
           
};

This yields wanted results (when executed manually as test out side Raven):
TeamId, Profit, Prophet
team/1, 4, -1
team/2, 1, -1
team/3, 3, -1

Here is map-reduce def for first model:

 public class Prediction_ByTeam_Test : AbstractMultiMapIndexCreationTask<Prediction_ByTeam_Test.Result>

{
   
public class Result
   
{
       
public string ProphetId { get; set; }

       
public string TeamId { get; set; }
       
public double Profit { get; set; }
   
}



   
public Prediction_ByTeam_Test()
   
{
       
AddMap<Team>(items => from x in items
                               
from t in x.MemberIds
                               
select new
                               
{
                                   
ProphetId = t,
                                   
TeamId = x.Id,
                                   
Profit = 0
                               
});



       
AddMap<Prediction>(items => from x in items
                                   
select new
                                   
{
                                       
ProphetId = x.ProphetId,
                                       
Profit = x.Profit,

                                       
TeamId = "-1"
                                   
});


       
Reduce = results => from result in results.Take(1)
                            let prophetsByTeam
= results.Where(x => x.TeamId != "-1")
                            let profitByProphet
= results.Where(x => x.TeamId == "-1").GroupBy(y => y.ProphetId).Select(y => new { ProphetId = y.Key, Profit = y.Sum(p => p.Profit) })
                            let items
= prophetsByTeam.Select(t => new
                           
{
                               
Profit = profitByProphet.Where(x => t.ProphetId == x.ProphetId).Sum(u => u.Profit), 
                               
TeamId = t.TeamId,
                               
ProphetId = "-1"
                           
})
                           
from item in items
                           
group item by item.TeamId into g
                           
select new
                           
{

                               
Profit = g.Sum(x => x.Profit),

                               
TeamId = g.Key,
                               
ProphetId = "-1"
                           
};
   
}
}

But when I try to execute on Raven I get exception from Raven.Client.Lightweight: Sequence contains no elements

With ProblematicText exception property containing following:

from result in Enumerable.Take(results, 1)
select new {
 result
= result,
 prophetsByTeam
= results.Where(x => x.TeamId != "-1")
} into this0
select new {
 this0
= this0,
 profitByProphet
= results.Where(x => x.TeamId == "-1").GroupBy(y => y.ProphetId).Select(y => new {
 
ProphetId = y.Key,
 
Profit = Enumerable.Sum(y, p => ((double)p.Profit))
 
})
} into this1
select new {
 this1
= this1,
 items
= this1.this0.prophetsByTeam.Select(t => new {
 
Profit = Enumerable.Sum(this1.profitByProphet.Where(x => t.ProphetId == x.ProphetId), u => ((double)u.Profit)),
 
TeamId = t.TeamId,
 
ProphetId = "-1"
 
})
} into this2
from item in this2.items
select new {
 this2
= this2,
 item
= item
} into this3
group this3.item by this3.item.TeamId into g
select new {
 
Profit = Enumerable.Sum(g, x => ((double)x.Profit)),
 
TeamId = g.Key,
 
ProphetId = "-1"
}

Oren Eini (Ayende Rahien)

unread,
Aug 28, 2015, 1:21:40 PM8/28/15
to ravendb
You don't have access to the full result set in the index.
We aren't rebuilding the whole thing on each change, but rather build it incrementally.
All your operations have to take that into account.
Reply all
Reply to author
Forward
0 new messages