Trouble with Max in Reduce

81 views
Skip to first unread message

4eyed

unread,
Apr 18, 2014, 3:46:37 PM4/18/14
to rav...@googlegroups.com
I have the following index

    public class PointDetail_Linear : AbstractIndexCreationTask<PointDetail,PointDetailReduce>
    {
        public PointDetail_Linear()
        {
            Map = docs => from doc in docs
                          select new 
                          {
                              AccountId = doc.AccountId,
                              RetailerId = doc.RetailerId,
                              AveragePoints = 0,
                              HighPoints=1,
                              Count = 1,
                              TotalPoints = doc.Points
                          };
            Reduce = results => results.GroupBy (r => new { r.AccountId, r.RetailerId })
.Select (r => new PointDetailReduce
{
AccountId = r.Key.AccountId,
RetailerId=r.Key.RetailerId,
TotalPoints=r.Sum (x => x.TotalPoints),
AveragePoints = 1,
HighPoints = 1,
Count=r.Sum (x => x.Count)
})
.GroupBy (r => r.RetailerId)
.Select (r => new PointDetailReduce
{
AccountId = string.Empty,
RetailerId = r.Key,
HighPoints = r.Max (x => x.TotalPoints),
Count = r.Sum (x => x.Count),
TotalPoints = r.Sum (x => x.TotalPoints),
AveragePoints = r.Sum (x => x.TotalPoints)/r.Sum (x => x.Count)
});


        }

    }


I am trying to get the highest points by AccountID for each retailer.  But the results keep coming back as HighPoints = TotalPoints.  Is there a problem doing the two group by's in the reduce function?

Kijana Woodard

unread,
Apr 18, 2014, 5:13:07 PM4/18/14
to rav...@googlegroups.com
IIRC, you can only do one group by.

"I am trying to get the highest points by AccountID for each retailer."

What's wrong with using the first group by?

let count = r.Sum (x => x.Count)
let total = r.Sum (x => x.TotalPoints)
.Select (r => new PointDetailReduce
{
AccountId = r.Key.AccountId,
RetailerId = r.Key.RetailerId,
TotalPoints = total,
AveragePoints = total/count,
HighPoints = r.Max(x => x.TotalPoints),
Count = count
})


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

4eyed

unread,
Apr 18, 2014, 5:21:15 PM4/18/14
to rav...@googlegroups.com
that didn't come out right :}  For each retailerid, i want "HighPoints" = to the max of any accountid.....  so the first groupby was to get the sum of points by retailer and account, the second one was to roll that up at the retailer level and list the account with the highest pts...not sure if that is clear

Oren Eini (Ayende Rahien)

unread,
Apr 18, 2014, 8:57:38 PM4/18/14
to ravendb
You can only group _once_ per reduced result.
Note that you can group again a sub results.


results => results.GroupBy (r => new { r.AccountId, r.RetailerId })
.Select (r => new PointDetailReduce
{
AccountId = r.Key.AccountId,
RetailerId=r.Key.RetailerId,
TotalPoints=r.Sum (x => x.TotalPoints),
AveragePoints = 1,
HighPoints = 1,
Count=r.Sum (x => x.Count),
})

This will give you the updated results per account per retailer.

What is the meaning of High Points?




Oren Eini

CEO

Mobile: + 972-52-548-6969

Office:  + 972-4-674-7811

Fax:      + 972-153-4622-7811



Kijana Woodard

unread,
Apr 18, 2014, 9:08:42 PM4/18/14
to rav...@googlegroups.com

Yeah. I think you need two m/r indexes.

Message has been deleted

4eyed

unread,
Apr 21, 2014, 2:21:27 PM4/21/14
to rav...@googlegroups.com
How would i group again vs the sub results?  I split the two groupby's below into two indexes, but the only way i can combine them is on the client with multiple queries....I wasn't able to get server side Result Transformers to query an index...

Kijana Woodard

unread,
Apr 21, 2014, 2:27:16 PM4/21/14
to rav...@googlegroups.com
I would probably do two queries in a single remote request (Lazily) and combine.


--

4eyed

unread,
Apr 21, 2014, 5:25:59 PM4/21/14
to rav...@googlegroups.com
I created and index that groups on RetailerId and AccountId, sums up points, count...that works fine...

            Map = docs => from doc in docs
                          select new PointDetailReduce
                          {
                              AccountId = doc.AccountId,
                              RetailerId = doc.RetailerId,
                              Count = 1,
      CreateDt = doc.CreateDt,
      TotalPoints = doc.Points
                          };
            Reduce = results => results.GroupBy (r => new { r.AccountId, r.RetailerId })
.Select (r => new PointDetailReduce
{
AccountId = r.Key.AccountId,
RetailerId=r.Key.RetailerId,
Count=r.Sum (x => x.Count),
CreateDt = r.Max(x=>x.CreateDt),
TotalPoints = r.Sum(x=>x.TotalPoints)
}
);


next i tried to apply a following Result Transformer..but getting strange results where TotalPoints = HighPoints



from result in results
 
group result by result.RetailerId into r
 let high
= r.GroupBy (p => p.AccountId).Select (p => p.Sum (x => x.TotalPoints)).OrderByDescending (p => p).FirstOrDefault ( )
 
select new
 
{
 
RetailerId=r.Key,
 
HighPoints = high,

 
Count = r.Sum (x => x.Count),
 
TotalPoints = r.Sum (x => x.TotalPoints),
 
AveragePoints = r.Sum (x => x.TotalPoints)/r.Sum (x => x.Count)
 
}
 

this works in regular link outside the transformer....


On Friday, April 18, 2014 3:46:37 PM UTC-4, 4eyed wrote:

4eyed

unread,
Apr 21, 2014, 7:27:38 PM4/21/14
to rav...@googlegroups.com
So the result transformer actually worked perfectly...i wasn't streaming the results...here is my Transformer

public class GroupByRetailerTransformer : AbstractTransformerCreationTask<PointDetailReduce>
{
   
public GroupByRetailerTransformer()
   
{
       
TransformResults = results => from result in results
 
group result by result.RetailerId into r
 
select new
 
{
 
RetailerId=r.Key,
 
HighPoints = r.Max(x=>x.TotalPoints),

 
Count = r.Sum (x => x.Count),
 
TotalPoints = r.Sum (x => x.TotalPoints),

 
AveragePoints = (double)r.Sum (x => x.TotalPoints)/(double)r.Sum (x => x.Count)
 
};
 
   
}
}

then client side code to call it

           
 var q = MvcApplication.CurrentSession.Query<BrandPointReduce>("PointDetail/ByAccountGroupedByRetailer")
           
.TransformWith<GroupByRetailerTransformer, BrandPointReduce>()
           
.Where(r => r.RetailerId.In(likes));
           
var brandstats = new List<BrandPointReduce>();
           
using (var enumerator = MvcApplication.CurrentSession.Advanced.Stream<BrandPointReduce>(q))
           
{


               
while (enumerator.MoveNext())
               
{
                   
BrandPointReduce pd = enumerator.Current.Document;
                    brandstats
.Add(pd);
                   
//storepts(acct);
               
}
           
}

i'm still testing but this seems to work perfectly.....very nice new feature...!


On Friday, April 18, 2014 3:46:37 PM UTC-4, 4eyed wrote:

Kijana Woodard

unread,
Apr 22, 2014, 12:06:13 AM4/22/14
to rav...@googlegroups.com

Great!

--

4eyed

unread,
Apr 22, 2014, 11:11:14 PM4/22/14
to rav...@googlegroups.com
I am facing another issue....I have to use the streaming api to get the transformwith to work properly and its making lots of requests to the server...That part i don't understand, if the work is happeing on the server,  why would it do multiple round trips...example

            var q = MvcApplication.CurrentSession.Query<BrandPointReduce>("PointDetail/ByAccountGroupedByRetailer")
            .TransformWith<GroupByRetailerTransformer, BrandPointReduce>()
            .Where(r => r.RetailerId.In(new List<string>{"xxxxx","xxxxx","xxxxx","xxxxxx"}));
            using (var enumerator = MvcApplication.CurrentSession.Advanced.Stream<BrandPointReduce>(q))
            {

                while (enumerator.MoveNext())
                {
                    BrandPointReduce pd = enumerator.Current.Document;
                }
            }

This makes 4 trips to the server...if i don't do this with the streaming api, i do not get back the correct results....


On Friday, April 18, 2014 3:46:37 PM UTC-4, 4eyed wrote:

Kijana Woodard

unread,
Apr 22, 2014, 11:54:01 PM4/22/14
to rav...@googlegroups.com

The streaming API....streams in batches. There may be a parameter to tune the batch size.

--

Oren Eini (Ayende Rahien)

unread,
Apr 23, 2014, 2:01:29 AM4/23/14
to ravendb
Huh? It does a _single_ roundtrip.



Oren Eini

CEO

Mobile: + 972-52-548-6969

Office:  + 972-4-674-7811

Fax:      + 972-153-4622-7811





--

4eyed

unread,
Apr 23, 2014, 11:38:52 AM4/23/14
to rav...@googlegroups.com
i just posted a new topic on this...didn't see the response.  When i turn on tracing in my MVC app i see multiple trips and even hit the 30 connection exception when the list is too large...here is the client side code

            var q = MvcApplication.CurrentSession.Query<BrandPointReduce>("PointDetail/ByAccountGroupedByRetailer")
           
.TransformWith<GroupByRetailerTransformer, BrandPointReduce>()

           
.Where(r => r.RetailerId.In(likes));
           
var brandstats = new List<BrandPointReduce>();

           
using (var enumerator = MvcApplication.CurrentSession.Advanced.Stream<BrandPointReduce>(q))
           
{


               
while (enumerator.MoveNext())
               
{
                   
BrandPointReduce pd = enumerator.Current.Document;

                    brandstats
.Add(pd);
               
}
           
}

I get one trip for every entry in "likes"....and if i don't use the stream...i only get back one incomplete result....I thought it was better to post this as a new topic....didn't want to get lost.


On Friday, April 18, 2014 3:46:37 PM UTC-4, 4eyed wrote:

Oren Eini (Ayende Rahien)

unread,
Apr 23, 2014, 1:08:17 PM4/23/14
to ravendb
Please show the Fiddler output.



Oren Eini

CEO

Mobile: + 972-52-548-6969

Office:  + 972-4-674-7811

Fax:      + 972-153-4622-7811





--
Reply all
Reply to author
Forward
0 new messages