Order in Map-Reduce index

49 views
Skip to first unread message

Steven Zhang

unread,
Jan 13, 2011, 5:05:53 AM1/13/11
to rav...@googlegroups.com
Hi,

I tried to do "orderby" in Reduce clause, but it seems not working.

E.g. by made some changing according to the post http://mookid.dk/oncode/archives/1689

public class AggregateAmountsPerItem : AbstractIndexCreationTask
{
    public override IndexDefinition CreateIndexDefinition()
    {
        return new IndexDefinition<Order, ItemAggregate>
                    {
                        Map = orders => from order in orders
                                        from item in order.Items
                                        select new {item.Name, item.Amount},
                        Reduce = items => from item in items
                                            group item by item.Name into i
                                            let amount = i.Max(x => x.Amount)
                                            sortby amount descending
                                            select new {Name = i.Key, Amount = i.Sum(x => x.Amount)}
                    }.ToIndexDefinition(DocumentStore.Conventions);
    }
}

But it did not work.

Any one tried "orderby" in Map-Reduce index?

--
Regards,
Steven Zhang

Ayende Rahien

unread,
Jan 13, 2011, 5:26:15 AM1/13/11
to rav...@googlegroups.com
Sorting in RavenDB is done at query time, not at write time.
It doesn't really make sense to order results at write time.

Ayende Rahien

unread,
Jan 13, 2011, 7:42:50 AM1/13/11
to rav...@googlegroups.com
Yes it should

On Thu, Jan 13, 2011 at 2:39 PM, Steven Zhang <jdom...@gmail.com> wrote:
Hi Ayende,

Got it. Thanks for your answer.

BTW, it seems we could not do Max on "long" type or "DateTime".

E.g. 

In below code, If the "Amount" type is "long" or "DateTime", would it work?


public class AggregateAmountsPerItem : AbstractIndexCreationTask { public override IndexDefinition CreateIndexDefinition() { return new IndexDefinition<Order, ItemAggregate> { Map = orders => from order in orders from item in order.Items select new {item.Name, item.Amount}, Reduce = items => from item in items group item by item.Name into i let amount = i.Max(x => x.Amount)

                                            select new {Name = i.Key, Amount = amount}
}.ToIndexDefinition(DocumentStore.Conventions); } }


--
Regards,
Steven Zhang

Steven Zhang

unread,
Jan 13, 2011, 8:05:38 AM1/13/11
to rav...@googlegroups.com
But when I do Max on 'DateTime' type, I got below error.

Error On:TagAggregateIndex
Error: Cannot convert null to 'int' because it is a non-nullable value type
TimeStamp: Thu Jan 13 2011 20:56:49.166
Document: null
--
Regards,
Steven Zhang

Ayende Rahien

unread,
Jan 13, 2011, 8:39:01 AM1/13/11
to rav...@googlegroups.com
It looks like you hit a null somewhere, is that the case?

Steven Zhang

unread,
Jan 13, 2011, 8:53:43 AM1/13/11
to rav...@googlegroups.com
Not actually, I only had 2 documents and both of them have value on LastUpdateTime field.

Below is my code. Besides, if I comment 2 lines code about setting "LastUpdateTime", it would work.

    public class TagAggregateIndex : AbstractIndexCreationTask
    {
        public override IndexDefinition CreateIndexDefinition()
        {
            return new IndexDefinition<Resource, Tag>
            {
                Map = resources => from resource in resources
                                   from tag in resource.Tags
                                   select new
                                   {
                                       Name = tag.Name.ToLower(),
                                       LastUpdateTime = resource.LastUpdateTime,
                                   },

                Reduce = items => from item in items
                                  group item by item.Name into g
                                  let lastUpdateTime = g.Max(x=>x.LastUpdateTime)
                                  let totalCount = g.Count()
                                  select new
                                  {
                                      Name = g.Key,
                                      TotalCount = totalCount,
                                      LastUpdateTime = lastUpdateTime,
                                  },

            }.ToIndexDefinition(DocumentStore.Conventions);

        }
    }
--
Regards,
Steven Zhang

Ayende Rahien

unread,
Jan 13, 2011, 8:59:37 AM1/13/11
to rav...@googlegroups.com
Can you try sending a full repro?

Steven Zhang

unread,
Jan 13, 2011, 9:55:49 AM1/13/11
to rav...@googlegroups.com
Sure. It is a project on GitHub https://github.com/agilewizard/agilewizard/

I create a branch 'forayende' for you review.

Some information about project structure.
  • RavenDB Server: "/tool/ravendb/Server" folder (Probably you don't need it)
  • Domain Models: "/src/AgileWizard.Domain/Models" folder
    • Resource: The type of document
    • Tag: The type of Reduce result
  • Map-Reduce index: /src/AgileWizard.Domain/QueryIndexes/TagAggregateIndex.cs
  • Integration Test Project: 
    • Uses SpecFlow (NUnit), you don't need install SpecFlow
    • /src/AgileWizard.IntegrationTests/Steps/Tag.cs: where actually invoked by Unit Test
    • /src/AgileWizard.IntegrationTests/EventDefinition.cs: When run Test, it would clear all documents, create some initial documents and rebuild index
To reproduce the index problem.
  • Run RavenDB (default port 8080)
  • Open AgileWizard solution
  • Run unit test on "AgileWizard.IntegrationTests/Features/Tag.feature.cs" file
--
Regards,
Steven Zhang

Steven Zhang

unread,
Jan 13, 2011, 10:12:45 AM1/13/11
to rav...@googlegroups.com
After reproduce, go to http://localhost:8080/raven/statistics.html

You will see the error log like this

Error On:TagAggregateIndex
Error: Cannot convert null to 'int' because it is a non-nullable value type
TimeStamp: Thu Jan 13 2011 22:26:12.57
Document: null
--
Regards,
Steven Zhang

Ayende Rahien

unread,
Jan 13, 2011, 12:02:24 PM1/13/11
to rav...@googlegroups.com
I had something similar to this in mind:

Your approach would take quite a bit of time to setup

Steven Zhang

unread,
Jan 14, 2011, 12:00:08 PM1/14/11
to rav...@googlegroups.com
Hi Ayende,

I can reproduce the defect by making some change basing on your sample.

Sample code is right here https://gist.github.com/779874
--
Regards,
Steven Zhang

Ayende Rahien

unread,
Jan 14, 2011, 4:16:18 PM1/14/11
to rav...@googlegroups.com
Thanks, I'll look on that on Sunday

Ayende Rahien

unread,
Jan 16, 2011, 8:42:30 AM1/16/11
to rav...@googlegroups.com
The basic idea is that because we don't have a type, it selects int by default.
In order to make both indexes work, you need to write them like this:

   private const string reduce = @"
from agg in results
group agg by agg.Name into g
let createdTime = g.Max(x => (long)x.CreatedTime.Ticks)
select new {Name = g.Key, CreatedTime = createdTime}
";


        private const string reduce = @"
from agg in results
group agg by agg.Name into g
let createdTimeTicks = g.Max(x => (long)x.CreatedTimeTicks)
select new {Name = g.Key, CreatedTimeTicks = createdTimeTicks}
";

Steven Zhang

unread,
Jan 17, 2011, 1:29:47 AM1/17/11
to rav...@googlegroups.com
Thank you Ayende, it makes sense to me.
--
Regards,
Steven Zhang
Reply all
Reply to author
Forward
0 new messages