Slow In Operator performance in RavenDb 3

99 views
Skip to first unread message

Peter Balzli

unread,
Mar 26, 2015, 12:49:01 PM3/26/15
to rav...@googlegroups.com
Hi

In RavenDb 3.0.3599, the performance of the IN operator is much slower than in RavenDb 2.5.

public class TestClass {
    public Guid Id { get; set; }
    public string Test { get; set; }
}


var list = Enumerable.Range(0, 100).Select(i => "Test" + i).ToList();

using (var session = documentStore.OpenSession()) {
    var testClasses = session.Query<TestClass>().Take(1024).Where(t => t.Test.In(list)).ToList();
}

In RavenDb 3 this query takes about 350ms in average, in RavenDb 2.5 it takes about 70ms.

Please see attached test project for details.
RavenDBInOperatorTests.zip

Peter Balzli

unread,
Mar 30, 2015, 2:38:00 AM3/30/15
to rav...@googlegroups.com
When I increase the number of arguments it's even worse.

Number of arguments: 200
RavenDb 2.5.2956: ~100ms
RavenDb 3.0.3599: ~1500ms


Number of arguments: 300
RavenDb 2.5.2956: ~100ms
RavenDb 3.0.3599: ~2500ms


Number of arguments: 400
RavenDb 2.5.2956: ~100ms
RavenDb 3.0.3599: ~4500ms

Number of arguments: 500
RavenDb 2.5.2956: ~120ms
RavenDb 3.0.3599: ~7500ms

Federico Lois

unread,
Mar 30, 2015, 8:32:00 AM3/30/15
to rav...@googlegroups.com
Hi Peter,

Is that an automatic index or it is already indexed?

Federico

From: Peter Balzli
Sent: ‎30/‎03/‎2015 03:38
To: rav...@googlegroups.com
Subject: [RavenDB] Re: Slow In Operator performance in RavenDb 3

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Peter Balzli

unread,
Mar 30, 2015, 9:07:42 AM3/30/15
to rav...@googlegroups.com
It's an auto index in the attached test project. 

But I have other examples in my code base where the same slowness appears in index queries.

Federico Lois

unread,
Mar 30, 2015, 9:56:40 AM3/30/15
to rav...@googlegroups.com
It doesnt seem to be attached. Maybe you send it from other channel. Can you resend it here so I can take a look?

Federico

Peter Balzli

unread,
Mar 30, 2015, 10:12:45 AM3/30/15
to rav...@googlegroups.com
The test project is attached in my initial post.

Federico Lois

unread,
Mar 30, 2015, 10:50:25 AM3/30/15
to rav...@googlegroups.com
Somehow when reading it on the email it got messed up. Found it in the group. 

Federico Lois

unread,
Mar 30, 2015, 12:04:00 PM3/30/15
to rav...@googlegroups.com
Hi Peter,

I have been looking into this issue and have a couple of conclussions.

- The timings are not completly right (on both cases), if you print your times you will see that the first takes 3400ms and the rest along the lines of 100ms. That's the regular expressions compiler at work.
- The issue is not the In operator, it is way more pervasive than that. The problem did exist in 2.5 too, however, given that the types of queries that we support in 2.5 is smaller the threshold is a bit higher. 
- There is nothing I could cut from the QueryBuilder that would improve the situation to the point of the 2.5 behavior without tackling the underlying problem.

Therefore, I am changing the name of the issue to: "Improve the QueryBuilder with a different parsing model" because even if I could squeeze some milliseconds from here or there, the culprit is that it grows non linearly. Eventually the query will reach a threshold in size (easier when you use the IN operator) and start costing as you report.

Federico
Reply all
Reply to author
Forward
0 new messages