Find the memory leak!

169 views
Skip to first unread message

Tobi

unread,
Jul 15, 2011, 11:52:49 AM7/15/11
to ravendb
It's Friday afternoon, and I'm probably trying to do something stupid
here, but I fail to see, what's going wrong.

The code below will cause an OutOfMemoryException with a database
containing a lot of Sale documents.

Where's the memory leak?

using (var documentStore = new EmbeddableDocumentStore())
{
documentStore.Configuration.DataDirectory = "Data";
documentStore.Configuration.DefaultStorageTypeName = "esent";
documentStore.Initialize();

var skip = 0;
while (true)
{
QueryResult queryResult;
do
{
queryResult =
documentStore.DatabaseCommands.Query("Raven/DocumentsByEntityName",
new IndexQuery
{
Query = "Tag:[[Sales]]",
Start = skip,
PageSize = 1024,
}, null);
if (queryResult.IsStale) Thread.Sleep(100);
} while (queryResult.IsStale);
if (queryResult.Results.Count == 0) break;
skip += queryResult.Results.Count;
}
}

Tobias

Ryan Heath

unread,
Jul 15, 2011, 12:16:08 PM7/15/11
to rav...@googlegroups.com
documentStore is caching the results returned from the server?

// Ryan

Tobi

unread,
Jul 15, 2011, 12:28:17 PM7/15/11
to rav...@googlegroups.com
Am 15.07.2011 18:16, schrieb Ryan Heath:

> documentStore is caching the results returned from the server?

It's running local/embedded, so it shouldn't do any caching at all
outside the session scope.

Tobias

Matt Warren

unread,
Jul 15, 2011, 12:33:35 PM7/15/11
to ravendb
Are you running under 32 or 64-bit and how much memory is it consuming
when it blows up?

Tobi

unread,
Jul 15, 2011, 1:45:24 PM7/15/11
to rav...@googlegroups.com
On 15.07.2011 18:33, Matt Warren wrote:

> Are you running under 32 or 64-bit and how much memory is it consuming
> when it blows up?

The OS is running 64bit, the app is compiled for 32bit. It blows up at
1.6 GB after about 30 seconds.

Tobias

Matt Warren

unread,
Jul 15, 2011, 3:13:36 PM7/15/11
to ravendb
> It's running local/embedded, so it shouldn't do any caching at all
> outside the session scope.

That's true, but you're running in embedded mode, so the Client and
Server are in the same process.If you were to run it as seperate
client & server, I would think that all the memory usage is on the
Server side, not in the Client.

> The OS is running 64bit, the app is compiled for 32bit. It blows up at
> 1.6 GB after about 30 seconds.

Raven generally doesn't run well under 32-bit, as it quite easily uses
up the ~2GB of memory available. You can try changing some of the
memory settings outlined in http://ravendb.net/faq/low-memory-footprint
and http://ravendb.net/documentation/configuration.

However it may well just down to high memory usage in Raven and/or
Lucene as you are doing deep-paging which means that in effect all the
docs are being queried from the index and read from the data store. So
both Raven and Lucene will be doing a lot of caching.

Tobi

unread,
Jul 15, 2011, 4:47:26 PM7/15/11
to rav...@googlegroups.com
On 15.07.2011 21:13, Matt Warren wrote:

> Raven generally doesn't run well under 32-bit, as it quite easily uses
> up the ~2GB of memory available.

So far it hasn't been any problem (besides some memory leaks in previous
releases) and I'm using Raven for about a year now on dozens of systems.
And 64 bit isn't really an option - a lot of the embedded systems we use
don't have a 64-Bit-capable CPU and usually there's not more than 2GB RAM
available. And I have Raven running on 512MB-Systems as well!

> You can try changing some of the
> memory settings outlined in http://ravendb.net/faq/low-memory-footprint
> and http://ravendb.net/documentation/configuration.

This won't help.

> However it may well just down to high memory usage in Raven and/or
> Lucene as you are doing deep-paging which means that in effect all the
> docs are being queried from the index and read from the data store. So
> both Raven and Lucene will be doing a lot of caching.

I still think this is a bug. I'll check this against previous versions
next week, but I'm pretty sure it worked one or two stable builds back.
Raven shouldn't do this amount of useless heavy caching - not even in a
Client/Server scenario.

Tobias

Matt Warren

unread,
Jul 15, 2011, 5:12:25 PM7/15/11
to ravendb
Fair enough, you're right Raven is probably doing a bit too much
caching.

But is this test a scenario that you do regularly? i.e do you have a
need to page through all the docs in the database or is it just a test
that reproduces your issue more quickly?

On Jul 15, 9:47 pm, Tobi <listacco...@e-tobi.net> wrote:
> On 15.07.2011 21:13, Matt Warren wrote:
>
> > Raven generally doesn't run well under 32-bit, as it quite easily uses
> > up the ~2GB of memory available.
>
> So far it hasn't been any problem (besides some memory leaks in previous
> releases) and I'm using Raven for about a year now on dozens of systems.
> And 64 bit isn't really an option - a lot of the embedded systems we use
> don't have a 64-Bit-capable CPU and usually there's not more than 2GB RAM
> available. And I have Raven running on 512MB-Systems as well!
>
> > You can try changing some of the
> > memory settings outlined inhttp://ravendb.net/faq/low-memory-footprint
> > andhttp://ravendb.net/documentation/configuration.

Tobi

unread,
Jul 15, 2011, 6:56:23 PM7/15/11
to rav...@googlegroups.com
On 15.07.2011 23:12, Matt Warren wrote:

> But is this test a scenario that you do regularly? i.e do you have a
> need to page through all the docs in the database or is it just a test
> that reproduces your issue more quickly?

I need to do this for the purpose of data migration. I did some
modifications to my document classes and now need to change the
stored docs. Until now, I could do minor changes with patch commands, but
this time it's more complicated, so I need to touch all documents (not ALL
documents, but just the docs of a specific type) myself.

Tobias

Matt Warren

unread,
Jul 16, 2011, 10:04:21 AM7/16/11
to ravendb
You might want to get those docs via:
documentStore.DatabaseCommands.StartsWith("sale", <page>, <size>)

This will pull docs directly from the Esent/Munin datastore, bypassing
the Lucene index completely. You can also page through them in the
same way as you would with the query.

It might not fix the issue, but it's a more efficient way (as long as
all the docs you want have the same prefix).

Matt Warren

unread,
Jul 16, 2011, 10:15:59 AM7/16/11
to ravendb
I've done a bit of digging and Raven uses the
System.Runtime.Caching.MemoryCache internally to cache docs after
they're been read.

However I can't find anywhere that it exposes the cache settings. But
according to this page you can config the MemeoryCache via a config
file, see http://msdn.microsoft.com/en-us/library/dd941875.aspx. So
you could try that, but I'm guessing a bit here!

Tobi

unread,
Jul 16, 2011, 3:28:06 PM7/16/11
to rav...@googlegroups.com
On 16.07.2011 16:04, Matt Warren wrote:

> You might want to get those docs via:
> documentStore.DatabaseCommands.StartsWith("sale", <page>, <size>)

Nice - I wasn't aware of this.

> This will pull docs directly from the Esent/Munin datastore, bypassing
> the Lucene index completely. You can also page through them in the
> same way as you would with the query.
>
> It might not fix the issue, but it's a more efficient way (as long as
> all the docs you want have the same prefix).

For the purpose of data migration this is definitly better than dealing
with indexes. I'll try this as soon as I'm back to work. Thx for the hint!

Tobias

Tobi

unread,
Jul 16, 2011, 3:31:15 PM7/16/11
to rav...@googlegroups.com
On 16.07.2011 16:15, Matt Warren wrote:

> I've done a bit of digging and Raven uses the
> System.Runtime.Caching.MemoryCache internally to cache docs after
> they're been read.

I don't think this is causing the problem. I did a quick check with a
memory profiler last Friday and it's a dictionary that sucks all the
memory. I didn't had a chance to pinpoint this yet - such shit always
happens Friday afternoon :-)

Tobias

Tobi

unread,
Jul 18, 2011, 6:59:54 AM7/18/11
to ravendb
I did some more testing.

First: I was wrong in my assumption that this has worked with a
previous version. I've tried every stable release back to January and
always got an OOME.

Next I tried different ways to query the docs:

1. Using DatabaseCommands.Query("Raven/DocumentsByEntityName", ...)

Retrieved docs : 20480
OutOfMemoryException
Used memory: 1680752640
Used Time : 00:01:10.8783754

2. Using LuceneQuery<dynamic>("Raven/DocumentsByEntityName"):

Retrieved docs : 24908
Used memory : 1406980096
Used Time : 00:02:35.3375286

3. Using GetDocumentsWithIdStartingWith():

Retrieved docs : 24908
Used memory : 434380800
Used Time : 00:00:43.7233648

Only the IndexQuery via DatabaseCommands causes an OOME.
But the LucenQuery will throw an OOME as well, if I simply
read more docs.

GetDocumentsWithIdStartingWith() is by far the fastest method,
using the least memory.
This is probably the best way, for what I need to do (modify
the structure of all docs of a specific type). I'm just not
sure, if this will work, if I modify the docs while paging
through them via GetDocumentsWithIdStartingWith().

I still don't like, that RavenDB OOME's this easily.
Trying the following:

using (DocumentCacher.SkipSettingDocumentsInDocumentCache())
{
...
}

...makes even the first method work.

So I guess, Matt was right and this is only a caching issue.

It would be nice to allow to configure the cache parameters in
code.

I'll try to provide a pull request for this soon.

Tobias

Matt Warren

unread,
Jul 18, 2011, 7:41:03 AM7/18/11
to ravendb
> ... It's not a caching issue

I had a quick look at it this morning and came to the same conclusion.
I think the problem is that Raven is allocating memory so fast that
the GC can't keep up.

With the original method add the following lines of code after each
query:
GC.Collect();
GC.WaitForFullGCComplete(2000);
You can see that the memory usage is more reasonable. (I know this
isn't a fix or advisable in production code, it's just to show the
point)

Matt Warren

unread,
Jul 18, 2011, 7:44:36 AM7/18/11
to ravendb
> > GetDocumentsWithIdStartingWith() is by far the fastest method,
> > using the least memory.
> > This is probably the best way, for what I need to do (modify
> > the structure of all docs of a specific type). I'm just not
> > sure, if this will work, if I modify the docs while paging
> > through them via GetDocumentsWithIdStartingWith().

It's more robust pulling them directly than going via the Lucene
index. If you modify a doc then Raven gets Lucene to re-index it and
as part of that Lucene gives it a new Lucene doc Id. This in turn
affects the default "relevance" order, so the paging could be
affected.

Tobi

unread,
Jul 18, 2011, 8:32:11 AM7/18/11
to rav...@googlegroups.com
Am 18.07.2011 13:41, schrieb Matt Warren:

> With the original method add the following lines of code after each
> query:
> GC.Collect();
> GC.WaitForFullGCComplete(2000);
> You can see that the memory usage is more reasonable. (I know this
> isn't a fix or advisable in production code, it's just to show the
> point)

Setting the configuration for MemoryCache works too. All I need is
to decrease the polling interval, which is 2min by default.
(Meaning the cache limits are checked every 2 minutes, but I reach
the memory limit much earlier.)

http://www.google.com/url?sa=D&q=http://msdn.microsoft.com/en-us/library/dd941875.aspx

The documentation just seems to be slightly wrong. This is a working
example:

<system.runtime.caching>
<memoryCache>
<namedCaches>
<add name="Default"
cacheMemoryLimitMegabytes="0"
physicalMemoryLimitPercentage="50"
pollingInterval="00:01:00" />
</namedCaches>
</memoryCache>
</system.runtime.caching>

But I prefer to be able to set this via the Raven-Configuration in
code, like this way:

https://github.com/e-tobi/ravendb/tree/ExposeCachingParameters

It would be possible to expose the megabyte limit as well, but I think
the percentage limit is all someone will ever need.
It would be possible to configure the TTL of the cached items as well,
which might be a little bit more helpful.

Tobias

Ayende Rahien

unread,
Jul 19, 2011, 7:54:19 AM7/19/11
to rav...@googlegroups.com
Tobi,
Great job on figuring this out, I am sorry I wasn't able to lend assistance, but it seems that you figured it out on your own, this seems wonderful!
I'll pull your changes.

Ayende Rahien

unread,
Jul 19, 2011, 7:54:37 AM7/19/11
to rav...@googlegroups.com
Urgh!
This seems like a wonderful topic for a blog post, I meant.

Tobi

unread,
Jul 19, 2011, 8:29:00 AM7/19/11
to rav...@googlegroups.com
Am 19.07.2011 13:54, schrieb Ayende Rahien:

> Tobi,
> Great job on figuring this out

Some credits go to Matt as well, he guessed it's a memory cache issue,
before I could track this down :-)

> but it seems that you figured it out on your own, this seems wonderful!
> I'll pull your changes.

I haven't done a pull-request yet, because I was thinking about adding
a TTL-setting for the cached entries as well. But If you think,
that MemoryCacheLimitPercentage and MemoryCacheLimitCheckInterval are
enough, please go ahead.

Tobias

Ayende Rahien

unread,
Jul 19, 2011, 8:30:59 AM7/19/11
to rav...@googlegroups.com
I think that TTL wouldn't work, probably, we want to keep the cached item in memory for as long as possible.
Reply all
Reply to author
Forward
0 new messages