CPU peaks 100% and RavenDb timeouts each time we have periodic export

161 views
Skip to first unread message

Alexey Zimarev

unread,
Mar 28, 2017, 5:24:01 AM3/28/17
to RavenDB - 2nd generation document database
I recently configured periodic export, full export once a day and incremental export every hour. Since then, we experience CPU peaks of 100%, not sure exactly about time correlation. 

During these peaks we have RavenDb not answering to any requests. All clients time out.

Is this normal? I mean, I was not expecting that. Of course, I am thinking now to export from the replica server, but, I mean, I thought that the database operation is naturally prioritised over any other jobs, isn't that the case?

Oren Eini (Ayende Rahien)

unread,
Mar 28, 2017, 6:08:07 AM3/28/17
to ravendb
No, it isn't normal. Can you take a minidump while this is happening?

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Plamen Boudinov

unread,
Mar 29, 2017, 3:20:49 AM3/29/17
to RavenDB - 2nd generation document database
Had the same problems some time back, and it came out memory management issues when working with large (> 5mb) documents. I beleive Raven team made some fixes since then which made it in a 3.xx version too.

Alexey Zimarev

unread,
Mar 29, 2017, 3:47:32 AM3/29/17
to RavenDB - 2nd generation document database
It appeared to be the memory issue. We had 8Gb on the machine and RavenDb consumer all of it. We added up to 20Gb now, overnight RavenDb memory footprint grew from 7Gb to 15Gb.
We have around 200 databases and 200 FS, do we need to assume that adding more databases will require more memory? If yes - then how much?

Michael Yarichuk

unread,
Mar 29, 2017, 4:09:41 AM3/29/17
to RavenDB - 2nd generation document database
RavenDB will try to use as much memory as possible (if not configured otherwise)

In general, more databases will need more memory yes. How much memory? Hard to say, memory utilization heavily depends  on usage pattern of a certain database and how active the database is.
 (how active the database is --> if you do some queries on a certain database once per 24 hours, then this specific database will get unloaded and won't take up memory the rest of the time)


On Wed, Mar 29, 2017 at 10:47 AM, Alexey Zimarev <azim...@gmail.com> wrote:
It appeared to be the memory issue. We had 8Gb on the machine and RavenDb consumer all of it. We added up to 20Gb now, overnight RavenDb memory footprint grew from 7Gb to 15Gb.
We have around 200 databases and 200 FS, do we need to assume that adding more databases will require more memory? If yes - then how much?

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Best regards,

 

Hibernating Rhinos Ltd  cid:image001.png@01CF95E2.8ED1B7D0

Michael Yarichuk l RavenDB Core Team 

RavenDB paving the way to "Data Made Simple"   http://ravendb.net/  

Alexey Zimarev

unread,
Mar 29, 2017, 4:12:39 AM3/29/17
to RavenDB - 2nd generation document database
Let me formulate it differently. 

If I have a server with 32Gb RAM and 10,000 databases, which are all rather small (10Mb max) - will this work?

Michael Yarichuk

unread,
Mar 29, 2017, 4:56:05 AM3/29/17
to RavenDB - 2nd generation document database
The best answer I can give to this question - it depends. Off the top of my head I see several _potential_ issues with such setup.
1) I/O bottleneck, especially during peak times : each database instance uses I/O to read and write changes to documents and to do indexing. The I/O overhead depends on frequency of CRUD and indexes you have on each database. Thus, each database instance will compete on I/O with all other instances.
2) Threading issue - again, it depends on usage pattern, especially on how many databases will be active concurrently. Even if you have lots of cores in such machines, the databases use threads, indexes use threads, HTTP requests use threads, etc.
3) Memory issues - each database has memory usage overhead that depends on usage pattern. This as well will be heavily dependent on how many concurrent databases you will have.
4) Parallel startup of the databases - after some inactivity time, databases get unloaded from memory. A query or CRUD request to a certain database will cause it to "wake-up".
If you would have a sudden flux of "wake-up" requests to let's say hundreds of databases at the same time, it would cause major I/O and resource bottleneck because databases that are "waking up" will compete for resources and mutexes.

To summarize, I think that such setup is likely to have stability issues during activity peaks, to say the least.

Also, note that in general, with RavenDB, it is much more performant to have 100GB single database than 10000 small 10MB databases.

On Wed, Mar 29, 2017 at 11:12 AM, Alexey Zimarev <azim...@gmail.com> wrote:
Let me formulate it differently. 

If I have a server with 32Gb RAM and 10,000 databases, which are all rather small (10Mb max) - will this work?

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexey Zimarev

unread,
Mar 29, 2017, 7:10:24 AM3/29/17
to RavenDB - 2nd generation document database
Well, we have a multi-tenant application and it is quite easy to use database per tenant and I was checking in this group two years ago, the answer was "RavenDb supports as many databases as OS supports files in one directory". Querying one small tenant database takes a very small index to go through and if we should put everything in one or more large database(s) - each index would also have the TenantId column, which does not sound very efficient.

Oren Eini (Ayende Rahien)

unread,
Mar 29, 2017, 8:26:44 AM3/29/17
to ravendb
Alexey,
We don't have an issue with the number of databases,we have an issue with the concurrent usage.

As a simple example, try take 1,000 15 GB files at the same directory, now try to read 150 of them all at the same time. You'll find that you spend most of your time with each file fighting with the other files.

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

On Wed, Mar 29, 2017 at 2:10 PM, Alexey Zimarev <azim...@gmail.com> wrote:
Well, we have a multi-tenant application and it is quite easy to use database per tenant and I was checking in this group two years ago, the answer was "RavenDb supports as many databases as OS supports files in one directory". Querying one small tenant database takes a very small index to go through and if we should put everything in one or more large database(s) - each index would also have the TenantId column, which does not sound very efficient.

 
Also, note that in general, with RavenDB, it is much more performant to have 100GB single database than 10000 small 10MB databases.

Alexey Zimarev

unread,
Mar 29, 2017, 3:47:51 PM3/29/17
to rav...@googlegroups.com
Oren, I agree but I am not talking about 15Gb files but rather 15Mb files.

Oren Eini (Ayende Rahien)

unread,
Mar 29, 2017, 3:49:33 PM3/29/17
to ravendb
Do the same on 1500 files, and you get the same. That is why we say that it is all about the usage pattern

Alexey Zimarev

unread,
Mar 29, 2017, 3:50:31 PM3/29/17
to rav...@googlegroups.com
So this means multinenancy by database per tenant is a bad idea.

Oren Eini (Ayende Rahien)

unread,
Mar 29, 2017, 3:54:49 PM3/29/17
to ravendb
If you intend to run thousands of tenants on the same node, all of which are active, yes.
A DB is a non trivial resource, it has its own threads, memory, buffers, etc. 

We can easily handle a few hundreds per node, but the idea is that each db is a pretty big one. 

Alexey Zimarev

unread,
Mar 30, 2017, 2:23:22 AM3/30/17
to rav...@googlegroups.com
The current issue is that we have about 200 tenant databases, not all but most of them are active “now and then”. And we get these CPU spikes for about 30-60 seconds, when RavenDb becomes unavailable.

You received this message because you are subscribed to a topic in the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/1GXgGVZ4-0s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Tal Weiss

unread,
Mar 30, 2017, 2:40:16 AM3/30/17
to RavenDB - 2nd generation document database
Alexey, what version of ravendb server are you using? We have made multiple changes to support multiple active tenants better (on v3. 5).

To unsubscribe from this group and all its topics, send an email to ravendb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Alexey Zimarev

unread,
Mar 31, 2017, 3:25:18 PM3/31/17
to rav...@googlegroups.com
We are using 3.5. Currently we are removing the use of RavenFS. I suspect large files might cause some issues.
Reply all
Reply to author
Forward
0 new messages