Constantly high CPU usage (70%+) on a simple project

Marcelo Volmaro

unread,

Nov 24, 2015, 1:18:09 AM11/24/15

to RavenDB - 2nd generation document database

Hi,

I just ported a service app that was using mysql to RavenDB. I'm now in the process of testing it, and found that the application uses constantly about 70% of the CPU time.

The application is a simple logger that I use to log data from different sources. The current rate of messages is about 30/s. Each message is similar to this one:

{
    "Count": 1,
    "FirstOccurredOn": "2015-11-23T18:53:48.598028Z",
    "ApplicationName": "TaskManagerServer",
    "Details": "",
    "Id": "2dde00cd25ed4a35a32b0000000bf697",
    "OccurredOn": "2015-11-23T18:53:48.598028Z",
    "Tags": [],
    "TextMessage": "....(some text here)...",
    "Type": 4,
    "Metadata": {
        "ServerId": "1",
        "ProviderId": "1"
    }
}

I also have 3 simple indexes, as following:

public sealed class DebugMessages_ByDefaultIndexes : AbstractIndexCreationTask<DebugMessage>
{
	public DebugMessages_ByDefaultIndexes()
	{
		Map = messages => messages
			.Select(item => new
			{
				item.OccurredOn,
				item.ApplicationName,
				item.Type,
				item.Tags,
				_ = item.Metadata.Select(x => CreateField("Metadata_" + x.Key, x.Value))
			});
	}
}

public sealed class Tags_ByName : AbstractIndexCreationTask<DebugMessage, Tags_ByName.Tag>
{
	public sealed class Tag
	{
		public string Application { get; set; }
		public string Name { get; set; }
	}
		public Tags_ByName()
	{
		Map = messages => messages.SelectMany(message => message.Tags, (message, tag) => new Tag
			{
				Name = tag,
				Application = message.ApplicationName
			});
			Reduce = results => results.GroupBy(tag => new
		{
			tag.Application,
			tag.Name
		}).Select(g => new Tag
		{
			Application = g.Key.Application,
			Name = g.Key.Name
		});
	}
}

public sealed class Applications_ByName : AbstractIndexCreationTask<DebugMessage, Applications_ByName.Application>
{
	public sealed class Application
	{
		public string Name { get; set; }
	}
		public Applications_ByName()
	{
		Map = messages => messages.Select(x => new Application
		{
			Name = x.ApplicationName
		});
			Reduce = results => results.GroupBy(application => new
		{
			application.Name
		}).Select(g => new Application
		{
			Name = g.Key.Name
		});
	}
}

The applications is using RavenDB as embedded DB. Profiling the app shows that most of the time is spent on the indexing (as can be seen on the attached screenshoot).

I don't think the indexes are complex enough, and while the app was working with mysql, I can handle WAY more messages that what the system is currently handling (and I just started using it, not even using it at a 10% of what it should handle).

My questions are:

1) Are my indexes wrong? (I'm new to Raven... so that's the most probable thing).

2) I'm using Esent, as Voron seems to not be working on embedded (it says it doesn't works in 32bits, but the server where this app is running is 64 bits). Switching from embedded to Raven as a service may aleviate the issue?

3) I'm using an async session and adding the items one by one... maybe there is a better way? I receive the messages in batches, but the size of each batch is unknown (could be 1 message or 1000, depending on the load on the other side).

4) Any other ideas to help the system not be so taxed?

Thanks in advance.

ProfilerSession.png

Michael Yarichuk

unread,

Nov 24, 2015, 2:07:54 AM11/24/15

to RavenDB - 2nd generation document database

Hi,

First, what is the build that you are running?

1) The indexes look ok. Just to make sure, do you have suggestions enabled?

2) How is the app that runs Raven instance is compiled? AnyCPU? What happens if you compile it to x64 and run Voron?

3) Can you show code of how you do inserts now?

Also, you might want to take a look at Bulk Insert http://ravendb.net/docs/article-page/3.0/csharp/client-api/bulk-insert/how-to-work-with-bulk-insert-operation

4)

a. Can you send the profiler session so I can take a look?

b. Can you post your Raven configuration?

c. Can you share hardware specs of a machine your app runs on?

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Best regards,

Michael Yarichuk

RavenDB Core Team

Tel: 972-4-6227811

Fax:972-153-4-6227811

Email : michael....@hibernatingrhinos.com

RavenDB paving the way to "Data Made Simple" http://ravendb.net/

Daniel Häfele

unread,

Nov 24, 2015, 2:47:34 AM11/24/15

to RavenDB - 2nd generation document database

In the last 2 indexes you can propably remove the reduce part and just load the data from the index.

This might also reduce the time needed to update the indexes.

Marcelo Volmaro

unread,

Nov 24, 2015, 7:44:39 AM11/24/15

to RavenDB - 2nd generation document database

On Tuesday, November 24, 2015 at 4:07:54 AM UTC-3, Michael Yarichuk wrote:

Hi,
First, what is the build that you are running?
1) The indexes look ok. Just to make sure, do you have suggestions enabled?

No suggestions at all.

2) How is the app that runs Raven instance is compiled? AnyCPU? What happens if you compile it to x64 and run Voron?

Yes, AnyCPU. Local tests seems to indicate that compiling for x64 makes voron works (at least it doesn't throws any exceptions).

3) Can you show code of how you do inserts now?

await _session.StoreAsync(message) in a foreach. Note that the insertion process doesn't takes almost any time (it's in the order of the ms.. 10/30ms depending on the number of messages per batch).

Also, you might want to take a look at Bulk Insert http://ravendb.net/docs/article-page/3.0/csharp/client-api/bulk-insert/how-to-work-with-bulk-insert-operation
4)
a. Can you send the profiler session so I can take a look?

Sure, no problem. Do you want a specific tool? I'm using Redgate ants for that, but can use any other tool if it has some trial/evaluation version.

b. Can you post your Raven configuration?

c. Can you share hardware specs of a machine your app runs on?

Windows server 2012R2 running on an Intel Xeon 2.6GHz, 2Gb ram, 40Gb hdd (standard disk, not sdd).

Marcelo Volmaro

unread,

Nov 24, 2015, 7:46:56 AM11/24/15

to RavenDB - 2nd generation document database

But the reduce part is there to get a single list of the tags/applications, not to do a projection. How do you suggest I could get a list of all the tags from all the messages, without having to query all of them?

Marcelo Volmaro

unread,

Nov 24, 2015, 7:58:49 AM11/24/15

to RavenDB - 2nd generation document database

Sorry, I forgot:

Build 3.0.3800, and no, I can't use batch inserts, as my messages already have an ID that I must respect.

On Tuesday, November 24, 2015 at 4:07:54 AM UTC-3, Michael Yarichuk wrote:

Oren Eini (Ayende Rahien)

unread,

Nov 24, 2015, 8:01:54 AM11/24/15

to ravendb

If you are writing constantly, we'll also be indexing constantly.

Hibernating Rhinos Ltd

Oren Eini l CEO l Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

On Mon, Nov 23, 2015 at 9:12 PM, Marcelo Volmaro <mvol...@gmail.com> wrote:

Marcelo Volmaro

unread,

Nov 24, 2015, 8:09:38 AM11/24/15

to RavenDB - 2nd generation document database

Sure, but the same app running on mysql doesn't event touch the 5% of the CPU, and I have the same indexes (with some limitations on the sql, as in order to index the metadata I have several columns where I store each value, and that's one of the reasons I wanted to switch to a nosql db...).

Oren Eini (Ayende Rahien)

unread,

Nov 24, 2015, 8:11:51 AM11/24/15

to ravendb

How many indexes total do you have, how many docs / sec?

Do you have a single instance, replication, sql replication, etc?

Marcelo Volmaro

unread,

Nov 24, 2015, 8:15:52 AM11/24/15

to rav...@googlegroups.com

Not sure what to answer as for how many indexes... Index classes are the 3 ones I shown on the first post. That's it. Fields, again, you can get a grasp from the first post, but right now I have about the equivalent to 16 indexed columns in sql (that includes both the static fields plus tags/metadata). This is embedded, so yes, single instance, no replication, no sql replication.

___________________
Marcelo Volmaro

--
You received this message because you are subscribed to a topic in the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/L3H1G8i-yeU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Oren Eini (Ayende Rahien)

unread,

Nov 24, 2015, 8:17:18 AM11/24/15

to ravendb

You write documents using session, with about 30 docs / sec, but how many per session call?

Marcelo Volmaro

unread,

Nov 24, 2015, 8:21:42 AM11/24/15

to rav...@googlegroups.com

It varies, but min 3, max 700 per session.

___________________
Marcelo Volmaro

Oren Eini (Ayende Rahien)

unread,

Nov 24, 2015, 8:24:58 AM11/24/15

to ravendb

Okay, with those rates, can you try using BulkInsert instead?

Marcelo Volmaro

unread,

Nov 24, 2015, 8:28:43 AM11/24/15

to rav...@googlegroups.com

But the documentation states that I can't use my own IDs on bulk insert. That would mean that I will have to change the model on the server (not really an issue), add another index (as I need to load documents by the original ID) and change my client (the one that every app uses to log messages to this app. That could be an issue)...

Is there a way I can use the IDs I already have, in bulk inserts?

___________________
Marcelo Volmaro

Oren Eini (Ayende Rahien)

unread,

Nov 24, 2015, 8:51:16 AM11/24/15

to ravendb

What do you mean, not use your own id?

Daniel Häfele

unread,

Nov 24, 2015, 8:52:15 AM11/24/15

to RavenDB - 2nd generation document database

I think you're mixing some things here.

If you're using the bulk insert feature, then you need to provide the IDs already in the client.

By default it will use HiLo if you don't specify an ID by yourself.

But in your case, you already have one.

So there should be no issue with bulk insert at all.

Marcelo Volmaro

unread,

Nov 24, 2015, 8:55:43 AM11/24/15

to rav...@googlegroups.com

Raven docs says on BulkInserts "Entity Id must be provided by the client. The client by default will use the HiLo generator in order to generate the Id.". My entities (messages) already have an ID at insertion time, that it is being generated per app basis (each app that logs messages generates its own ids).

So, basically, this is the body of an entity being inserted:

{

"Count": 1,

"FirstOccurredOn": "2015-11-24T13:54:40.9691384Z",

"ApplicationName": "TaskManagerServer",

"Details": "",

"Id": "e780334f30ba4b34ab3f0000001ce3ef",

"OccurredOn": "2015-11-24T13:54:40.9691384Z",

"Tags": [],

"TextMessage": "El cliente solicitó el Task",

"Type": 4,

"Metadata": {

"ServerId": "1",

"TaskId": "c6a82530ea5742348f08d591921d2bd7",

"PluginId": "7",

"ClientId": "9",

"ProviderId": "1"

}

As you can see, I already have an ID setup.

___________________
Marcelo Volmaro

Marcelo Volmaro

unread,

Nov 24, 2015, 8:56:46 AM11/24/15

to rav...@googlegroups.com

Ahhh... That's not what I understood. So then I'm good with batch inserts and will try them right now... thanks!

___________________
Marcelo Volmaro

Marcelo Volmaro

unread,

Nov 24, 2015, 10:28:18 AM11/24/15

to rav...@googlegroups.com

Ok, updated the code to use bulk inserts. That doesn't change the CPU usage. Now, force-compiling into x64 and using voron made a huge change. Now the CPU is at about 35%, so that change alone cut the cpu usage in half.

Why do I need to force compile into x64? Is there something wrong detecting the CPU? I have other software that makes uses of different .dlls (native versions using p-invoke) and they detect and work fine in x64.

I still believe the CPU usage is high, but 35% is much better than 70% for the same amount of inserts...

___________________
Marcelo Volmaro

Grisha Kotler

unread,

Nov 24, 2015, 12:27:42 PM11/24/15

to rav...@googlegroups.com

Why are you using those settings?

Hibernating Rhinos Ltd cid:image001.png@01CF95E2.8ED1B7D0

Grisha Kotler l RavenDB Core Team Developer l Mobile: +972-54-586-8647

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

RavenDB paving the way to "Data Made Simple" http://ravendb.net/

Marcelo Volmaro

unread,

Nov 24, 2015, 1:22:53 PM11/24/15

to rav...@googlegroups.com

To try to optimize indexing, as it is what it takes most of the time.

___________________
Marcelo Volmaro

Michael Yarichuk

unread,

Nov 24, 2015, 4:53:43 PM11/24/15

to RavenDB - 2nd generation document database

What happens to CPU usage if you remove those? RavenDB tracks lots of heuristics and changes those (among others) on the fly, depending on the system.

Best regards,

Michael Yarichuk

RavenDB Core Team

Tel: 972-4-6227811

Fax:972-153-4-6227811

Email : michael....@hibernatingrhinos.com

RavenDB paving the way to "Data Made Simple" http://ravendb.net/

Michael Yarichuk

unread,

Nov 26, 2015, 6:31:43 AM11/26/15

to rav...@googlegroups.com

Excellent... Is there a way to get the tags by app using the same method? That would be awesome.

I already guessed that the problem could be those indexes, so I removed them (and manually maintain a dictionary<string, string[]> with the apps/tags) and now the cpu usage is about 9-14%, so much better, but if I can use the above queries and can skip the dictionary hack, much better (as I have to persist it from session to session)...

___________________
Marcelo Volmaro

Oren Eini (Ayende Rahien)

unread,

Nov 26, 2015, 6:34:30 AM11/26/15

to ravendb

You can use facets to do that, yes.

Marcelo Volmaro

unread,

Nov 26, 2015, 8:50:59 AM11/26/15

to rav...@googlegroups.com

Sorry to bother you with this, as now it is not part of the original topic, but I can't seem to make facets works (probably I'm not understanding how they works).

as a test, I have this:

session.Query<DebugMessage, DebugMessages_ByDefaultIndexes>().Where(x => x.ApplicationName == applicationName).Select(x => x.Tags).ToFacets(new[] { new Facet { Name = "Tags" }}

But I don't get any results. Taking out the Select doesn't seems to change anything...

___________________
Marcelo Volmaro

Oren Eini (Ayende Rahien)

unread,

Nov 26, 2015, 9:14:11 AM11/26/15

to ravendb

This looks something like:

var facetQuery = s.Query<DebugMessage, DebugMessages_ByDefaultIndexes>()

.Where(x=>x.ApplicationName == "Application_0")

.ToFacets(new Facet[]

{

new Facet<DebugMessage>

{

Name = x=>x.Tags

},

});

Reply all

Reply to author

Forward