Clear() function excessive CPU usage

276 views
Skip to first unread message

Zachary Turner

unread,
Mar 5, 2009, 6:11:07 PM3/5/09
to Protocol Buffers
I have a fairly old version of the protobuf library, so if this has
been changed let me know, but I have a situation where Message::Clear
() is causing my cpu to go to like 70% for an extended period of time.

It's also possible this is user error, so please correct me if that's
the case.

Basically what I have is a top level message with a bunch of optional
messages, which I send across the wire.

One of these optional messages is defined as follows:

message DataChunkList {
required bool is_end_of_list = 1;
repeated DataChunk data = 2;
};

message DataChunk {
optional bytes data = 1;
//Other fields here
};

The "data" field will almost always be exactly 4k, and I will usually
not want to send 1 chunk at a time, but a list of around 32 at a
time.

So I save an instance of the top level message in the class containing
my sending code, and right before I'm about to send data I do the
following:

net::DataChunkList* pChunks = m_CachedTopLevel.mutable_data_chunk_list
();

//Should already be clear, but just in case
pChunks->Clear();
prevCount = pChunks->mutable_data()->ClearedCount();

for (int i=prevCount; i < num_chunks; ++i)
{
net::DataChunk* pChunk = new net::DataChunk();
pChunk->mutable_data()->reserve(4096);
pChunkList->mutable_data()->AddCleared(pChunk);
}

for (int i=0; i < num_chunks; ++i)
{
net::DataChunk* pChunk = pChunks->mutable_data()->ReleaseCleared();
pChunk->mutable_data()->assign(global_4k_buffer, 4096);
pChunks->mutable_data()->AddAllocated(pChunks);
}

send(m_CachedTopLevel);

m_CachedTopLevel.Clear();





I ran a profiler on my code, and the very last line (the Clear())
takes up almost 95% of the CPU usage for the function, and the
function takes up about about 30% of the CPU usage of the entire app.
So obviously this is a big problem.

The comment on the code says that clear "does not free any memory"
however. So why could it be using so much CPU? Am I misunderstanding
the purpose / usage of these methods? What I'm trying to do is just
re-use a pool of 4k buffers for all of these sends.

Kenton Varda

unread,
Mar 5, 2009, 6:20:29 PM3/5/09
to Zachary Turner, Protocol Buffers
Wow, that's interesting.  I don't know why it would do that.  Can you look deeper into your profiles and see what part of Clear() is taking so long?  For example, is it spending the time clearing STL strings?

Zachary Turner

unread,
Mar 5, 2009, 6:23:18 PM3/5/09
to Protocol Buffers
I'll give it a try. I haven't built the protobuf libraries with
instrumenting support or else I'd already know, but I should be able
to get it working.

On Mar 5, 5:20 pm, Kenton Varda <ken...@google.com> wrote:
> Wow, that's interesting.  I don't know why it would do that.  Can you look
> deeper into your profiles and see what part of Clear() is taking so long?
>  For example, is it spending the time clearing STL strings?
>

Kenton Varda

unread,
Mar 5, 2009, 6:25:29 PM3/5/09
to Zachary Turner, Protocol Buffers
Add this to your .proto file:
  option optimize_for = SPEED;

Does it help?

Zachary Turner

unread,
Mar 5, 2009, 7:02:53 PM3/5/09
to Kenton Varda, Protocol Buffers
I get somewhat better results with that flag.  I built protobuf with profiling enabled and I'm a little suspicious that the information is 100% accurate,  but it seems like std::string::clear() takes up the most time.  But the percentages don't match up to what I calculate, so I'm not sure where the inconsistency is. 

Just out of curiosity, is there even any need for me to call Clear()?  I'm filling out every single field every single time, and always using mutable_data()->assign() to copy the data into the message, so is it fine to just leave it "uncleared" but still stick it back into the cleared list?

Kenton Varda

unread,
Mar 5, 2009, 7:24:49 PM3/5/09
to Zachary Turner, Protocol Buffers
On Thu, Mar 5, 2009 at 4:02 PM, Zachary Turner <diviso...@gmail.com> wrote:
I get somewhat better results with that flag.  I built protobuf with profiling enabled and I'm a little suspicious that the information is 100% accurate,  but it seems like std::string::clear() takes up the most time.  But the percentages don't match up to what I calculate, so I'm not sure where the inconsistency is. 

Can you write a small example program demonstrating the problem which I can play with?

What STL implementation are you using?  (I.e. what compiler?)
 
Just out of curiosity, is there even any need for me to call Clear()?  I'm filling out every single field every single time, and always using mutable_data()->assign() to copy the data into the message, so is it fine to just leave it "uncleared" but still stick it back into the cleared list?

Technically it might work, but if it does I can't guarantee that it wouldn't break in the future.

Zachary Turner

unread,
Mar 5, 2009, 7:42:26 PM3/5/09
to Kenton Varda, Protocol Buffers
I'll try to come up with a sample tomorrow, but the surrounding code is pretty complex, so I'm not 100% sure it will still exhibit the same pattern if I do the same thing in a stripped application. 

As an alternative to not clearing the items before I put them back in the list, would there be any problem with storing my own list of buffers internally, and then calling AddAllocated() a bunch of times while building the message stream and then ReleaseLast() at the end until all the messages are clear?  What I really want is a way to just give it a raw memory buffer, tell it how big the buffer is, and then have it just store a pointer to the buffer.  Then there's no strings, no copying, etc.   It's currently somewhat awkard, because my sequence goes like this:

1) Read some data from the disk into a buffer
2) Put that data into a proto buf message.
3) Repeat this a number of times, putting each chunk of data into a new message
4) Serialize the new message, which contains a list of chunks into an array.
5) Call socket.write() with the serialized array.

But that's 3 copies.  There's my original buffer that i read from the disk into, protobuf's message buffer where it stores internally as a string, and the final buffer that I serialize into so that I can send it across the wire.  It would be nice if I could get rid of all this copying.

Kenton Varda

unread,
Mar 5, 2009, 8:12:25 PM3/5/09
to Zachary Turner, Protocol Buffers
On Thu, Mar 5, 2009 at 4:42 PM, Zachary Turner <diviso...@gmail.com> wrote:
I'll try to come up with a sample tomorrow, but the surrounding code is pretty complex, so I'm not 100% sure it will still exhibit the same pattern if I do the same thing in a stripped application. 

As an alternative to not clearing the items before I put them back in the list, would there be any problem with storing my own list of buffers internally, and then calling AddAllocated() a bunch of times while building the message stream and then ReleaseLast() at the end until all the messages are clear?

That would be fine.
 
  What I really want is a way to just give it a raw memory buffer, tell it how big the buffer is, and then have it just store a pointer to the buffer.  Then there's no strings, no copying, etc.   It's currently somewhat awkard, because my sequence goes like this:

1) Read some data from the disk into a buffer
2) Put that data into a proto buf message.
3) Repeat this a number of times, putting each chunk of data into a new message
4) Serialize the new message, which contains a list of chunks into an array.
5) Call socket.write() with the serialized array.

But that's 3 copies.  There's my original buffer that i read from the disk into, protobuf's message buffer where it stores internally as a string, and the final buffer that I serialize into so that I can send it across the wire.  It would be nice if I could get rid of all this copying.

Yeah, the implementation wasn't really designed for this sort of usage.  :/
Reply all
Reply to author
Forward
0 new messages