[Boost-users] boost::function performance issues with C++11

87 views
Skip to first unread message

饭桶

unread,
Jul 3, 2014, 7:30:13 PM7/3/14
to boost...@lists.boost.org
Hi,

Does anyone know why -std=c++11 causes so much difference on boost::function?

I was planed to understand if there any performance issues with big size of parameters.
So I wrote a function that takes a vector as parameter, like func2 shows. I know it's better to use a pointer or reference as function parameter. I just want to evaluate the performance, so let's keep the vector as parameter.

However, I found that it's quite slow when compiled with -std=c++11. In detail, it takes 173874 milliseconds with C++11, while it takes 3676 seconds without C++11.

About 50 times slower!! How can that be?

In my opinion, I thought boost::function should had the same performance with std::function. So I decided to try std::function in C++11 directly, Finally, it takes about 29233 milliseconds. That's till 8 times slows!

Can anyone tell me what happend here?

int func2(std::vector<int> i){
    total += i.size();
    return i.size();
}

    const int T = 1000000;
    s = boost::chrono::system_clock::now();
    for (int i = 0; i < T; ++i)
        call(boost::bind(&func2, v));
    e = boost::chrono::system_clock::now();

In case you need to know my enviorment, my OS is Arch, compiler is gcc 4.9.0, and optimizations are default.
The execution time (ms) of three versions I tried:
boost::function with C++11 : 173874
boost::function without C++11 : 3676
std::function in C++11 : 29233

Any thoughts are appreciated!
Thanks,
Athrun


Michael Powell

unread,
Jul 3, 2014, 7:55:43 PM7/3/14
to boost...@lists.boost.org
On Thu, Jul 3, 2014 at 4:10 PM, 饭桶 <athrun...@163.com> wrote:
> Hi,
>
> Does anyone know why -std=c++11 causes so much difference on
> boost::function?
>
> I was planed to understand if there any performance issues with big size of
> parameters.
> So I wrote a function that takes a vector as parameter, like func2 shows. I
> know it's better to use a pointer or reference as function parameter. I just
> want to evaluate the performance, so let's keep the vector as parameter.
>
> However, I found that it's quite slow when compiled with -std=c++11. In
> detail, it takes 173874 milliseconds with C++11, while it takes 3676 seconds
> without C++11.
>
> About 50 times slower!! How can that be?
>
> In my opinion, I thought boost::function should had the same performance
> with std::function. So I decided to try std::function in C++11 directly,
> Finally, it takes about 29233 milliseconds. That's till 8 times slows!
>
> Can anyone tell me what happend here?

I don't know the inner workings of either boost::function or
std::function. It's not boost's fault per se, but there are a couple
of things you could do differently.

> int func2(std::vector<int> i){
> total += i.size();
> return i.size();
> }

Pass a reference or even a pointer instead of the whole vector. You
are copying the vector every time.

> const int T = 1000000;
> s = boost::chrono::system_clock::now();
> for (int i = 0; i < T; ++i)
> call(boost::bind(&func2, v));
> e = boost::chrono::system_clock::now();

You are also binding func2 every time; not sure if that's getting
optimized or not.

You likely want to bind with a placeholder instead of the vector
itself, then call the binding itself. i.e.

auto binding = boost::bind(&func2, _1);
binding(v);

I do that frequently enough; and with more event-driven systems,
boost::signals, etc, it is unavoidable. You want the placeholder
instead, once-bound/later-called.

> In case you need to know my enviorment, my OS is Arch, compiler is gcc
> 4.9.0, and optimizations are default.

Default can mean a lot of things. Debug or release mode? Beyond those
two broad categories, do verify your settings and build for release
mode.

HTH

> The execution time (ms) of three versions I tried:
> boost::function with C++11 : 173874
> boost::function without C++11 : 3676
> std::function in C++11 : 29233
>
> Any thoughts are appreciated!
> Thanks,
> Athrun
>
>
>

> _______________________________________________
> Boost-users mailing list
> Boost...@lists.boost.org
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

饭桶

unread,
Jul 3, 2014, 8:24:07 PM7/3/14
to boost...@lists.boost.org
Hi Michael,

Thanks for your suggestion. I understand that using a reference or a pointer will achieve better performance, and I really appreciate you pointed out that for me.
But I intend to do this to highlight my point, that the performance are quite different with or without -std=c++11. And I'm really curious about that.

Consider a situiation, that I'm using boost::function. However, the performance could be seriously affected by using a compiling flag (-std=c++11). That will be very weired to me, unless someone can tell me what happens there.

BTW, when I say default, I mean release mode. 
Thanks
Xuepeng

Angelo Mondaini

unread,
Jul 3, 2014, 8:50:05 PM7/3/14
to boost...@lists.boost.org
I guess that with -std=c++11 the compiler may not do all the possible optimizations, since c++11 is "new".
How about go wild and use -O3 as compilation flag?
This will give more freedom to the compiler optimization, and maybe more velocity.

饭桶

unread,
Jul 3, 2014, 9:23:07 PM7/3/14
to boost...@lists.boost.org
Thanks, that helps. But still there are only about 7 times slower.
Specifically, the performance difference only happens when passing a vector as parameter. I tried passing reference and shared pointer, they are kind of similar to boost::function.

So here is my guess according to your suggestion. Maybe std::function uses copying code that can be easier optimized. But that's just my guess, still need proof.

-Athrun

Angelo Mondaini

unread,
Jul 3, 2014, 9:38:57 PM7/3/14
to boost...@lists.boost.org
Maybe trying another compiler would help you obtain your "proof".
How about intel compiler? 
It also has the "-std=c++11" flag, so, if it is a compiler issue it will be clear.

Michael Powell

unread,
Jul 3, 2014, 11:10:34 PM7/3/14
to boost...@lists.boost.org
On Thu, Jul 3, 2014 at 7:23 PM, 饭桶 <athrun...@163.com> wrote:
> Hi Michael,
>
> Thanks for your suggestion. I understand that using a reference or a pointer
> will achieve better performance, and I really appreciate you pointed out
> that for me.
> But I intend to do this to highlight my point, that the performance are
> quite different with or without -std=c++11. And I'm really curious about
> that.
>
> Consider a situiation, that I'm using boost::function. However, the
> performance could be seriously affected by using a compiling flag
> (-std=c++11). That will be very weired to me, unless someone can tell me
> what happens there.

Well, you're testing potentially two things: STL performance, or
std::bind performance. Couldn't tell you which is the bottleneck, but
neither in my mind are trivial depending how many elements are in your
vector.

Niall Douglas

unread,
Jul 4, 2014, 7:46:37 AM7/4/14
to boost...@lists.boost.org
On 4 Jul 2014 at 5:10, wrote:

> However, I found that it's quite slow when compiled with -std=c++11. In
> detail, it takes 173874 milliseconds with C++11, while it takes 3676
> seconds without C++11.
>
> About 50 times slower!! How can that be?
> [snip]
> In case you need to know my enviorment, my OS is Arch, compiler is gcc 4.9.0, and optimizations are default.
> The execution time (ms) of three versions I tried:
> boost::function with C++11 : 173874
> boost::function without C++11 : 3676
> std::function in C++11 : 29233
>
>
> Any thoughts are appreciated!

I'd check C++ 11 mode on GCC 4.6. It could be as simple as an
outdated GCC version switch.

Niall

--
ned Productions Limited Consulting
http://www.nedproductions.biz/
http://ie.linkedin.com/in/nialldouglas/


Adam Romanek

unread,
Jul 4, 2014, 8:10:57 AM7/4/14
to boost...@lists.boost.org

Hi!

I'd run the example code through Callgrind (valgrind --tool=callgrind)
and then compare the reports in KCachegrind. This should at least give
you an idea of the source of the problem.

I may do that myself during the weekend in case you would not able to do
that on your own.

WBR,
Adam Romanek

饭桶

unread,
Jul 4, 2014, 7:23:17 PM7/4/14
to boost...@lists.boost.org
Cool, that maybe helpful. I will try it and see if I can identify any thing interestring here.

Thanks
Athrun

饭桶

unread,
Jul 4, 2014, 7:28:10 PM7/4/14
to boost...@lists.boost.org
Right, there are 1000 integers in the vector. If something, say STL, is bottleneck for boost::bind with C++ 11, it should also be a bottleneck without C++11. And they should have the same performance. However, the fact shows that C++11 mode has some additional overhead. I wanna figure that out. 

Thanks
Athrun

Seeger, Steven D. (GSFC-444.0)[Embedded Flight Systems, Inc]

unread,
Jul 5, 2014, 11:32:28 AM7/5/14
to boost...@lists.boost.org
>Right, there are 1000 integers in the vector. If something, say STL, is bottleneck for boost::bind with C++ 11, it >should also be a bottleneck without C++11. And they should have the same performance. However, the fact shows >that C++11 mode has some additional overhead. I wanna figure that out.

What if you try with just 1 integer? Does the difference in time stay proportionally the same?

Steven

饭桶

unread,
Jul 7, 2014, 8:28:15 PM7/7/14
to boost...@lists.boost.org
Hi, there,
Thanks for you guys help. I think I have reached the answer I wanted. So I'd like to explain why there are performance issues with C++11 in boost::function.
First of all, I wanna reply the guys who give me hints.

>> What if you try with just 1 integer? Does the difference in time stay proportionally the same?
There won't be big differences if there is just 1 integer. So this proofs that it could be a memory copy problem.

>> I'd run the example code through Callgrind (valgrind --tool=callgrind) 
>> and then compare the reports in KCachegrind. This should at least give 
>> you an idea of the source of the problem.
You are right. I ran callgrind, and it showed some interesting things. If I compiled without C++11, it shows the most hot spot is "wordcopy_fwd_align". Otherwise, the most hotspot is "boost::bind" and there is no records for wordcopy_fwd_align. I think that means if we dont' use C++11, there will be some memory copy optimizations.

>> Maybe trying another compiler would help you obtain your "proof".
>> How about intel compiler? 
>> It also has the "-std=c++11" flag, so, if it is a compiler issue it will be clear.
As you suggested, I tried Intel compiler. It's great. The data shows there is no performance differences between with or without C++11 in boost::function. That proofs your guess that it's a compiler issue.

So here is my conclusion, there won't be memory copy optimization when using (C++11, boost::function, gcc4.9/clang3.4). But Intel compiler does well to do C++11 related optimizations.

Thanks
Athrun

Michael Powell

unread,
Jul 7, 2014, 9:29:12 PM7/7/14
to boost...@lists.boost.org
On Mon, Jul 7, 2014 at 7:28 PM, 饭桶 <athrun...@163.com> wrote:
> Hi, there,
> Thanks for you guys help. I think I have reached the answer I wanted. So I'd
> like to explain why there are performance issues with C++11 in
> boost::function.
> First of all, I wanna reply the guys who give me hints.
>
>>> What if you try with just 1 integer? Does the difference in time stay
>>> proportionally the same?
> There won't be big differences if there is just 1 integer. So this proofs
> that it could be a memory copy problem.
>
>>> I'd run the example code through Callgrind (valgrind --tool=callgrind)
>>> and then compare the reports in KCachegrind. This should at least give
>>> you an idea of the source of the problem.
> You are right. I ran callgrind, and it showed some interesting things. If I
> compiled without C++11, it shows the most hot spot is "wordcopy_fwd_align".
> Otherwise, the most hotspot is "boost::bind" and there is no records for
> wordcopy_fwd_align. I think that means if we dont' use C++11, there will be
> some memory copy optimizations.

I wouldn't swear off boost::bind or even std::bind. It has its place.
But like I explained in an earlier response, bind is something you
(you, generally) generally want to do once, like when you are
connecting signal slots, wiring up functions, and so on. Once you have
the thing bound, leave it alone. The same is true for any functor,
type thing; there is always overhead instantiating a class (or
struct), so generally you want to pick up the instance(s) you
need/want, then work with them forever and ever, Amen.

>>> Maybe trying another compiler would help you obtain your "proof".
>>> How about intel compiler?
>>> It also has the "-std=c++11" flag, so, if it is a compiler issue it will
>>> be clear.
> As you suggested, I tried Intel compiler. It's great. The data shows there
> is no performance differences between with or without C++11 in
> boost::function. That proofs your guess that it's a compiler issue.
>
> So here is my conclusion, there won't be memory copy optimization when using
> (C++11, boost::function, gcc4.9/clang3.4). But Intel compiler does well to
> do C++11 related optimizations.

Glad to have helped some.

Antony Polukhin

unread,
Jul 8, 2014, 2:58:54 AM7/8/14
to boost...@lists.boost.org



2014-07-08 4:28 GMT+04:00 饭桶 <athrun...@163.com>:
Hi, there,
Thanks for you guys help. I think I have reached the answer I wanted. So I'd like to explain why there are performance issues with C++11 in boost::function.
First of all, I wanna reply the guys who give me hints.

>> What if you try with just 1 integer? Does the difference in time stay proportionally the same?
There won't be big differences if there is just 1 integer. So this proofs that it could be a memory copy problem.

>> I'd run the example code through Callgrind (valgrind --tool=callgrind) 
>> and then compare the reports in KCachegrind. This should at least give 
>> you an idea of the source of the problem.
You are right. I ran callgrind, and it showed some interesting things. If I compiled without C++11, it shows the most hot spot is "wordcopy_fwd_align". Otherwise, the most hotspot is "boost::bind" and there is no records for wordcopy_fwd_align. I think that means if we dont' use C++11, there will be some memory copy optimizations.

>> Maybe trying another compiler would help you obtain your "proof".
>> How about intel compiler? 
>> It also has the "-std=c++11" flag, so, if it is a compiler issue it will be clear.
As you suggested, I tried Intel compiler. It's great. The data shows there is no performance differences between with or without C++11 in boost::function. That proofs your guess that it's a compiler issue.

So here is my conclusion, there won't be memory copy optimization when using (C++11, boost::function, gcc4.9/clang3.4). But Intel compiler does well to do C++11 related optimizations.

Could you please create a ticket in GCC's bugzilla for this issue? Post the most interesting parts of this discussion in that ticket and then post a link to the ticket in this mailing list.

Thanks in advance!

--
Best regards,
Antony Polukhin

饭桶

unread,
Jul 8, 2014, 4:59:51 AM7/8/14
to boost...@lists.boost.org
sure, I will report this to GCC community.

Adam Romanek

unread,
Jul 8, 2014, 6:11:47 AM7/8/14
to boost...@lists.boost.org

As promised I performed a simple test during the weekend and wasn't able
to reproduce the issue. See the code below:

---
#include <vector>
#include <boost/bind.hpp>
#include <boost/function.hpp>

void call(boost::function<int ()> f) {
f();
}

long long total = 0;

int func2(std::vector<int> i){
total += i.size();
return i.size();
}

int main() {
std::vector<int> v(100);


const int T = 1000000;

// s = boost::chrono::system_clock::now();


for (int i = 0; i < T; ++i)
call(boost::bind(&func2, v));

// e = boost::chrono::system_clock::now();
}
---

The performance does not change when compiling with -std=c++11 or
without it. I compile the code like this:

$ g++ -I/home/A.Romanek/tmp/boost/boost_1_55_0 main.cpp -std=c++11 -O2
&& time ./a.out

real 0m1.669s

My setup is Ubuntu 14.04, gcc 4.8.2 and my CPU is Intel Core2Duo P8700 @
2.53GHz.

Could you please provide a complete example so that I could reproduce
the issue on my desk?

Ben Pope

unread,
Jul 9, 2014, 12:06:11 AM7/9/14
to boost...@lists.boost.org
Make of this what you will, I used your example:

g++-4.8.2
-std=c++03 -std=c++11
real 0m0.753s 0m0.798s
user 0m0.752s 0m0.797s
sys 0m0.001s 0m0.002s

g++-4.9
-std=c++03 -std=c++11
real 0m0.786s 0m1.419s
user 0m0.785s 0m1.418s
sys 0m0.001s 0m0.002s

clang++-3.4 (libstdc++ from g++-4.9)
-std=c++03 -std=c++11
real 0m0.799s 0m1.407s
user 0m0.798s 0m1.406s
sys 0m0.001s 0m0.002s

clang++-3.4 (libc++ trunk)
-std=c++03 -std=c++11
real 0m1.382s 0m1.389s
user 0m1.381s 0m1.387s
sys 0m0.002s 0m0.002s

clang++-3.5 (libstdc++ from g++-4.9)
-std=c++03 -std=c++11
real 0m0.812s 0m1.112s
user 0m0.811s 0m1.111s
sys 0m0.002s 0m0.002s

clang++-3.5 (libc++ trunk)
-std=c++03 -std=c++11
real 0m0.782s 0m0.770s
user 0m0.781s 0m0.769s
sys 0m0.002s 0m0.002s

Ben
Reply all
Reply to author
Forward
0 new messages