On 20.01.2020 13:08, Frederick Gotham wrote:
>
> I have multi-posted this to 'comp.lang.c' and 'comp.lang.c++'. I would have gotten more flack if I cross-posted.
>
> I have taken single-threaded code written in Python to calculate hex digits of pi, and I have ported it to C and C++ to compare speed. Both C and C++ versions are faster than the Python version. The C version is about 10% faster than Python. For some reason there is a 20% speed difference between the C and C++ versions, with the C++ one being faster. I don't know why.
>
> All versions use the GNU Multiprecision library (-lgmp).
[...]
> I compiled the C and C++ versions with "-O3 -DNDEBUG". It doesn't make sense to me that the C++ one is faster if the GNU Multiprecision C++ classes are just an interface/wrapper around the C code.
>
> Anyone got any ideas?
Code speed depends on many factors. 20% is not so much and might be
specific to your compiler version, hardware and hardware-specific
compiler options. So there are no quick answers.
Anyway, one usual source of speed differences is the function inlining.
It might be that by some reason the compiler can inline the C++ Bint *=
operator, but not the C mpz_mul_ui() function call. You can see if this
is the case by studying the generated assembler.
My 1-minute googling shows that indeed mpz_mul_ui() is only declared in
gmph.h while Bint *= seems to be both declared and defined in gmpxx.h.
If so, the compiler/linker need to do much more work for getting
functions like mpz_mul_ui() inlined. For that you would probably need to
pass more compiler flags to switch on whole program optimization and
even then it is not sure that it can be done.
Even if the C version does not inline the gmp library calls, it is not
certain this is the actual reason of slowdown. Even if it is the reason
with your artificial test case, the slowdown might not appear with real
data. Etc.
Also, note that -O3 is not guaranteed to produce fastest code always.
Sometimes it can optimize the wrong thing and produce slower code. For
any case, you should try -O2 as well.