Alexandru <
alexandr...@meshparts.de> wrote:
> Am Mittwoch, 15. März 2017 09:37:30 UTC+1 schrieb Christian Gollwitzer:
>> Am 15.03.17 um 08:35 schrieb Alexandru:
>> >> I tested your code (and had to add some declarations) and now it runs but:
>> >> 1. I get the wrong checksum
>>
>>
>> >> 2. It's still 2x slower...
>>
>> are you sure you compile with full optimizations?
>
> After adding "-O3 -march=native" it's 50% faster! than the Tcllib
> implementation. Thanks for the tip! No idea yet what -O3
> -march=native does, but are there any other magical keywords that I
> can use to make it as fast as the openssl?
-O3 tells the compiler to apply full optimizations to its output (which
often results in substantial speedups.
-march=native means the output object file is specific to the
particular CPU generation you compiled it upon. It will fail to run on
other CPU's that do not have the particular mix of instructions the
compiler chose due to whichever CPU you compiled it upon. This is fine
as long as you know an end user will have a CPU that is compatible with
the code. But if you plan a more 'general' distribution you'll want to
not use "native" and instead pick a target that most everyone will
almost always have (you can get the targets your compiler supports by
reading the GCC info page for the compiler).
>
>>
>> > This is the code:
>> > s = nil;
>> > const int bufsize = 1024*1024; // 1M, must be divisible by 64
>> > int bytes_read;
>> > char *buffer = malloc(bufsize);
>> > while (1) {
>> > int bytes_read = fread(buffer, 1, bufsize, fd);
>>
>> delete the int. This is an error, now you have two variables
>> bytes_read.
>
> The compile said nothing about it. Weird... I removed the second int
> declaration and now I get the correct result. But still 2x slower.
Because it was not an 'error' the compiler can detect. It is perfect,
standards compilant, C (well C99 or better at least) code.
The reason it is also a bug is you have a "bytes_read" variable
declared outside the block that is the while loop body, and you have a
second "bytes_read" variable declared inside the block that is the
while loop body.
This is perfectly valid C, but incorrect for your use, because the
"bytes_read" inside the block only exists inside the block (the
technical term is it "shadows" the "bytes_read" variable declared
outside the block). What you have is two separate, independent,
"bytes_read" variables.
As soon as you exit the while loop, the variable that was declared
inside the loop disappears, and the variable that is used is the one
declared outside the loop. So your later test wasn't testing the
result of the last fread, but the value before you even read any data
at all. And as you didn't initalize bytes_read, the 'error' comes from
using an uninitalized variable outside the loop.