Hi Everyone,
I need the Integer dev-branch tested. The Integer class is usually solid, but a couple of platforms have intermittent trouble with it. The "intermittent trouble" is a sign of undefined behavior. You usually see its effects when the self tests hang on the BlumBlumShub test.
There were two contributors to the undefined behavior. First, there was an anonymous union detected by Solaris' SunCC. This was a "language lawyer" type finding, and I think it was mostly benign. We cleared it anyways since it was easy to do. It was corrected on Master at
https://github.com/weidai11/cryptopp/commit/f2e5149319e4ed4a86c9f5e3f3bef1b89d06cd19 .
Second, after the f2e5 commit, the union was being used incorrectly. Unions are not necessarily bad in C++, but C++ is more strict than C. Nearly every C++ compiler seems to relax the C++ standard requirement because it consumes C code. For the details, see
https://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior.
The second undefined behavior was cleared on the Integer dev-branch. It was cleared at
http://github.com/weidai11/cryptopp/commit/5e687ecb677558faa30332d76a4dcb7a1d00569d . Effectively, it removes the union of whole word/half words and makes us choose one or the other. When double words (128-bit dwords) are available, it changes this code which used to be used with both dwords/words.
Old code:
word GetLowHalf() const {return m_halfs.low;}
|
word GetHighHalf() const {return m_halfs.high;}
|
word GetHighHalf() const {return m_halfs.high;}
New code:
#ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
# if defined(IS_LITTLE_ENDIAN)
word GetLowHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+0);}
word GetHighHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+1);}
word GetHighHalfAsBorrow() const {return 0-GetHighHalf();}
# else
word GetLowHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+1);}
word GetHighHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+0);}
word GetHighHalfAsBorrow() const {return 0-GetHighHalf();}
# endif // IS_LITTLE_ENDIAN
#else
word GetLowHalf() const {return m_halfs.low;}
word GetHighHalf() const {return m_halfs.high;}
word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
#endif // CRYPTOPP_NATIVE_DWORD_AVAILABLE
So the above code gets us out of undefined behavior. We tried three different patterns (bitops, memcpy and memory accesses), and it appears to generate the best code. And it avoids some corner case compile problems when using bitops.
The downside is Integer slowed down by about 8% on Linux at -O2 when 128-bit dwords are in effect. That equates to about 200 ms during self tests. On the upside, we are stable a -O2, -O3, -O5 and -Ofast; and we are faster with -O3, -O5 and -Ofast. Overall or 'net', I think we get more than we give back (though I despise giving back 200 ms).
I'd like to start testing this change and verify two things:
1. Is the code stable or more stable (many won't notice the improved stability because they never experienced the hang)
2. How is the performance (are you seeing the 8% drop at -O2 on Linux with x86_64; are you seeing the overall speedup at -O3 and -O5)?
Please report back with results and observations.
Jeff