Integer union removal and Integer dev-branch testing

7 views
Skip to first unread message

Jeffrey Walton

unread,
Sep 20, 2016, 12:37:16 PM9/20/16
to Crypto++ Users
Hi Everyone,

I need the Integer dev-branch tested. The Integer class is usually solid, but a couple of platforms have intermittent trouble with it. The "intermittent trouble" is a sign of undefined behavior. You usually see its effects when the self tests hang on the BlumBlumShub test.

There were two contributors to the undefined behavior. First, there was an anonymous union detected by Solaris' SunCC. This was a "language lawyer" type finding, and I think it was mostly benign. We cleared it anyways since it was easy to do. It was corrected on Master at https://github.com/weidai11/cryptopp/commit/f2e5149319e4ed4a86c9f5e3f3bef1b89d06cd19 .

Second, after the f2e5 commit, the union was being used incorrectly. Unions are not necessarily bad in C++, but C++ is more strict than C. Nearly every C++ compiler seems to relax the C++ standard requirement because it consumes C code. For the details, see https://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior.

The second undefined behavior was cleared on the Integer dev-branch. It was cleared at http://github.com/weidai11/cryptopp/commit/5e687ecb677558faa30332d76a4dcb7a1d00569d . Effectively, it removes the union of whole word/half words and makes us choose one or the other. When double words (128-bit dwords) are available, it changes this code which used to be used with both dwords/words.

Old code:
     word GetLowHalf() const {return m_halfs.low;}
     word GetHighHalf() const {return m_halfs.high;}
     word GetHighHalf() const {return m_halfs.high;}

New code:

#ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
# if defined(IS_LITTLE_ENDIAN)
    word GetLowHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+0);}
    word GetHighHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+1);}
    word GetHighHalfAsBorrow() const {return 0-GetHighHalf();}
# else
    word GetLowHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+1);}
    word GetHighHalf() const {return *(reinterpret_cast<const word*>(&m_whole)+0);}
    word GetHighHalfAsBorrow() const {return 0-GetHighHalf();}
# endif  // IS_LITTLE_ENDIAN
#else
     word GetLowHalf() const {return m_halfs.low;}
     word GetHighHalf() const {return m_halfs.high;}
     word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
#endif  // CRYPTOPP_NATIVE_DWORD_AVAILABLE

So the above code gets us out of undefined behavior. We tried three different patterns (bitops, memcpy and memory accesses), and it appears to generate the best code. And it avoids some corner case compile problems when using bitops.

The downside is Integer slowed down by about 8% on Linux at -O2 when 128-bit dwords are in effect. That equates to about 200 ms during self tests. On the upside, we are stable a -O2, -O3, -O5 and -Ofast; and we are faster with -O3, -O5 and -Ofast. Overall or 'net', I think we get more than we give back (though I despise giving back 200 ms).

I'd like to start testing this change and verify two things:

  1. Is the code stable or more stable (many won't notice the improved stability because they never experienced the hang)

  2. How is the performance (are you seeing the 8% drop at -O2 on Linux with x86_64; are you seeing the overall speedup at -O3 and -O5)?

Please report back with results and observations.

Jeff

Jeffrey Walton

unread,
Sep 20, 2016, 11:56:00 PM9/20/16
to Crypto++ Users

I need the Integer dev-branch tested. The Integer class is usually solid, but a couple of platforms have intermittent trouble with it. The "intermittent trouble" is a sign of undefined behavior. You usually see its effects when the self tests hang on the BlumBlumShub test.

We were able to zero-in on the issue. It seems GCC was eliding a constructor (I think that's what happened) due to our [non-legal]  use of a union in C++. Effectively we have a DWord class with:

    struct half_words
    {
        word64 low;
        word64 high;
    };

    struct DWord
    {
        DWord(word64 low, word64 high)
        {
            m_halfs.low = low;
            m_halfs.high = high;
        }
        ...

        union
        {
# if WORD128_AVAILABLE
            word128 m_whole;
#endif
            half_words m_halfs;
        };
    };

GCC was generating code to use the m_whole because a 128-bit word was available, so we needed to initialize it and make it the active member:

# if WORD128_AVAILABLE
    DWord(word64 low, word64 high) : word128()
#else
    DWord(word64 low, word64 high) : word64()
#endif
    {
        m_halfs.low = low;
        m_halfs.high = high;
    };

More details with the diff can be found at http://github.com/weidai11/cryptopp/issues/293 .

The extra value initialization cost us 2-3% in performance, but it clears a Coverity finding and initializes the DWord class properly.

We still need thorough testing, but it looks good on our first pass with GCC, Clang, MSC and SunCC.

Jeff
Reply all
Reply to author
Forward
0 new messages