cryptopp-565 core dumps on solaris 11 with sun compiler in unit tests and benchmarks

60 views
Skip to first unread message

Andrew Marlow

unread,
Oct 14, 2016, 6:33:53 AM10/14/16
to Crypto++ Users
Hello,

I am sorry to report that cryptest.exe v still core dumps on solaris 11 when using the sun 12.4 compiler. The command I used to build cryptopp was: CXX=/opt/solarisstudio12.4/bin/CC make -j20

The error is:

Testing MessageDigest algorithm SHA-384.
..signal BUS (invalid address alignment) in CryptoPP::SHA512::Transform at line 34 in file "sha.cpp"
   34   #define blk0(i) (W[i] = data[i])

the stack trace is:

(dbx) where
=>[1] CryptoPP::SHA512::Transform(state = <value unavailable>, data = <value unavailable>) (optimized), at 0x1006255a0 (line ~34) in "sha.cpp"
  [2] CryptoPP::IteratedHashWithStaticTransform<unsigned long,CryptoPP::EnumToType<CryptoPP::ByteOrder,1>,128U,64U,CryptoPP::SHA384,48U,false>::HashEndianCorrectedBlock(this = 0x1010c18d0, data = 0xffffffff7fffc1b4) (optimized), at 0x1004c8120 (line ~170) in "iterhash.h"
  [3] CryptoPP::IteratedHashBase<unsigned long,CryptoPP::HashTransformation>::HashMultipleBlocks(this = 0x1010c18d0, input = 0xffffffff7fffc1b4, length = <value unavailable>) (optimized), at 0x1005d834c (line ~91) in "iterhash.cpp"
  [4] CryptoPP::IteratedHashBase<unsigned long,CryptoPP::HashTransformation>::Update(this = 0x1010c18d0, input = 0xffffffff7fffc1b4 "aaaaaaaaaaaaaaa [snip]
  [5] CryptoPP::HashVerificationFilter::NextPutMultiple(this = 0xffffffff7fffd550, inString = 0xffffffff7fffc15d "aaaaaaaaaaa [snip]
  [6] CryptoPP::FilterWithBufferedInput::PutMaybeModifiable(this = 0xffffffff7fffd550, inString = 0xffffffff7fffc15d "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa [snip]
  [7] CryptoPP::FilterWithBufferedInput::Put2(this = 0xffffffff7fffd550, inString = 0xffffffff7fffc15d "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
  [8] CryptoPP::BufferedTransformation::ChannelPut2(this = 0xffffffff7fffd550, channel = CLASS, begin = 0xffffffff7fffc15d "aaaaaaaaaaaaaaaaaaaaaaaaaaa
  [9] RandomizedTransfer(source = CLASS, target = CLASS, finish = <value unavailable>, channel = CLASS) (optimized), at 0x1004e9e94 (line ~92) in "datatest.cpp"
  [10] PutDecodedDatumInto(data = CLASS, name = <value unavailable>, target = CLASS) (optimized), at 0x1004ea41c (line ~138) in "datatest.cpp"
  [11] TestDigestOrMAC(v = CLASS, testDigest = <value unavailable>) (optimized), at 0x1004ef674 (line ~603) in "datatest.cpp"
  [12] TestDataFile(filename = CLASS, overrideParameters = CLASS, totalTests = 11U, failedTests = 0) (optimized), at 0x1004f0c44 (line ~802) in "datatest.cpp"
  [13] RunTestDataFile(filename = 0x100afec60 "TestVectors/sha.txt", overrideParameters = CLASS, thorough = true) (optimized), at 0x1004f1168 (line ~243) in "string"
  [14] ValidateSHA() (optimized), at 0x1004c9228 (line ~212) in "validat3.cpp"
  [15] ValidateAll(thorough = false) (optimized), at 0x1004339e8 (line ~95) in "validat1.cpp"
  [16] Validate(alg = <value unavailable>, thorough = false, seedInput = <value unavailable>) (optimized), at 0x100380cdc (line ~899) in "test.cpp"
  [17] main(argc = <value unavailable>, argv = 0xffffffff7ffff7d6) (optimized), at 0x10037b690 (line ~364) in "test.cpp"

The test program also crashes when the b (benchmark) option is used. Interestingly, the crash is in the same place as my own test program crashes, in CryptoPP::CountWords, due to a null pointer. Here is the dbx output:

<TBODY style="background: yellow">signal SEGV (no mapping at the fault address) in CryptoPP::CountWords at line 9 in file "words.h"
    9   inline size_t CountWords(const word *X, size_t N)
(dbx) print X
X = (nil)
=>[1] CryptoPP::CountWords(X = (nil), N = 144U) (optimized), at 0x1005a4238 (line ~9) in "words.h"
  [2] CryptoPP::Integer::WordCount(this = 0x1010bfa98) (optimized), at 0x100596cd8 (line ~3298) in "integer.cpp"
  [3] CryptoPP::Integer::Integer(this = 0xffffffff7fffb090, t = CLASS) (optimized), at 0x10059573c (line ~2903) in "integer.cpp"
  [4] CryptoPP::RSAFunction::PreimageBound(this = 0x1010bfa80) (optimized), at 0x1006c3148 (line ~46) in "rsa.h"
  [5] CryptoPP::AssignFromHelper<CryptoPP::RSAFunction>(pObject = 0xffffffff7fffb000, source = CLASS) (optimized), at 0x1006c35f0 (line ~320) in "cryptlib.h"
  [6] CryptoPP::RSAFunction::AssignFrom(this = 0xffffffff7fffb000, source = CLASS) (optimized), at 0x1006b6854 (line ~93) in "rsa.cpp"
  [7] CryptoPP::PK_FinalTemplate<CryptoPP::TF_EncryptorImpl<CryptoPP::TF_CryptoSchemeOptions<CryptoPP::TF_ES<CryptoPP::OAEP<CryptoPP::SHA1,CryptoPP::P1363_MGF1>,CryptoPP::RSA,int>,CryptoPP::RSA,CryptoPP::OAEP<CryptoPP::SHA1,CryptoPP::P1363_MGF1> > > >::PK_FinalTemplate(this = 0xffffffff7fffafe8, algorithm = CLASS) (optimized), at 0x1003a41c8 (line ~2049) in "string"
  [8] BenchMarkCrypto<CryptoPP::RSAES<CryptoPP::OAEP<CryptoPP::SHA1,CryptoPP::P1363_MGF1> > >(filename = <value unavailable>, name = 0x1009b27e8 "RSA 1024", timeTotal = 1.0, x = <value unavailable>) (optimized), at 0x10041b550 (line ~248) in "bench2.cpp"
  [9] BenchmarkAll2(t = 1.0, hertz = <value unavailable>) (optimized), at 0x1003ea460 (line ~288) in "bench2.cpp"
  [10] BenchmarkAll(t = <value unavailable>, hertz = <value unavailable>) (optimized), at 0x1003e0250 (line ~381) in "bench1.cpp"
  [11] main(argc = <value unavailable>, argv = 0xffffffff7ffff558) (optimized), at 0x10037ade4 (line ~366) in "test.cpp"

Regards,

Andrew Marlow


Jeffrey Walton

unread,
Oct 14, 2016, 7:10:30 AM10/14/16
to Andrew Marlow, Crypto++ Users
> I am sorry to report that cryptest.exe v still core dumps on solaris 11 when
> using the sun 12.4 compiler. The command I used to build cryptopp was:
> CXX=/opt/solarisstudio12.4/bin/CC make -j20
>
> The error is:
>
> Testing MessageDigest algorithm SHA-384.
> ..signal BUS (invalid address alignment) in CryptoPP::SHA512::Transform at
> line 34 in file "sha.cpp"
> 34 #define blk0(i) (W[i] = data[i])

This is the one we cannot duplicate. Unfortunately, there's nothing we
can do for this one until we can duplicate it.

Jeff

Andrew Marlow

unread,
Oct 17, 2016, 11:56:27 AM10/17/16
to Crypto++ Users, marlow...@gmail.com, nolo...@gmail.com

Can someone, maybe Jeff, please let me know what command was tried to reproduce the problem on Solaris 11 SPARC. It may be that I am not building cryptopp properly.

Jeffrey Walton

unread,
Oct 18, 2016, 3:16:58 AM10/18/16
to Crypto++ Users, marlow...@gmail.com, nolo...@gmail.com


> The error is:
>
> Testing MessageDigest algorithm SHA-384.
> ..signal BUS (invalid address alignment) in CryptoPP::SHA512::Transform at
> line 34 in file "sha.cpp"
>    34   #define blk0(i) (W[i] = data[i])

This is the one we cannot duplicate. Unfortunately, there's nothing we
can do for this one until we can duplicate it.

The command I used to build cryptopp was: CXX=/opt/solarisstudio12.4/bin/CC make -j20

That's interesting. Is make linked to GNU's make? Or another make?
 

Can someone, maybe Jeff, please let me know what command was tried to reproduce the problem on Solaris 11 SPARC. It may be that I am not building cryptopp properly.

The commands I run during smoke testing are either:

  (1) straight gmake    # default C++ compiler
  (2) CXX=.../CC gmake    # SunCC compiler

We added a section at https://cryptopp.com/wiki/Solaris_(Command_Line)#Default_Make. It shows the commands and the outputs we see when we run the commands.

Jeff

Andrew Marlow

unread,
Oct 19, 2016, 2:31:53 AM10/19/16
to Crypto++ Users, marlow...@gmail.com, nolo...@gmail.com
On Tuesday, 18 October 2016 08:16:58 UTC+1, Jeffrey Walton wrote:
> The error is:
>
> Testing MessageDigest algorithm SHA-384.
> ..signal BUS (invalid address alignment) in CryptoPP::SHA512::Transform at
> line 34 in file "sha.cpp"
>    34   #define blk0(i) (W[i] = data[i])

This is the one we cannot duplicate. Unfortunately, there's nothing we
can do for this one until we can duplicate it.

The command I used to build cryptopp was: CXX=/opt/solarisstudio12.4/bin/CC make -j20

That's interesting. Is make linked to GNU's make? Or another make?
 
It is GNU make.
 

Can someone, maybe Jeff, please let me know what command was tried to reproduce the problem on Solaris 11 SPARC. It may be that I am not building cryptopp properly.

The commands I run during smoke testing are either:

  (1) straight gmake    # default C++ compiler
  (2) CXX=.../CC gmake    # SunCC compiler

 
Step 2 is just what I do.
 
We added a section at https://cryptopp.com/wiki/Solaris_(Command_Line)#Default_Make. It shows the commands and the outputs we see when we run the commands.

That section is very helpful, thanks for writing it. It shows that the command is:

CC -DNDEBUG -g3 -xO2 -m64 -native -KPIC -template=no%extdef -c

whereas what I get is:

CC -DNDEBUG -g3 -xO2 -fPIC -pipe -m64 -native -KPIC -template=no%extdef -w -erroff=wvarhidemem -erroff=voidretw -c

So I get the extra options -pipe -w -erroff=wvarhidemem -erroff=voidretw

The -pipe option causes:

CC: Warning: Option -pipe passed to ld, if ld is invoked, ignored otherwise

to be emitted.

Maybe CountWords could check that the pointer X is not null and throw InvalidArgument if it is? I know that represents a coding error since the pointer should never be null but it is null in my case and this is crashing our test harness. Hopefully a nullity check will not be deemed too expensive.

Andrew Marlow

unread,
Oct 31, 2016, 6:18:59 AM10/31/16
to Crypto++ Users
[snip]

I have spent some more time in the debugger and have some more information on this problem. Unfortunately I have no resolution at the moment.

The Integer copy ctor (integer.cpp:2903) is being given a bad integer to copy. The m_ptr is null, the m_size is 144. This integer is returned from RSAFunction::PreimageBound. The "this" pointer at this juncture seems to be bad. It turns out the pointer value is actually a pointer to char* rather than the object expected. The string is the value returned from GetThisObject. Single stepping in dbx I get to RSAFunction::AssignFrom. At this point the "this" ptr is ok. I step into RSAFunction::GetThisObject which calls GetValue which calls GetVoidValue. This is where it calls RSAFunction::PreboundImage with m_n equal to that empty, troublesome Integer. But there is no way that the function should even have been called. GetVoidValue is a pure virtual on the base class (NameValuePairs) so it should have done a virtual function dispatch to the relevant GetVoidValue function. It didn't . On landing in RSAFunction::PreboundImage the "this" pointer is equal to the name value in GetVoidValue, a string pointing to "ThisObject:CryptoPP::RSAFunction". What could be causing the vptr dispatch to go wrong?  I note that the base class, NameValuePairs, is decorated with the macro CRYPTOPP_NO_VTABLE. It seems quite a coincidence that I am having vptr trouble and there is this macro with a name like that. The macro is a no-op though unles the Microsoft compiler is being used (I am on solaris 11 sparc with the sunstudio 12.4 compiler). Can someone please explain what that macro is about?

So this kind of explains why the integer is bad but nothing can detect it and nothing assigns it this bad value. But what to do from here? I am stuck.



Jeffrey Walton

unread,
Oct 31, 2016, 6:49:16 AM10/31/16
to Andrew Marlow, Crypto++ Users
Yuk... In a vacuum and absent of a memory smash, a bad "this" pointer
is usually caused by C++ static initialization order problems. On
other OS'es it usually surfaces as a null this pointer.

I have spent countless hours trying to wring-out the problems. We
mostly have them solved on Windows and Linux. Solaris and MacPorts are
the hold outs because they don't respond to init_seg(lib) or
init_priority. Also see
https://cryptopp.com/wiki/Static_Initialization_Order_Fiasco.

How are you building the library? Are {cryptlib.o, cpu.o, integer.o}
the first three objects fed to the linker? The remaining *.o files are
"don't care's". However, if you have a file-scope static, then the
object file needs to be after {cryptlib.o, cpu.o, integer.o}.
{cryptlib.o, cpu.o, integer.o} must be initialized first:

// As an example:
$CXX $CXXFLAGS cryptlib.o, cpu.o, integer.o my_main.o -o prog.exe

Something else that kind of sucks when trying to diagnose a C++ static
initialization order problems: we don't have tools to audit for it.
I've been looking for them for years. Also see
http://stackoverflow.com/q/34144185.

Another workaround could be: (1) don't use libcryptopp.a, and (2) use
libcryptopp.so. I believe the module boundary ensures the C++ static
objects are initialized properly.

I could be completely wrong. I've also seen problems with the
initialized data segment on OS X. That's why we added
CRYPTOPP_SECTION_INIT. I wonder if Solaris is experiencing a similar
issue. Does -fno-common help the problem?

On early .Net 2002 and 2003 I saw some odd vtable problems. But I'm
pretty sure it was do to CRYPTOPP_NO_VTABLE on MS platforms. On
Unix'es the macro is defined to nothing so it should not be
contributing to the issue you are experiencing.

Can you get me remote access to that machine?

Jeff

Andrew Marlow

unread,
Oct 31, 2016, 10:34:23 AM10/31/16
to Crypto++ Users, marlow...@gmail.com, nolo...@gmail.com
On Monday, 31 October 2016 10:49:16 UTC, Jeffrey Walton wrote:
Yuk... In a vacuum and absent of a memory smash, a bad "this" pointer
is usually caused by C++ static initialization order problems.
[snip]

How are you building the library? Are {cryptlib.o, cpu.o, integer.o}
the first three objects fed to the linker?

The supplied GNUmakefile has a link line that starts with the object files: cryptlib.o integer.o seed.o shacal2.o
 
The remaining *.o files are
"don't care's". However, if you have a file-scope static, then the
object file needs to be after {cryptlib.o, cpu.o, integer.o}.
{cryptlib.o, cpu.o, integer.o} must be initialized first:
[snip]
Remember, I am no longer investigating why my program goes wrong. I am looking at cryptest.exe b for running the benchmarks. This shows exactly the same issue and takes my program code out the equation.


Another workaround could be: (1) don't use libcryptopp.a, and (2) use
libcryptopp.so.

Tried that. Didn't help.

I could be completely wrong. I've also seen problems with the
initialized data segment on OS X. That's why we added
CRYPTOPP_SECTION_INIT. I wonder if Solaris is experiencing a similar
issue. Does -fno-common help the problem?

Nope. Tried that, no change.
 

On early .Net 2002 and 2003 I saw some odd vtable problems. But I'm
pretty sure it was do to CRYPTOPP_NO_VTABLE on MS platforms. On
Unix'es the macro is defined to nothing so it should not be
contributing to the issue you are experiencing.

Ok. Thanks for the explanation.
 

Can you get me remote access to that machine?

No. It does not belong to me, it belongs to my client.

 

Jeff

I tried making GetVoidValue impure and put trace there to see if the virtual function dispatch was maybe a pure virtual being called somehow. No luck.

I also tried indirecting the call via a new non-virtual function that is not template code and whose implementation is in the cpp file, just in case that might help. It didn't. The new function appears in the stack trace but the wrong function still gets called when the call to GetVoidValue is called, even when the source code qualifies the function name with "this->".
 


> I have spent some more time in the debugger and have some more information
> on this problem. Unfortunately I have no resolution at the moment.
>
> The Integer copy ctor (integer.cpp:2903) is being given a bad integer to
> copy. The m_ptr is null, the m_size is 144. This integer is returned from
> RSAFunction::PreimageBound. The "this" pointer at this juncture seems to be
> bad. It turns out the pointer value is actually a pointer to char* rather
> than the object expected.
[snip]

Jeffrey Walton

unread,
Oct 31, 2016, 6:39:39 PM10/31/16
to Crypto++ Users, marlow...@gmail.com, nolo...@gmail.com

Another workaround could be: (1) don't use libcryptopp.a, and (2) use
libcryptopp.so.

Tried that. Didn't help.

That's a drag. I thought you might be compiling against one vrsion, and then runtime linking against another version.
 
Can you get me remote access to that machine?

No. It does not belong to me, it belongs to my client.
 

Can you bring me on as a subcontractor?

 Jeff

Andrew Marlow

unread,
Dec 12, 2016, 10:36:11 AM12/12/16
to Crypto++ Users, marlow...@gmail.com, nolo...@gmail.com

I'd love to but I am pleased to report that there's no need - I have got it working! It turns out it was indeed a compiler bug in version 12.4. I suspected as much from the stack trace where it showed a string address was being misinterpreted as a this pointer, causing an error in a virtual function dispatch.

After several calls to Rogue Wave support I eventually found out that in order to start using C++11 we would not only have to move from Source2016 to the fix release SourcePro2016.1, but we would also have to change compiler from 12.4 to 12.5.

I just tried cryptopp 565 with 12.5 using the same compiler options that were used for the build of SourcePro12.5 and it worked. The benchmarks now complete successfully.

 

 Jeff

Jeffrey Walton

unread,
Dec 12, 2016, 12:39:15 PM12/12/16
to Crypto++ Users, marlow...@gmail.com, nolo...@gmail.com


Can you bring me on as a subcontractor?

I'd love to but I am pleased to report that there's no need - I have got it working! It turns out it was indeed a compiler bug in version 12.4. I suspected as much from the stack trace where it showed a string address was being misinterpreted as a this pointer, causing an error in a virtual function dispatch.

After several calls to Rogue Wave support I eventually found out that in order to start using C++11 we would not only have to move from Source2016 to the fix release SourcePro2016.1, but we would also have to change compiler from 12.4 to 12.5.

I just tried cryptopp 565 with 12.5 using the same compiler options that were used for the build of SourcePro12.5 and it worked. The benchmarks now complete successfully.

Congrats on clearing it. I'm sorry you had to suffer it alone.

I'd like to get it written up at https://www.cryptopp.com/wiki/Solaris_(Command_Line) for the next unsuspecting person who comes along.

How would you write this up? Can you suggest a heading and some text?

Jeff
Reply all
Reply to author
Forward
0 new messages