Segfault with core dump on 5.6.1 in Rijndael::Enc::AdvancedProcessBlocks

197 views
Skip to first unread message

Brian Vincent

unread,
Mar 7, 2013, 1:41:37 AM3/7/13
to cryptop...@googlegroups.com
I'm using CryptoPP's AES-256 encryption.  It's working for 99% of people just fine.  So far, 2 separate people are experiencing segfaults.  The seg fault seems to happen after successfully encrypting thousands of blocks, so even on their machines, it doesn't always fail.

Program terminated with signal 11, Segmentation fault.
#0  CryptoPP::Rijndael::Enc::AdvancedProcessBlocks (this=Cannot access memory at address 0x8
) at rijndael.cpp:1233
1233                    return length % BLOCKSIZE;
(gdb)

(gdb) bt
#0  CryptoPP::Rijndael::Enc::AdvancedProcessBlocks (this=Cannot access memory at address 0x8
) at rijndael.cpp:1233
Cannot access memory at address 0x4

(gdb) info registers
eax            0x7639370        123966320
ecx            0x0      0
edx            0xac64   44132
ebx            0x0      0
esp            0x76391f0        0x76391f0
ebp            0x0      0x0
esi            0x64     100
edi            0x8643434e       -2042412210
eip            0x83d45e8        0x83d45e8 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2024>
eflags         0x10246  [ PF ZF IF RF ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0xc040007b       -1069547397
fs             0x0      0
gs             0x33     51

Since the ebp register is 0x0, I can't get a good stack trace.

It's important to note that both of these people are running the x86 version of this library (on an x64 machine), and their CPUs do not support AES-NI.  This means that they're executing they're executing the SSE2 codepath.

(gdb) print g_hasAESNI
$1 = false
(gdb) print g_hasSSE2
$2 = true

If you look at the source, just before the seg fault, it executes an all-assembly function called Rijndael_Enc_AdvancedProcessBlocks.

01232                 Rijndael_Enc_AdvancedProcessBlocks(&locals, m_key);
01233                 return length % BLOCKSIZE;

That function sets up and manages its own stack space.  On x86, one of the first things that it does is push ebx and ebp on the stack.  One of the last things it does is pop them both off of the stack.  This matches up perfectly with the assembly code that I'm seeing.

   |0x83d45cd <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+1997> movaps %xmm0,0x30(%eax)
   |0x83d45d1 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2001> movaps %xmm0,0x40(%eax)
   |0x83d45d5 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2005> movaps %xmm0,0x50(%eax)
   |0x83d45d9 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2009> movaps %xmm0,0x60(%eax)
   |0x83d45dd <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2013> mov    0x300(%esp),%esp
   |0x83d45e4 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2020> emms
   |0x83d45e6 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2022> pop    %ebp
   |0x83d45e7 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2023> pop    %ebx
  >|0x83d45e8 <CryptoPP::Rijndael::Enc::AdvancedProcessBlocks(byte const*, byte const*, byte*, size_t, CryptoPP::Rijndael::Dec::word32) const+2024> andl   $0xf,0x18(%ebp)

Apparently my compiler has inlined the function.  It seg faults when it tries to access the variable "length" to perform %BLOCKSIZE (which is equivalent to bitwise AND of 0xf).  "length" should be 0x18 bytes after the ebp register.  But my ebp register is 0x0, meaning that somewhere inbetween pushing it to the stack and popping it off the stack, something has probably overwritten it with 0x0.  It certainly looks like a buffer overflow in the assembly code.

All of my attempts to reproduce this problem or analyze the asm function Rijndael_Enc_AdvancedProcessBlocks have failed.

I haven't tried 5.6.2 yet, because getting someone else to reproduce this problem is hard.  Also, there is only one change in 5.6.2 that could possibly be related to this, and it supposedly only fixes a valgrind false-positive warning.


1.  Interestingly, valgrind will report an error on the exact same assembly instruction, when attempting to access "length", saying that it's uninitialized.
2.  Valgrind will report that error, even when "length" is perfectly initialized, supporting the claim that it really is a false-positive.
3.  I have no idea why the change in 5.6.2 (increasing the assembly function's stack space from 512 to 768) would fix the valgrind false-positive.

I don't have any good reason to believe this change in 5.6.2 will fix my problem.

Can anyone help?

Thanks

David Irvine

unread,
Mar 7, 2013, 1:05:47 PM3/7/13
to Brian Vincent, Crypto++ Users

--
--

First off, what a great posting and very detailed. 
A couple of questions (I am aware this is horrible as fault is so random). 

Is it possible you can supply a minimal test case that will show this error (I appreciate it's random so possibly forcing many threads to concurrently run the test to speed up fail ) ?
2nd can you please give a description of the machines/compiler + switches etc.

I am very interested and can possibly set this up and test on several platforms. 

David
 

Brian Vincent

unread,
Mar 25, 2013, 7:06:18 PM3/25/13
to David Irvine, Crypto++ Users
I figured it out!  Well, mostly.  I can reproduce it now... sometimes

These are the things that have to be present to reproduce it:

1.  You have to make sure that you're taking the SSE2 code path for AES.  So, you need to be running on an SSE2 processor, and you either need to disable AESNI when compiling or have a processor that doesn't support it.
2.  You have to be running in 32-bit mode.
3.  You have to be using signals.

Here is my test program that I can reproduce it with:  http://pastebin.com/y5f7hRUr

I'm compiling it with the flags -O2 and -g.

First, I run my test program in one terminal window (under gdb if you want).  In another terminal, I find the PID of the running test program and run:

while true; do kill -RTMIN <pid>; done

I fully don't expect any of you to be able to reproduce this.  It's been acting very picky for me.  Sometimes I can segfault it 5 times in a row, and then the next 20 times, I can't.  It seems like if it doesn't segfault in 10 seconds or so, it probably won't happen.

Here is me running it under gdb:

Program received signal SIG34, Real-time event 34.
0x081863a1 in gettimeofday ()
(gdb) handle SIG34 nostop
Signal        Stop      Print   Pass to program Description
SIG34         No        Yes     Yes             Real-time event 34
(gdb) handle SIG34 noprint
Signal        Stop      Print   Pass to program Description
SIG34         No        No      Yes             Real-time event 34
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
CryptoPP::Rijndael::Enc::AdvancedProcessBlocks (this=Cannot access memory at address 0x8
) at rijndael.cpp:1233
1233                    return length % BLOCKSIZE;
(gdb) q
A debugging session is active.




So, I'm not sure of what's causing this yet.  My guess is that the assembly code for the SSE2 AES code path is doing things that aren't signal-safe.  From what I've read, I think the stack for the signal handler is pushed on to the stack of the thread handling the exception.  I can't fully follow exactly what happens to the esp register in the code, so I can't determine if it increments esp above data that it needs.  If it does, maybe the signal handler stack is overwriting data.  I also noticed that the assembly code overwrites the ebp register, which seems like a weird thing to do, but I can't explain how it could cause this.

Hopefully this helps you.  I think I'm done looking at this for now.  I'm either going to handle signals in a different thread, or disable the SSE2 codepath, or use OpenSSL.

Jeffrey Walton

unread,
Mar 28, 2013, 7:48:02 AM3/28/13
to Crypto++ Users
Hi Brian,

On Mar 7, 2:41 am, Brian Vincent <bra...@gmail.com> wrote:
> I'm using CryptoPP's AES-256 encryption.  It's working for 99% of people
> just fine.  So far, 2 separate people are experiencing segfaults.  The seg
> fault seems to happen after successfully encrypting thousands of blocks, so
> even on their machines, it doesn't always fail.
>
> Program terminated with signal 11, Segmentation fault.
> #0  CryptoPP::Rijndael::Enc::AdvancedProcessBlocks (this=Cannot access
> memory at address 0x8
> ) at rijndael.cpp:1233
> 1233                    return length % BLOCKSIZE;
> (gdb)
>
> (gdb) bt
> #0  CryptoPP::Rijndael::Enc::AdvancedProcessBlocks (this=Cannot access
> memory at address 0x8
From this output, it looks like ECX (the 'this' pointer) is getting
blown away. I'm not sure what's at +8, but I'm not sure it matters
either. Finding that overwrite seems to be very relevant, though :)

> ) at rijndael.cpp:1233
> Cannot access memory at address 0x4
>
> (gdb) info registers
> eax            0x7639370        123966320
> ecx            0x0      0
> edx            0xac64   44132
> ebx            0x0      0
> esp            0x76391f0        0x76391f0
> ebp            0x0      0x0
> esi            0x64     100
> edi            0x8643434e       -2042412210
> eip            0x83d45e8        0x83d45e8
> ...

> 1. Interestingly, valgrind will report an error on the exact same assembly
> instruction, when attempting to access "length", saying that it's
> uninitialized.
> 2. Valgrind will report that error, even when "length" is perfectly
> initialized, supporting the claim that it really is a false-positive.
Its interesting things look right in the sources, but Valgrind flags
it during dynamic analysis.

Perhaps its a GCC or BinUtils problem? Have you tried another version
of the tools?

> Can anyone help?
Is AdvancedProcessBlocks using ECX? Is it preserving it (push/pop)? I
expect so, but it does not hurt to ask.

I assume the problem goes away when defining CRYPTOPP_DISABLE_ASM.

Jeff

Brian Vincent

unread,
Mar 28, 2013, 1:33:04 PM3/28/13
to Crypto++ Users
On Thu, Mar 28, 2013 at 6:48 AM, Jeffrey Walton <nolo...@gmail.com> wrote:
Hi Brian,

On Mar 7, 2:41 am, Brian Vincent <bra...@gmail.com> wrote:
> I'm using CryptoPP's AES-256 encryption.  It's working for 99% of people
> just fine.  So far, 2 separate people are experiencing segfaults.  The seg
> fault seems to happen after successfully encrypting thousands of blocks, so
> even on their machines, it doesn't always fail.
>
> Program terminated with signal 11, Segmentation fault.
> #0  CryptoPP::Rijndael::Enc::AdvancedProcessBlocks (this=Cannot access
> memory at address 0x8
> ) at rijndael.cpp:1233
> 1233                    return length % BLOCKSIZE;
> (gdb)
>
> (gdb) bt
> #0  CryptoPP::Rijndael::Enc::AdvancedProcessBlocks (this=Cannot access
> memory at address 0x8
From this output, it looks like ECX (the 'this' pointer) is getting
blown away. I'm not sure what's at +8, but I'm not sure it matters
either. Finding that overwrite seems to be very relevant, though :)

I thought this this was due to the ebp register being 0.  I think the saved eip register is saved at ebp+0x4, and the first arguments on the stack are saved at ebp+0x8.  So I think that error is when gdb is trying to look at the first argument, this.
 

> ) at rijndael.cpp:1233
> Cannot access memory at address 0x4
>
> (gdb) info registers
> eax            0x7639370        123966320
> ecx            0x0      0
> edx            0xac64   44132
> ebx            0x0      0
> esp            0x76391f0        0x76391f0
> ebp            0x0      0x0
> esi            0x64     100
> edi            0x8643434e       -2042412210
> eip            0x83d45e8        0x83d45e8
> ...

> 1.  Interestingly, valgrind will report an error on the exact same assembly
> instruction, when attempting to access "length", saying that it's
> uninitialized.
> 2.  Valgrind will report that error, even when "length" is perfectly
> initialized, supporting the claim that it really is a false-positive.
Its interesting things look right in the sources, but Valgrind flags
it during dynamic analysis.

Perhaps its a GCC or BinUtils problem? Have you tried another version
of the tools?

Nope.
 

> Can anyone help?
Is AdvancedProcessBlocks using ECX? Is it preserving it (push/pop)? I
expect so, but it does not hurt to ask.

At the beginning of the asm function, the ecx is a pointer to "locals".  But the asm function uses ecx, so it might be fine that it's 0 at this point.  I have trouble following the aes code in assembly.
 

I assume the problem goes away when defining CRYPTOPP_DISABLE_ASM.

I would assume so too.  The crashes that people are getting with our program only happen when they take this SSE2 assembly code path, and not the c codepath or the AESNI code path.
 

Jeff

--
--
You received this message because you are subscribed to the "Crypto++ Users" Google Group.
To unsubscribe, send an email to cryptopp-user...@googlegroups.com.
More information about Crypto++ and this group is available at http://www.cryptopp.com.
---
You received this message because you are subscribed to a topic in the Google Groups "Crypto++ Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cryptopp-users/qGIdqp3MIgg/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to cryptopp-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



Zooko Wilcox-O'Hearn

unread,
Apr 8, 2013, 2:05:15 AM4/8/13
to Brian Vincent, Crypto++ Users
Thanks to Brian Vincent for the really good bug report.

I can't yet tell if this bug could effect my use of Crypto++. My use
of Crypto++ is in the pycryptopp library:

https://pypi.python.org/pypi/pycryptopp

I think whether this bug can affect my users may depend on whether
Python sets signal handlers in the main thread.

Also, we already have this patch:
http://cryptopp.svn.sourceforge.net/viewvc/cryptopp?view=revision&revision=525
. Our version of it is:
https://github.com/tahoe-lafs/pycryptopp/commit/906683989eddadb9a5f4da17ce49867b3e27a24a

I can't tell if that patch would fix this bug. Apparently Brian
Vincent didn't try it (for the very good reason that he didn't see why
it would help).

It's a shame that the way to exercise this bug requires sending a
signal from a separate process. I really like testing
(https://tahoe-lafs.org/buildbot-pycryptopp/waterfall), but even I may
balk at writing a test that spawns a subprocess and then sends signals
at it.

I'm seriously thinking of just defining CRYPTOPP_DISABLE_ASM always in
our build system. This would have eliminated about 90% of the bugs
we've had to deal with over the years. I honestly don't know if the
performance penalty would be detectable to my users, but I guess I
could experiment. (We have benchmarks as well as unit tests on that
buildbot.)

In any case, defining that would guarantee that my users are safe from
this bug (unless they link at runtime to a system-provided
libcryptopp.so instead of letting pycryptopp's build system build
Crypto++ for them).

In the future, I intend to work toward replacing Crypto++ entirely in
pycryptopp, for reasons of compilation, portability, deployment, etc.
The biggest single problem I have with Crypto++ is that it is written
in C++. Every couple of years this causes a deployment headache for
me. The most recent example is that the newest and best way to
interface native code to Python -- cffi
(http://cffi.readthedocs.org/en/latest/) doesn't support C++ at all. I
think I'd rather have the simplicity of using cffi and give up the
advantages of Crypto++. That means I have to adopt some other
implementation of AES and of RSA, most likely by relying on a future
release of pyOpenSSL which is itself based on cffi and which exposes
the lower-level API of OpenSSL to Python land.

Thanks!

Regards,

Zooko

Zooko O'Whielacronx

unread,
Apr 12, 2013, 5:05:30 AM4/12/13
to Brian Vincent, Crypto++ Users
I'm making progress at disabling the ASM implementations and using
only the C++ implementations:

https://tahoe-lafs.org/trac/pycryptopp/ticket/85

Intriguingly this appears to have unexpectedly fixed a mysterious bug
that manifests only on NetBSD, and only when the /dev/urandom entropy
pool is "low on entropy":

https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1924

Regards,

Zooko
Reply all
Reply to author
Forward
0 new messages