Crash on Cygwin i386 with -O3

28 views
Skip to first unread message

Jeffrey Walton

unread,
Jul 14, 2015, 4:31:21 AM7/14/15
to cryptop...@googlegroups.com
Oh, the irony....

Testing MessageDigest algorithm SHA-3-224.
......
Program received signal SIGSEGV, Segmentation fault.
0x004d7235 in CryptoPP::xorbuf (
    buf=0x8004dc0a "\276Y\255\064^|Lo!$\374զ\203\f\244\347bc\237\311xV4\367H\021\331\304\\\347\276\306(\254jŭ\233\300W\016\062m\274yk\371\375\274\377\241ڿ\276\220wc\360<Q\337/\266\231\267߸\274-\242\272\262i\251H\006\272b\037 \376\275\004eMk\261\200\004\210ֆ\362r\031\237\372v\f\256\320r\344\353\320=\212\246\023\365\347X\036\067ָö\220\271\302i\315U&",
    mask=0x28adf3 'a' <repeats 200 times>..., count=count@entry=0x46)
    at misc.cpp:41
41                              ((word32*)buf)[i] ^= ((word32*)mask)[i];
(gdb)

And its our old friend, this time showing its head on word32:

(gdb) disass
Dump of assembler code for function CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned int):
   0x004d7120 <+0>:     push   %ebp
   0x004d7121 <+1>:     mov    %esp,%ebp
   ...
   0x004d721a <+250>:   mov    0xc(%esp),%edx
   0x004d721e <+254>:   vmovdqu (%eax,%ebx,1),%xmm1
   0x004d7223 <+259>:   vinsertf128 $0x1,0x10(%eax,%ebx,1),%ymm1,%ymm0
   0x004d722b <+267>:   addl   $0x1,0x1c(%esp)
   0x004d7230 <+272>:   vxorps (%edx,%ebx,1),%ymm0,%ymm0
=> 0x004d7235 <+277>:   vmovdqa %ymm0,(%edx,%ebx,1)
   0x004d723a <+282>:   mov    0x1c(%esp),%edx
   0x004d723e <+286>:   add    $0x20,%ebx
   ...

We now have CRYPTOPP_NO_UNALIGNED_DATA_ACCESS. We can enable it (1) project wide. Or we can band-aide it and (2) enable it for Cygwin; maybe even (3) enable it for Cygwin i386.

Or, we can add yet another band-aide, and change the
IsAligned<word32> to IsStrictAligned<word32>.

How do you guys want to proceed? Any thing other than (1) is a band-aide that's going to lead to future trouble.

Jeff

Mobile Mouse

unread,
Jul 14, 2015, 12:25:54 PM7/14/15
to Jeffrey Walton, cryptop...@googlegroups.com
I think we should test the performance of IsStrictAligned<word32>, and if it is acceptable, enforce it throughout the project.

Sent from my iPad
--
--
You received this message because you are subscribed to the "Crypto++ Users" Google Group.
To unsubscribe, send an email to cryptopp-user...@googlegroups.com.
More information about Crypto++ and this group is available at http://www.cryptopp.com.
---
You received this message because you are subscribed to the Google Groups "Crypto++ Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cryptopp-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mobile Mouse

unread,
Jul 14, 2015, 12:27:19 PM7/14/15
to Jeffrey Walton, cryptop...@googlegroups.com
To explain my logic: IMHO the only reason NOT to enforce IsStrictAligned in the entire project would be if it kills performance (which I consider unlikely).

Sent from my iPad

On Jul 14, 2015, at 00:31, Jeffrey Walton <nolo...@gmail.com> wrote:

Jeffrey Walton

unread,
Jul 15, 2015, 12:54:41 AM7/15/15
to cryptop...@googlegroups.com, nolo...@gmail.com


On Tuesday, July 14, 2015 at 8:27:19 AM UTC-4, Mouse wrote:
To explain my logic: IMHO the only reason NOT to enforce IsStrictAligned in the entire project would be if it kills performance (which I consider unlikely).
But its still undefined behavior :) I guess its kind of like Jon Bentley said: "If it doesn't have to be correct, I can make it as fast as you'd like it to be".

Here's what we know about it:

 * Its undefined behavior
 * Windows tolerates it
 * Linux/x86_64 and GCC 4.8+ at -O3 does not tolerate it for word64
 * Linux/i386 and GCC 4.8+ at -O3 does not tolerate it for word32 or word64

To get the speed benefit, we want to hit the code path guarded by:

    if (IsAligned<word32>(ptr))
    {
        if (IsAligned<word64>(ptr))
        {
            ...
        }
    }

The way to [mostly] ensure we hit that code path is to define CRYPTOPP_NO_UNALIGNED_DATA_ACCESS. That ensures we get natural alignments from the allocator, and not the doctored value of 1 because the processor can tolerate it. And it removes all that undefined behavior that UBsan is complaining about :)

We can still use unaligned data. We just need to do it outside of C/C++ because that's where the aliasing and alignment rules are. To do it outside of C/C++, we need to drop into assembly language.

Here's an interim patch until you guys figure out how to handle it:

$ git diff config.h
diff --git a/config.h b/config.h
index d3bd692..e359729 100644
--- a/config.h
+++ b/config.h
@@ -33,6 +33,11 @@
 // define this to retain (as much as possible) old deprecated function and clas
 // #define CRYPTOPP_MAINTAIN_BACKWARDS_COMPATIBILITY
 
+// Cygwin requires aligned data acess. It vectorizes word32's on i386, too.
+#if defined(__CYGWIN__) || defined(__CYGWIN32__)
+# define CRYPTOPP_NO_UNALIGNED_DATA_ACCESS
+#endif
+
 #define GZIP_OS_CODE 0

I think its important to do it in the config.h file in case someone does a `sudo make install`. If we only did it in a makefile, then the define would be lost after install.

Jeff

Jeffrey Walton

unread,
Jul 16, 2015, 5:20:15 AM7/16/15
to cryptop...@googlegroups.com

Here's an interim patch until you guys figure out how to handle it:

$ git diff config.h
diff --git a/config.h b/config.h
index d3bd692..e359729 100644
--- a/config.h
+++ b/config.h
@@ -33,6 +33,11 @@
 // define this to retain (as much as possible) old deprecated function and clas
 // #define CRYPTOPP_MAINTAIN_BACKWARDS_COMPATIBILITY
 
+// Cygwin requires aligned data acess. It vectorizes word32's on i386, too.
+#if defined(__CYGWIN__) || defined(__CYGWIN32__)
+# define CRYPTOPP_NO_UNALIGNED_DATA_ACCESS
+#endif
+
 #define GZIP_OS_CODE 0

The patch was applied at:

 * https://github.com/weidai11/cryptopp/commit/06ea2d2952bd7439de53e7bf6d74da334e951162
 * https://sourceforge.net/p/cryptopp/code/591/

Eventually, we will need to address the undefined behavior that results from the unaligned accesses.

I'm going to modify config.h to do the same for the mobile and embedded targets (Android, iOS, Windows Phone, and Embedded).

I don't really consider the issue closed. Feel free to pick it up later.

Jeff

Jeffrey Walton

unread,
Oct 9, 2018, 6:58:01 AM10/9/18
to Crypto++ Users
Reply all
Reply to author
Forward
0 new messages