[Boost-users] Coroutine jump_fcontext crash (Boost 1.61)

184 views
Skip to first unread message

Jason Mancini

unread,
Jun 21, 2016, 1:24:34 PM6/21/16
to boost...@lists.boost.org

Hello,

Starting with Boost 1.61, if -DBOOST_USE_VALGRIND=1 is set, then the Coroutine sample from boost docs crashes on first stack switch.  Is this a known issue? (We're using GCC 6.1 on 64-bit Linux.)  Not posting a code sample, it's literally the exact one from the Coroutine web page.  Compile with the define, then run.  We have Valgrind 3.10.1 installed.  [Curiously, the Context code sample runs fine.]  I am suspecting a missing snippet on the Coroutine side that affects stack_context structure size.

Thanks!

Jason


# gdb ./a.out

...

SIGSEGV, Segmentation fault.

(gdb) bt

#0 0x00002aaaaaab3ec0 in jump_fcontext () from .../lib/libboost_context.so.1.61.0

(gdb)

Oliver Kowalke

unread,
Jun 21, 2016, 2:01:18 PM6/21/16
to boost-users
works for me
did you recompile boost.context with BOOST_USE_VALGRIND?

Jason Mancini

unread,
Jun 21, 2016, 2:43:41 PM6/21/16
to boost...@lists.boost.org
Oliver Kowalke wrote:
> works for me
> did you recompile boost.context with BOOST_USE_VALGRIND?

Hi Oliver,

Well, we didn't do anything special for Boost 1.57/1.60 like that, and had no problems.  Are you saying we need one set of boost *.so/*.a library files with the define, and another set without?  Specifically, how do I get the flag into boost bootstrap.sh/b2 for building the installed *.so and *.a files?  My only option presently is --with-libraries=all

Thanks!

Jason


Jason Mancini

unread,
Jun 21, 2016, 4:05:15 PM6/21/16
to boost...@lists.boost.org

Here's a little more info:


jump_fcontext ()

libs/context/src/asm/jump_x86_64_sysv_elf_gas.S:45

45 popq %r12 /* restrore R12 */


rsp is 0x0, because rdi is 0x0, which was supposed to point to the context data.  Attempting to add printf to the 8 jump_fcontext call sites.


Jason Mancini

unread,
Jun 21, 2016, 5:06:02 PM6/21/16
to boost...@lists.boost.org

Header file coroutine/detail/push_coroutine_impl.hpp:98

callee_.ctx_ is valid


But then, library file coroutine/src/detail/coroutine_context.hpp:68

other.ctx_ is null


Yeah, callee_ has

sp, size, sctx={size,sp,valgrind_stack_id}, ctx_


But other has

sp, size, sctx={size,sp}, ctx_


That's the disconnect.  Trying to figure out why 1.60 is not afflicted.

Jason


Jason Mancini

unread,
Jun 21, 2016, 8:21:09 PM6/21/16
to boost...@lists.boost.org
Oliver,
Yeah okay I see what you mean.  I checked both 1.60 and 1.61 in the debugger with debug libraries.  Both are broken if BOOST_USE_VALGRIND is set only on the user code, but for reasons, only 1.61 crashes (the way it is wrong gets swapped around, one way leaves ctx_ in the right spot, one doesn't).  Got it worked out.
Thanks for the hint this morning!

Jason


Gavin Lambert

unread,
Jun 21, 2016, 8:32:49 PM6/21/16
to boost...@lists.boost.org
On 22/06/2016 06:00, Oliver Kowalke wrote:
> works for me
> did you recompile boost.context with BOOST_USE_VALGRIND?

You should probably have an ABI compatibility namespace if defining this
differently on each end causes ABI breakage.


_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Jason Mancini

unread,
Jun 21, 2016, 9:16:18 PM6/21/16
to boost...@lists.boost.org

> You should probably have an ABI compatibility namespace if defining this
> differently on each end causes ABI breakage.


For instance, we have 45 boost installations (boost version * gcc versions * 32/64 bit * options), and 30 valgrind installations.  That would require 1350 boost+valgrind installations to generate all possible debug cases, hahaha.


One ABI solution would be to leave the valgrind_stack_id declared in the struct at all times (or a field of equivalent size).  At least the ABI wouldn't change.


#if defined(BOOST_USE_VALGRIND)

  unsigned valgrind_stack_id;

#else

  unsigned unused_valgrind_stack_id;

#endif


My solution is to force the define in the 7 header files that use it, such that it is always enabled, for the library/installation build, and all end-user code.  Because ABI class layout mismatches are horrible to debug, we've been through that before.


It would be nice if the installed *.so was not dependent on valgrind.h when BOOST_USE_VALGRIND is in play.


Jason


Gavin Lambert

unread,
Jun 21, 2016, 11:40:57 PM6/21/16
to boost...@lists.boost.org
On 22/06/2016 13:16, Jason Mancini wrote:
>> You should probably have an ABI compatibility namespace if defining this
>> differently on each end causes ABI breakage.
>
> For instance, we have 45 boost installations (boost version * gcc
> versions * 32/64 bit * options), and 30 valgrind installations. That
> would require 1350 boost+valgrind installations to generate all possible
> debug cases, hahaha.
[...]
> My solution is to force the define in the 7 header files that use it,
> such that it is always enabled, for the library/installation build, and
> all end-user code. Because ABI class layout mismatches are horrible to
> debug, we've been through that before.

That's the point of defining ABI namespaces (in Boost.Context, not in
your code). If the settings are mismatched then you'll get a link error
instead of UB -- provided that the library author has correctly
identified all the settings that change the ABI, of course.
Reply all
Reply to author
Forward
0 new messages