crash during process shutdown grpc-0.12, c++

Yaz Saito

unread,

Jan 26, 2016, 1:18:04 AM1/26/16

to grp...@googlegroups.com, Abhishek Parmar

I'm seeing occasional crashes of the below form:

pure virtual method called

terminate called without an active exception

*** Aborted at 1453786187 (unix time) try "date -d @1453786187" if you are using GNU date ***

PC: @ 0x7fb8067cfcc9 (unknown)

*** SIGABRT (@0x3e8000008b2) received by PID 2226 (TID 0x7fb803de3700) from PID 2226; stack trace: ***

@ 0x7fb8070d91bf (unknown)

@ 0x47406d grpc::Server::RunRpc()

@ 0x47d031 grpc::DynamicThreadPool::ThreadFunc()

@ 0x47d1a8 grpc::DynamicThreadPool::DynamicThread::ThreadFunc()

@ 0x7fb80712ba40 (unknown)

@ 0x7fb807fec182 start_thread

@ 0x7fb80689347d (unknown)

Aborted (core dumped)

If we change cpp/server/server.cc,

static void InitGlobalCallbacks() {

if (g_callbacks == nullptr) {

static DefaultGlobalCallbacks default_global_callbacks;

g_callbacks = &default_global_callbacks;

}

to

static void InitGlobalCallbacks() {

if (g_callbacks == nullptr) {

static DefaultGlobalCallbacks* default_global_callbacks =

new DefaultGlobalCallbacks;

g_callbacks = default_global_callbacks;

}

the problem disappears. Is this code intentional? I don't see much benefit in destroying the callback object while the rpc system itself may be still live.

--

yaz

Abhishek Parmar

unread,

Jan 26, 2016, 12:49:59 PM1/26/16

to Yaz Saito, grp...@googlegroups.com

In addition to this we are seeing that after the upgrade to 0.12 the server shutdown seems to hang with num_running_cb_ never going down to zero. Seems like a regression. Let us know if you need more information.

--

-Abhishek

Craig Tiller

unread,

Jan 26, 2016, 5:34:51 PM1/26/16

to Abhishek Parmar, Yaz Saito, grp...@googlegroups.com

Hey... I would appreciate more information on the server shutdown problem.

Yaz - would something like this fix the problem you're seeing?

We're keen to have no memory leaks at the end, which is why there's some (implicit) cleanup path.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CAD2hskxdPYd5sYsFJDiaip5so3TrmvVECVmYFR4kG9f3it7Q1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Abhishek Parmar

unread,

Jan 26, 2016, 6:13:31 PM1/26/16

to Craig Tiller, Yaz Saito, grp...@googlegroups.com

On Tue, Jan 26, 2016 at 2:34 PM, Craig Tiller <cti...@google.com> wrote:

Hey... I would appreciate more information on the server shutdown problem.

We have some code like below that seems to be get stuck (after upgrading to 0.12) at server->Shutdown() waiting on the condvar in server.cc for num_running_cb_ to become 0. Does not happen to all our server based tests but happens very frequently to a small test.

...

  grpc_init();
  grpc::ServerBuilder builder;
  address = c3d::StringPrintf("0.0.0.0:%d", kPort);
  builder.AddListeningPort(address, grpc::InsecureServerCredentials());
  builder.RegisterService(alerts);
  std::unique_ptr<grpc::Server> server(builder.BuildAndStart());
  LOG(INFO)  << "GRPC ALERTS listening on " << address;

  std::unique_ptr<ALERTS::Stub> stub = c3d::alerts::ALERTS::NewStub(
      grpc::CreateChannel(address, grpc::InsecureCredentials()));
  kStub = stub.get();

  int retval = RUN_ALL_TESTS();

  server->Shutdown();
  server->Wait();
  // Delete server and client before grpc_shutdown
  server.reset(nullptr);
  stub.reset(nullptr);
  grpc_shutdown()

...

Yaz - would something like this fix the problem you're seeing?

Yes something like that fixes the problem that Yaz mentioned.

--

-Abhishek

Craig Tiller

unread,

Jan 26, 2016, 8:10:53 PM1/26/16

to Abhishek Parmar, Yaz Saito, grp...@googlegroups.com

We've merged that onto master. Will see that it's backported tomorrow to 0.12.

Craig Tiller

unread,

Jan 26, 2016, 8:11:33 PM1/26/16

to Abhishek Parmar, Yaz Saito, grp...@googlegroups.com

I'll try to reproduce the other problem tomorrow also.

Yaz Saito

unread,

Jan 27, 2016, 1:19:48 PM1/27/16

to Craig Tiller, Abhishek Parmar, grp...@googlegroups.com

On Tue, Jan 26, 2016 at 2:34 PM, Craig Tiller <cti...@google.com> wrote:

Hey... I would appreciate more information on the server shutdown problem.

Yaz - would something like this fix the problem you're seeing?

No, this won't work. The problem is that we can't have this object go away while the RPC system is still running. So we either need to force clean shutdown of the rpc system before the destructor of global_callbacks run, or tell people to call _exit and not exit, or just leak global_callbacks. I personally think the last option is the best.

--

yaz

Craig Tiller

unread,

Jan 27, 2016, 1:31:15 PM1/27/16

to Yaz Saito, Abhishek Parmar, grp...@googlegroups.com

So the shared pointer (held by server) prevents the actual callback object from going away right?

The g_callbacks reference will drop during the atexit chain. It'll fail if you try to create a server after atexit, but otherwise we should be fine.

Reply all

Reply to author

Forward