[C++] gRPC shutdown sequence

512 views
Skip to first unread message

Arpit Baldeva

unread,
Sep 20, 2017, 5:35:13 PM9/20/17
to grpc.io
Hi,

I see an occasional when shutting down my server (version 1.4.2, VS 2015, Windows). 

My set up: I use async api and have 2 threads. 
1.  Thread 1 processes the tags from the completion queue. It also executes the shutdown request.
2. Thread 2 plucks the tags from the completion queue and queues them for the thread 1 to process.

After running for a while, on thread 1, I call server shutdown followed by the completion queue shutdown. The code looks like 

 if (mServer != nullptr)
        mServer->Shutdown(gpr_time_add(gpr_now(GPR_CLOCK_MONOTONIC), gpr_time_from_micros((deadline - TimeValue::getTimeOfDay()).getMicroSeconds(), GPR_TIMESPAN)));

    if (mCQ != nullptr) 
        mCQ->Shutdown();

Meanwhile, thread 2 is doing following

 while (true)
    {
        if (!mCQ->Next((void**)&tagInfo.tagProcessor, &tagInfo.ok))
        {
            
            break;
        }
}


So at some point, thread 2 notices that the completion queue has been shutdown and it exits out of the loop. 

Thread 1 continues to process the tags that were received previously. One of them happened to be 'done' tag of a previous rpc operation and in response to it, I destroy the rpc (my own class). This causes the ServerContext held by the rpc to be destroyed and I see following crash. 

ntdll.dll!00007ffe6f7ebbdf() Unknown
  ntdll.dll!00007ffe6f7c4571() Unknown
  ntdll.dll!00007ffe6f7c4490() Unknown
        gpr_mu_lock(gpr_mu * mu) Line 53 C
  interned_slice_destroy(interned_slice_refcount * s) Line 94 C
  interned_slice_unref(grpc_exec_ctx * exec_ctx, void * p) Line 112 C
  grpc_slice_unref_internal(grpc_exec_ctx * exec_ctx, grpc_slice slice) Line 76 C
  destroy_channel_elem(grpc_exec_ctx * exec_ctx, grpc_channel_element * elem) Line 926 C
  grpc_channel_stack_destroy(grpc_exec_ctx * exec_ctx, grpc_channel_stack * stack) Line 166 C
  destroy_channel(grpc_exec_ctx * exec_ctx, void * arg, grpc_error * error) Line 383 C
  grpc_exec_ctx_flush(grpc_exec_ctx * exec_ctx) Line 85 C
  grpc_exec_ctx_finish(grpc_exec_ctx * exec_ctx) Line 100 C
  grpc_call_unref(grpc_call * c) Line 579 C
  grpc::ServerContext::~ServerContext() Line 157 C++

Is it possible that the shutdown sequence is just a red herring and the bug is something else? Or is my shutdown sequence wrong (I can't delete ServerContext after shutting down server and completion queue)? 

Any tips appreciated. 

Thanks.


Yang Gao

unread,
Sep 20, 2017, 5:49:53 PM9/20/17
to Arpit Baldeva, grpc.io
Did you destruct the mServer and mCQ before destroying the rpc? If you keep either one living after the rpc's are all destroyed, does it still crash?



--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/2e7f038e-8d19-4989-8f2a-8e09536afb33%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Arpit Baldeva

unread,
Sep 20, 2017, 7:07:30 PM9/20/17
to grpc.io
I destroyed both mServer and mCQ. The 'done' tag/event that caused the rpc destruction (and in turn ServerContext destruction) came later. 

Being able to destroy all rpcs before server/completion queue is a fair bit of change for me so I only want to do that if what I am doing is supposed to be wrong. 

Thanks. 
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages