Help with CompletionQueue closedown

9 views
Skip to first unread message

Colm Quinn

unread,
Jun 7, 2024, 2:28:36 PMJun 7
to grpc.io
I'm having an issue trying to close down a CQ.   Would appreciate some ideas or suggestions.  It looks similar to     Completion queue never shutdown · Issue #23764 · grpc/grpc · GitHub

This is a Win64 dev environment.  No rpc has been run.

I'm calling Shutdown and wait for the QUEUE_COMPLETE event to arrive.  Then I'm exiting the thread, and deleting the object containing the CQ.    However, the gRPC logging show that threads are continuing to run and a segfault with corrupted backtrace (looks like a branch to 0 if my rusty assembly holds).   I've played with delays - but the seg fault continues.

CQ Code is basically:

  while(true) {

    std::chrono::system_clock::time_point deadline =
         std::chrono::system_clock::now() + std::chrono::seconds(interval);

     ok = true;
     
    CompletionQueue::NextStatus status = m_cq->AsyncNext(&tag, &ok, deadline);

    if (status == CompletionQueue::NextStatus::SHUTDOWN) {
 /* Hummmmm ... ignore for now */
 inshutdown = true;
 log( "CQ shutdown received\n");
 break;    //  --> and return from thread
    }

    if (m_stop == true) {                     // set just before Shutdown all
 interval = 1;     /* Drain the CQ - pick up the pace */
 continue;
     }

    if (status == CompletionQueue::NextStatus::GOT_EVENT) {
         ... do stuff


The logs look like ....  I see the QUEUE_SHUTDOWN arriving but something is continuing to run a timer.

I0605 17:42:14.340000000 24608 connectivity_state.cc:182] ConnectivityStateTracker client_channel[000001E5713B5520]: get current state: READY
2024-06-05 17:42:14.336697 95.64% [NOTICE] mod_grpc.c:66 gRPC Channel 192.168.1.195:50051 state 2
2024-06-05 17:42:14.336697 95.64% [NOTICE] mod_grpc.c:263 gRPC channel created, profile->name routetest, uri 192.168.1.195:50051

Load dll and start a Completion Queue


I0605 17:42:14.826000000 24236 timer_manager.cc:207] wait ended: was_timed:1 kicked:0
D0605 17:42:14.827000000 24236 timer_generic.cc:696] TIMER CHECK BEGIN: now=3038 next=9223372036854775807 tls_min=2032 glob_min=3037
D0605 17:42:14.827000000 24236 timer_generic.cc:596]   .. shard[0]->min_deadline = 3037
D0605 17:42:14.827000000 24236 timer_generic.cc:512]   .. shard[0]: heap_empty=true
D0605 17:42:14.827000000 24236 timer_generic.cc:483]   .. shard[0]->queue_deadline_cap --> 4037
D0605 17:42:14.827000000 24236 timer_generic.cc:556]   .. shard[0] popped 0
D0605 17:42:14.827000000 24236 timer_generic.cc:615]   .. result --> 1, shard[0]->min_deadline 3037 --> 4038, now=3038
D0605 17:42:14.828000000 24236 timer_generic.cc:512]   .. shard[1]: heap_empty=true
D0605 17:42:14.828000000 24236 timer_generic.cc:483]   .. shard[1]->queue_deadline_cap --> 4038
D0605 17:42:14.828000000 24236 timer_generic.cc:556]   .. shard[1] popped 0
D0605 17:42:14.828000000 24236 timer_generic.cc:615]   .. result --> 1, shard[1]->min_deadline 3037 --> 4039, now=3038
D0605 17:42:14.828000000 24236 timer_generic.cc:512]   .. shard[2]: heap_empty=true
D0605 17:42:14.829000000 24236 timer_generic.cc:483]   .. shard[2]->queue_deadline_cap --> 4037
D0605 17:42:14.829000000 24236 timer_generic.cc:556]   .. shard[2] popped 0
D0605 17:42:14.829000000 24236 timer_generic.cc:615]   .. result --> 1, shard[2]->min_deadline 3037 --> 4038, now=3038
D0605 17:42:14.829000000 24236 timer_generic.cc:512]   .. shard[3]: heap_empty=true
D0605 17:42:14.830000000 24236 timer_generic.cc:483]   .. shard[3]->queue_deadline_cap --> 4038
D0605 17:42:14.830000000 24236 timer_generic.cc:556]   .. shard[3] popped 0
D0605 17:42:14.830000000 24236 timer_generic.cc:615]   .. result --> 1, shard[3]->min_deadline 3037 --> 4039, now=3038
D0605 17:42:14.830000000 24236 timer_generic.cc:512]   .. shard[4]: heap_empty=true
D0605 17:42:14.830000000 24236 timer_generic.cc:483]   .. shard[4]->queue_deadline_cap --> 4037
D0605 17:42:14.830000000 24236 timer_generic.cc:556]   .. shard[4] popped 0
D0605 17:42:14.830000000 24236 timer_generic.cc:615]   .. result --> 1, shard[4]->min_deadline 3037 --> 4038, now=3038
D0605 17:42:14.830000000 24236 timer_generic.cc:512]   .. shard[5]: heap_empty=true
D0605 17:42:14.830000000 24236 timer_generic.cc:483]   .. shard[5]->queue_deadline_cap --> 4037
D0605 17:42:14.830000000 24236 timer_generic.cc:556]   .. shard[5] popped 0
D0605 17:42:14.830000000 24236 timer_generic.cc:615]   .. result --> 1, shard[5]->min_deadline 3037 --> 4038, now=3038
D0605 17:42:14.830000000 24236 timer_generic.cc:512]   .. shard[6]: heap_empty=true
D0605 17:42:14.830000000 24236 timer_generic.cc:483]   .. shard[6]->queue_deadline_cap --> 4037
D0605 17:42:14.830000000 24236 timer_generic.cc:556]   .. shard[6] popped 0
D0605 17:42:14.830000000 24236 timer_generic.cc:615]   .. result --> 1, shard[6]->min_deadline 3037 --> 4038, now=3038
D0605 17:42:14.830000000 24236 timer_generic.cc:512]   .. shard[7]: heap_empty=true
D0605 17:42:14.830000000 24236 timer_generic.cc:483]   .. shard[7]->queue_deadline_cap --> 4037
D0605 17:42:14.831000000 24236 timer_generic.cc:556]   .. shard[7] popped 0
D0605 17:42:14.831000000 24236 timer_generic.cc:615]   .. result --> 1, shard[7]->min_deadline 3037 --> 4038, now=3038
D0605 17:42:14.831000000 24236 timer_generic.cc:722] TIMER CHECK END: r=1; next=4038 tls_min=3037 glob_min=4038

D0605 17:42:26.852000000 24236 timer_generic.cc:483]   .. shard[1]->queue_deadline_cap --> 16062
D0605 17:42:26.852000000 24236 timer_generic.cc:556]   .. shard[1] popped 0
D0605 17:42:26.852000000 24236 timer_generic.cc:615]   .. result --> 1, shard[1]->min_deadline 15062 --> 16063, now=15062
D0605 17:42:26.852000000 24236 timer_generic.cc:722] TIMER CHECK END: r=1; next=16047
I0605 17:42:26.852000000 24236 timer_manager.cc:191] sleep for a 985 milliseconds
I0605 17:42:31.028000000 24236 timer_manager.cc:207] wait ended: was_timed:1 kicked:0

Call CQ->Shutdown()

I0605 17:42:31.028000000 24608 completion_queue.cc:1398] grpc_completion_queue_shutdown(cq=000001E571FF0360)
D0605 17:42:31.029000000 24236 timer_generic.cc:696] TIMER CHECK BEGIN: now=19240 next=9223372036854775807 tls_min=15062 glob_min=16047
I0605 17:42:31.029000000 24608 src/core/lib/gprpp/ref_counted.h:85] completion_queue:000001E571FF0360 C:\Users\colm_\source\repos\grpc\src\core\lib\surface\completion_queue.cc:1101 ref 3 -> 4 shutting_down
D0605 17:42:31.029000000 24236 timer_generic.cc:596]   .. shard[3]->min_deadline = 16047

I0605 17:42:31.029000000 24608 src/core/lib/gprpp/ref_counted.h:181] completion_queue:000001E571FF0360 C:\Users\colm_\source\repos\grpc\src\core\lib\surface\completion_queue.cc:1116 unref 4 -> 3 shutting_down


D0605 17:42:31.029000000 25052 exec_ctx.cc:36] running closure 000001E571FF0460: created [C:\Users\colm_\source\repos\grpc\src\core\lib\surface\completion_queue.cc:556]: scheduled [C:\Users\colm_\source\repos\grpc\src\core\lib\iomgr\pollset_windows.cc:150]
D0605 17:42:31.029000000 24236 timer_generic.cc:512]   .. shard[3]: heap_empty=false
I0605 17:42:31.030000000 25052 src/core/lib/gprpp/ref_counted.h:181] completion_queue:000001E571FF0360 C:\Users\colm_\source\repos\grpc\src\core\lib\surface\completion_queue.cc:616 unref 3 -> 2 pollset_destroy
D0605 17:42:31.030000000 24236 timer_generic.cc:525]   .. check top timer deadline=16047 now=19240
D0605 17:42:31.030000000 25052 exec_ctx.cc:48] closure 000001E571FF0460 finished
D0605 17:42:31.030000000 24236 timer_generic.cc:532] TIMER 000001E57301ADB0: FIRE 3193ms late
I0605 17:42:31.030000000 25052 completion_queue.cc:1069] RETURN_EVENT[000001E571FF0360]: QUEUE_SHUTDOWN

D0605 17:42:31.030000000 24236 timer_generic.cc:512]   .. shard[3]: heap_empty=true
I0605 17:42:31.030000000 25052 src/core/lib/gprpp/ref_counted.h:181] completion_queue:000001E571FF0360 C:\Users\colm_\source\repos\grpc\src\core\lib\surface\completion_queue.cc:1070 unref 2 -> 1 next
D0605 17:42:31.030000000 24236 timer_generic.cc:483]   .. shard[3]->queue_deadline_cap --> 20240
D0605 17:42:31.030000000 24236 timer_generic.cc:556]   .. shard[3] popped 1
D0605 17:42:31.030000000 24236 timer_generic.cc:615]   .. result --> 2, shard[3]->min_deadline 16047 --> 20241, now=19240
D0605 17:42:31.030000000 24236 timer_generic.cc:512]   .. shard[7]: heap_empty=true
2024-06-05 17:42:31.028548 0000000000000000000000000000000000000000000000000000 [NOTICE] grpc_channel.cpp:291 Completion queue ending 0
D0605 17:42:31.030000000 24236 timer_generic.cc:483]   .. shard[7]->queue_deadline_cap --> 20240
D0605 17:42:31.031000000 24236 timer_generic.cc:556]   .. shard[7] popped 0
D0605 17:42:31.032000000 24236 timer_generic.cc:615]   .. result --> 2, shard[7]->min_deadline 16054 --> 20241, now=19240
D0605 17:42:31.032000000 24236 timer_generic.cc:512]   .. shard[6]: heap_empty=true
D0605 17:42:31.032000000 24236 timer_generic.cc:483]   .. shard[6]->queue_deadline_cap --> 20240
D0605 17:42:31.032000000 24236 timer_generic.cc:556]   .. shard[6] popped 0
D0605 17:42:31.032000000 24236 timer_generic.cc:615]   .. result --> 2, shard[6]->min_deadline 16054 --> 20241, now=19240
D0605 17:42:31.032000000 24236 timer_generic.cc:512]   .. shard[5]: heap_empty=true
D0605 17:42:31.033000000 24236 timer_generic.cc:483]   .. shard[5]->queue_deadline_cap --> 20240
D0605 17:42:31.033000000 24236 timer_generic.cc:556]   .. shard[5] popped 0
D0605 17:42:31.033000000 24236 timer_generic.cc:615]   .. result --> 2, shard[5]->min_deadline 16054 --> 20241, now=19240
D0605 17:42:31.033000000 24236 timer_generic.cc:512]   .. shard[4]: heap_empty=true
D0605 17:42:31.033000000 24236 timer_generic.cc:483]   .. shard[4]->queue_deadline_cap --> 20240
D0605 17:42:31.034000000 24236 timer_generic.cc:556]   .. shard[4] popped 0
D0605 17:42:31.034000000 24236 timer_generic.cc:615]   .. result --> 2, shard[4]->min_deadline 16054 --> 20241, now=19240
D0605 17:42:31.034000000 24236 timer_generic.cc:512]   .. shard[2]: heap_empty=true
D0605 17:42:31.034000000 24236 timer_generic.cc:483]   .. shard[2]->queue_deadline_cap --> 20240
D0605 17:42:31.034000000 24236 timer_generic.cc:556]   .. shard[2] popped 0
D0605 17:42:31.034000000 24236 timer_generic.cc:615]   .. result --> 2, shard[2]->min_deadline 16054 --> 20241, now=19240
D0605 17:42:31.035000000 24236 timer_generic.cc:512]   .. shard[0]: heap_empty=true
D0605 17:42:31.035000000 24236 timer_generic.cc:483]   .. shard[0]->queue_deadline_cap --> 20240
D0605 17:42:31.035000000 24236 timer_generic.cc:556]   .. shard[0] popped 0
D0605 17:42:31.035000000 24236 timer_generic.cc:615]   .. result --> 2, shard[0]->min_deadline 16054 --> 20241, now=19240
D0605 17:42:31.035000000 24236 timer_generic.cc:512]   .. shard[1]: heap_empty=true
D0605 17:42:31.035000000 24236 timer_generic.cc:483]   .. shard[1]->queue_deadline_cap --> 20240
D0605 17:42:31.036000000 24236 timer_generic.cc:556]   .. shard[1] popped 0
D0605 17:42:31.036000000 24236 timer_generic.cc:615]   .. result --> 2, shard[1]->min_deadline 16063 --> 20241, now=19240
D0605 17:42:31.036000000 24236 timer_generic.cc:722] TIMER CHECK END: r=2; next=20241
I0605 17:42:31.036000000 24236 timer_manager.cc:127] kick untimed waiter
I0605 17:42:31.036000000 24236 timer_manager.cc:135] flush exec_ctx
I0605 17:42:31.036000000 22224 timer_manager.cc:207] wait ended: was_timed:0 kicked:0
D0605 17:42:31.036000000 24236 exec_ctx.cc:36] running closure 000001E57301ADF0: created [C:\Users\colm_\source\repos\grpc\src\core\client_channel\backup_poller.cc:144]: scheduled [C:\Users\colm_\source\repos\grpc\src\core\lib\iomgr\timer_generic.cc:550]
D0605 17:42:31.036000000 22224 timer_generic.cc:696] TIMER CHECK BEGIN: now=19247 next=9223372036854775807 tls_min=11048 glob_min=20241
D0605 17:42:31.036000000 24236 timer_generic.cc:341] TIMER 000001E57301ADB0: SET 24247 now 19247 call 000001E57301ADF0[000001E575D3F920]
D0605 17:42:31.036000000 22224 timer_generic.cc:722] TIMER CHECK END: r=1; next=20241
D0605 17:42:31.036000000 24236 timer_generic.cc:377]   .. add to shard 3 with queue_deadline_cap=20240 => is_first_timer=false
I0605 17:42:31.036000000 22224 timer_manager.cc:191] sleep for a 994 milliseconds
D0605 17:42:31.036000000 24236 exec_ctx.cc:48] closure 000001E57301ADF0 finished
D0605 17:42:31.036000000 24236 timer_generic.cc:696] TIMER CHECK BEGIN: now=19247 next=9223372036854775807 tls_min=16047 glob_min=20241
D0605 17:42:31.036000000 24236 timer_generic.cc:722] TIMER CHECK END: r=1; next=20241
I0605 17:42:31.036000000 24236 timer_manager.cc:201] sleep until kicked
I0605 17:42:32.040000000 22224 timer_manager.cc:207] wait ended: was_timed:1 kicked:0
D0605 17:42:32.040000000 22224 timer_generic.cc:696] TIMER CHECK BEGIN: now=20251 next=9223372036854775807 tls_min=20241 glob_min=20241

D0605 17:42:32.040000000 22224 timer_generic.cc:596]   .. shard[1]->min_deadline = 20241
D0605 17:42:32.043000000 22224 timer_generic.cc:512]   .. shard[1]: heap_empty=true
D0605 17:42:32.044000000 22224 timer_generic.cc:483]   .. shard[1]->queue_deadline_cap --> 21251





Reply all
Reply to author
Forward
0 new messages