[Python] gRPC hello world example fails on HPC cluster

42 views
Skip to first unread message

lee ming

unread,
Oct 1, 2021, 11:29:29 AM10/1/21
to grpc.io

Hi,

I am running into a really odd issue where the following line (obtained from hello world's greeter_server.py example) freezes when it is executed on a HPC cluster.

server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))

The following stack trace could provide more information on where the greeter_server program gets stuck within the gRPC Python library:

Thread 204411 (active): "MainThread"
    __init__ (grpc/_server.py:958)
    create_server (grpc/_server.py:1003)
    server (grpc/__init__.py:2064)
    serve (greeter_server.py:31)
    <module> (greeter_server.py:40)

(Apologies for the poor formatting)

The weird thing about this issue is that
  • The hello world example (greeter_server + greeter_client) works fine locally.
  • I attempted to debug by adding the "GRPC_VERBOSITY=DEBUG GRPC_TRACE=all" flag (and confirmed it works locally), but nothing gets printed out while running on the HPC cluster.  Furthermore, no exception is thrown, rather the program simply freezes with the above stack trace.
I'm wondering if anyone has run into this issue before. I have spoken to the administrator of the HPC cluster and as far as I know there are no specific restrictions from preventing gRPC servers from being hosted.

Thanks in advance!
   

Lidi Zheng

unread,
Oct 6, 2021, 1:59:26 PM10/6/21
to grpc.io
gRPC Python server spawns worker threads. Each incoming request will be routed to one of the worker. The main thread is expected to be blocked via the `wait_for_termination` call.

On the other hand, I have little expertise about HPC cluster. I'm not sure what is blocking here. I wonder if you have access to debuggers like gdb, then you can peek into what function it is executing last.
Reply all
Reply to author
Forward
0 new messages