Python Rendezvous exception when running python server, python client and C++ client on the same PC

1,401 views
Skip to first unread message

Alex

unread,
Feb 20, 2019, 8:22:34 AM2/20/19
to grpc.io
Hi,

I've got a Python server and two clients (Python and C++) running on the same machine (all listening and writing to 127.0.0.1:50051) and I'm hitting the following Python exception on my client:

File "C:\Python27\lib\site-packages\grpc\_channel.py", line 547, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "C:\Python27\lib\site-packages\grpc\_channel.py", line 466, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
_Rendezvous: <_Rendezvous of RPC that terminated with: 
status = StatusCode.UNKNOWN
details = "Stream removed"

The Python grpc version is 1.17.1
The C++ grpc version is 1.14.1

The execution flow is as follows:

1.- Server starts
2.- Python client starts and creates an insecure_channel on 127.0.0.1:50051. It keeps that channel for the rest of the execution.

Then in a loop:

3.- Python client calls method on Server.
4.- Server runs method and returns.
5.- C++ client starts and creates an insecure_channel on 127.0.0.1:50051. It doesn't communicate with the server.
6.- C++ client finishes so the shared_ptr to the channel created above is released.
7.- Goes back to step 3

I get the exception on step 3/4 above but not every time. It usually happens after 3 or 4 loops from step 3 to step 6. It looks like it's timing dependent. 
When I get the exception, the server sometimes has managed to execute the call and sometimes it doesn't receive the request from the Python client.

I've attached server and python client debug traces. The C++ one is a bit trickier to get as it's run as a subprocess and its output is not available but let me know if you need it and I'll try to get it.

My suspicion is that the Python client channel gets corrupted/closed when the C++ client finishes and presumably destroys its channel. Could that be possible?

Please let me know if you need any more info.

Thanks,
Alex.
CrashWithTraceServer.txt
CrashWithTraceClient.txt

Alex

unread,
Feb 20, 2019, 9:57:21 AM2/20/19
to grpc.io
I should add that the Python client application which owns the Python grpc client is the one that runs the C++ grpc client as a subprocess in case that makes a difference.

Eric Gribkoff

unread,
Feb 20, 2019, 2:04:51 PM2/20/19
to Alex, grpc.io
Can you post the code you're using to reproduce this error? If you're using subprocess.Popen (or otherwise using fork+exec) to start the C++ grpc client process, the C++ client itself cannot be interfering with the Python process. Something could be going wrong in the gRPC core fork handlers, however - you can try running with the environment variable `GRPC_ENABLE_FORK_SUPPORT=0` to disable this feature and see if it fixes the issue.

Also, in your step 5 you note that the C++ client isn't communicating with the server. If you remove the fork+exec of a C++ subprocess altogether, do you still see this intermittent exception in the Python client?

Eric

On Wed, Feb 20, 2019 at 6:57 AM Alex <negr...@gmail.com> wrote:
I should add that the Python client application which owns the Python grpc client is the one that runs the C++ grpc client as a subprocess in case that makes a difference.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/b323fac3-978b-47c1-b1fa-555c2f62b544%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alex

unread,
Feb 21, 2019, 1:28:18 PM2/21/19
to grpc.io
Hi Eric,

Thanks for your reply. Unfortunately I cannot post the code I'm running. I'm using subprocess.call but it's the same thing basically (changed it to Popen just in case and I still got the error).
I tried setting GRPC_ENABLE_FORK_SUPPORT to 0 in the Server and Clients environments, but that didn't make a difference.

If I remove the subprocess.call/Popen, I don't get the exception. 
I've been trying different combinations and I've found that the channel created in the C++ client doesn't have anything to do with the exception. If I open a C++ subprocess that does nothing (just returning 0 from its main method), I do get the exception. So, it seems that just creating the subprocess from the Python client is causing that exception to be thrown. 

I will try to reproduce the error with code that I can share tomorrow. Just wanted to give you an update before finishing today in case that could shed any light on what could be happening.

Thanks,
Alex.
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

Alex

unread,
Feb 22, 2019, 11:35:04 AM2/22/19
to grpc.io
Hi Eric,

I have finally managed to reproduce the issue with code I can share. The attached "ReproduceGRPCIssue.zip" package contains the following:

ReproduceGRPCIssue:
  • BreakGRPC: Noddy C++ project that just prints "Hello World!". This is going to be the application run in a subprocess. Please compile it and put the "BreakGRPC.exe" in "ReproduceGRPCIssue\BreakGRPC\x64\Release".
  • Services
    • protos: contains the Test.proto file to generate the GRPC source files.
    • src: 
      • client.py: Simple CherryPy websocket client that talks to the CherryPy webserver.py on 127.0.0.1:9000. When opened, it sends 10 requests to the webserver.py to run the GRPCCall. It sleeps 1 second between requests.
      • GRPCServer.py: Implementation of the GRPCTestService running on [::]:50051. It has one rpc "GRCCall" which just prints "In GRPCCall".
      • Test_pb2/Test_pb2_grpc.py: Autogenerated python files
      • webserver.py: CherryPy websocket server. It creates the GRPCTestStub with an insecure channel on 127.0.0.1:50051. It listens for websocket messages on 127.0.0.1:9000. When it receives a message, it creates a thread to deal with it (MessageHandler). This thread uses the grpc_stub to call GRPCCall and then run the BreakGRPC.exe in a subprocess. 
 
In order to run the above files you need to have installed cherrypy (https://cherrypy.org/) and ws4py(https://github.com/Lawouach/WebSocket-for-Python).
The webserver looks for the BreakGRPC.exe in "../../BreakGRPC/x64/Release/BreakGRPC.exe". Please update line 40 in webserver.py if you change the location of the exe.

To reproduce the issue:
  1. Launch the GRPC server (python GRPCserver.py)
  2. Launch the webserver (python webserver.py)
  3. Launch the client (python client.py)
You'll see "Hello World!" on the webserver command prompt and after 2 "Hello World!" messages, you'll get the exception:

_Rendezvous: <_Rendezvous of RPC that terminated with:
 status = StatusCode.UNKNOWN
 details = "Stream removed"
 debug_error_string = "{"created":"@1550849396.521000000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1036,"grpc_message":"Stream removed","grpc_status":2}"
  
If you comment out line 40 in webserver.py:
subprocess.call('../../BreakGRPC/x64/Release/BreakGRPC.exe')
the exception won't be raised.

Also, if the webserver doesn't spawn a thread to deal with the websocket message, the exception won't be raised either. To see this, comment out lines 57-59 in webserver.py:

      msgHandler = MessageHandler(d)
      msgHandler
.daemon = True
      msgHandler
.start()

and uncomment lines 60-61 in webserver.py:

      #if d['cmd'] == 'grpc_call':
     
#   cherrypy.engine.publish('grpc_call', d)


Please let me know if you have problems trying to reproduce it.

Thanks,
Alex.

On Wednesday, February 20, 2019 at 7:04:51 PM UTC, Eric Gribkoff wrote:
ReproduceGRPCIssue.zip

Alex

unread,
Feb 22, 2019, 11:37:47 AM2/22/19
to grpc.io
Sorry, forgot to add that I'm using Python 2.7

Alex

unread,
Mar 4, 2019, 5:36:27 AM3/4/19
to grpc.io
Hi Eric,

Just wondering if you had time to run my attached example and managed to reproduce the problem?

Thanks,
Alex.

On Wednesday, February 20, 2019 at 7:04:51 PM UTC, Eric Gribkoff wrote:

Eric Gribkoff

unread,
Mar 4, 2019, 12:04:11 PM3/4/19
to Alex, Lidi Zheng, grpc.io
+Lidi Zheng, who will be available for any follow-up questions (it will be easier for him to notice your questions if you include his email address on the "to:" line)

Hi Alex,

Sorry for the delay. I was not able to reproduce the problem; it looks like you are running on Windows, in which case gRPC's fork handlers are not registered/run, so those shouldn't be the cause here . Since the reproduction example also uses CherryPy websockets, it's quite possible the issue stems from that software rather than the gRPC stack - we'd likely need a reproduction case that only uses gRPC, without the websockets, to be able to help debug this further.

Thanks,

Eric

Alejandro Villagrán

unread,
Mar 18, 2019, 7:09:28 AM3/18/19
to Eric Gribkoff, Lidi Zheng, grpc.io
Hi Eric/Lidi,

Yes, I'm running on Windows. I have now removed the CherryPy code and I still get the exception.

Please follow these steps to reproduce the issue:
- Unzip ReproduceGRPCIssue.zip
- Go to the BreakGRPC folder and compile BreakGRPC.sln. Make sure BreakGRPC.exe is saved in BreakGRPC/x64/Release.
- Go to the Services/src folder and open two command prompts there.
- Run "python GRPCserver.py" in one command prompt.
- Run "python GRPCclient.py" in the other command prompt.

You should see the exception on the client command prompt.

Please let me know if you are still unable to reproduce the issue with this version of the code.

Thanks,
Alex.
ReproduceGRPCIssue.zip

Lidi Zheng

unread,
Mar 18, 2019, 9:06:28 PM3/18/19
to Alejandro Villagrán, Eric Gribkoff, grpc.io
Hi Alex,

Thank you for providing the reproduce code. I will spin up a Windows machine to investigate this error.
If I'm able to find something useful, I'll let you know.

Lidi Zheng

Alejandro Villagrán

unread,
Mar 28, 2019, 6:47:10 AM3/28/19
to Lidi Zheng, Eric Gribkoff, grpc.io
Hi Lidi,

Did you manage to reproduce the issue?

Thanks,
Alex.

Lidi Zheng

unread,
Mar 28, 2019, 12:55:26 PM3/28/19
to Alejandro Villagrán, Eric Gribkoff, grpc.io
My apologies Alex. There are other stuff keep come up and consumed my time. I'm not a Windows expert, it will take me a long time to setup the compilation environment for gRPC in Windows with debugger. And I failed to find a Windows expert to debug your issue.
Eric has mentioned that the breakage can be caused by either fork handlers registration or CherryPy. Do you think you can migrate the reproduce case to Linux, if the root cause is the software? It would be much easier to debug.
Also, have you tried to turn on the debug trace in gRPC by setting environmental variables "GRPC_VERBOSITY" to "DEBUG", and "GRPC_TRACE" to "api,channel,connectivity_state"? They might produce useful information for us to identify the problem.

Thanks,
Lidi Zheng

Alejandro Villagrán

unread,
Mar 28, 2019, 1:06:55 PM3/28/19
to Lidi Zheng, Eric Gribkoff, grpc.io
Hi Lidi,

In the last version of the code I attached, I don't use CherryPy anymore so the issue cannot come from there. Eric said that fork handlers are not registered on Windows so he ruled out that option too.
Unfortunately I haven't got access to a Linux machine and I'm not familiar enough with Linux to be able to recreate the issue there.

I did use turn on the debug trace a few weeks ago but I couldn't see anything obvious. I'll find time to enable them again and share the logs with you.

Thanks,
Alex.

Alejandro Villagrán

unread,
Apr 1, 2019, 10:55:46 AM4/1/19
to Lidi Zheng, Eric Gribkoff, grpc.io
Hi Lidi,

I've set those environment variables you mentioned and I've attached the log files (serverlog.txt and clientlog.txt).
Since I reported this issue, I've upgrade the Python version of gRPC, so the attached log files were created with grpcio 1.19.0

Do they give you any hint as to what could be wrong?

Thanks,
Alex.
serverlog.txt
clientlog.txt

Lidi Zheng

unread,
Apr 1, 2019, 5:32:16 PM4/1/19
to Alejandro Villagrán, Eric Gribkoff, grpc.io
Hi Alex,

From the log, I found the error is located at client-side. The gRPC client sends reset stream frame with error code 2 indicating internal error, and its channel state somehow becomes SHUTDOWN.
Also, I tried your code in Linux, it works fine without any error. I had another failed attempt to build the VS solution with 400+ errors...

Although the root cause still remained unclear, the scope is reduced to the gRPC client behavior on Windows. To dig deeper into the bug, I think one have to trace if the "subprocess" finishes and the life cycle of channel.

Lidi Zheng

Alejandro Villagrán

unread,
Apr 2, 2019, 6:38:07 AM4/2/19
to Lidi Zheng, Eric Gribkoff, grpc.io
Hi,

The subprocess is finishing OK, almost immediately as it's not doing anything apart from printing a message to the console. "subprocess.call" returns a status code 0. So everything should be fine on that side.
Not 100% sure what you meant with tracing the lifecycle of the channel. I called grpc.Channel.subscribe to print the different states of the channel used:

def stateChange(conn):
   print conn

class GRPCStub(object):
   def __init__(self):
      self.channel = grpc.insecure_channel('127.0.0.1:50051')
      self.channel.subscribe(stateChange)
      self.stub = Test_pb2_grpc.GRPCTestStub(self.channel)

With that call to subscribe, the exception is no longer thrown. The channel starts IDLE and then changes to READY as soon as the first grpc call is made and stays like that for the rest of the program.

It'd be interesting if you could reproduce it on your end to see what's really happening under the bonnet. Why would that call to grpc.Channel.subscribe will prevent the exception from being raised?

Thanks,
Alex.

Lidi Zheng

unread,
Apr 2, 2019, 2:04:22 PM4/2/19
to Alejandro Villagrán, grpc.io
Hi Alex,

I'm glad to see the problem is finally going away. Sorry that I can't track down this issue in Windows.
And there are many possibilities to cause that behavior, might related to underlying file descriptors, the garbage collections, fork+exec issue.
I don't have an answer.

In addition, the details you posted here is more than enough to file a bug report.
It would be great if you can submit an issue in Github where we can revisit this issue when we got resources.

Thanks,
Lidi Zheng


Alejandro Villagrán

unread,
Apr 3, 2019, 7:39:11 AM4/3/19
to Lidi Zheng, grpc.io
Hi Lidi,

Thanks for looking into it. 
I've now submitted an issue on Github: https://github.com/grpc/grpc/issues/18626

Thanks,
Alex.
Reply all
Reply to author
Forward
0 new messages