Why does it take a long time to connect sometimes in C++?

109 views
Skip to first unread message

br...@forceconstant.com

unread,
Jan 17, 2019, 2:39:18 PM1/17/19
to grpc.io

I have a gRPC streaming client, that has to handle server going up and down, so I have a while loop, but sometimes it works fine, but other times it takes 15 seconds to connect even on the same machine. Is it something wrong with my code, or how can I debug? As you can see below I have debug to print out channel state, and is mostly GRPC_CHANNEL_CONNECTING  or GRPC_CHANNEL_TRANSIENT_FAILURE , but still can take 15 seconds to connect. I haven't found a pattern. Can someone tell me how I get it to connect faster and more reliably?  Thanks.  Note I am using a deadline, so that I can shut everything down at the end gracefully, and not have it block forever.



...

channel = grpc::CreateChannel(asServerAddress, channel_creds);

 while ((channel->GetState(true) != GRPC_CHANNEL_READY))
    {
      time_point deadline = std::chrono::system_clock::now() + std::chrono::milliseconds(1000);
      
      channel->WaitForConnected(deadline);
      std::cout << "." << channel->GetState(false) << std::flush ;
    }
std::cout << "Client Connected" << std::endl;

....


robert engels

unread,
Jan 17, 2019, 2:46:44 PM1/17/19
to br...@forceconstant.com, grpc.io
How are you testing the retry - pulling plug? iptables ?

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/06cb24fd-f91f-42d4-b495-9c701b2457ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

br...@forceconstant.com

unread,
Jan 17, 2019, 2:54:28 PM1/17/19
to grpc.io
I don't really understand the question, but  I have tested retry by just starting and stopping server.

br...@forceconstant.com

unread,
Jan 17, 2019, 3:09:17 PM1/17/19
to grpc.io
Another point, is I look at connections with netstat, and I don't even see gRPC even trying to connect until the connection actually happens. So I am not sure what it is waiting for.

robert engels

unread,
Jan 17, 2019, 3:09:55 PM1/17/19
to br...@forceconstant.com, grpc.io
If you are running a tight loop with lots of connection attempts there are a lot of reasons it can fail. usually resources (number of connections) - so while the OS is waiting to close the existing connections, future attempts will fail.

Brian Wagener

unread,
Jan 18, 2019, 9:18:40 AM1/18/19
to grpc.io
I don't think it is either of those. I was able to capture grpc debug and it shows that the actual fd_create to Port 50074 doesn't even get called in this case until 8 seconds after trying to connect. Can someone look at debug file and see what grpc is doing?

Brian
connect.txt

Robert Engels

unread,
Jan 18, 2019, 9:23:03 AM1/18/19
to Brian Wagener, grpc.io
Is the debug file attached ?

For more options, visit https://groups.google.com/d/optout.
<connect.txt>

br...@forceconstant.com

unread,
Jan 18, 2019, 9:35:44 AM1/18/19
to grpc.io
Yes see previous message: connect.txt 

Robert Engels

unread,
Jan 18, 2019, 9:39:18 AM1/18/19
to br...@forceconstant.com, grpc.io
Can you identify the time this occurs in the file? It’s fairly large. 

br...@forceconstant.com

unread,
Jan 18, 2019, 9:44:13 AM1/18/19
to grpc.io
Yes start of file is about when the server connection goes up at the same time client software is trying to connect using loop above, so 12:54:24.361342246, fd_create in question is at 12:54:32.698523276, and connection happens shortly after.  Thanks for looking.

br...@forceconstant.com

unread,
Jan 18, 2019, 11:36:11 AM1/18/19
to grpc.io
Ok I think i have a clue, looking at full log file for fd_create for that port, I see that the time between retries is monotonically increasing (doubling) even though I have the WaitForConnection deadline static. How can I disable this feature?
Message has been deleted

yas...@google.com

unread,
Jan 23, 2019, 2:20:47 PM1/23/19
to grpc.io
I believe GRPC_ARG_MAX_RECONNECT_BACKOFF_MS is what you are looking for.
Reply all
Reply to author
Forward
0 new messages