gRPC java (1.16) RoundRobinLoadBalancer is not able to load balancing to the newly added server

98 views
Skip to first unread message

eleano...@gmail.com

unread,
Jan 9, 2019, 1:18:47 PM1/9/19
to grpc.io
Hi, 

in my java gRPC client, when I create the ManagedChannel, I am passing my custom NameResolver, and using RoundRobinLoadBalancer. When my NameResolver is notified with a change to the server list (new server added), it will call Listener.onAddress and pass the updated the list.

I see from the Log: the onAddress is called from NameResolverListenerImpl, (9097 is the new server address added)

resolved address: [[addrs=[localhost/127.0.0.1:9096], attrs={}], [addrs=[localhost/127.0.0.1:9097], attrs={}]], config={}


however, the traffic is not coming to the new server, did I miss anything?


Thanks a lot!





Kun Zhang

unread,
Jan 10, 2019, 7:00:28 PM1/10/19
to grpc.io
Can you find logs from InternalSubchannel that mention the new server?
If the new server can not be connected, round-robin won't use it.

eleano...@gmail.com

unread,
Jan 10, 2019, 7:34:58 PM1/10/19
to grpc.io
Hi Kun, 

Thanks for your reply, I did see that new SubChannel gets created for the new server,  do you mean that so long as the new server's subchannel gets created, it should take effect immediately, meaning the new server should also get the traffic?

Thanks a lot!

Kun Zhang

unread,
Jan 10, 2019, 8:37:55 PM1/10/19
to grpc.io
SubChannel getting created for the new server means round-robin is aware of this new server and tries to connect.
The creation log starts with the logId of the Subchannel. Do you see any other logs related to that logId?
My suspicion is that the Subchannel couldn't get connected.

eleano...@gmail.com

unread,
Jan 11, 2019, 5:43:24 PM1/11/19
to grpc.io
Hi Kun, 

please see below the logs from the gRPC client, so server1 (localhost:9095) is running first, then the client start making requests, afterward, I started up server2 (localhost:9096), then I see the following logs, and the request is not sent to server2. 

[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Created with target localhost:9095
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Created with target localhost:9095
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Exiting idle mode
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Exiting idle mode
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] resolved address: [[addrs=[localhost/127.0.0.1:9095], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9095], attrs={}]], config={}
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] resolved address: [[addrs=[localhost/127.0.0.1:9095], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9095], attrs={}]], config={}
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] io.grpc.internal.InternalSubchannel-14 created for [[addrs=[localhost/127.0.0.1:9095], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9095], attrs={}]]
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] io.grpc.internal.InternalSubchannel-14 created for [[addrs=[localhost/127.0.0.1:9095], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9095], attrs={}]]
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] shutdownNow() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] shutdownNow() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] shutdown() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] shutdown() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Shutting down
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Shutting down
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Created with target localhost:9096
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Created with target localhost:9096
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Terminated
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-12] Terminated
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Exiting idle mode
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Exiting idle mode
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] resolved address: [[addrs=[localhost/127.0.0.1:9096], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9096], attrs={}]], config={}
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] resolved address: [[addrs=[localhost/127.0.0.1:9096], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9096], attrs={}]], config={}
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] io.grpc.internal.InternalSubchannel-18 created for [[addrs=[localhost/127.0.0.1:9096], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9096], attrs={}]]
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] io.grpc.internal.InternalSubchannel-18 created for [[addrs=[localhost/127.0.0.1:9096], attrs={}], [addrs=[localhost/0:0:0:0:0:0:0:1:9096], attrs={}]]

[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] shutdownNow() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] shutdownNow() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] shutdown() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] shutdown() called
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Shutting down
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Shutting down
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-4] resolved address: [[addrs=[localhost/127.0.0.1:9095], attrs={}], [addrs=[localhost/127.0.0.1:9096], attrs={}]], config={}
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-4] resolved address: [[addrs=[localhost/127.0.0.1:9095], attrs={}], [addrs=[localhost/127.0.0.1:9096], attrs={}]], config={}
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-4] io.grpc.internal.InternalSubchannel-20 created for [[addrs=[localhost/127.0.0.1:9096], attrs={}]]
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-4] io.grpc.internal.InternalSubchannel-20 created for [[addrs=[localhost/127.0.0.1:9096], attrs={}]]
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Terminated
[io.grpc.internal.ManagedChannelImpl][io.grpc.internal.ManagedChannelImpl-16] Terminated

Eric Anderson

unread,
Jan 15, 2019, 11:12:16 AM1/15/19
to Jin Yi, grpc.io
It looks like you are re-creating channels when the backends change. That is unfortunate; I would encourage you to instead create a NameResolver that will provide updated server addresses when they change. That will prevent needing to shut down perfectly good connections and avoids you having to deal with many races when swapping out the Channel.

Are you sure you are using RoundRobin? The last channel would likely only send RPCs to 9095 if it was using the default PickFirst.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/b869c723-3d66-4305-8dd7-80208fc18066%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

eleano...@gmail.com

unread,
Jan 15, 2019, 1:04:09 PM1/15/19
to grpc.io
Hi Eric, 

Thanks a lot for the reply, actually I do have my custom NameResolver, and upon changes for the server list, NameResolver will be notified. And I do have the RoundRobinLoadBalancer
configured, please see code below.

ManagedChannel channel = ManagedChannelBuilder.forTarget(...)
.executor(channelExecutor)
.nameResolverFactory(new Factory() {
public NameResolver newNameResolver(URI targetUri, Attributes params) {
return new MyCustomNameResolver(
...);
}

@Override
public String getDefaultScheme() {
return null;
}
})
.loadBalancerFactory(RoundRobinLoadBalancerFactory.
getInstance())
.usePlaintext()
.enableRetry()
.build();

channel.getState(
true);

eleano...@gmail.com

unread,
Jan 15, 2019, 2:09:17 PM1/15/19
to grpc.io
Hi Eric, 

one more question, when the subchannel gets updated from a channel, how about the Streams that is created from the channel? I assume that the stream is for a particular tcp connection, meaning a particular subchannel?

Kun Zhang

unread,
Jan 15, 2019, 7:32:36 PM1/15/19
to grpc.io
I only see logs from ManagedChannelImpl. Can you also enable FINE logging for io.grpc.internal.InternalSubchannel? We can find the connection states for each Subchannel from there.
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

eleano...@gmail.com

unread,
Jan 16, 2019, 3:50:04 PM1/16/19
to grpc.io
Hi Kun, 
 
I did see that the new server3 (listening on 9097) has its InternalSubchannel gets created:

 [io.grpc.internal.InternalSubchannel] (grpc-default-worker-ELG-3-9) [io.grpc.internal.InternalSubchannel-20] io.grpc.netty.NettyClientTransport-21 for localhost/127.0.0.1:9097 is ready
 [io.grpc.internal.InternalSubchannel] (grpc-default-worker-ELG-3-9) [io.grpc.internal.InternalSubchannel-20] io.grpc.netty.NettyClientTransport-21 for localhost/127.0.0.1:9097 is ready

On Wednesday, January 9, 2019 at 10:18:47 AM UTC-8, eleano...@gmail.com wrote:

eleano...@gmail.com

unread,
Jan 16, 2019, 4:44:41 PM1/16/19
to grpc.io
Hi Kun, 

I am trying to debug further, in io.grpc.util.RoundRobinLoadBalancerFactory::handleResolvedAddressGroups will be called if the NameResolver.Listener::onAddress is called, 

inside handleResolvedAddressGroups method, it is calling updateBalancingState(getAggregatedState(), getAggregatedError()); where it seems in getAggregatedState(),
it is not returning the subchannel state as READY, sometimes connecting, sometimes idle.

Then in updateBalancingState(), it will only put those subchannel's state with READY in the activeList. 

So just wonder is there anyway to ensure the sub channel is READY when updating the loadbalancer ?

Kun Zhang

unread,
Jan 17, 2019, 5:35:54 PM1/17/19
to grpc.io
You don't need to worry about the timing. As soon as the Subchannel becomes ready, RoundRobinLoadBalancer should notice that by yet another call to updateBalancingState() and add it to the round-robin list. If you continue debugging, you should be able to see that.

eleano...@gmail.com

unread,
Jan 17, 2019, 7:35:29 PM1/17/19
to grpc.io
Got it! Thanks a lot
Reply all
Reply to author
Forward
Message has been deleted
0 new messages