grpclb and Empty address list error despite successful DNS resolve

1,047 views
Skip to first unread message

Thomas Sanchez

unread,
Jan 26, 2023, 11:53:57 AM1/26/23
to grpc.io
Hello folks,
I'm currently experimenting with the built in load balancer support,
specifically:

https://github.com/grpc/proposal/blob/ce4fe27b88eec315ec0d5fe1d40bb795c0ee94b1/A5-grpclb-in-dns.md

I'm encountering an error and before creating a bug report I'd like to make sure
I've not missed anything.

We've setup the DNS according to the doc:

    $> dig +noall  +answer SRV   _grpclb._tcp.server._tcp.internal.example.com
    _grpclb._tcp.server._tcp.internal.example.com. 120 IN SRV 0 0 2379 nodes.internal.example.com.

    $> dig +noall  +answer A   nodes.internal.example.com
    nodes.internal.example.com. 120 IN A 10.10.44.90
    nodes.internal.example.com. 120 IN A 10.10.19.222
    nodes.internal.example.com. 120 IN A 10.10.3.170

Using a Python client (3.10, grpcio 1.51.1):

    # [...]
    channel = grpc.secure_channel(
        "server._tcp.internal.example.com",
        credentials=grpc.ssl_channel_credentials(
            ca_cert_file,
            cert_key_file,
            cert_cert_file,
        ),
        options=(
            ("grpc.dns_enable_srv_queries", 1),
            ("grpc.lb_policy_name", "round_robin"),
        ),
    )
    server = rpc.ServerStub(channel)
    req = rpcpb.StatusRequest()
    server.Status(rep)

When I execute this code, I get the following exception:

    Traceback (most recent call last):
    File "/tmp/test.py", line 57, in <module>
        server.Status(streq)
    File "/....../grpc/_channel.py", line 946, in __call__
        return _end_unary_response_blocking(state, call, False, None)
    File "/....../grpc/_channel.py", line 849, in _end_unary_response_blocking
        raise _InactiveRpcError(state)
    grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "empty address list: "
        debug_error_string = "UNKNOWN:empty address list:  {created_time:"2023-01-26T17:10:23.423558027+01:00", grpc_status:14}"

When I enable the gRPC logs, I can see that ares properly resolves the
addresses:

    D0126 17:10:23.355144647 1897428 grpc_ares_wrapper.cc:726]             (c-ares resolver) request:0x888c20 on_srv_query_done_locked name=_grpclb._tcp.server._tcp.internal.example.com ARES_SUCCESS
    D0126 17:10:23.355170775 1897428 grpc_ares_wrapper.cc:731]             (c-ares resolver) request:0x888c20 ares_parse_srv_reply: 0
    D0126 17:10:23.355175243 1897428 grpc_ares_wrapper.cc:625]             (c-ares resolver) request:0x888c20 create_hostbyname_request_locked host:nodes.internal.example.com port:19209 is_balancer:1 qtype:AAAA
    D0126 17:10:23.355270526 1897428 grpc_ares_wrapper.cc:625]             (c-ares resolver) request:0x888c20 create_hostbyname_request_locked host:nodes.internal.example.com port:19209 is_balancer:1 qtype:A
    D0126 17:10:23.355363205 1897428 grpc_ares_wrapper.cc:191]             (c-ares resolver) request:0x888c20 Ref ev_driver 0x888d00
    D0126 17:10:23.355368326 1897428 grpc_ares_wrapper.cc:439]             (c-ares resolver) request:0x888c20 notify read on: c-ares fd: 6
    D0126 17:10:23.355388528 1897428 grpc_ares_wrapper.cc:202]             (c-ares resolver) request:0x888c20 Unref ev_driver 0x888d00
    D0126 17:10:23.422973484 1897428 grpc_ares_wrapper.cc:364]             (c-ares resolver) request:0x888c20 readable on c-ares fd: 6
    D0126 17:10:23.423048620 1897428 grpc_ares_wrapper.cc:710]             (c-ares resolver) request:0x888c20 on_hostbyname_done_locked: C-ares status is not ARES_SUCCESS qtype=AAAA name=nodes.internal.example.com is_balancer=1: DNS server returned answer with no data
    D0126 17:10:23.423217106 1897428 grpc_ares_wrapper.cc:655]             (c-ares resolver) request:0x888c20 on_hostbyname_done_locked qtype=A host=nodes.internal.example.com ARES_SUCCESS
    D0126 17:10:23.423241972 1897428 grpc_ares_wrapper.cc:698]             (c-ares resolver) request:0x888c20 c-ares resolver gets a AF_INET result:
    addr: 10.10.44.90
    port: 2379

    D0126 17:10:23.423245904 1897428 grpc_ares_wrapper.cc:698]             (c-ares resolver) request:0x888c20 c-ares resolver gets a AF_INET result:
    addr: 10.10.3.170
    port: 2379

    D0126 17:10:23.423249513 1897428 grpc_ares_wrapper.cc:698]             (c-ares resolver) request:0x888c20 c-ares resolver gets a AF_INET result:
    addr: 10.10.19.222
    port: 2379

    D0126 17:10:23.423252890 1897428 grpc_ares_wrapper.cc:202]             (c-ares resolver) request:0x888c20 Unref ev_driver 0x888d00
    D0126 17:10:23.423261751 1897428 grpc_ares_wrapper.cc:216]             (c-ares resolver) request:0x888c20 delete fd: c-ares fd: 6
    D0126 17:10:23.423266242 1897428 grpc_ares_wrapper.cc:202]             (c-ares resolver) request:0x888c20 Unref ev_driver 0x888d00
    D0126 17:10:23.423270617 1897428 grpc_ares_wrapper.cc:293]             (c-ares resolver) request:0x888c20 ev_driver=0x888d00 on_timeout_locked. driver->shutting_down=1. err=CANCELLED
    D0126 17:10:23.423272866 1897428 grpc_ares_wrapper.cc:202]             (c-ares resolver) request:0x888c20 Unref ev_driver 0x888d00
    D0126 17:10:23.423275996 1897428 grpc_ares_wrapper.cc:318]             (c-ares resolver) request:0x888c20 ev_driver=0x888d00 on_ares_backup_poll_alarm_locked. driver->shutting_down=1. err=CANCELLED
    D0126 17:10:23.423278521 1897428 grpc_ares_wrapper.cc:202]             (c-ares resolver) request:0x888c20 Unref ev_driver 0x888d00
    D0126 17:10:23.423303740 1897428 grpc_ares_wrapper.cc:205]             (c-ares resolver) request:0x888c20 destroy ev_driver 0x888d00
    I0126 17:10:23.423316475 1897428 grpc_ares_wrapper.cc:552]             (c-ares resolver) request:0x888c20 c-ares address sorting: input[0]=10.10.44.90:2379
    I0126 17:10:23.423320669 1897428 grpc_ares_wrapper.cc:552]             (c-ares resolver) request:0x888c20 c-ares address sorting: input[1]=10.10.3.170:2379
    I0126 17:10:23.423323482 1897428 grpc_ares_wrapper.cc:552]             (c-ares resolver) request:0x888c20 c-ares address sorting: input[2]=10.10.19.222:2379
    I0126 17:10:23.423375716 1897428 grpc_ares_wrapper.cc:552]             (c-ares resolver) request:0x888c20 c-ares address sorting: output[0]=10.10.44.90:2379
    I0126 17:10:23.423378492 1897428 grpc_ares_wrapper.cc:552]             (c-ares resolver) request:0x888c20 c-ares address sorting: output[1]=10.10.3.170:2379
    I0126 17:10:23.423380187 1897428 grpc_ares_wrapper.cc:552]             (c-ares resolver) request:0x888c20 c-ares address sorting: output[2]=10.10.19.222:2379
    D0126 17:10:23.423382614 1897428 dns_resolver_ares.cc:396]             (c-ares resolver) resolver:0x69ed30 OnResolved() proceeding
    I0126 17:10:23.423409387 1897428 polling_resolver.cc:137]              [polling resolver 0x8d7270] request complete
    I0126 17:10:23.423413325 1897428 polling_resolver.cc:142]              [polling resolver 0x8d7270] returning result: addresses=<0 addresses>, service_config=<null>

As we can see in the last line we return 0 addresses instead of the three.
When checking the code, dns_resolver_ares.cc, I wonder whether the code handling the
`balancer_addresses_` is missing:

- We should have only balancer addresses so we init `result.addresses` with an
  empty list: `ServerAddressList`:
  - https://github.com/grpc/grpc/blob/f99b8b5bc4a89d37f86bf063999ec22e44312663/src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc#L405
- Further down, after any config handling if
  present, we can see that addresses is not written, but instead `result.arg` is
  set with the addresses we resolved and put under the
  `grpc.grpclb_balancer_addresses` key whereas the policy expect it in the
  addresses field
  - https://github.com/grpc/grpc/blob/f99b8b5bc4a89d37f86bf063999ec22e44312663/src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc#L428
  - https://github.com/grpc/grpc/blob/50fbe51238e8932b180e5e18eaa9ab87d26ecc10/src/core/ext/filters/client_channel/lb_policy/grpclb/grpclb_balancer_addresses.cc#L82
  - https://github.com/grpc/grpc/blob/v1.51.1/src/core/ext/filters/client_channel/lb_policy/round_robin/round_robin.cc#L274


Here are the next logs following the polling_resolver:

    I0126 17:10:23.423448035 1897428 round_robin.cc:239]                   [RR 0x8da3f0] Created
    I0126 17:10:23.423454769 1897428 round_robin.cc:271]                   [RR 0x8da3f0] received update with 0 addresses
    I0126 17:10:23.423458411 1897428 subchannel_list.h:367]                [RoundRobinSubchannelList 0x8da3f0] Creating subchannel list 0x8c3f40 for 0 subchannels
    I0126 17:10:23.423471480 1897428 channel_stack.cc:114]                 CHANNEL_STACK: init DynamicFilters
    I0126 17:10:23.423474992 1897428 channel_stack.cc:116]                 CHANNEL_STACK:   filter retry_filter
    I0126 17:10:23.423491810 1897428 client_channel.cc:2169]               chand=0x908060 calld=0x950880: applying service config to call
    I0126 17:10:23.423495972 1897428 client_channel.cc:2140]               chand=0x908060 calld=0x950880: removing from resolver queued picks list
    I0126 17:10:23.423501202 1897428 polling_resolver.cc:172]              [polling resolver 0x8d7270] result status from channel: UNAVAILABLE: empty address list:
    I0126 17:10:23.423505125 1897428 polling_resolver.cc:197]              [polling resolver 0x8d7270] retrying in 1000 ms
    I0126 17:10:23.423511785 1897428 client_channel.cc:2372]               chand=0x908060 calld=0x950880: creating dynamic call stack on channel_stack=0x90dc20
    I0126 17:10:23.423519385 1897428 client_channel.cc:2068]               chand=0x908060 calld=0x950880: starting 1 pending batches on dynamic_call=0x9512b0


Is there anything I'm missing?

Thanks a lot,
Thomas Sanchez

Alexandre BECQUET

unread,
Feb 6, 2023, 3:49:18 AM2/6/23
to grpc.io
Hello, did you find the source of the problem ?

I have the same error : "empty address list". But there is no error with network or dns.

Yijie Ma

unread,
Feb 6, 2023, 2:51:40 PM2/6/23
to grpc.io
The `round_robin` policy does not use SRV record or `balancer_addresses`. Only `grpclb` policy utilizes that so you need to configure to use that policy. See: https://github.com/grpc/proposal/blob/master/A5-grpclb-in-dns.md.

Also the current DNS resolver might have a limitation that if there are no A or AAAA records, it will return an error instead of returning an empty address list, so it might not be actually possible to use `grpclb` configured via DNS without having at least one fallback address.

apo...@google.com

unread,
Feb 8, 2023, 1:21:05 PM2/8/23
to grpc.io
Also note https://github.com/grpc/proposal/blob/master/A26-grpclb-selection.md - this relates to what Yijie noted about how you'll need to configure the client to use the grpclb policy.

Alexandre BECQUET

unread,
Feb 15, 2023, 5:49:38 AM2/15/23
to grpc.io
We have no grpclb but we have the error "empty address list", do you no why ?

A bad configuration of DNS ?
Reply all
Reply to author
Forward
0 new messages