Envoy routes gRPC requests to unhealthy servers

55 views
Skip to first unread message

su-...@indetail.co.jp

unread,
Apr 2, 2019, 3:31:31 AM4/2/19
to envoy-users
Hi,

I couldn't find related information about this issue we met.
I am sorry if there's already an answer for it.

We use Envoy as the gRPC/gRPC-web proxy with the upstream of two gRPC servers.
(Really appreciate that Envoy supports gRPC-web)

These two gRPC servers have not yet implemented gRPC health-checking service, but we intentionally enabled Envoy's gRPC health-checking to confirm that Envoy will not route any requests to them after the failure of health-checking, according to [here](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/service_discovery#on-eventually-consistent-service-discovery)

However, we found that the Envoy is still routing requests even the healthy status of these two gRPC servers have been changed to `failed_active_hc`.

[debug][hc] [source/common/upstream/health_checker_impl.cc:630] [C0] hc grpc_status=12 (Method not found: grpc.health.v1.Health/Check) service_status=rpc_error health_flags=/failed_active_hc

This is our envoy.yaml.
I wonder if we made any mistakes or missed configuring something. Please kindly advise. Thank you :)

static_resources:
 listeners:
 - name: listener_0
   address:
     socket_address:
       address: 0.0.0.0
       port_value: 50051
   filter_chains:
   - filters:
     - name: envoy.http_connection_manager
       config:
         codec_type: auto
         stat_prefix: ingress_http
         route_config:
           name: local_route
           virtual_hosts:
           - name: local_service
             domains:
             - "*"
             routes:
             - match:
                 prefix: "/"
               route:
                 cluster: foo
             cors:
               allow_origin:
               - "*"
               allow_methods: GET, PUT, DELETE, POST, OPTIONS
               allow_headers: keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,custom-header-1,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout
               max_age: "1728000"
               expose_headers: custom-header-1,grpc-status,grpc-message
               enabled: true
         http_filters:
         - name: envoy.grpc_web
         - name: envoy.cors
         - name: envoy.router
 clusters:
 - name: foo
   connect_timeout: 0.25s
   type: strict_dns
   http2_protocol_options: {}
   lb_policy: round_robin
   health_checks:
   - grpc_health_check:
       service_name: 'foo'
     timeout:
       seconds: 2
     interval:
       seconds: 2
     unhealthy_threshold: 2
     healthy_threshold: 2
   hosts:
   - socket_address:
       address: foo-1
       port_value: 50051
   - socket_address:
       address: foo-2
       port_value: 50051

Matt Klein

unread,
Apr 2, 2019, 4:32:32 PM4/2/19
to su-...@indetail.co.jp, envoy-users

--
You received this message because you are subscribed to the Google Groups "envoy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users...@googlegroups.com.
To post to this group, send email to envoy...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/envoy-users/d8a942b7-5ba4-4bad-af6e-69c1b6b4972a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

su-...@indetail.co.jp

unread,
Apr 3, 2019, 3:26:14 AM4/3/19
to envoy-users
Matt,

Thank you for your information :)

We tried setting the panic threshold to 0.0 (tried to disable panic mode)
...
     lb_policy: round_robin
+    common_lb_config:
+      healthy_panic_threshold:
+        value: 0.0
     health_checks:
...

We thought this could make our Envoy to not enable panic mode (Don't ignore health-checking result),
however, it didn't work and still routing requests to gRPC servers :(

Is there any other configs that I need to set?

Thank you.
To unsubscribe from this group and stop receiving emails from it, send an email to envoy...@googlegroups.com.

Matt Klein

unread,
Apr 8, 2019, 7:25:56 PM4/8/19
to su-...@indetail.co.jp, envoy-users
I don't think so, but can't remember off the top of my head. Would need some more debugging.

To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users...@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.

su-...@indetail.co.jp

unread,
Apr 9, 2019, 4:37:15 AM4/9/19
to envoy-users
Thank you very much, Matt.

I would also try to find the reason. Please let me know if any that we can help.

Thanks.

su-...@indetail.co.jp

unread,
Apr 16, 2019, 3:24:56 AM4/16/19
to envoy-users
Matt,

We checked the source codes and realized it might be the correct behavior.
Envoy goes into panic mode anyway (event the threshold is 0.0) when the normalized total availability is 0 which I guess is our case (All hosts are unhealthy).

Thank you.
Reply all
Reply to author
Forward
0 new messages