I'm using envoy proxy as a reverse proxy for my loadbalanced application. We recently had a network outage which caused envoy proxy to return 503 to our downstream clients. Normally, when the network outage recovers, envoy proxy will be able to connect to our upstream servers properly again and everyone is happy.
However, now I am stuck with 503 with LR unless I restart the containers. Am I missing something? Can someone point me to the right direction on how should I resolve this issue?
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 9901
static_resources:
listeners:
- name: listener_0
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
stat_prefix: ingress_http
server_name: none
route_config:
name: local_route
response_headers_to_remove:
- x-envoy-upstream-service-time
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: service_api
http_filters:
- name: envoy.filters.http.router
access_log:
- name: envoy.access_loggers.file
filter:
not_health_check_filter: {}
config:
path: "/dev/stdout"
clusters:
- name: service_api
connect_timeout: 30s
type: LOGICAL_DNS
dns_resolvers:
socket_address:
address: 127.0.0.1
port_value: 53
# Comment out the following line to test on v6 networks
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
upstream_connection_options:
tcp_keepalive: {}
load_assignment:
cluster_name: service_api
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
port_value: 443
transport_socket:
name: envoy.transport_sockets.tls
typed_config: