Thank you for the reply!
I've set healthy_panic_threshold with no luck:
clusters:
- name: postgres_cluster
connect_timeout: 1s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
common_lb_config:
healthy_panic_threshold:
value: 0
Here is the clusters dump:
{
"cluster_statuses": [
{
"name": "postgres_cluster",
"host_statuses": [
{
"address": {
"socket_address": {
"address": "10.0.0.1",
"port_value": 5432
}
},
"stats": [
{
"name": "cx_connect_fail"
},
{
"value": "9",
"name": "cx_total"
},
{
"name": "rq_error"
},
{
"name": "rq_success"
},
{
"name": "rq_timeout"
},
{
"value": "9",
"name": "rq_total"
},
{
"type": "GAUGE",
"name": "cx_active"
},
{
"type": "GAUGE",
"name": "rq_active"
}
],
"health_status": {
"eds_health_status": "HEALTHY"
},
"weight": 1,
"hostname": "postgres-0",
"locality": {}
},
{
"address": {
"socket_address": {
"address": "10.0.0.2",
"port_value": 5432
}
},
"stats": [
{
"name": "cx_connect_fail"
},
{
"value": "8",
"name": "cx_total"
},
{
"name": "rq_error"
},
{
"name": "rq_success"
},
{
"name": "rq_timeout"
},
{
"value": "8",
"name": "rq_total"
},
{
"type": "GAUGE",
"name": "cx_active"
},
{
"type": "GAUGE",
"name": "rq_active"
}
],
"health_status": {
"failed_active_health_check": true,
"eds_health_status": "HEALTHY"
},
"weight": 1,
"hostname": "postgres-1",
"locality": {}
},
{
"address": {
"socket_address": {
"address": "10.0.0.3",
"port_value": 5432
}
},
"stats": [
{
"name": "cx_connect_fail"
},
{
"value": "8",
"name": "cx_total"
},
{
"name": "rq_error"
},
{
"name": "rq_success"
},
{
"name": "rq_timeout"
},
{
"value": "8",
"name": "rq_total"
},
{
"type": "GAUGE",
"name": "cx_active"
},
{
"type": "GAUGE",
"name": "rq_active"
}
],
"health_status": {
"failed_active_health_check": true,
"eds_health_status": "HEALTHY"
},
"weight": 1,
"hostname": "postgres-2",
"locality": {}
}
],
"circuit_breakers": {
"thresholds": [
{
"max_connections": 1024,
"max_pending_requests": 1024,
"max_requests": 1024,
"max_retries": 3
},
{
"priority": "HIGH",
"max_connections": 1024,
"max_pending_requests": 1024,
"max_requests": 1024,
"max_retries": 3
}
]
},
"observability_name": "postgres_cluster"
}
]
}
And some logs for health check:
[2023-03-14 18:39:13.539][1][debug][client] [source/common/http/codec_client.cc:130] [C0] response complete
[2023-03-14 18:39:13.539][1][debug][hc] [source/common/upstream/health_checker_impl.cc:342] [C0] hc response=200 health_flags=healthy
[2023-03-14 18:39:13.859][1][debug][client] [source/common/http/codec_client.cc:130] [C1] response complete
[2023-03-14 18:39:13.859][1][debug][hc] [source/common/upstream/health_checker_impl.cc:342] [C1] hc response=404 health_flags=/failed_active_hc
[2023-03-14 18:39:14.218][1][debug][client] [source/common/http/codec_client.cc:130] [C2] response complete
[2023-03-14 18:39:14.218][1][debug][hc] [source/common/upstream/health_checker_impl.cc:342] [C2] hc response=404 health_flags=/failed_active_hc
And for several connections:
[2023-03-14 18:40:53.137][49][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:397] [C61] Creating connection to cluster postgres_cluster
[2023-03-14 18:40:53.137][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:245] trying to create new connection
[2023-03-14 18:40:53.137][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:143] creating a new connection
[2023-03-14 18:40:53.137][49][debug][connection] [source/common/network/connection_impl.cc:864] [C62] connecting to 10.0.0.1:5432
[2023-03-14 18:40:53.137][49][debug][connection] [source/common/network/connection_impl.cc:880] [C62] connection in progress
[2023-03-14 18:40:53.137][49][debug][conn_handler] [source/server/active_tcp_listener.cc:332] [C61] new connection from 127.0.0.1:37112
[2023-03-14 18:40:53.185][49][debug][connection] [source/common/network/connection_impl.cc:669] [C62] connected
[2023-03-14 18:40:53.185][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:289] [C62] attaching to next stream
[2023-03-14 18:40:53.185][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:175] [C62] creating stream
[2023-03-14 18:40:53.185][49][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:660] [C61] TCP:onUpstreamEvent(), requestedServerName:
[2023-03-14 18:40:53.876][49][debug][connection] [source/common/network/connection_impl.cc:637] [C62] remote close
[2023-03-14 18:40:53.876][49][debug][connection] [source/common/network/connection_impl.cc:247] [C62] closing socket: 0
[2023-03-14 18:40:53.876][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:407] [C62] client disconnected, failure reason:
[2023-03-14 18:40:53.876][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:204] [C62] destroying stream: 0 remaining
[2023-03-14 18:40:53.876][49][debug][connection] [source/common/network/connection_impl.cc:137] [C61] closing data_to_write=0 type=0
[2023-03-14 18:40:53.876][49][debug][connection] [source/common/network/connection_impl.cc:247] [C61] closing socket: 1
[2023-03-14 18:40:53.876][49][debug][conn_handler] [source/server/active_tcp_listener.cc:76] [C61] adding to cleanup list
[2023-03-14 18:40:54.488][50][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:249] [C63] new tcp proxy session
[2023-03-14 18:40:54.488][50][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:397] [C63] Creating connection to cluster postgres_cluster
[2023-03-14 18:40:54.488][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:245] trying to create new connection
[2023-03-14 18:40:54.488][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:143] creating a new connection
[2023-03-14 18:40:54.488][50][debug][connection] [source/common/network/connection_impl.cc:864] [C64] connecting to 10.0.0.3:5432
[2023-03-14 18:40:54.488][50][debug][connection] [source/common/network/connection_impl.cc:880] [C64] connection in progress
[2023-03-14 18:40:54.488][50][debug][conn_handler] [source/server/active_tcp_listener.cc:332] [C63] new connection from 127.0.0.1:37122
[2023-03-14 18:40:54.535][50][debug][connection] [source/common/network/connection_impl.cc:669] [C64] connected
[2023-03-14 18:40:54.535][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:289] [C64] attaching to next stream
[2023-03-14 18:40:54.535][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:175] [C64] creating stream
[2023-03-14 18:40:54.535][50][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:660] [C63] TCP:onUpstreamEvent(), requestedServerName:
[2023-03-14 18:40:55.211][50][debug][connection] [source/common/network/connection_impl.cc:637] [C64] remote close
[2023-03-14 18:40:55.211][50][debug][connection] [source/common/network/connection_impl.cc:247] [C64] closing socket: 0
[2023-03-14 18:40:55.211][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:407] [C64] client disconnected, failure reason:
[2023-03-14 18:40:55.211][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:204] [C64] destroying stream: 0 remaining
[2023-03-14 18:40:55.211][50][debug][connection] [source/common/network/connection_impl.cc:137] [C63] closing data_to_write=0 type=0
[2023-03-14 18:40:55.211][50][debug][connection] [source/common/network/connection_impl.cc:247] [C63] closing socket: 1
[2023-03-14 18:40:55.211][50][debug][conn_handler] [source/server/active_tcp_listener.cc:76] [C63] adding to cleanup list
[2023-03-14 18:40:55.817][1][debug][main] [source/server/server.cc:215] flushing stats
[2023-03-14 18:40:56.311][49][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:249] [C65] new tcp proxy session
[2023-03-14 18:40:56.311][49][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:397] [C65] Creating connection to cluster postgres_cluster
[2023-03-14 18:40:56.311][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:245] trying to create new connection
[2023-03-14 18:40:56.311][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:143] creating a new connection
[2023-03-14 18:40:56.311][49][debug][connection] [source/common/network/connection_impl.cc:864] [C66] connecting to 10.0.0.2:5432
[2023-03-14 18:40:56.311][49][debug][connection] [source/common/network/connection_impl.cc:880] [C66] connection in progress
[2023-03-14 18:40:56.311][49][debug][conn_handler] [source/server/active_tcp_listener.cc:332] [C65] new connection from 127.0.0.1:37128
[2023-03-14 18:40:56.362][49][debug][connection] [source/common/network/connection_impl.cc:669] [C66] connected
[2023-03-14 18:40:56.362][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:289] [C66] attaching to next stream
[2023-03-14 18:40:56.362][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:175] [C66] creating stream
[2023-03-14 18:40:56.362][49][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:660] [C65] TCP:onUpstreamEvent(), requestedServerName:
[2023-03-14 18:40:57.089][49][debug][connection] [source/common/network/connection_impl.cc:637] [C66] remote close
[2023-03-14 18:40:57.089][49][debug][connection] [source/common/network/connection_impl.cc:247] [C66] closing socket: 0
[2023-03-14 18:40:57.089][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:407] [C66] client disconnected, failure reason:
[2023-03-14 18:40:57.089][49][debug][pool] [source/common/conn_pool/conn_pool_base.cc:204] [C66] destroying stream: 0 remaining
[2023-03-14 18:40:57.089][49][debug][connection] [source/common/network/connection_impl.cc:137] [C65] closing data_to_write=0 type=0
[2023-03-14 18:40:57.089][49][debug][connection] [source/common/network/connection_impl.cc:247] [C65] closing socket: 1
[2023-03-14 18:40:57.089][49][debug][conn_handler] [source/server/active_tcp_listener.cc:76] [C65] adding to cleanup list
As you can see, it still tries to distribute connections evenly to all cluster endpoints in spite of health check status.
вторник, 14 марта 2023 г. в 17:11:21 UTC+2, Stephan Zuercher: