Unable to achieve FALLBACK POLICY with DEFAULT_SUBSET

386 views
Skip to first unread message

mrs.sa...@gmail.com

unread,
Jun 6, 2018, 3:22:25 AM6/6/18
to envoy-users
Hi,

I am trying to achieve fall back policy with subset load balancing in V2 API. In my case the request is not falling back to the default subset. I am facing '503'.

I have the following route configuration, where I gave "type"="dev3" and "path"="/v1/heartbeat", which does not match with any of the existing subsets. The idea is to hit default subset created with "type"="test"


    "routes": [
                {
                  "match": {
                    "prefix": "/"
                  },
                  "route": {
                    "cluster": "some_service",
                    "metadata_match": {
                     "filter_metadata": {
                       "envoy.lb": {
                         "type":"dev3",
                         "path":"/v1/heartbeat"
                     }
                    }
                   },
                   "timeout": "120s",
                   "retry_policy": {
                      "retry_on": "gateway-error",
                      "num_retries": 120
                    }
                  }
                }
              ]


and EDS configuration is as shown

      {
        "name" : "some_service",
        "type": "EDS",
        "eds_cluster_config": {
          "eds_config": {
            "api_config_source": {
              "api_type": "REST",
              "cluster_names": ["xds_cluster"],
              "refresh_delay": "1s"
            }
          }
        },
        "connect_timeout": "120s",
        "lb_policy": "ROUND_ROBIN",
        "lb_subset_config": {
         "fallback_policy": "DEFAULT_SUBSET",
         "default_subset": {
           "type": "test"
        },
        "subset_selectors": [
          { "keys": ["type", "path" ]}
        ]
       }
      }


I have a set of hosts from EDS with following metadata

{
   "version_info": "0",
   "resources": [
   {
     "@type": "type.googleapis.com/envoy.api.v2.ClusterLoadAssignment",
     "cluster_name": "some_service",
     "endpoints": [
     {
      "lb_endpoints": [
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "host_ip1",
          "port_value": 80
         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
           "type": "dev",
           "path" : "/v1/heartbeat"
         }
        }
       }
      },
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "host_ip2",
          "port_value": 80
         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
           "type": "dev",
           "path" : "/v1/heartbeat"
         }
        }
       }
      },
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "host_ip3",
          "port_value": 80
         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
           "type": "test",
           "path" : "/v1/heartbeat"
         }
        }
       }
      }

     ]
    }
   ]
  }
 ]
}




envoy trace clearly shows that it has created a fallback load balancer for type="test" and subset load balancer for "path" and "type"


[2018-06-06 06:46:40.113][15824][debug][upstream] source/common/upstream/subset_lb.cc:147] subset lb: creating fallback load balancer for type="test"
[2018-06-06 06:46:40.114][15824][debug][upstream] source/common/upstream/subset_lb.cc:238] subset lb: creating load balancer for path="/v1/heartbeat", type="dev"


When I send request for /v1/hearbeat it fails with '503'. As per the document it should select the default subset which is created for metadata "type"="test"



[2018-06-06 06:48:57.761][15851][trace][http] source/common/http/http1/codec_impl.cc:361] [C2] headers complete
[2018-06-06 06:48:57.761][15851][trace][http] source/common/http/http1/codec_impl.cc:292] [C2] completed header: key=Accept value=*/*
[2018-06-06 06:48:57.761][15851][trace][http] source/common/http/http1/codec_impl.cc:543] [C2] message complete
[2018-06-06 06:48:57.761][15851][debug][http] source/common/http/conn_manager_impl.cc:785] [C2][S10979014826725386109] request end stream
[2018-06-06 06:48:57.761][15851][debug][http] source/common/http/conn_manager_impl.cc:451] [C2][S10979014826725386109] request headers complete (end_stream=true):
':authority', 'localhost:8080'
'user-agent', 'curl/7.47.0'
'accept', '*/*'
':path', '/v1/heartbeat'
':method', 'GET'

[2018-06-06 06:48:57.761][15851][debug][router] source/common/router/router.cc:237] [C2][S10979014826725386109] cluster 'some_service' match for URL '/v1/heartbeat'
[2018-06-06 06:48:57.761][15851][debug][upstream] source/common/upstream/cluster_manager_impl.cc:903] no healthy host for HTTP connection pool
[2018-06-06 06:48:57.761][15851][debug][http] source/common/http/conn_manager_impl.cc:970] [C2][S10979014826725386109] encoding headers via codec (end_stream=false):
':status', '503'
'content-length', '19'
'content-type', 'text/plain'
'date', 'Wed, 06 Jun 2018 06:48:57 GMT'
'server', 'envoy'


I had referred the following documents:

https://www.envoyproxy.io/docs/envoy/v1.5.0/intro/arch_overview/load_balancing.html

https://github.com/envoyproxy/envoy/blob/master/source/docs/subset_load_balancer.md




Any help on this is greatly appreciated !!

Matt Klein

unread,
Jun 7, 2018, 12:34:10 PM6/7/18
to mrs.sa...@gmail.com, envoy-users
Sounds like there are no healthy hosts to route to. This is orthogonal to any subsets.

--
You received this message because you are subscribed to the Google Groups "envoy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users+unsubscribe@googlegroups.com.
To post to this group, send email to envoy...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/envoy-users/aae9cb0a-3678-408e-a36b-03689388d83e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

mrs.sa...@gmail.com

unread,
Jun 12, 2018, 6:54:30 AM6/12/18
to envoy-users
The host which is selected for default_subset is also part of other subset and it is reachable. I mean envoy is able to route the requests to it. However if the same host is select under fall back policy it fails with "no healthy host for HTTP connection pool". I also checked the host externally and able to ping it.

Is this a bug ? or am I missing anything in the following configuration ?
To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users...@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/envoy-users/aae9cb0a-3678-408e-a36b-03689388d83e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Stephan Zuercher

unread,
Jun 12, 2018, 6:08:47 PM6/12/18
to mrs.sa...@gmail.com, envoy-users
I reproduced your configuration locally and verified that it does work as expected.

Can you enable the admin listener (if you haven't already) and look at /clusters? Even without making any requests to Envoy you should be able to see entries for all the endpoints in your cluster. If they're missing (or if the one you expect to be in the default subset is missing) then the problem is with the result of the EDS query. I noticed that your endpoint config uses host names in the socket address. I don't believe that will work with EDS (or any cluster type besides strict-dns or logical-dns).

Stephan


To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users+unsubscribe@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.

mrs.sa...@gmail.com

unread,
Jun 13, 2018, 1:10:08 AM6/13/18
to envoy-users
Hi Stephan,

Thanks for looking into this. Just to mask the original IP address I had just mentioned "host_ip1", in really I am having the actual IP address.

So the original EDS response is something like below...



{
   "version_info": "0",
   "resources": [
   {
     "@type": "type.googleapis.com/envoy.api.v2.ClusterLoadAssignment",
     "cluster_name": "some_service",
     "endpoints": [
     {
      "lb_endpoints": [
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "172.33.209.224",
          "port_value": 16400

         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
           "type": "dev",
           "path" : "/v1/heartbeat"
         }
        }
       }
      },
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "172.33.233.238",
          "port_value": 16400
         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
           "type": "dev",
           "path" : "/v1/heartbeat"
         }
        }
       }
      },
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "172.33.233.238",           // Note: this IP is same as the above
          "port_value": 16400
         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
           "type": "test",                                  // metadata "type" is "test" which is my default subset
           "path" : "/v1/heartbeat"
         }
        }
       }
      }

     ]
    }
   ]
  }
 ]
}


output of /clusters

some_service::default_priority::max_connections::1024
some_service::default_priority::max_pending_requests::1024
some_service::default_priority::max_requests::1024
some_service::default_priority::max_retries::3
some_service::high_priority::max_connections::1024
some_service::high_priority::max_pending_requests::1024
some_service::high_priority::max_requests::1024
some_service::high_priority::max_retries::3
some_service::added_via_api::false
some_service::172.33.209.224:16400::cx_active::0
some_service::172.33.209.224:16400::cx_connect_fail::0
some_service::172.33.209.224:16400::cx_total::0
some_service::172.33.209.224:16400::rq_active::0
some_service::172.33.209.224:16400::rq_error::0
some_service::172.33.209.224:16400::rq_success::0
some_service::172.33.209.224:16400::rq_timeout::0
some_service::172.33.209.224:16400::rq_total::0
some_service::172.33.209.224:16400::health_flags::healthy
some_service::172.33.209.224:16400::weight::1
some_service::172.33.209.224:16400::region::
some_service::172.33.209.224:16400::zone::
some_service::172.33.209.224:16400::sub_zone::
some_service::172.33.209.224:16400::canary::false
some_service::172.33.209.224:16400::success_rate::-1
some_service::172.33.233.238:16400::cx_active::0
some_service::172.33.233.238:16400::cx_connect_fail::0
some_service::172.33.233.238:16400::cx_total::0
some_service::172.33.233.238:16400::rq_active::0
some_service::172.33.233.238:16400::rq_error::0
some_service::172.33.233.238:16400::rq_success::0
some_service::172.33.233.238:16400::rq_timeout::0
some_service::172.33.233.238:16400::rq_total::0
some_service::172.33.233.238:16400::health_flags::healthy
some_service::172.33.233.238:16400::weight::1
some_service::172.33.233.238:16400::region::
some_service::172.33.233.238:16400::zone::
some_service::172.33.233.238:16400::sub_zone::
some_service::172.33.233.238:16400::canary::false
some_service::172.33.233.238:16400::success_rate::-1



Both the hosts has health_flags set to "healthy"


-- Showing the result of success case where it selects the subset

-- resetting the counter first

# curl localhost:8001/reset_counters
OK


-- my route metadata match is "type":"dev" and "path":"/v1/heartbeat" this will select the subset and gives 200 ok as shown below


             "route": {
                    "cluster": "some_service",
                    "metadata_match": {
                     "filter_metadata": {
                       "envoy.lb": {
                         "type":"dev",
                         "path":"/v1/heartbeat"

                     }
                    }
                   }

-- sending request to envoy

# curl -X GET -v -k https://localhost:16600/v1/heartbeat

< HTTP/1.1 200 OK
< access-control-allow-origin: *
< access-control-allow-methods: GET
< access-control-allow-headers: Content-Type, Accept, X-Requested-With
< server: envoy
< date: Wed, 13 Jun 2018 04:50:31 GMT
< content-type: application/json
< content-length: 100
< x-envoy-upstream-service-time: 50
<

{"version":"0.1.111-201805020432","build":"2018-05-21T07:13:11Z","hostname":"dev-cluster-sahana-1"}


-- Snippet of /stats

# curl localhost:8001/stats      
 
cluster.some_service.bind_errors: 0
cluster.some_service.external.upstream_rq_200: 1
cluster.some_service.external.upstream_rq_2xx: 1

cluster.some_service.lb_healthy_panic: 0
cluster.some_service.lb_local_cluster_not_ok: 0
cluster.some_service.lb_recalculate_zone_structures: 0
cluster.some_service.lb_subsets_active: 17
cluster.some_service.lb_subsets_created: 0
cluster.some_service.lb_subsets_fallback: 0
cluster.some_service.lb_subsets_removed: 0
cluster.some_service.lb_subsets_selected: 1
cluster.some_service.lb_zone_cluster_too_small: 0
cluster.some_service.lb_zone_no_capacity_left: 0
cluster.some_service.lb_zone_number_differs: 0
cluster.some_service.lb_zone_routing_all_directly: 0
cluster.some_service.lb_zone_routing_cross_zone: 0
cluster.some_service.lb_zone_routing_sampled: 0

 

-- Showing the result of FALLBACK policy

IF I change my route metadata_match as shown

                       "metadata_match": {
                                "filter_metadata": {
                                      "envoy.lb": {
                                        "type": "dev3",                              // changed the type to "dev3" so that it falls back to the default subset
                                        "path": "/v1/heartbeat"
                                  }
                                }
                              }


-- reset the counters

# curl localhost:8001/reset_counters
OK



output of /clusters

some_service::default_priority::max_connections::1024
some_service::default_priority::max_pending_requests::1024
some_service::default_priority::max_requests::1024
some_service::default_priority::max_retries::3
some_service::high_priority::max_connections::1024
some_service::high_priority::max_pending_requests::1024
some_service::high_priority::max_requests::1024
some_service::high_priority::max_retries::3
some_service::added_via_api::false
some_service::172.33.209.224:16400::cx_active::0
some_service::172.33.209.224:16400::cx_connect_fail::0
some_service::172.33.209.224:16400::cx_total::0
some_service::172.33.209.224:16400::rq_active::0
some_service::172.33.209.224:16400::rq_error::0
some_service::172.33.209.224:16400::rq_success::0
some_service::172.33.209.224:16400::rq_timeout::0
some_service::172.33.209.224:16400::rq_total::0
some_service::172.33.209.224:16400::health_flags::healthy
some_service::172.33.209.224:16400::weight::1
some_service::172.33.209.224:16400::region::
some_service::172.33.209.224:16400::zone::
some_service::172.33.209.224:16400::sub_zone::
some_service::172.33.209.224:16400::canary::false
some_service::172.33.209.224:16400::success_rate::-1
some_service::172.33.233.238:16400::cx_active::0
some_service::172.33.233.238:16400::cx_connect_fail::0
some_service::172.33.233.238:16400::cx_total::0
some_service::172.33.233.238:16400::rq_active::0
some_service::172.33.233.238:16400::rq_error::0
some_service::172.33.233.238:16400::rq_success::0
some_service::172.33.233.238:16400::rq_timeout::0
some_service::172.33.233.238:16400::rq_total::0
some_service::172.33.233.238:16400::health_flags::healthy
some_service::172.33.233.238:16400::weight::1
some_service::172.33.233.238:16400::region::
some_service::172.33.233.238:16400::zone::
some_service::172.33.233.238:16400::sub_zone::
some_service::172.33.233.238:16400::canary::false
some_service::172.33.233.238:16400::success_rate::-1


-- Now sending the heartbeat request again

# curl -X GET -v -k https://localhost:16600/v1/heartbeat

< HTTP/1.1 503 Service Unavailable
< content-length: 19
< content-type: text/plain
< date: Wed, 13 Jun 2018 05:00:54 GMT
< server: envoy
<
* Connection #0 to host localhost left intact

no healthy upstreamroot


# curl localhost:8001/stats

cluster.some_service.bind_errors: 0
cluster.some_service.external.upstream_rq_503: 1
cluster.some_service.external.upstream_rq_5xx: 1

cluster.some_service.lb_healthy_panic: 1
cluster.some_service.lb_local_cluster_not_ok: 0
cluster.some_service.lb_recalculate_zone_structures: 0
cluster.some_service.lb_subsets_active: 17
cluster.some_service.lb_subsets_created: 0
cluster.some_service.lb_subsets_fallback: 1
cluster.some_service.lb_subsets_removed: 0
cluster.some_service.lb_subsets_selected: 0
cluster.some_service.lb_zone_cluster_too_small: 0
cluster.some_service.lb_zone_no_capacity_left: 0
cluster.some_service.lb_zone_number_differs: 0
cluster.some_service.lb_zone_routing_all_directly: 0
cluster.some_service.lb_zone_routing_cross_zone: 0
cluster.some_service.lb_zone_routing_sampled: 0


When the host IP I am using is same why it should go unhealthy for default subset alone ?

Thanks
Sahana

Stephan Zuercher

unread,
Jun 13, 2018, 1:42:32 AM6/13/18
to mrs.sa...@gmail.com, envoy-users
What's happening is that the second host with the same IP and port is being treated as a duplicate (in BaseDynamicClusterImpl::updateDynamicHostList) and it's being ignored.

The subset lb isn't involved in this -- it happens when the host list is loaded (or updated). Based on the comment in that function it's done on purpose in case a DNS-based cluster or a "bad SDS implementation" returns duplicates. The net result is that only the first 172.33.233.238:16400 host is actually in the cluster and the subset lb cannot find the second one during fallback, which results in an empty subset and the 503 error reporting no healthy hosts.

One way to work around this by either using a different metadata key to identify the host for fallback (e.g. add "fallback=true" to that host and use that as the default subset criteria). 


To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users+unsubscribe@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.

mrs.sa...@gmail.com

unread,
Jun 13, 2018, 2:40:55 AM6/13/18
to envoy-users
Hi,

I quickly tried setting fallback=true, but still no luck


      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "172.33.233.238",
          "port_value": 16500
         }
        }
       },
        "metadata": {
          "filter_metadata": {
            "envoy.lb" : {
              "fallback": "true"
          }
         }
        }

and the envoy configuration looks like this

    "lb_subset_config": {
          "fallback_policy": "DEFAULT_SUBSET",
          "default_subset": {
             "fallback" : "true"
          },

          "subset_selectors": [
            { "keys": [ "type", "path" ] }
          ]
        }

# curl localhost:8001/stats

cluster.some_service.bind_errors: 0
cluster.some_service.external.upstream_rq_503: 1
cluster.some_service.external.upstream_rq_5xx: 1
cluster.some_service.lb_healthy_panic: 1
cluster.some_service.lb_local_cluster_not_ok: 0
cluster.some_service.lb_recalculate_zone_structures: 0
cluster.some_service.lb_subsets_active: 17
cluster.some_service.lb_subsets_created: 0
cluster.some_service.lb_subsets_fallback: 1
cluster.some_service.lb_subsets_removed: 0
cluster.some_service.lb_subsets_selected: 0
cluster.some_service.lb_zone_cluster_too_small: 0
cluster.some_service.lb_zone_no_capacity_left: 0
cluster.some_service.lb_zone_number_differs: 0
cluster.some_service.lb_zone_routing_all_directly: 0
cluster.some_service.lb_zone_routing_cross_zone: 0
cluster.some_service.lb_zone_routing_sampled: 0




mrs.sa...@gmail.com

unread,
Jun 13, 2018, 2:44:24 AM6/13/18
to envoy-users
Additional information...


I see the following fallback subset getting created in the envoy trace


[2018-06-13 06:35:36.958][8425][debug][upstream] source/common/upstream/subset_lb.cc:238] subset lb: creating load balancer for path="/v1/heartbeat", type="dev"
[2018-06-13 06:35:36.958][8435][debug][upstream] source/common/upstream/cluster_manager_impl.cc:682] adding TLS initial cluster some_service
[2018-06-13 06:35:36.958][8433][debug][upstream] source/common/upstream/subset_lb.cc:147] subset lb: creating fallback load balancer for fallback="true"
[2018-06-13 06:35:36.958][8426][debug][upstream] source/common/upstream/subset_lb.cc:238] subset lb: creating load balancer for path="/v1/heartbeat", type="dev"

Stephan Zuercher

unread,
Jun 13, 2018, 12:29:59 PM6/13/18
to sahana Santhosh, envoy-users
You still can't have multiple hosts with the same IP:port -- only one will end up in the cluster. I meant for you to add the metadata to the first host:

           "fallback": "test"
         }
        }
       }
      }
     ]
    }
   ]
  }
 ]
}



To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users+unsubscribe@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.

mrs.sa...@gmail.com

unread,
Jun 14, 2018, 12:08:24 AM6/14/18
to envoy-users
Hi Stephan,

Yes, you are correct we can't have multiple hosts with same IP and thank you so much for looking into this. It helped me a lot :)

mrs.sa...@gmail.com

unread,
Jul 11, 2018, 5:49:45 AM7/11/18
to envoy-users
Hi Stephan,

I had to start this discussion again :)

As per our requirement we end up having the same IP address in multiple subsets, for example I am having the EDS response as…



{
   "version_info": "0",
   "resources": [
   {
     "@type": "type.googleapis.com/envoy.api.v2.ClusterLoadAssignment",
     "cluster_name": "some_service",
     "endpoints": [
     {
      "lb_endpoints": [
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "172.03.88.173",
          "port_value": 16503
         }
        }
       },
        "metadata": {
          "filter_metadata": {
            "envoy.lb" : {
              "url" : "/v3/Mllib-try1"
            }
          }
        }
       },
       {
          "endpoint": {
            "address": {
              "socket_address": {
                "address": "172.03.94.255",
                "port_value": 16503
              }
            }
          },
          "metadata": {
            "filter_metadata": {
              "envoy.lb" : {
                "url" : "/v3/Mllib-try1"
              }
            }
          }
        },
       {
          "endpoint": {
            "address": {
              "socket_address": {
                "address": "172.03.88.173",               --> IP address as same as metadata "/v3/Mllib-try1"
                "port_value": 16503
              }
            }
          },
          "metadata": {
            "filter_metadata": {
              "envoy.lb" : {
                "url" : "/v3/Mllib-try2"
              }
            }
          }
        },
       {
          "endpoint": {
            "address": {
              "socket_address": {
                "address": "172.03.94.255",         IP address as same as metadata "/v3/Mllib-try1"
                "port_value": 16503
              }
            }
          },
          "metadata": {
            "filter_metadata": {
              "envoy.lb" : {
                "url" : "/v3/Mllib-try2"
              }
            }
          }
        },
        {
          "endpoint": {
            "address": {
              "socket_address": {
                "address": "172.03.69.1",
                "port_value": 16503
              }
            }
          },
          "metadata": {
            "filter_metadata": {
              "envoy.lb" : {
                "fallback" : "default_fallback"
              }
            }
          }
        },
        {
          "endpoint": {
            "address": {
              "socket_address": {
                "address": "172.03.112.14",
                "port_value": 16503
              }
            }
          },
          "metadata": {
            "filter_metadata": {
              "envoy.lb" : {
                "fallback" : "default_fallback"
              }
            }
          }
        }
      ]
     }
    ]
   }
  ]
}

Envoy config:


 "connect_timeout": "120s",
        "lb_policy": "ROUND_ROBIN",
        "lb_subset_config": {
          "fallback_policy": "DEFAULT_SUBSET",
          "default_subset": {
             "fallback" : "default_fallback"
          },
          "subset_selectors": [
            { "keys": [ "url" ] }
          ]
        }



Based on the "url" metadata key value we create multiple subsets. During testing we got to know that the same IP address can be part of multiple subsets.
However from the envoy logs I see that the subsets are not getting created if the IP address is duplicated (which you have mentioned in earlier conversation)

In the above I was hoping it will form first subset for url matching "/v3/Mllib-try1" having ip address {172.03.88.173, 172.03.94.255} and second subset should be formed for url matching "/v3/Mllib-try2".  However the subset was not at all created for url value "/v3/Mllib-try2". Looks like it eliminated this subset as the IP address is duplicate.


envoy trace:

[2018-07-11 06:40:57.796][109][debug][upstream] source/common/upstream/subset_lb.cc:147] subset lb: creating fallback load balancer for fallback="default_fallback"
[2018-07-11 06:40:57.838][109][debug][upstream] source/common/upstream/subset_lb.cc:238] subset lb: creating load balancer for url="/v3/Mllib-try1"

Is there any way to allow same IP address to be part of multiple subsets ?? As the below document, it gives the impression that same host can be part of multiple subset based on metadata match criteria.I see that host "e7" is part of 3 subsets

https://github.com/envoyproxy/envoy/blob/master/source/docs/subset_load_balancer.md

We are blocked due to the elimination of duplicate IP across subsets, it will help if we can find a way to overcome this.

Thanks
Sahana






Stephan Zuercher

unread,
Jul 23, 2018, 10:39:18 PM7/23/18
to sahana Santhosh, envoy-users
Sorry for the long delay. Hosts are uniquely identified by their address. This happens before the subsets load balancer sees the hosts. This, repeating the same address with different metadata will not work, as you've discovered.

You could instead use the url as the metadata key, with some fixed value.

Stephan


To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users+unsubscribe@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.

sahana Santhosh

unread,
Jul 24, 2018, 12:30:50 AM7/24/18
to Stephan Zuercher, envoy-users
Thanks for responding Stephan!

Yes, I can construct url as metadata key in the EDS response, however I think in the envoy configuration I might need support to set subset_selector "keys" as dynamic.

mrs.sa...@gmail.com

unread,
Jul 31, 2018, 2:51:45 AM7/31/18
to envoy-users
Hi Stephan,

I thought of giving you little background on our requirement from v2 envoy. I have posted the same in issue #2436.


screen_shot_2018-07-19_at_4 11 05_pm

Lets assume I have 3 endpoints in a cluster (ip:port) and multiple URLs (you can assume URLs as some data/file which resides inside the mentioned ip:port) assigned to each endpoint.

The number of URLs assigned to each endpoint grows during runtime and the same URL can be part of multiple endpoints (we duplicate it in multiple endpoints as a backup)

As shown in the diagram Endpoint 1 contains URL1, URL2 and URL2 is also part of Endpoint 2.

When users sends request to ENVOY, we will be having URL1 or URL2 ….. URLn as part of the http request. We fetch this information via dynamic metadata matching and then route the request to corresponding Endpoint which contains the respective URL number.

The issue here is.. the number of URLs grows during runtime and is not fixed, hence we can not configure the exact metadata key names in subset_selectors.

We could think of 2 possible ways of solving this..

  1. create each metadata key for every URL in the EDS response. for example …

Endpoint 1 {
ip: "1.2.3.4"
port: 1600
}
metadata {
"key1" : "URL 1",
"key2" : "URL 2" —> URLs can grow more than 10,000 and with that our metadata key count.
}

similarly …
Endpoint 2 {
ip: "1.2.3.5"
port: 1600
}
metadata {
"key2" : "URL 2",
"key3" : "URL 3"
}

However in this approach we don’t know how many number of keys gets generated during runtime. This could be in 10,000. It will be a huge number to statically configure in the envoy config subset_selector. Also we might have to put a threshold on number of keys will can grow but we will be loosing the other URLs which crosses beyond this threshold.

  1. Second approach is to create one cluster for each URL, example if I have URL1, URL2, URL3 ……. URLn, we will be construction that many clusters each having the corresponding endpoint.

In this approach we can solve the issue of having a duplicate IP:port (as the ip:port can be same in multiple clusters) however the main disadvantage is that the cluster number grows with respect to number of URLs… ie if number of URLs are 10,000 we will end up creating 10,000 cluster entries.

Do you have any suggestions to overcome this situation ? What is the best approach which we can take in V2 envoy to fit our requirement ? Any help on this will be greatly appreciated .


-Sahana

Stephan Zuercher

unread,
Aug 7, 2018, 2:52:50 PM8/7/18
to sahana Santhosh, envoy-users
I think option 2 is better suited to your use case. You'll probably want to avoid having a route per URL by writing a custom HTTP filter (or using the existing Lua filter) to generate a header containing the name of the cluster you want to use for a specific URL.





sahana Santhosh

unread,
Aug 8, 2018, 1:01:28 AM8/8/18
to Stephan Zuercher, envoy-users
Do we have options to route it to the default cluster (similar to default subset) if none of the URL match with existing cluster name ?

Stephan Zuercher

unread,
Aug 13, 2018, 5:50:15 PM8/13/18
to sahana Santhosh, envoy-users
You can write a route that matches on the absence of the cluster header and route to any cluster you'd like.
Reply all
Reply to author
Forward
0 new messages