gRFC A1: HTTP CONNECT proxy support

580 views
Skip to first unread message

Mark D. Roth

unread,
Jan 18, 2017, 5:12:10 PM1/18/17
to grpc. io, Julien Boeuf, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li
I've created a gRFC describing how HTTP CONNECT proxies will be supported in gRPC:

https://github.com/grpc/proposal/pull/4

Please keep discussion in this thread.  Thanks!

--
Mark D. Roth <ro...@google.com>
Software Engineer
Google, Inc.

Julien Boeuf

unread,
Jan 18, 2017, 5:18:38 PM1/18/17
to Mark D. Roth, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li
Thanks I saw this. I'll comment on the doc.

BTW, i'm at an offsite today (and I was yesterday) but this is really high on my priority list.

Cheers,

    Julien.

Mark D. Roth

unread,
Jan 19, 2017, 11:08:02 AM1/19/17
to Julien Boeuf, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li
Julien,

The gRFC process says that all discussion should happen in this thread, rather than in the PR.  So I'll reply to your comments here.

I agree with you that the proxy mapper could set the HTTP CONNECT argument to a server name instead of to an IP address.  However, that would not be enough to address the case where the servers' DNS information is not available, at least not in the general case, because the client still needs to know the set of server addresses in order to open the right set of connections to load-balance across.

As you and I have discussed, in the specific case where the grpclb load balancing policy is in use, then you could in principle make this work, because the set of server addresses will actually come from the load balancers instead of from the name resolver.  However, this would require a number of additional hacks:
  • The name resolver would somehow have to know that when you request a load balanced name, it should return the address of the proxy but with the "is_balancer" bit set.
  • The proxy mapper would need some way to differentiate between the connections to the load balancers and the connections to the backend servers, so that it could set the HTTP CONNECT argument to the server name for the load balancer connections and to the IP address for the backend server connections.
  • The proxy itself would have to know how to resolve the internal name of the load balancers.
And even once all of those hacks are implemented, this approach still only works for the case where the grpclb load balancing policy is in use.  If you want to use something like round_robin instead, it won't work at all.

I continue to believe that running a gRPC-level proxy is a better solution for this use-case.

Julien Boeuf

unread,
Jan 19, 2017, 6:06:29 PM1/19/17
to Mark D. Roth, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
+stubblefield since he expressed interest.

Thanks Mark for the reply. Please see inline.

Cheers,

     Julien.

On Thu, Jan 19, 2017 at 8:08 AM, Mark D. Roth <ro...@google.com> wrote:
Julien,

The gRFC process says that all discussion should happen in this thread, rather than in the PR.  So I'll reply to your comments here.
Ack. Sorry about that.
 

I agree with you that the proxy mapper could set the HTTP CONNECT argument to a server name instead of to an IP address.  However, that would not be enough to address the case where the servers' DNS information is not available, at least not in the general case, because the client still needs to know the set of server addresses in order to open the right set of connections to load-balance across.

As you and I have discussed, in the specific case where the grpclb load balancing policy is in use, then you could in principle make this work, because the set of server addresses will actually come from the load balancers instead of from the name resolver.  However, this would require a number of additional hacks:
  • The name resolver would somehow have to know that when you request a load balanced name, it should return the address of the proxy but with the "is_balancer" bit set.
Correct. This can be done using naming convention which is a reasonable thing to do.
 
  • The proxy mapper would need some way to differentiate between the connections to the load balancers and the connections to the backend servers, so that it could set the HTTP CONNECT argument to the server name for the load balancer connections and to the IP address for the backend server connections.
Yes, that is correct. A way to do that is to work hand in hand with the resolver which would set a well known / invalid IP address in case of the balancer connection (e.g. link local address) so that it can be processed as a special case by the proxy mapper. It's not great but it would certainly work.
 
  • The proxy itself would have to know how to resolve the internal name of the load balancers.
Yes, this is totally reasonable and is one of the benefits of using HTTP CONNECT. In fact, we are using that very feature for the http_proxy env var case today.
 
And even once all of those hacks are implemented, this approach still only works for the case where the grpclb load balancing policy is in use.  If you want to use something like round_robin instead, it won't work at all.
IMO, it is OK. I don't believe that round robin is very useful if you have grpclb at your disposal. If your client is not able to properly resolve names, then round-robin is out of the equation to begin with.
 

I continue to believe that running a gRPC-level proxy is a better solution for this use-case.
I agree that this could work. However, this is no silver bullet. Here are some issues I have with this scheme.
1. This requires the deployment of a full gRPC proxy in the path as opposed to a more standard HTTP CONNECT proxy.
2. More importantly, it requires the termination of the secure session at the proxy which means that the proxy has to be fully trusted.
3. Even if the proxy is fully trusted, you will need a way to:
- carry the whole authentication information of the client from the proxy to the backend (e.g. attributes, restrictions etc...).
- depending on your transport security protocol, you may or may not have access to something like Server Name Indication (SNI: https://en.wikipedia.org/wiki/Server_Name_Indication) which would be needed in this kind of deployment.

Mark D. Roth

unread,
Jan 20, 2017, 10:17:18 AM1/20/17
to Julien Boeuf, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
On Thu, Jan 19, 2017 at 3:06 PM, 'Julien Boeuf' via grpc.io <grp...@googlegroups.com> wrote:
+stubblefield since he expressed interest.

Thanks Mark for the reply. Please see inline.

Cheers,

     Julien.

On Thu, Jan 19, 2017 at 8:08 AM, Mark D. Roth <ro...@google.com> wrote:
Julien,

The gRFC process says that all discussion should happen in this thread, rather than in the PR.  So I'll reply to your comments here.
Ack. Sorry about that.
 

I agree with you that the proxy mapper could set the HTTP CONNECT argument to a server name instead of to an IP address.  However, that would not be enough to address the case where the servers' DNS information is not available, at least not in the general case, because the client still needs to know the set of server addresses in order to open the right set of connections to load-balance across.

As you and I have discussed, in the specific case where the grpclb load balancing policy is in use, then you could in principle make this work, because the set of server addresses will actually come from the load balancers instead of from the name resolver.  However, this would require a number of additional hacks:
  • The name resolver would somehow have to know that when you request a load balanced name, it should return the address of the proxy but with the "is_balancer" bit set.
Correct. This can be done using naming convention which is a reasonable thing to do.
 
  • The proxy mapper would need some way to differentiate between the connections to the load balancers and the connections to the backend servers, so that it could set the HTTP CONNECT argument to the server name for the load balancer connections and to the IP address for the backend server connections.
Yes, that is correct. A way to do that is to work hand in hand with the resolver which would set a well known / invalid IP address in case of the balancer connection (e.g. link local address) so that it can be processed as a special case by the proxy mapper. It's not great but it would certainly work.
 
  • The proxy itself would have to know how to resolve the internal name of the load balancers.
Yes, this is totally reasonable and is one of the benefits of using HTTP CONNECT. In fact, we are using that very feature for the http_proxy env var case today.
 
And even once all of those hacks are implemented, this approach still only works for the case where the grpclb load balancing policy is in use.  If you want to use something like round_robin instead, it won't work at all.
IMO, it is OK. I don't believe that round robin is very useful if you have grpclb at your disposal. If your client is not able to properly resolve names, then round-robin is out of the equation to begin with.
 

I continue to believe that running a gRPC-level proxy is a better solution for this use-case.
I agree that this could work. However, this is no silver bullet. Here are some issues I have with this scheme.
1. This requires the deployment of a full gRPC proxy in the path as opposed to a more standard HTTP CONNECT proxy.

That's true, but I think that having this kind of proxy would be fairly useful in other scenarios as well.
 
2. More importantly, it requires the termination of the secure session at the proxy which means that the proxy has to be fully trusted.

Is this a significant problem, given that the proxy would be under the control of the same organization as the servers?
 
3. Even if the proxy is fully trusted, you will need a way to:
- carry the whole authentication information of the client from the proxy to the backend (e.g. attributes, restrictions etc...).
- depending on your transport security protocol, you may or may not have access to something like Server Name Indication (SNI: https://en.wikipedia.org/wiki/Server_Name_Indication) which would be needed in this kind of deployment.

Won't having those capabilities be useful in other scenarios too?


I certainly agree that there's some work that needs to be done for the gRPC-level proxy approach.  However, it seems like that work would yield a set of tools that would be generally useful in other situations -- in effect, we'd be creating new building blocks that we could compose in different ways in the future to solve other problems.  In contrast, the hacks described above that would be necessary to do the work on the gRPC client would only be useful in this particular scenario, and they would actually complicate the existing code instead of providing new building blocks that can be reused later.

I think we are in agreement that either approach could be made to work.  However, I think the gRPC-level proxy approach is cleaner and provides more long-term benefit.
 
 

On Wed, Jan 18, 2017 at 2:18 PM, Julien Boeuf <jbo...@google.com> wrote:
Thanks I saw this. I'll comment on the doc.

BTW, i'm at an offsite today (and I was yesterday) but this is really high on my priority list.

Cheers,

    Julien.

On Wed, Jan 18, 2017 at 2:12 PM, Mark D. Roth <ro...@google.com> wrote:
I've created a gRFC describing how HTTP CONNECT proxies will be supported in gRPC:

https://github.com/grpc/proposal/pull/4

Please keep discussion in this thread.  Thanks!

--
Mark D. Roth <ro...@google.com>
Software Engineer
Google, Inc.




--
Mark D. Roth <ro...@google.com>
Software Engineer
Google, Inc.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CAAvOVOd%3DXK0Mw1E9L0hnb7Tb5RaDESvVb%3D9S7GE99HfR4w1djg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Julien Boeuf

unread,
Jan 20, 2017, 12:23:24 PM1/20/17
to Mark D. Roth, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
This is not necessarily the case. And even if it is the same organization, such a proxy would be able to be impersonate any of these connections which basically makes it "root" on all gRPC backends that accept connections through it: this is something that we are trying hard to avoid.
On the other hand, even if an HTTP CONNECT proxy runs in a more privileged environment since it has access to name resolution, it is not able to impersonate clients and as such a compromised proxy has a limited blast radius.
Since a proxy lives on the edge of 2 security zones (e.g less trusted on the client side, and more trusted on the backend side), it is very much subject to attacks as it is exposed on the less trusted side.
 
 
3. Even if the proxy is fully trusted, you will need a way to:
- carry the whole authentication information of the client from the proxy to the backend (e.g. attributes, restrictions etc...).
- depending on your transport security protocol, you may or may not have access to something like Server Name Indication (SNI: https://en.wikipedia.org/wiki/Server_Name_Indication) which would be needed in this kind of deployment.

Won't having those capabilities be useful in other scenarios too?


I certainly agree that there's some work that needs to be done for the gRPC-level proxy approach.  However, it seems like that work would yield a set of tools that would be generally useful in other situations -- in effect, we'd be creating new building blocks that we could compose in different ways in the future to solve other problems.  In contrast, the hacks described above that would be necessary to do the work on the gRPC client would only be useful in this particular scenario, and they would actually complicate the existing code instead of providing new building blocks that can be reused later.
For me, the biggest 'hack' is the link-local IP address (or a marker that IP resolution did not work). For the rest, I don't believe that these are hacks. I also believe that the implications on the code are not that bad: the proxy mapper will have to return the parameters for the HTTP CONNECT (which it may have to do anyway if custom headers are needed in the CONNECT request) as opposed to return just a new IP address and let the framework do the HTTP CONNECT. 
 

I think we are in agreement that either approach could be made to work.  However, I think the gRPC-level proxy approach is cleaner and provides more long-term benefit.
I don't think that these 2 approaches are equivalent in terms for security. While the gRPC-level proxy could be useful, it may not fulfill some security requirements as I tried to explain above. On the other hand, the trust that we put on a TCP-level proxy is much more tunable.
 

 

Mark D. Roth

unread,
Jan 20, 2017, 2:36:58 PM1/20/17
to Julien Boeuf, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
That's a good point.  I guess the trade-off here is that in the gRPC-level proxy case, you would no longer expose individual servers to attacks, since they'd be hidden behind the proxy.  But a successful attack on any individual service would only compromise that service, not every service behind the proxy, so perhaps this is a worthwhile trade-off.
 
 
 
3. Even if the proxy is fully trusted, you will need a way to:
- carry the whole authentication information of the client from the proxy to the backend (e.g. attributes, restrictions etc...).
- depending on your transport security protocol, you may or may not have access to something like Server Name Indication (SNI: https://en.wikipedia.org/wiki/Server_Name_Indication) which would be needed in this kind of deployment.

Won't having those capabilities be useful in other scenarios too?


I certainly agree that there's some work that needs to be done for the gRPC-level proxy approach.  However, it seems like that work would yield a set of tools that would be generally useful in other situations -- in effect, we'd be creating new building blocks that we could compose in different ways in the future to solve other problems.  In contrast, the hacks described above that would be necessary to do the work on the gRPC client would only be useful in this particular scenario, and they would actually complicate the existing code instead of providing new building blocks that can be reused later.
For me, the biggest 'hack' is the link-local IP address (or a marker that IP resolution did not work). For the rest, I don't believe that these are hacks. I also believe that the implications on the code are not that bad: the proxy mapper will have to return the parameters for the HTTP CONNECT (which it may have to do anyway if custom headers are needed in the CONNECT request) as opposed to return just a new IP address and let the framework do the HTTP CONNECT. 

A proxy mapper will pretty much always need to return the HTTP CONNECT argument anyway, so that's not a problem from my perspective.

I agree with you that the main code hack here is having some sort of "sentinel" address returned by the resolver, and that hack has to live in two places: both the resolver and the proxy mapper.  But in addition, this is still only a partial solution, because it will work with grpclb but not with round_robin, and it will not allow access to the service config information.

Actually, could we resolve this by externally publishing a DNS record for the service name that points to the proxy address and has the is_balancer bit set?  It wouldn't have to expose anything about the internal network architecture; it would just be an externally facing record to point the client to the proxy.  It could even include service config information.  If we did that, then the resolver would not need to do anything special; the only thing we'd need would be the proxy mapper to redirect requests for internal addresses through the proxy.  This would essentially reduce the problem so that this would look a lot more like case 2.  What do you think?

(This reminds me that I still need to put together a gRFC for how the is_balancer bit is going to be encoded in DNS.  But for the purposes of this discussion, let's assume that problem is solved.)
 
 

I think we are in agreement that either approach could be made to work.  However, I think the gRPC-level proxy approach is cleaner and provides more long-term benefit.
I don't think that these 2 approaches are equivalent in terms for security. While the gRPC-level proxy could be useful, it may not fulfill some security requirements as I tried to explain above. On the other hand, the trust that we put on a TCP-level proxy is much more tunable.

You're right that there are trade-offs here.  I will update the gRFC to document this once we figure out the details of the client-side approach.

Julien Boeuf

unread,
Jan 20, 2017, 6:20:08 PM1/20/17
to Mark D. Roth, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
I guess that could work indeed. Just to make sure I understand correctly.
1. Load balanced name (lb_name) is resolved by DNS to (proxy_ip_addr, lb = true) (and maybe service config).
2. proxy mapper recognizes the proxy_ip_addr and returns instructions to form a CONNECT request to this proxy_ip_addr which would look like:
CONNECT <lb_name> HTTP 1.1
Host:<lb_name>
<Custom headers>
3. The proxy resolves the real name and talks to the load balancer. From now on, we have end to end communication between the client and the LB.
4. The client receives IP addresses of backends from the LB Channel.
5. The proxy mapper recognizes these IP addresses and issues a CONNECT request to the proxy_ip_addr (mapped) which would look like:
CONNECT <backend_ip> HTTP1.1
Host: <backend_ip>
<Custom headers>
6. The proxy makes a connection to the <backend_ip>. From now on, we have end to end communication between the client and the backend.
 
 
 

I think we are in agreement that either approach could be made to work.  However, I think the gRPC-level proxy approach is cleaner and provides more long-term benefit.
I don't think that these 2 approaches are equivalent in terms for security. While the gRPC-level proxy could be useful, it may not fulfill some security requirements as I tried to explain above. On the other hand, the trust that we put on a TCP-level proxy is much more tunable.

You're right that there are trade-offs here.  I will update the gRFC to document this once we figure out the details of the client-side approach.
Thanks! Very much appreciated.
 

Mark D. Roth

unread,
Jan 23, 2017, 10:26:21 AM1/23/17
to Julien Boeuf, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
Yes, that looks correct.

I'll update the gRFC doc accordingly.
 

For more options, visit https://groups.google.com/d/optout.

Julien Boeuf

unread,
Jan 24, 2017, 7:08:14 PM1/24/17
to Mark D. Roth, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
Thanks much for the changes on the RFC.

I just see a small over-specification with the following sentence:
"""
there will be an external DNS record for the service name that
points to the IP address of the proxy and has the `is_balancer` bit set. 
""""

While this is certainly a way to achieve the desired behavior, I don't believe that we have to mandate the use of DNS records here. The default resolver (which may or may not be backed by DNS) just has to return the IP address of the proxy for the service name and set the is_balancer bit.
What do you think?

    Julien.


Mark D. Roth

unread,
Jan 25, 2017, 10:37:35 AM1/25/17
to Julien Boeuf, grpc. io, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li, Adam Stubblefield
Thanks for pointing this out, Julien.  I've changed most of the references to DNS to instead say "name service", which is more generic.


For more options, visit https://groups.google.com/d/optout.

Eric Anderson

unread,
Jan 25, 2017, 3:59:45 PM1/25/17
to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
The Java implementation is going to have hurdles. I sort of expect issues adhering to the design as precisely as it is defined. I've got to figure out where ProxySelector fits into all of this.

Any references to "load balancing" should say "client-side load balancing".

nit: s/internet/Internet/ (is this really where I'm supposed to make comments like this?)

Note that load-balancing within gRPC happens on a per-call basis, not a per-connection basis.  Because of this, use of a TCP-level proxy may limit the ability to do load balancing in some environments.

What is that talking about? If the first sentence said "client-side load balancing" I could agree. But I don't understand what that has to do with the second sentence.

It seems like this is getting at something different: we can't resolve the multiple backends. How about:

Note that client-side load-balancing within gRPC requires knowing the addresses of the backends. Because of this, use of a TCP-level proxy may limit the ability to do load balacing in some environments.

   - *All* requests must go through the proxy, both for internal and     external servers.

This is not true. It only applies to external servers. It directly contradicts the earlier "outbound to the internet." I could maybe agree with it if it said "may" instead of "must."

such as the `http_proxy` environment variable or the Java registry

What is the Java registry? Do you mean Java system properties?

- For cases 2 and 3, in the subchannel, before we connect to the target address, we will call a new *proxy mapper* hook, which will allow selectively requesting the use of a proxy based on the address to which the subchannel is going to connect.

Yeah... that's very C-specific.

Could you instead define the requirements, like "before each connection is made, application code can dynamically choose whether to use CONNECT and to which proxy based on the address of the destination?" You then say, "In C core, this could be accomplished by..."

Note that in case 1, because the client cannot know the set of server addresses, it is impossible to use the normal gRPC per-call load balancing.  It *is* possible to do load balancing on a per-connection basis, but that may not spread out the load evenly if different client impose a different amount of load.
 
Note that it is possible to use multiple connections through the proxy. And I'm expecting we'll support such a feature for other use-cases. What can't happen is the client guaranteeing each connection goes to a different backend, or using a non-hard-coded number of connections.

In this case, there will be an external name service record for the server name that points to the IP address of the proxy and has the `is_balancer` bit set.  (Note: We have not yet designed how that bit will be encoded in DNS, but that will be the subject of a separate gRFC.) The proxy mapper implementation will then have to detect two types of addresses:
- When it sees the proxy address, it will set the HTTP CONNECT argument to the original server name.

Eww... So it actually changes the end-host. Note that ability is not described earlier in the doc when proxy mapper is introduced. Nor is it made clear here it is an additional feature.

Why isn't the LB just made public? It can be behind some other type of load balancer. That's what I had expected when discussing earlier. Yes, that means there are more auth hurdles, but it seems more sound.


On Wed, Jan 18, 2017 at 2:12 PM, Mark D. Roth <ro...@google.com> wrote:

Mark D. Roth

unread,
Jan 25, 2017, 5:00:04 PM1/25/17
to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
Thanks for the feedback, Eric.

On Wed, Jan 25, 2017 at 12:59 PM, 'Eric Anderson' via grpc.io <grp...@googlegroups.com> wrote:
The Java implementation is going to have hurdles. I sort of expect issues adhering to the design as precisely as it is defined. I've got to figure out where ProxySelector fits into all of this.

Is this just the concern you've mentioned about how authentication fits in, or is there something more here?
 

Any references to "load balancing" should say "client-side load balancing".

I think that "client-side load balancing" is misleading when talking about grpclb, since the actual load balancing code happens on the balancers instead of on the client.  But I do take your point here.  I've changed it to use the term "per-call load balancing".
 

nit: s/internet/Internet/ (is this really where I'm supposed to make comments like this?)

Done.

(I think the intent is to keep all significant design discussion in this thread, but it's probably fine to add comments about minor wording changes in the PR itself.)
 

Note that load-balancing within gRPC happens on a per-call basis, not a per-connection basis.  Because of this, use of a TCP-level proxy may limit the ability to do load balancing in some environments.

What is that talking about? If the first sentence said "client-side load balancing" I could agree. But I don't understand what that has to do with the second sentence.

It seems like this is getting at something different: we can't resolve the multiple backends. How about:

Note that client-side load-balancing within gRPC requires knowing the addresses of the backends. Because of this, use of a TCP-level proxy may limit the ability to do load balacing in some environments.

Thanks -- I like that wording much better.  Updated.
 

   - *All* requests must go through the proxy, both for internal and     external servers.

This is not true. It only applies to external servers. It directly contradicts the earlier "outbound to the internet." I could maybe agree with it if it said "may" instead of "must."

My understanding is that if the http_proxy environment variable is set, then the proxy is used unconditionally for all servers, so I think this is accurate.  I've updated the wording in the description of this case to make it clear that this is not just for outbound traffic.
 

such as the `http_proxy` environment variable or the Java registry

What is the Java registry? Do you mean Java system properties?

Probably. :)  Updated.
 

- For cases 2 and 3, in the subchannel, before we connect to the target address, we will call a new *proxy mapper* hook, which will allow selectively requesting the use of a proxy based on the address to which the subchannel is going to connect.

Yeah... that's very C-specific.

Could you instead define the requirements, like "before each connection is made, application code can dynamically choose whether to use CONNECT and to which proxy based on the address of the destination?" You then say, "In C core, this could be accomplished by..."

Done.
 

Note that in case 1, because the client cannot know the set of server addresses, it is impossible to use the normal gRPC per-call load balancing.  It *is* possible to do load balancing on a per-connection basis, but that may not spread out the load evenly if different client impose a different amount of load.
 
Note that it is possible to use multiple connections through the proxy. And I'm expecting we'll support such a feature for other use-cases. What can't happen is the client guaranteeing each connection goes to a different backend, or using a non-hard-coded number of connections.

I've added a note about the fact that we have no guarantee that multiple connections to the proxy would go to different backends.
 

In this case, there will be an external name service record for the server name that points to the IP address of the proxy and has the `is_balancer` bit set.  (Note: We have not yet designed how that bit will be encoded in DNS, but that will be the subject of a separate gRFC.) The proxy mapper implementation will then have to detect two types of addresses:
- When it sees the proxy address, it will set the HTTP CONNECT argument to the original server name.

Eww... So it actually changes the end-host. Note that ability is not described earlier in the doc when proxy mapper is introduced. Nor is it made clear here it is an additional feature.

I'm not sure that I completely understand this comment.

It's definitely required that we be able to set the argument of the HTTP CONNECT request.  Even without case 3, we needed that anyway, because in case 1 we want to use the server name and have the proxy do the name resolution for us, but in case 2 we want to use the IP addresses of the individual backends.  (In case 3, we actually have both cases: we want the proxy to do the resolution for us when talking to the balancer, but we want to specify the individual IPs for the backend connections.)

In C-core, the channel arg is defined as follows:

/// Channel arg indicating the server in HTTP CONNECT request (string).
/// The presence of this arg triggers the use of HTTP CONNECT.
#define GRPC_ARG_HTTP_CONNECT_SERVER "grpc.http_connect_server"

In other words, the presence of this channel arg triggers the use of the HTTP CONNECT handshaker, and the value of the channel arg is the argument to be used in the HTTP CONNECT request.

I've attempted to clarify this wording in the document.
 
 

Why isn't the LB just made public? It can be behind some other type of load balancer. That's what I had expected when discussing earlier. Yes, that means there are more auth hurdles, but it seems more sound.

If I'm understanding you right, that is essentially what is being proposed here.  The idea is that the grpclb balancer is accessed via the HTTP CONNECT proxy.
 


On Wed, Jan 18, 2017 at 2:12 PM, Mark D. Roth <ro...@google.com> wrote:
I've created a gRFC describing how HTTP CONNECT proxies will be supported in gRPC:

https://github.com/grpc/proposal/pull/4

Please keep discussion in this thread.  Thanks!

--
Mark D. Roth <ro...@google.com>
Software Engineer
Google, Inc.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.

For more options, visit https://groups.google.com/d/optout.

Eric Anderson

unread,
Jan 25, 2017, 7:14:20 PM1/25/17
to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
On Wed, Jan 25, 2017 at 2:00 PM, Mark D. Roth <ro...@google.com> wrote:
On Wed, Jan 25, 2017 at 12:59 PM, 'Eric Anderson' via grpc.io <grp...@googlegroups.com> wrote:
The Java implementation is going to have hurdles. I sort of expect issues adhering to the design as precisely as it is defined. I've got to figure out where ProxySelector fits into all of this.

Is this just the concern you've mentioned about how authentication fits in, or is there something more here?

It wasn't an auth issue. It's more of an issue of needing to work with pre-existing APIs and expectations. We will, for example, be able to support a mixed CONNECT usage in case 1. I wouldn't be surprised if you need to eventually as well. That is what wpad.dat solves, after all.

Any references to "load balancing" should say "client-side load balancing".

I think that "client-side load balancing" is misleading when talking about grpclb, since the actual load balancing code happens on the balancers instead of on the client.  But I do take your point here.  I've changed it to use the term "per-call load balancing".

Moving discussion to PR.

   - *All* requests must go through the proxy, both for internal and     external servers.

This is not true. It only applies to external servers. It directly contradicts the earlier "outbound to the internet." I could maybe agree with it if it said "may" instead of "must."

My understanding is that if the http_proxy environment variable is set, then the proxy is used unconditionally for all servers, so I think this is accurate.  I've updated the wording in the description of this case to make it clear that this is not just for outbound traffic.

That's conflating two things: the environment and the configuration. Your description of the environment is not true. When in this environment we expect the http_proxy environment variable as the form of configuration, but that has no impact on how the environment actually behaves.

In this case, there will be an external name service record for the server name that points to the IP address of the proxy and has the `is_balancer` bit set.  (Note: We have not yet designed how that bit will be encoded in DNS, but that will be the subject of a separate gRFC.) The proxy mapper implementation will then have to detect two types of addresses:
- When it sees the proxy address, it will set the HTTP CONNECT argument to the original server name.

Eww... So it actually changes the end-host. Note that ability is not described earlier in the doc when proxy mapper is introduced. Nor is it made clear here it is an additional feature.

I'm not sure that I completely understand this comment.

It's definitely required that we be able to set the argument of the HTTP CONNECT request.  Even without case 3, we needed that anyway, because in case 1 we want to use the server name and have the proxy do the name resolution for us, but in case 2 we want to use the IP addresses of the individual backends.

I'm not certain there's a fundamental need for special behavior between case 1 and 2 concerning the CONNECT string, but in any case, I don't see why the proxy mapper must do it.

I'd expect the proxy mapper to return one of two things:
 - no proxy needed
 - use CONNECT with proxy IP x.x.x.x

That gives the mapper the control it needs without opening the ability to do outrageous things.

I think "when it sees the proxy address" also has fundamental issues, like requiring the proxy to have a hard-coded stable IP. That means you couldn't add a new proxy to the rotation if experiencing too much load.

More likely, in your scheme, I'd expect the "proxy address" to become 100% fake. "Oh! It's 1.1.1.1! That's our secret code for proxy address."

Why isn't the LB just made public? It can be behind some other type of load balancer. That's what I had expected when discussing earlier. Yes, that means there are more auth hurdles, but it seems more sound.

If I'm understanding you right, that is essentially what is being proposed here.  The idea is that the grpclb balancer is accessed via the HTTP CONNECT proxy.

No. I'm proposing that case 3 is the same as case 2, but with a different server configuration.

Case 1 would use the hostname in CONNECT. Case 2 would use IP in CONNECT.

If I want Case 2, but don't want to expose internal IP addresses to unauthenticated clients, I'd just make GRPCLB public and connect to it directly, without CONNECT. DNS returns public IPs, and the GRPCLB communication can be authenticated.

Julien Boeuf

unread,
Jan 25, 2017, 8:40:45 PM1/25/17
to Eric Anderson, Mark D. Roth, grpc. io, Craig Tiller, Abhishek Kumar, Menghan Li
The main issue here I think, as you pointed out, is that naming and the mapper need to agree on something (e.g. the proxy address or a sentinel address as in your example) and this is not great. On the other hand, it allows us to support cases where the name of the LB (or direct backend connection in non-LB case) cannot be really resolved on the client but can be resolved on the proxy. This is a really nice feature of HTTP CONNECT so I'm willing to live with it.
 

Why isn't the LB just made public? It can be behind some other type of load balancer. That's what I had expected when discussing earlier. Yes, that means there are more auth hurdles, but it seems more sound.

If I'm understanding you right, that is essentially what is being proposed here.  The idea is that the grpclb balancer is accessed via the HTTP CONNECT proxy.

No. I'm proposing that case 3 is the same as case 2, but with a different server configuration.

Case 1 would use the hostname in CONNECT. Case 2 would use IP in CONNECT.

If I want Case 2, but don't want to expose internal IP addresses to unauthenticated clients, I'd just make GRPCLB public and connect to it directly, without CONNECT. DNS returns public IPs, and the GRPCLB communication can be authenticated.
I don't think that this would pass some security/deployment requirements (certainly not ours) as it would force the load balancer to run within the client's security zone boundary.
 

Mark D. Roth

unread,
Jan 26, 2017, 11:42:56 AM1/26/17
to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
Replies inline.  Please let me know if you want to chat about any of this in person.

On Wed, Jan 25, 2017 at 4:13 PM, 'Eric Anderson' via grpc.io <grp...@googlegroups.com> wrote:
On Wed, Jan 25, 2017 at 2:00 PM, Mark D. Roth <ro...@google.com> wrote:
On Wed, Jan 25, 2017 at 12:59 PM, 'Eric Anderson' via grpc.io <grp...@googlegroups.com> wrote:
The Java implementation is going to have hurdles. I sort of expect issues adhering to the design as precisely as it is defined. I've got to figure out where ProxySelector fits into all of this.

Is this just the concern you've mentioned about how authentication fits in, or is there something more here?

It wasn't an auth issue. It's more of an issue of needing to work with pre-existing APIs and expectations. We will, for example, be able to support a mixed CONNECT usage in case 1. I wouldn't be surprised if you need to eventually as well. That is what wpad.dat solves, after all.

I think that if/when we need to do that, it should be possible to add the logic in the same place where we currently look at the environment variable, so it's not clear to me that there are any design changes needed to leave room for this possibility.  Do you agree?  If not, can you point out what parts of the design may cause problems for this, and possibly suggest alternatives?
 

Any references to "load balancing" should say "client-side load balancing".

I think that "client-side load balancing" is misleading when talking about grpclb, since the actual load balancing code happens on the balancers instead of on the client.  But I do take your point here.  I've changed it to use the term "per-call load balancing".

Moving discussion to PR.

   - *All* requests must go through the proxy, both for internal and     external servers.

This is not true. It only applies to external servers. It directly contradicts the earlier "outbound to the internet." I could maybe agree with it if it said "may" instead of "must."

My understanding is that if the http_proxy environment variable is set, then the proxy is used unconditionally for all servers, so I think this is accurate.  I've updated the wording in the description of this case to make it clear that this is not just for outbound traffic.

That's conflating two things: the environment and the configuration. Your description of the environment is not true. When in this environment we expect the http_proxy environment variable as the form of configuration, but that has no impact on how the environment actually behaves.

What we actually care about here is the configuration, which is that all connections go through the proxy.  It may be the case that this configuration was created as the simplest way to address a policy that said that only outbound connections must go through the proxy, but our code doesn't actually care about that; it just cares about what configuration we need to support.
 

In this case, there will be an external name service record for the server name that points to the IP address of the proxy and has the `is_balancer` bit set.  (Note: We have not yet designed how that bit will be encoded in DNS, but that will be the subject of a separate gRFC.) The proxy mapper implementation will then have to detect two types of addresses:
- When it sees the proxy address, it will set the HTTP CONNECT argument to the original server name.

Eww... So it actually changes the end-host. Note that ability is not described earlier in the doc when proxy mapper is introduced. Nor is it made clear here it is an additional feature.

I'm not sure that I completely understand this comment.

It's definitely required that we be able to set the argument of the HTTP CONNECT request.  Even without case 3, we needed that anyway, because in case 1 we want to use the server name and have the proxy do the name resolution for us, but in case 2 we want to use the IP addresses of the individual backends.

I'm not certain there's a fundamental need for special behavior between case 1 and 2 concerning the CONNECT string, but in any case, I don't see why the proxy mapper must do it.

Can you say more about why you think we could use the same CONNECT argument in both cases 1 and 2?  I don't see how this could work.  Using the IP address won't work in case 1, because the client can't do name resolution, so we don't actually know the IP address to use.  And using the hostname won't work in case 2, because the client has done the name resolution in this case, and it needs to create a separate subchannel for each server address in order for LB policies like round_robin to work.  If we use the hostname in the CONNECT request, then we have no guarantee that each subchannel will wind up at the right backend.  That's why I think we need a mechanism to specify what the CONNECT argument is in different cases.

Case 2 is triggered from the proxy mapper, which is why the proxy mapper needs to use this mechanism.  In addition, this allows the proxy mapper to set the CONNECT argument differently for the different situations in case 3.  And more generally, I also think it's a more flexible approach that may allow users to write proxy mappers in the future to do things that we're not thinking of right now.
 

I'd expect the proxy mapper to return one of two things:
 - no proxy needed
 - use CONNECT with proxy IP x.x.x.x

That gives the mapper the control it needs without opening the ability to do outrageous things.

I think "when it sees the proxy address" also has fundamental issues, like requiring the proxy to have a hard-coded stable IP. That means you couldn't add a new proxy to the rotation if experiencing too much load.

More likely, in your scheme, I'd expect the "proxy address" to become 100% fake. "Oh! It's 1.1.1.1! That's our secret code for proxy address."

We discussed the possibility of using a sentinel address value like this, but I think that's really ugly.  Using the proxy address seems cleaner, especially since the client needs to know what proxy address to use anyway in order to return that value from the proxy mapper.
 

Why isn't the LB just made public? It can be behind some other type of load balancer. That's what I had expected when discussing earlier. Yes, that means there are more auth hurdles, but it seems more sound.

If I'm understanding you right, that is essentially what is being proposed here.  The idea is that the grpclb balancer is accessed via the HTTP CONNECT proxy.

No. I'm proposing that case 3 is the same as case 2, but with a different server configuration.

Case 1 would use the hostname in CONNECT. Case 2 would use IP in CONNECT.

If I want Case 2, but don't want to expose internal IP addresses to unauthenticated clients, I'd just make GRPCLB public and connect to it directly, without CONNECT. DNS returns public IPs, and the GRPCLB communication can be authenticated.

I think Julien addressed this in his reply.  One of the requirements of case 2 is that the grpclb balancers are inside of the protected environment.
 

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.

For more options, visit https://groups.google.com/d/optout.

Eric Anderson

unread,
Jan 26, 2017, 2:58:33 PM1/26/17
to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
On Thu, Jan 26, 2017 at 8:42 AM, 'Mark D. Roth' via grpc.io <grp...@googlegroups.com> wrote:
   - *All* requests must go through the proxy, both for internal and     external servers.

This is not true. It only applies to external servers. It directly contradicts the earlier "outbound to the internet." I could maybe agree with it if it said "may" instead of "must."

My understanding is that if the http_proxy environment variable is set, then the proxy is used unconditionally for all servers, so I think this is accurate.  I've updated the wording in the description of this case to make it clear that this is not just for outbound traffic.

That's conflating two things: the environment and the configuration. Your description of the environment is not true. When in this environment we expect the http_proxy environment variable as the form of configuration, but that has no impact on how the environment actually behaves.

What we actually care about here is the configuration, which is that all connections go through the proxy.

But... that's not what it says. It says, "We are aware of the following use-cases for TCP-level proxying with gRPC" and then follows with "A corp environment where all traffic (especially traffic outbound to the Internet) must go through a proxy." I'm not aware of that use-case/environment.

As I said though, if you soften "must" to "may", I could get behind it. Otherwise I'm not aware of Case 1 existing at all in the world, so let's not support it.

Really though, I'm not sure how often the proxies are unable to load internal resources. And even if they are able to, the solution isn't probably going to be satisfactory for users, because performance. If it were me I'd frame it where some application only needs to access external resources. http_proxy doesn't solve the mixed case, so let's just call that out up-front.

but our code doesn't actually care about that; it just cares about what configuration we need to support.

Then delete the use cases and just describe the configuration, if that's all that matters. That is to say, the use cases are important for people, not the code. And the document is for people.

I'm harping on this a bit hard because many people don't already understand the use cases.

I'm not certain there's a fundamental need for special behavior between case 1 and 2 concerning the CONNECT string, but in any case, I don't see why the proxy mapper must do it.

Can you say more about why you think we could use the same CONNECT argument in both cases 1 and 2?

Use a string of what to connect to for CONNECT. Sometimes it contains an IP, sometimes it contains a hostname.

Case 2 is triggered from the proxy mapper, which is why the proxy mapper needs to use this mechanism.

I'm not concerned about it using any mechanism. I wanted to reduce its power, which I don't think there should be any argument about whether that is possible.

In addition, this allows the proxy mapper to set the CONNECT argument differently for the different situations in case 3.

And only case 3 benefits. There is no need application-provided overriding of the CONNECT string in cases 1 and 2. So that's why I was trying to figure out an alternative solution for case 3.

And more generally, I also think it's a more flexible approach that may allow users to write proxy mappers in the future to do things that we're not thinking of right now.

"more flexible approach" != "better". Does it not concern you that the proxy mapper may completely replace the decision of the name resolver? That could make for a painful debugging session. I'm fine with it tweaking the results, but in no way do I see complete overriding to be a good thing inherently. If we have to do it, so be it, but it's an anti-feature if we support it unnecessarily.

I'd expect the proxy mapper to return one of two things:
 - no proxy needed
 - use CONNECT with proxy IP x.x.x.x

That gives the mapper the control it needs without opening the ability to do outrageous things.

I think "when it sees the proxy address" also has fundamental issues, like requiring the proxy to have a hard-coded stable IP. That means you couldn't add a new proxy to the rotation if experiencing too much load.

More likely, in your scheme, I'd expect the "proxy address" to become 100% fake. "Oh! It's 1.1.1.1! That's our secret code for proxy address."

We discussed the possibility of using a sentinel address value like this, but I think that's really ugly.  Using the proxy address seems cleaner, especially since the client needs to know what proxy address to use anyway in order to return that value from the proxy mapper.

But you didn't address my concerns that the mapping code doesn't actually know the proxy addresses, unless it never changes. But if it never changes then you have stability issues.

Eric Anderson

unread,
Jan 26, 2017, 3:36:52 PM1/26/17
to Julien Boeuf, Mark D. Roth, grpc. io, Craig Tiller, Abhishek Kumar, Menghan Li
On Wed, Jan 25, 2017 at 5:40 PM, Julien Boeuf <jbo...@google.com> wrote:
I think "when it sees the proxy address" also has fundamental issues, like requiring the proxy to have a hard-coded stable IP. That means you couldn't add a new proxy to the rotation if experiencing too much load.

More likely, in your scheme, I'd expect the "proxy address" to become 100% fake. "Oh! It's 1.1.1.1! That's our secret code for proxy address."
The main issue here I think, as you pointed out, is that naming and the mapper need to agree on something (e.g. the proxy address or a sentinel address as in your example) and this is not great. On the other hand, it allows us to support cases where the name of the LB (or direct backend connection in non-LB case) cannot be really resolved on the client but can be resolved on the proxy. This is a really nice feature of HTTP CONNECT so I'm willing to live with it.

As another alternative, why not fix all of case 1 and support programmatic configuration, in addition to http_proxy?

What would be wrong with specifying configuration that lb.example.com should use proxy 1.2.3.4 (like in case 1, but only for the host lb.example.com)?

A more concrete flow:
  1. client wants to connect to service.example.com
  2. check whether service.example.com should use a proxy. It shouldn't
  3. do DNS SRV resolution for _grpclb._tcp.service.example.com to determine if it should use a LB. It should, and says to connect to lb.example.com
  4. check whether lb.example.com should use a proxy. It should
  5. use CONNECT to lb.example.com (as a hostname)
I recognize step 3 is speculative, but it seems any discussion as to how GRPCLB+DNS works in speculative.

If I want Case 2, but don't want to expose internal IP addresses to unauthenticated clients, I'd just make GRPCLB public and connect to it directly, without CONNECT. DNS returns public IPs, and the GRPCLB communication can be authenticated.
I don't think that this would pass some security/deployment requirements (certainly not ours) as it would force the load balancer to run within the client's security zone boundary.

How so? I'd use a L4 (if using client certs) or L7 (if using bearer tokens) reverse proxy instead of a forward proxy. The reverse proxy would be in the same network location as the forward proxy.

Now, you could complain that now there are two proxies instead of one. I would agree that is a shortcoming. There could be other issues, like maybe it is harder to configure clients. But I don't see any security/trust zone differences between the approaches.

Mark D. Roth

unread,
Jan 26, 2017, 3:46:07 PM1/26/17
to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
On Thu, Jan 26, 2017 at 11:58 AM, Eric Anderson <ej...@google.com> wrote:
On Thu, Jan 26, 2017 at 8:42 AM, 'Mark D. Roth' via grpc.io <grp...@googlegroups.com> wrote:
   - *All* requests must go through the proxy, both for internal and     external servers.

This is not true. It only applies to external servers. It directly contradicts the earlier "outbound to the internet." I could maybe agree with it if it said "may" instead of "must."

My understanding is that if the http_proxy environment variable is set, then the proxy is used unconditionally for all servers, so I think this is accurate.  I've updated the wording in the description of this case to make it clear that this is not just for outbound traffic.

That's conflating two things: the environment and the configuration. Your description of the environment is not true. When in this environment we expect the http_proxy environment variable as the form of configuration, but that has no impact on how the environment actually behaves.

What we actually care about here is the configuration, which is that all connections go through the proxy.

But... that's not what it says. It says, "We are aware of the following use-cases for TCP-level proxying with gRPC" and then follows with "A corp environment where all traffic (especially traffic outbound to the Internet) must go through a proxy." I'm not aware of that use-case/environment.

As I said though, if you soften "must" to "may", I could get behind it. Otherwise I'm not aware of Case 1 existing at all in the world, so let's not support it.

Really though, I'm not sure how often the proxies are unable to load internal resources. And even if they are able to, the solution isn't probably going to be satisfactory for users, because performance. If it were me I'd frame it where some application only needs to access external resources. http_proxy doesn't solve the mixed case, so let's just call that out up-front.

but our code doesn't actually care about that; it just cares about what configuration we need to support.

Then delete the use cases and just describe the configuration, if that's all that matters. That is to say, the use cases are important for people, not the code. And the document is for people.

I'm harping on this a bit hard because many people don't already understand the use cases.

I've attempted to modify the language in the doc to make it clear that the intent is for outbound traffic to go through the proxy, but that this is often implemented by having all traffic go through the proxy.  Please let me know if this addresses your concern.
 

I'm not certain there's a fundamental need for special behavior between case 1 and 2 concerning the CONNECT string, but in any case, I don't see why the proxy mapper must do it.

Can you say more about why you think we could use the same CONNECT argument in both cases 1 and 2?

Use a string of what to connect to for CONNECT. Sometimes it contains an IP, sometimes it contains a hostname.

Case 2 is triggered from the proxy mapper, which is why the proxy mapper needs to use this mechanism.

I'm not concerned about it using any mechanism. I wanted to reduce its power, which I don't think there should be any argument about whether that is possible.

In addition, this allows the proxy mapper to set the CONNECT argument differently for the different situations in case 3.

And only case 3 benefits. There is no need application-provided overriding of the CONNECT string in cases 1 and 2. So that's why I was trying to figure out an alternative solution for case 3.

And more generally, I also think it's a more flexible approach that may allow users to write proxy mappers in the future to do things that we're not thinking of right now.

"more flexible approach" != "better". Does it not concern you that the proxy mapper may completely replace the decision of the name resolver? That could make for a painful debugging session. I'm fine with it tweaking the results, but in no way do I see complete overriding to be a good thing inherently. If we have to do it, so be it, but it's an anti-feature if we support it unnecessarily.

Let me try to make sure I'm understanding you right here.  It sounds like you're suggesting that instead of giving the proxy mapper the ability to control whether the CONNECT argument is a hostname or an IP address, we instead always assume that we should use the IP address in the CONNECT request whenever use of a proxy is indicated by a proxy mapper.  In other words, we would determine the CONNECT argument based on where the use of the proxy was triggered (i.e., from the client channel code vs. from a proxy mapper) instead of having the proxy mapper explicitly control it.  Is that right?

I do understand where you're coming from with wanting to limit the proxy mapper's control, but I am not actually bothered by allowing it to have that control.  In general, I'd rather provide more flexibility where we can and trust people to debug their own problems when they arise.  And it's not as though anyone who has access to add a resolver does not also have access to add a proxy mapper, so there's no security issue.

But that philosophical debate aside, I think that we should focus on case 3, because that's a concrete case that we do want to support.  So far, at least, I have not heard a workable proposal that does not require the proxy mapper to control the CONNECT argument (although I'm certainly still open to new proposals).

I can think of one possible middle-ground approach here, which is that instead of having the proxy mapper specify the CONNECT argument string, it just indicates whether the argument should be the original hostname or the IP address returned by the resolver.  That way, it can control what it needs to but can't completely override the results of the resolver.  I'm not super enthusiastic about this approach, since it seems like it actually makes the interface a bit harder to understand, but I'm curious what you think of it.
 

I'd expect the proxy mapper to return one of two things:
 - no proxy needed
 - use CONNECT with proxy IP x.x.x.x

That gives the mapper the control it needs without opening the ability to do outrageous things.

I think "when it sees the proxy address" also has fundamental issues, like requiring the proxy to have a hard-coded stable IP. That means you couldn't add a new proxy to the rotation if experiencing too much load.

More likely, in your scheme, I'd expect the "proxy address" to become 100% fake. "Oh! It's 1.1.1.1! That's our secret code for proxy address."

We discussed the possibility of using a sentinel address value like this, but I think that's really ugly.  Using the proxy address seems cleaner, especially since the client needs to know what proxy address to use anyway in order to return that value from the proxy mapper.

But you didn't address my concerns that the mapping code doesn't actually know the proxy addresses, unless it never changes. But if it never changes then you have stability issues.

The whole point of the proxy mapper is to provide a hook for the logic that knows what proxy address to use.  It has to have some source of that data, whether it be hard-coded or read from a file or something else entirely.

In other words, what I'm saying is that it's up to the proxy mapper's author to decide how it will get this data, but that the author has to solve that problem anyway as an inherent part of writing the proxy mapper.  Therefore, it does not add any additional requirement to use the proxy address that it already has to know in its own logic.

Eric Anderson

unread,
Jan 27, 2017, 12:17:04 PM1/27/17
to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
On Thu, Jan 26, 2017 at 12:46 PM, Mark D. Roth <ro...@google.com> wrote:
I've attempted to modify the language in the doc to make it clear that the intent is for outbound traffic to go through the proxy, but that this is often implemented by having all traffic go through the proxy.  Please let me know if this addresses your concern.

The modification looks great.

Let me try to make sure I'm understanding you right here.  It sounds like you're suggesting that instead of giving the proxy mapper the ability to control whether the CONNECT argument is a hostname or an IP address, we instead always assume that we should use the IP address in the CONNECT request whenever use of a proxy is indicated by a proxy mapper.  In other words, we would determine the CONNECT argument based on where the use of the proxy was triggered (i.e., from the client channel code vs. from a proxy mapper) instead of having the proxy mapper explicitly control it.  Is that right?

Yes. And that seems to agree with how the different proxy choosing logic will work; the first primarily consumes hostnames and returns proxy hostnames (which is http_proxy in C) and the second one primarily consumes IPs and returns proxy IPs.

But that philosophical debate aside, I think that we should focus on case 3, because that's a concrete case that we do want to support.  So far, at least, I have not heard a workable proposal that does not require the proxy mapper to control the CONNECT argument (although I'm certainly still open to new proposals).

I've provided two proposals. Neither of which seem debunked as of yet. I could totally agree they may be worse than what you are proposing, but the discussion hasn't gotten to that point. The mentioned security issue of the first proposal seemed to ignore the fact that a reverse proxy could be used to "protect" the LB, in an identical fashion to any forward proxy.

I can think of one possible middle-ground approach here, which is that instead of having the proxy mapper specify the CONNECT argument string, it just indicates whether the argument should be the original hostname or the IP address returned by the resolver.  That way, it can control what it needs to but can't completely override the results of the resolver.  I'm not super enthusiastic about this approach, since it seems like it actually makes the interface a bit harder to understand, but I'm curious what you think of it.

Meh. It still requires coupling the proxy mapper with other parts of the system. And I'm not convinced that coupling works. It does make me feel a bit better about the predictability of the system though. And that is important to have an orthogonal system which helps as you try to add more features/refactor.

But you didn't address my concerns that the mapping code doesn't actually know the proxy addresses, unless it never changes. But if it never changes then you have stability issues.

The whole point of the proxy mapper is to provide a hook for the logic that knows what proxy address to use.  It has to have some source of that data, whether it be hard-coded or read from a file or something else entirely.

It has to provide an address. It doesn't have to have global knowledge of all possible addresses, which are even coming from a separate system. Will the two different systems be updated in concert, which may need to be simultaneously/atomically? Seems unlikely, in part because it is non-obvious they need to be. But let's say they do. Then we also have to deal with configuration changes during the time the name resolver runs and the proxy mapper runs. This is asking for bugs. The design is brittle and feels like it requires different parts of the system to be tied together with duct tape.

If you assume "one proxy" which has "one static IP" and everything is hard-coded, then the design is fine. But that seems unlikely to describe a productionized system. And that's why I would feel forced to use the "magic IP" that it seems you have previously rejected.

Mark D. Roth

unread,
Jan 27, 2017, 3:16:41 PM1/27/17
to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
On Fri, Jan 27, 2017 at 9:16 AM, Eric Anderson <ej...@google.com> wrote:
On Thu, Jan 26, 2017 at 12:46 PM, Mark D. Roth <ro...@google.com> wrote:
I've attempted to modify the language in the doc to make it clear that the intent is for outbound traffic to go through the proxy, but that this is often implemented by having all traffic go through the proxy.  Please let me know if this addresses your concern.

The modification looks great.

Let me try to make sure I'm understanding you right here.  It sounds like you're suggesting that instead of giving the proxy mapper the ability to control whether the CONNECT argument is a hostname or an IP address, we instead always assume that we should use the IP address in the CONNECT request whenever use of a proxy is indicated by a proxy mapper.  In other words, we would determine the CONNECT argument based on where the use of the proxy was triggered (i.e., from the client channel code vs. from a proxy mapper) instead of having the proxy mapper explicitly control it.  Is that right?

Yes. And that seems to agree with how the different proxy choosing logic will work; the first primarily consumes hostnames and returns proxy hostnames (which is http_proxy in C) and the second one primarily consumes IPs and returns proxy IPs.

I don't think that's actually entirely correct.  The first case doesn't consume anything; it unconditionally sets the hostname to be resolved.  And the second case can consume either the hostname or the IP.

This is more philosophical than practical, but I think the distinction between the two cases is not really about what they consume or provide; it's actually about whether we can do name resolution directly or whether we need to rely on the proxy to do it for us.
 

But that philosophical debate aside, I think that we should focus on case 3, because that's a concrete case that we do want to support.  So far, at least, I have not heard a workable proposal that does not require the proxy mapper to control the CONNECT argument (although I'm certainly still open to new proposals).

I've provided two proposals. Neither of which seem debunked as of yet. I could totally agree they may be worse than what you are proposing, but the discussion hasn't gotten to that point. The mentioned security issue of the first proposal seemed to ignore the fact that a reverse proxy could be used to "protect" the LB, in an identical fashion to any forward proxy.

I don't quite understand the proposed reverse proxy approach.  Can you explain how that would work in more detail?
 

I can think of one possible middle-ground approach here, which is that instead of having the proxy mapper specify the CONNECT argument string, it just indicates whether the argument should be the original hostname or the IP address returned by the resolver.  That way, it can control what it needs to but can't completely override the results of the resolver.  I'm not super enthusiastic about this approach, since it seems like it actually makes the interface a bit harder to understand, but I'm curious what you think of it.

Meh. It still requires coupling the proxy mapper with other parts of the system. And I'm not convinced that coupling works. It does make me feel a bit better about the predictability of the system though. And that is important to have an orthogonal system which helps as you try to add more features/refactor.

If this doesn't actually address your concerns, then let's not consider it, since I'm not very enthusiastic about it either.
 

But you didn't address my concerns that the mapping code doesn't actually know the proxy addresses, unless it never changes. But if it never changes then you have stability issues.

The whole point of the proxy mapper is to provide a hook for the logic that knows what proxy address to use.  It has to have some source of that data, whether it be hard-coded or read from a file or something else entirely.

It has to provide an address. It doesn't have to have global knowledge of all possible addresses, which are even coming from a separate system. Will the two different systems be updated in concert, which may need to be simultaneously/atomically? Seems unlikely, in part because it is non-obvious they need to be. But let's say they do.
Then we also have to deal with configuration changes during the time the name resolver runs and the proxy mapper runs. This is asking for bugs. The design is brittle and feels like it requires different parts of the system to be tied together with duct tape.

I agree that case 3 requires different parts of the system to be coordinated.  For example, assuming that your proxy mapper implementation is getting the list of proxy addresses from a local file, you would need to first push an updated list that contains the new proxy address to all clients.  Then, once all clients have been updated, you can add the new proxy to DNS.

I agree that this is cumbersome, but I think it's an inherent problem with case 3, because you need some way to configure the clients.  I think the only way to completely avoid this would be to go with a gRPC-level proxy, as described in the "Rationale" section of the doc.
 

If you assume "one proxy" which has "one static IP" and everything is hard-coded, then the design is fine. But that seems unlikely to describe a productionized system. And that's why I would feel forced to use the "magic IP" that it seems you have previously rejected.

There are a couple of reasons that I don't like the "magic IP" approach.  First, it requires writing a custom resolver in addition to a custom proxy mapper, when it really should be fully possible to use the existing resolver.  And second, I'm not a big fan of "sentinel" values, since it's often hard to find a value that will never be used in real life.

Eric Anderson

unread,
Jan 27, 2017, 4:32:06 PM1/27/17
to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
On Fri, Jan 27, 2017 at 12:16 PM, 'Mark D. Roth' via grpc.io <grp...@googlegroups.com> wrote:
Yes. And that seems to agree with how the different proxy choosing logic will work; the first primarily consumes hostnames and returns proxy hostnames (which is http_proxy in C) and the second one primarily consumes IPs and returns proxy IPs.

I don't think that's actually entirely correct.  The first case doesn't consume anything; it unconditionally sets the hostname to be resolved.

The first case will consume a hostname in Java. Observing the hostname is necessary to fix the mixed internal/external in an expanded view of case 1. Since the Java APIs support that mixed case, Java ends up needing to support them. And if C ever needed to support the mixed case (which seems likely to me), then it would also need to use the hostname.
 
And the second case can consume either the hostname or the IP.

And I wouldn't be surprised if only IP were used. We're not aware of a user of it.

This is more philosophical than practical,

My further explanation there was meant to be more philosophical, as an explanation that this "special case" is pretty normal and sort of agrees with the rest of the design.

But that philosophical debate aside, I think that we should focus on case 3, because that's a concrete case that we do want to support.  So far, at least, I have not heard a workable proposal that does not require the proxy mapper to control the CONNECT argument (although I'm certainly still open to new proposals).

I've provided two proposals. Neither of which seem debunked as of yet. I could totally agree they may be worse than what you are proposing, but the discussion hasn't gotten to that point. The mentioned security issue of the first proposal seemed to ignore the fact that a reverse proxy could be used to "protect" the LB, in an identical fashion to any forward proxy.

I don't quite understand the proposed reverse proxy approach.  Can you explain how that would work in more detail?

Case 3 as stated today (for contrasting)
    1. client wants to connect to service.example.com
    1. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
    2. do a DNS resolution for lb.example.com, get IP 1.2.3.4
    3. ask the proxy mapper about IP 1.2.3.4, it recognizes the IP as the proxy and says to use "CONNECT service.example.com" via proxy IP 1.2.3.4
    4. connect to proxy 1.2.3.4, it performs internal resolution of service.example.com and connects to one of the hosts
    Case 3 using reverse proxy for LB
      1. client wants to connect to service.example.com
      1. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
      2. do a DNS resolution for lb.example.com, get IP 1.2.3.4
      3. (different starting here) connect to 1.2.3.4, which is a transparent reverse proxy
      4. Perform an RPC to 1.2.3.4. Host header is lb.example.com. The proxy performs internal mapping of lb.example.com to internal addresses and connects to one of the hosts, forwarding the RPC.

      I agree that case 3 requires different parts of the system to be coordinated.  For example, assuming that your proxy mapper implementation is getting the list of proxy addresses from a local file, you would need to first push an updated list that contains the new proxy address to all clients.  Then, once all clients have been updated, you can add the new proxy to DNS.

      And the file needs to contain old proxy addresses that should be used for detection but not be used.

      Okay. So we're on the same page there.

      I agree that this is cumbersome, but I think it's an inherent problem with case 3, because you need some way to configure the clients.

      I agree you need to be able to configure the clients. I understand that something needs to tell the client what to do. My concern was the pain of updating the proxy mapping list in concert with name resolution. And because of that I would recommend implementors to use the magic IP, because it has less operational overhead and less likelihood of failing.

      If you assume "one proxy" which has "one static IP" and everything is hard-coded, then the design is fine. But that seems unlikely to describe a productionized system. And that's why I would feel forced to use the "magic IP" that it seems you have previously rejected.

      There are a couple of reasons that I don't like the "magic IP" approach.  First, it requires writing a custom resolver in addition to a custom proxy mapper,

      No, I'd just have DNS return the trash IP.

      I'm not a big fan of "sentinel" values, since it's often hard to find a value that will never be used in real life.

      I would gladly accept a magic value instead of needing to make sure two systems stay in sync and rollouts happen properly. And I would quickly recommend that to others. And if I started explaining the gotchas of the alternative, I'd expect them to quickly be thankful for the recommendation since it is less code to write and less operational complexity.

      Mark D. Roth

      unread,
      Jan 27, 2017, 6:41:30 PM1/27/17
      to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
      On Fri, Jan 27, 2017 at 1:31 PM, Eric Anderson <ej...@google.com> wrote:
      On Fri, Jan 27, 2017 at 12:16 PM, 'Mark D. Roth' via grpc.io <grp...@googlegroups.com> wrote:
      Yes. And that seems to agree with how the different proxy choosing logic will work; the first primarily consumes hostnames and returns proxy hostnames (which is http_proxy in C) and the second one primarily consumes IPs and returns proxy IPs.

      I don't think that's actually entirely correct.  The first case doesn't consume anything; it unconditionally sets the hostname to be resolved.

      The first case will consume a hostname in Java. Observing the hostname is necessary to fix the mixed internal/external in an expanded view of case 1. Since the Java APIs support that mixed case, Java ends up needing to support them. And if C ever needed to support the mixed case (which seems likely to me), then it would also need to use the hostname.
       
      And the second case can consume either the hostname or the IP.

      And I wouldn't be surprised if only IP were used. We're not aware of a user of it.

      This is more philosophical than practical,

      My further explanation there was meant to be more philosophical, as an explanation that this "special case" is pretty normal and sort of agrees with the rest of the design.

      But that philosophical debate aside, I think that we should focus on case 3, because that's a concrete case that we do want to support.  So far, at least, I have not heard a workable proposal that does not require the proxy mapper to control the CONNECT argument (although I'm certainly still open to new proposals).

      I've provided two proposals. Neither of which seem debunked as of yet. I could totally agree they may be worse than what you are proposing, but the discussion hasn't gotten to that point. The mentioned security issue of the first proposal seemed to ignore the fact that a reverse proxy could be used to "protect" the LB, in an identical fashion to any forward proxy.

      I don't quite understand the proposed reverse proxy approach.  Can you explain how that would work in more detail?

      Case 3 as stated today (for contrasting)
      1. client wants to connect to service.example.com
      2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
      3. do a DNS resolution for lb.example.com, get IP 1.2.3.4
      4. ask the proxy mapper about IP 1.2.3.4, it recognizes the IP as the proxy and says to use "CONNECT service.example.com" via proxy IP 1.2.3.4
      5. connect to proxy 1.2.3.4, it performs internal resolution of service.example.com and connects to one of the hosts
      That's not actually an accurate representation of how case 3 is proposed to work in the current document.  The document is actually proposing the following:
        1. client wants to connect to service.example.com
        2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
        3. do a DNS resolution for lb.example.com, get IP 1.2.3.4
        1. ask the proxy mapper about IP 1.2.3.4; it recognizes the IP as the proxy and says to use "CONNECT lb.example.com" via proxy IP 1.2.3.4
        2. connect to proxy 1.2.3.4 with "CONNECT lb.example.com"; proxy does internal name resolution and connects to one of the load balancers
        3. send grpclb request; get response indicating that the backend server is 5.6.7.8
        4. ask the proxy mapper about IP 5.6.7.8; it recognizes it as an internal IP address and says to use "CONNECT 5.6.7.8" via proxy IP 1.2.3.4
        5. connect to proxy 1.2.3.4 with "CONNECT 5.6.7.8"; proxy connects to the specified backend server
        Remember that the goal of case 3 is to allow client-side per-call load balancing, despite not being able to resolve the internal names of the backend servers.  Instead of getting those from DNS, we get them from the grpclb balancer.

         
        Case 3 using reverse proxy for LB
        1. client wants to connect to service.example.com
        2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
        3. do a DNS resolution for lb.example.com, get IP 1.2.3.4
        4. (different starting here) connect to 1.2.3.4, which is a transparent reverse proxy
        5. Perform an RPC to 1.2.3.4. Host header is lb.example.com. The proxy performs internal mapping of lb.example.com to internal addresses and connects to one of the hosts, forwarding the RPC.
        The reverse proxy approach is essentially what I originally suggested for case 3, but Julien argued that it would be a security problem.

        Keep in mind that in case 3, the grpclb load balancers and the server backends are in the same internal domain, with the same access restrictions.  If we can't use a reverse proxy to access the server backends, I don't think we'll be able to do that for the grpclb balancers either.

        That having been said, as a security issue, Julien can address this directly.

         

        I agree that case 3 requires different parts of the system to be coordinated.  For example, assuming that your proxy mapper implementation is getting the list of proxy addresses from a local file, you would need to first push an updated list that contains the new proxy address to all clients.  Then, once all clients have been updated, you can add the new proxy to DNS.

        And the file needs to contain old proxy addresses that should be used for detection but not be used.

        Okay. So we're on the same page there.

        I agree that this is cumbersome, but I think it's an inherent problem with case 3, because you need some way to configure the clients.

        I agree you need to be able to configure the clients. I understand that something needs to tell the client what to do. My concern was the pain of updating the proxy mapping list in concert with name resolution. And because of that I would recommend implementors to use the magic IP, because it has less operational overhead and less likelihood of failing.

        If you assume "one proxy" which has "one static IP" and everything is hard-coded, then the design is fine. But that seems unlikely to describe a productionized system. And that's why I would feel forced to use the "magic IP" that it seems you have previously rejected.

        There are a couple of reasons that I don't like the "magic IP" approach.  First, it requires writing a custom resolver in addition to a custom proxy mapper,

        No, I'd just have DNS return the trash IP.

        This actually makes me even less happy with the sentinel-value approach, because now we wouldn't just be using the value internally in a particular piece of software; we'd actually be publishing it in a way that would be very confusing when people were trying to debug the system from an operational perspective.  ("Wait, why is the client even attempting to connect to the proxy, since DNS points it at this bogus IP address?")

         

        I'm not a big fan of "sentinel" values, since it's often hard to find a value that will never be used in real life.

        I would gladly accept a magic value instead of needing to make sure two systems stay in sync and rollouts happen properly. And I would quickly recommend that to others. And if I started explaining the gotchas of the alternative, I'd expect them to quickly be thankful for the recommendation since it is less code to write and less operational complexity.

        I do see your point, but I think that the sentinel-value approach has operational downsides of its own.  There are pros and cons here, so it basically boils down to a judgement call, and personally, I prefer the alternative that's currently outlined in the doc.


        Just thinking out loud here about whether there's another alternative -- this is a purely brainstorming-level idea, so please feel free to shoot holes in it.  What if we had another type of SRV record specifically for HTTP CONNECT proxy use?  The presence of that record would tell the client to connect to that address and issue a CONNECT request using the originally looked up name.  With that, case 3 would look something like this:
          1. client wants to connect to service.example.com
          2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
          1. do DNS SRV resolution for _grpc_proxy._tcp.lb.example.com; you find it is a proxy with name proxy.example.com
          2. do DNS lookup for proxy.example.com; get IP 1.2.3.4
          3. connect to proxy 1.2.3.4 with "CONNECT lb.example.com"; proxy does internal name resolution and connects to one of the load balancers
          4. send grpclb request; get response indicating that the backend server is 5.6.7.8
          5. ask the proxy mapper about IP 5.6.7.8; it recognizes it as an internal IP address and says to use "CONNECT 5.6.7.8" via proxy IP 1.2.3.4
          6. connect to proxy 1.2.3.4 with "CONNECT 5.6.7.8"; proxy connects to the specified backend server
          In this case, there's no proxy mapper involved in the grpclb connection, only for the backend connections, so the proxy mapper doesn't need to sync up with the resolver result (which would seem to ameliorate your concern).  The down-sides are that there are more DNS lookups involved, and that we might need to extend the resolver API so that it can pass down richer information (not sure about that -- would need to think about this more fully).

          I'm not sure that this approach is really worth the additional complexity, but I figured I'd shoot it out there and see what you think.  Thoughts...?

          Mark D. Roth

          unread,
          Jan 27, 2017, 6:53:43 PM1/27/17
          to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li, ♾ Bill Clarke
          (+ll...@google.com)

          During the course of this discussion, it occured to me that the proposed solution for case 3 may affect how the clients are distributed across the grpclb load balancers.

          Bill, in this scenario, the client will not know the IP addresses of the balancers directly; instead, it will just know the proxy address and will depend on the proxy to resolve the internal names of the balancers.  This means that even if there are many balancers, the client will not have connections to all of them, but every time it connects to the proxy, it's likely to get a different balancer.  So if the individual balancer that it happens to be talking to goes down, it will create a new connection to the proxy, which will pick a new balancer task to talk to.

          I think this is probably fine w.r.t. our conversations about properly balancing load across the balancers, but I want to make sure that there's no problem here that I might be missing.  Does this sound reasonable to you?

          Thanks!

          Julien Boeuf

          unread,
          Jan 27, 2017, 7:20:05 PM1/27/17
          to Mark D. Roth, Eric Anderson, grpc. io, Craig Tiller, Abhishek Kumar, Menghan Li
          I agree: we don't want clients to be able to hit the load balancers (or the backends) before the proxy has a chance to filter things like source IP address and/or traffic inspection. You could argue that the transparent proxy could perform these functions, however this would mean that:
          - we have now 2 proxies that perform the same function.
          - it is not possible to signal anything with regards to the shape of the traffic from the client to the transparent reverse proxy since there is no HTTP CONNECT request. This makes traffic inspection very difficult as it has to rely on heuristics as opposed to clear and unambiguous signalling.  

          Mark D. Roth

          unread,
          Jan 30, 2017, 4:16:19 PM1/30/17
          to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li, ♾ Bill Clarke
          FYI, I chatted with Bill about this today, and he agreed that this approach should be fine w.r.t. properly balancing load across the grpclb balancers.

          The only down-side we can think of is that when a client is talking to a balancer that goes down, it will take the client longer to connect to a different balancer than it probably would if the client were not going through a proxy, because the client does not already have other subchannels already configured that it can try out.  Instead, it will need to reconnect to the proxy and wait for the proxy to connect to a balancer task that is up, which will likely be slower (especially if multiple attempts are needed to find a balancer task that is up).  However, this is probably acceptable.

          Julien Boeuf

          unread,
          Jan 30, 2017, 4:44:56 PM1/30/17
          to Mark D. Roth, Eric Anderson, grpc. io, Craig Tiller, Abhishek Kumar, Menghan Li, ♾ Bill Clarke
          sgtm. Thanks much for checking.

               Julien.

          Eric Anderson

          unread,
          Jan 31, 2017, 1:11:42 PM1/31/17
          to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          On Fri, Jan 27, 2017 at 3:41 PM, Mark D. Roth <ro...@google.com> wrote:
          On Fri, Jan 27, 2017 at 1:31 PM, Eric Anderson <ej...@google.com> wrote:
          Case 3 as stated today (for contrasting)
          1. client wants to connect to service.example.com
          2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
          3. do a DNS resolution for lb.example.com, get IP 1.2.3.4
          4. ask the proxy mapper about IP 1.2.3.4, it recognizes the IP as the proxy and says to use "CONNECT service.example.com" via proxy IP 1.2.3.4
          5. connect to proxy 1.2.3.4, it performs internal resolution of service.example.com and connects to one of the hosts
          That's not actually an accurate representation of how case 3 is proposed to work in the current document.

          Oh, sorry. #4 and 5 should have used lb.example.com instead of service.example.com. That seems to be the only changes you made.

          Just thinking out loud here about whether there's another alternative -- this is a purely brainstorming-level idea, so please feel free to shoot holes in it.  What if we had another type of SRV record specifically for HTTP CONNECT proxy use?

          I considered something of that ilk, but wasn't very excited. I do agree it could work. I don't think we want a separate SRV for it, because that means another lookup for all clients for this one rare case.

          But maybe we could shoe-horn it somewhere. Like service config. Probably icky.

          This does have the advantage that it can be rolled out seemlessly, without an update to the client.

          1. client wants to connect to service.example.com
          2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
          3. do DNS SRV resolution for _grpc_proxy._tcp.lb.example.com; you find it is a proxy with name proxy.example.com
          Whoa. So do a SRV for the LB. I didn't quite expect that as it goes a bit against normal SRV, but it makes sense. It's interesting. Again, for reasons above, I don't think we want to go with this, but it does seem like it could work.

          Mark D. Roth

          unread,
          Jan 31, 2017, 1:25:20 PM1/31/17
          to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          On Tue, Jan 31, 2017 at 10:11 AM, Eric Anderson <ej...@google.com> wrote:
          On Fri, Jan 27, 2017 at 3:41 PM, Mark D. Roth <ro...@google.com> wrote:
          On Fri, Jan 27, 2017 at 1:31 PM, Eric Anderson <ej...@google.com> wrote:
          Case 3 as stated today (for contrasting)
          1. client wants to connect to service.example.com
          2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
          3. do a DNS resolution for lb.example.com, get IP 1.2.3.4
          4. ask the proxy mapper about IP 1.2.3.4, it recognizes the IP as the proxy and says to use "CONNECT service.example.com" via proxy IP 1.2.3.4
          5. connect to proxy 1.2.3.4, it performs internal resolution of service.example.com and connects to one of the hosts
          That's not actually an accurate representation of how case 3 is proposed to work in the current document.

          Oh, sorry. #4 and 5 should have used lb.example.com instead of service.example.com. That seems to be the only changes you made.

          Well, I also added a few additional steps after step 5, but perhaps they are less relevant to the current discussion.
           

          Just thinking out loud here about whether there's another alternative -- this is a purely brainstorming-level idea, so please feel free to shoot holes in it.  What if we had another type of SRV record specifically for HTTP CONNECT proxy use?

          I considered something of that ilk, but wasn't very excited. I do agree it could work. I don't think we want a separate SRV for it, because that means another lookup for all clients for this one rare case.

          But maybe we could shoe-horn it somewhere. Like service config. Probably icky.

          This does have the advantage that it can be rolled out seemlessly, without an update to the client.

          1. client wants to connect to service.example.com
          2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you find it is a LB with name lb.example.com
          3. do DNS SRV resolution for _grpc_proxy._tcp.lb.example.com; you find it is a proxy with name proxy.example.com
          Whoa. So do a SRV for the LB. I didn't quite expect that as it goes a bit against normal SRV, but it makes sense. It's interesting. Again, for reasons above, I don't think we want to go with this, but it does seem like it could work.

          Yeah, it seems a bit unnecessarily complex to me, too.


          So I think this leaves us with the current design.  I've added a note to the doc explaining the need for coordination between the resolver and the proxy mapper for case 3.

          Eric Anderson

          unread,
          Jan 31, 2017, 3:41:18 PM1/31/17
          to Julien Boeuf, Mark D. Roth, grpc. io, Craig Tiller, Abhishek Kumar, Menghan Li
          On Fri, Jan 27, 2017 at 4:20 PM, Julien Boeuf <jbo...@google.com> wrote:
          Keep in mind that in case 3, the grpclb load balancers and the server backends are in the same internal domain, with the same access restrictions.  If we can't use a reverse proxy to access the server backends, I don't think we'll be able to do that for the grpclb balancers either.

          That having been said, as a security issue, Julien can address this directly.
          I agree: we don't want clients to be able to hit the load balancers (or the backends) before the proxy has a chance to filter things like source IP address and/or traffic inspection. You could argue that the transparent proxy could perform these functions, however this would mean that:
          - we have now 2 proxies that perform the same function.

          Yep. I agree. 

          - it is not possible to signal anything with regards to the shape of the traffic from the client to the transparent reverse proxy since there is no HTTP CONNECT request. This makes traffic inspection very difficult as it has to rely on heuristics as opposed to clear and unambiguous signalling.
           
          I don't understand this. "Traffic inspection" should be much easier with reverse proxy, since it actually sees the traffic vs an encrypted blob. Are you suggesting that we'll support putting extra request headers in the CONNECT? I didn't think we were going to do that, other than for auth, and we already know how to handle that sort of auth in the reverse proxy case.

          We can hold this part of the conversation, because of the double proxy being a painful.

          Eric Anderson

          unread,
          Jan 31, 2017, 3:44:05 PM1/31/17
          to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          On Tue, Jan 31, 2017 at 10:25 AM, Mark D. Roth <ro...@google.com> wrote:
          So I think this leaves us with the current design.

          This solution wasn't shown to be lacking:

          As another alternative, why not fix all of case 1 and support programmatic configuration, in addition to http_proxy?

          What would be wrong with specifying configuration that lb.example.com should use proxy 1.2.3.4 (like in case 1, but only for the host lb.example.com)?

          A more concrete flow:
          1. client wants to connect to service.example.com
          1. check whether service.example.com should use a proxy. It shouldn't
          1. do DNS SRV resolution for _grpclb._tcp.service.example.com to determine if it should use a LB. It should, and says to connect to lb.example.com
          1. check whether lb.example.com should use a proxy. It should
          2. use CONNECT to lb.example.com (as a hostname)

          Mark D. Roth

          unread,
          Feb 1, 2017, 11:57:39 AM2/1/17
          to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          This approach would require an additional hook in the client channel code, separate from what we're already providing via the proxy mapper hook.  I'd prefer to avoid providing two separate hooks for this.

          Eric Anderson

          unread,
          Feb 1, 2017, 1:04:22 PM2/1/17
          to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          We will need it eventually. Otherwise we can't truly support the mixed internal/external communication when DNS is unavailable for external hosts (the full form of case 1).

          So the argument can be "we don't want to support it now" but I don't think I could accept "we don't ever want to support it."

          Mark D. Roth

          unread,
          Feb 1, 2017, 1:16:39 PM2/1/17
          to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          Do we know that there are cases where we'll need to support this?  I don't doubt that there are cases where only connections to external servers should go through the proxy, but I wonder how many cases there are where users will need to use both internal and external servers with gRPC.

          If/when we do support this, do we have some idea how this would be configured?  Is there some standard way of configuring which server names are internal vs. which are external, or would it be custom code for each environment?

          Eric Anderson

          unread,
          Feb 1, 2017, 1:37:39 PM2/1/17
          to Mark D. Roth, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          On Wed, Feb 1, 2017 at 10:16 AM, Mark D. Roth <ro...@google.com> wrote:
          Do we know that there are cases where we'll need to support this?  I don't doubt that there are cases where only connections to external servers should go through the proxy, but I wonder how many cases there are where users will need to use both internal and external servers with gRPC.

          So let me get this straight: we think there are users who can't use DNS for external hostnames and need to use a proxy for external services. We know this is most likely in corporate and server environments (for security). But we wouldn't support those user's jumping further on the gRPC bandwagon because it either wouldn't work or performance and overhead would be poor.

          Or conversely: if you are using gRPC for your own communication, you cannot use gRPC to communicate with "the cloud", except if you accept a performance drop for the existing traffic and increased overhead.

          If/when we do support this, do we have some idea how this would be configured?  Is there some standard way of configuring which server names are internal vs. which are external, or would it be custom code for each environment?

          Probably custom code.

          Mark D. Roth

          unread,
          Feb 2, 2017, 11:10:21 AM2/2/17
          to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          Eric and I chatted about this at lunch yesterday.  This message is intended mainly to document what we agreed upon and the logic we used to get there.  (Eric, please correct anything I might get wrong here.)

          My understanding of Eric's concern is that he believes that we will ultimately need to support the situation in case 1 where the client needs to talk to both internal and external servers, where only the latter should go through the proxy, and he was trying to drive us toward a solution that would address both that case and case 3 in a way that does not suffer from the drawback he pointed out for case 3 (i.e., needing to update the list of proxies known to the proxy mapper before you can add them to DNS).

          For case 1 with both internal and external servers, we agreed that we would need a hook right before the resolver is called to conditionally override the name being passed to the resolver.  That way, we could insert logic that would ignore internal names but override external names to use the proxy name instead.

          Unfortunately, that approach doesn't really help address the problem Eric wanted to solve for case 3, because there we would want to use this hook to select the proxy only for the load balancers, not for the public name of the server.  However, as per the proposed mechanism in gRFC A5 (https://github.com/grpc/proposal/pull/10) for using SRV records to indicate load balancers, we would only pass the public name of the server to the resolver; the name for the load balancers will only be seen internally in the resolver, so it would not be visible to a hook right before the resolver is called.

          We could only think of two alternatives to this, both of which are unappealing: we could put the hook inside of the resolver itself, in which case every resolver implementation would need to support it, or we could change the resolver API such that we have to exit the resolver and then re-enter each time one name lookup spawns another one, which makes the API more complex.  Neither of those seems worthwhile, given that we have a simpler solution to the case 1 situation that Eric was worried about and that the primary customer for case 3 (Julien) is fine with the currently proposed solution, despite the drawback that Eric has pointed out.

          So, we agreed to go ahead and add the hook for case 1 with both internal and external servers, but to leave the currently proposed solution for case 3 as-is.

          I've sent out a PR adding the new hook for case 1 to C-core (https://github.com/grpc/grpc/pull/9557).

          I will update the gRFC and update this thread when that's done.

          Mark D. Roth

          unread,
          Feb 2, 2017, 11:37:22 AM2/2/17
          to Eric Anderson, grpc. io, Julien Boeuf, Craig Tiller, Abhishek Kumar, Menghan Li
          The gRFC has now been updated with the latest changes.

          Mark D. Roth

          unread,
          Mar 13, 2017, 4:53:19 PM3/13/17
          to grpc. io, Julien Boeuf, Eric Anderson, Craig Tiller, Abhishek Kumar, Menghan Li
          There was a question in the PR about how we're handling authentication, but in an attempt to keep all design discussion in this thread, I'm going to answer it here.

          The current proposal does not attempt to address proxy authentication.  I know that this is something that Eric has been considering for Java, since there are issues there related to finding the proxy via Java system properties that may assume that you are also using Java system properties for authentication to the proxy.  In the other languages, we have not attempted to address this at all, but it's probably something that we should have a cross-language design for.

          I don't want to hold up the current proposal on this feature, but we can probably try to put together a follow-up proposal to address this sometime in the next couple of quarters.

          On Wed, Jan 18, 2017 at 2:12 PM, Mark D. Roth <ro...@google.com> wrote:
          I've created a gRFC describing how HTTP CONNECT proxies will be supported in gRPC:

          https://github.com/grpc/proposal/pull/4

          Please keep discussion in this thread.  Thanks!
          --
          Mark D. Roth <ro...@google.com>
          Software Engineer
          Google, Inc.

          Abhishek Kumar

          unread,
          Mar 17, 2017, 6:17:00 PM3/17/17
          to grpc.io, jbo...@google.com, ej...@google.com, cti...@google.com, abhi...@google.com, meng...@google.com
          I am ready to approve and merge this proposal. We can address proxy authentication in a separate proposal as needed.

          Thanks
          -Abhishek

          Mark D. Roth

          unread,
          Mar 20, 2017, 11:27:10 AM3/20/17
          to Abhishek Kumar, grpc.io, Julien Boeuf, Eric Anderson, Craig Tiller, Menghan Li
          I did a little bit of final wordsmithing and added a couple of links.  I think this gRFC is ready to be merged.

          Abhishek, please feel free to merge.  Thanks!

          --
          You received this message because you are subscribed to the Google Groups "grpc.io" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
          To post to this group, send email to grp...@googlegroups.com.
          Visit this group at https://groups.google.com/group/grpc-io.

          For more options, visit https://groups.google.com/d/optout.
          Reply all
          Reply to author
          Forward
          0 new messages