So I need a bit of help figuring out how to handle X-Forwarded-For, and specifically what to do in the presance of multiple IPs.
Django's SetRemoteAddrFromForwardedFor middleware used to take the *first* item in the X-F-F header, but after http://code.djangoproject.com/ticket/3872 was filed we changed it to take the *last* IP.
Now we're getting reports that the IP we want is, in fact, the first IP after all (a fact confirmed by http://en.wikipedia.org/wiki/X-Forwarded-For -- if Wikipedia is capable of actually confirming anything :)
Is there anyone on this group who's got a pretty good knowledge of all the various HTTP proxies and can provide some advice? Obviously we've got to pick one IP; which should it be?
On 9/20/07, Jacob Kaplan-Moss <jacob.kaplanm...@gmail.com> wrote:
> Django's SetRemoteAddrFromForwardedFor middleware used to take the > *first* item in the X-F-F header, but after > http://code.djangoproject.com/ticket/3872 was filed we changed it to > take the *last* IP.
as its source for information. That article is about reverse proxying rather than proxying and the author of that article in the last comment on the article says:
"The use case here is to determine the external IP address of the service, in a local reverse-proxy style load balancer configuration. The goal is not to find out any information whatsoever about the client. "
Given that, I believe all of your sources for information on X-F-F agree and the change should be reverted.
> So I need a bit of help figuring out how to handle X-Forwarded-For, > and specifically what to do in the presance of multiple IPs.
> Django's SetRemoteAddrFromForwardedFor middleware used to take the > *first* item in the X-F-F header, but after > http://code.djangoproject.com/ticket/3872 was filed we changed it to > take the *last* IP.
> Now we're getting reports that the IP we want is, in fact, the first > IP after all (a fact confirmed by > http://en.wikipedia.org/wiki/X-Forwarded-For -- if Wikipedia is > capable of actually confirming anything :)
> Is there anyone on this group who's got a pretty good knowledge of all > the various HTTP proxies and can provide some advice? Obviously we've > got to pick one IP; which should it be?
Hi, Jacob.
In the comments for the article Simon cited on the ticket, Bob confirms left most is client, right most is last proxy but he was trying for the most trust worthy IP in the chain, not the client's IP. I'm pretty sure this is the norm. Squid and mod_proxy do this for sure; I believe perlbal, too, but you could confirm that with your guys. I think it's pretty safe to revert, at least for the major proxies.
Granted, a proxy could do what it wants with the ordering, but there's no way to avoid that, IMHO.
On 9/20/07, Jacob Kaplan-Moss <jacob.kaplanm...@gmail.com> wrote:
> Howdy folks --
> So I need a bit of help figuring out how to handle X-Forwarded-For, > and specifically what to do in the presance of multiple IPs.
> Django's SetRemoteAddrFromForwardedFor middleware used to take the > *first* item in the X-F-F header, but after > http://code.djangoproject.com/ticket/3872 was filed we changed it to > take the *last* IP.
> Now we're getting reports that the IP we want is, in fact, the first > IP after all (a fact confirmed by > http://en.wikipedia.org/wiki/X-Forwarded-For -- if Wikipedia is > capable of actually confirming anything :)
Wikipedia isn't confirming that the first IP should be taken. It says that the first entry is the "farthest downstream client". But if you are going to believe it, you are blindly trusting on every downstream client who is providing some part of such list.
What stops the client who wants to set X-Forwarded-For to a false address? When it passes through the reverse proxy, X-Forwarded-For will be "false-address, the-real-address".
So, the only case when you really want to use the X-Forwarded-For middleware is when you have exactly _one_ trusted reverse proxy in front of your server. In such case, the proxy will append the address of its client at the _end_ of X-Forwarded-For (because if the header already exists, the proxy's client is not supposed to be the "farthest downstream client").
I can't explain why, according to the ticket, Chris Bennett gets "HTTP_X_FORWARDED_FOR: 66.162.32.x, 127.0.0.1". But, using Wikipedia again as reference, it would indicate that _two_ proxies are involved.
Anyway, please *do not* revert it. Such change would make easy to fake the remote address when using that middleware. If people are _really_ using more than one trusted proxy (a transparent Squid getting in the way maybe?), the middleware could have a settings to let the user indicate how many values of X-Forwarded-For are know to be good. -- Leo Soto M.
On 9/20/07, Deryck Hodge <der...@samba.org> wrote:
> In the comments for the article Simon cited on the ticket, Bob > confirms left most is client, right most is last proxy but he was > trying for the most trust worthy IP in the chain, not the client's IP. > I'm pretty sure this is the norm. Squid and mod_proxy do this for > sure; I believe perlbal, too, but you could confirm that with your > guys. I think it's pretty safe to revert, at least for the major > proxies.
OK, I can confirm that perlbal indeed appends to X-F-F, so I'm going to assume that if Squid, mod_proxy, and perlbal do it, it's as "correct" as it's gonna get. I'm going to revert this change and go back to assuming that the client IP is the first one.
On 9/20/07, Craig Ogg <craig....@gmail.com> wrote:
> On 9/20/07, Jacob Kaplan-Moss <jacob.kaplanm...@gmail.com> wrote: > > Django's SetRemoteAddrFromForwardedFor middleware used to take the > > *first* item in the X-F-F header, but after > > http://code.djangoproject.com/ticket/3872 was filed we changed it to > > take the *last* IP.
> Wikipedia isn't confirming that the first IP should be taken. It says > that the first entry is the "farthest downstream client". But if you > are going to believe it, you are blindly trusting on every downstream > client who is providing some part of such list.
> What stops the client who wants to set X-Forwarded-For to a false > address? When it passes through the reverse proxy, X-Forwarded-For > will be "false-address, the-real-address".
> So, the only case when you really want to use the X-Forwarded-For > middleware is when you have exactly _one_ trusted reverse proxy in > front of your server. In such case, the proxy will append the address > of its client at the _end_ of X-Forwarded-For (because if the header > already exists, the proxy's client is not supposed to be the "farthest > downstream client").
> I can't explain why, according to the ticket, Chris Bennett gets > "HTTP_X_FORWARDED_FOR: 66.162.32.x, 127.0.0.1". But, using Wikipedia > again as reference, it would indicate that _two_ proxies are involved.
> Anyway, please *do not* revert it. Such change would make easy to fake > the remote address when using that middleware. If people are _really_ > using more than one trusted proxy (a transparent Squid getting in the > way maybe?), the middleware could have a settings to let the user > indicate how many values of X-Forwarded-For are know to be good.
I completely agree you shouldn't use this middleware unless you know and trust the proxy setup, but I can easily imagine (large corporate networks) a situation where there could be multiple proxies. Seems to me its better to be clear of the dangers in the docs rather than trying to prevent someone using this with multiple proxies.
On 9/20/07, Leo Soto M. <leo.s...@gmail.com> wrote:
> Anyway, please *do not* revert it. Such change would make easy to fake > the remote address when using that middleware. If people are _really_ > using more than one trusted proxy (a transparent Squid getting in the > way maybe?), the middleware could have a settings to let the user > indicate how many values of X-Forwarded-For are know to be good.
*sigh*
Already did. Mostly because in a typical squid -> perlbal -> apache setup the take-the-last behavior was broken.
I really don't know how to both take the first *and* the last, so I'm going to go with what appears to work for me over what doesn't.
On 9/20/07, Deryck Hodge <der...@samba.org> wrote:
> I completely agree you shouldn't use this middleware unless you know > and trust the proxy setup, but I can easily imagine (large corporate > networks) a situation where there could be multiple proxies. Seems to > me its better to be clear of the dangers in the docs rather than > trying to prevent someone using this with multiple proxies.
No. With the patch applied, the Middleware is *secure* for people who trust in _one_ reverse proxy. Without it, the Middleware is *insecure* for anybody who uses it. The use case for reverse proxy users who want to have a *reliable* remote address is not even hard: is impossible. No documentation can fix it.
On 9/20/07, Leo Soto M. <leo.s...@gmail.com> wrote:
> On 9/20/07, Deryck Hodge <der...@samba.org> wrote: > > I completely agree you shouldn't use this middleware unless you know > > and trust the proxy setup, but I can easily imagine (large corporate > > networks) a situation where there could be multiple proxies. Seems to > > me its better to be clear of the dangers in the docs rather than > > trying to prevent someone using this with multiple proxies.
> No. With the patch applied, the Middleware is *secure* for people who > trust in _one_ reverse proxy. Without it, the Middleware is *insecure* > for anybody who uses it. The use case for reverse proxy users who want > to have a *reliable* remote address is not even hard: is impossible. > No documentation can fix it.
But what about the case of multiple trusted proxies (not the case of the client acting as a proxy)? Or what about if the proxy sends the XFF header as [CLIENTIP, PROXYIP] which is what I believe the major ones do and what cause the patch to break existing setups?
On 9/20/07, Deryck Hodge <der...@samba.org> wrote: [...]
> But what about the case of multiple trusted proxies (not the case of > the client acting as a proxy)? Or what about if the proxy sends the > XFF header as [CLIENTIP, PROXYIP] which is what I believe the major > ones do and what cause the patch to break existing setups?
Exactly. We have to fix this cases, without breaking security. On the other hand, maybe a reliable remote IP address is not that important. Then, the doc should be fixed, because currently it somehow implies that you can trust HTTP_X_FORWARDED_FOR in some cases. You can't.
Now, if having a reliable remote IP address is important, then a setting (NUMBER_OF_TRUSTED_PROXY_SERVERS?) specifying how many values you can trust is the only thing that occurs to me. (I'm not that creative).
Then, you get the right remote IP using x_forwarded_for.split(",")[-NUMBER_OF_TRUSTED_PROXY_SERVERS].strip().
On 9/20/07, Leo Soto M. <leo.s...@gmail.com> wrote:
> Now, if having a reliable remote IP address is important, then a > setting (NUMBER_OF_TRUSTED_PROXY_SERVERS?) specifying how many values > you can trust is the only thing that occurs to me. (I'm not that > creative).
Doh. That was a complicated proposal. White-listing trusted proxy servers seems more intuitive from the sysadmin point of view, and more reliable. How about that?
> On 9/20/07, Deryck Hodge <der...@samba.org> wrote: > [...] > > But what about the case of multiple trusted proxies (not the case of > > the client acting as a proxy)? Or what about if the proxy sends the > > XFF header as [CLIENTIP, PROXYIP] which is what I believe the major > > ones do and what cause the patch to break existing setups?
> Exactly. We have to fix this cases, without breaking security. On the > other hand, maybe a reliable remote IP address is not that important. > Then, the doc should be fixed, because currently it somehow implies > that you can trust HTTP_X_FORWARDED_FOR in some cases. You can't.
> Now, if having a reliable remote IP address is important, then a > setting (NUMBER_OF_TRUSTED_PROXY_SERVERS?) specifying how many values > you can trust is the only thing that occurs to me. (I'm not that > creative).
> Then, you get the right remote IP using > x_forwarded_for.split(",")[-NUMBER_OF_TRUSTED_PROXY_SERVERS].strip().
> What do you think?
I'll let someone else speak to providing a configurable option for this. It feels a bit much for me, but certainly provides more flexibility. But it's also not hard to write a custom middleware if your proxy setup isn't the common case.
I guess I would challenge the notion, too, that you can't trust the client IP when you trust the proxy or proxies, at least in the sense of knowing trusted proxies versus untrusted. For example, if my setup has proxies p1 and p2:
client (untrusted) --> p1 --> p2 --> django
Can't I trust p1 and p2 to setup client IP appropriately in XFF between the two of them? It's not like p1 or p2 are going to read the XXF header from the untrusted client. If they do, the problem is in proxy trust, and I don't think Django can be asked to account for this.
Since there seems to be two use cases, might I suggest forking the secondary use case into a separate middleware class?
Whether or not the trusted reverse proxy scenario is more common (though I believe it is), it's best to avoid breaking existing functionality, especially when the SetRemoteAddrFromForwardedFor docstring already stated not to use it with untrusted sources.
Perhaps SetRemoteAddrFromUntrustedForwardedFor middleware, with the secondary implementation?
Chris
On Sep 20, 8:58 am, "Jacob Kaplan-Moss" <jacob.kaplanm...@gmail.com> wrote:
> So I need a bit of help figuring out how to handle X-Forwarded-For, > and specifically what to do in the presance of multiple IPs.
> Django's SetRemoteAddrFromForwardedFor middleware used to take the > *first* item in the X-F-F header, but afterhttp://code.djangoproject.com/ticket/3872was filed we changed it to > take the *last* IP.
> Now we're getting reports that the IP we want is, in fact, the first > IP after all (a fact confirmed byhttp://en.wikipedia.org/wiki/X-Forwarded-For-- if Wikipedia is > capable of actually confirming anything :)
> Is there anyone on this group who's got a pretty good knowledge of all > the various HTTP proxies and can provide some advice? Obviously we've > got to pick one IP; which should it be?
On 9/20/07, Deryck Hodge <der...@samba.org> wrote:
> I guess I would challenge the notion, too, that you can't trust the > client IP when you trust the proxy or proxies, at least in the sense > of knowing trusted proxies versus untrusted. For example, if my setup > has proxies p1 and p2:
> client (untrusted) --> p1 --> p2 --> django
> Can't I trust p1 and p2 to setup client IP appropriately in XFF > between the two of them? It's not like p1 or p2 are going to read the > XXF header from the untrusted client.
Yes, of course they *are* going to read it. Otherwise, how would they assemble the XFF header? (Yeah, proxys could have a option to white-list known proxys downstream, but they do?).
> If they do, the problem is in > proxy trust, and I don't think Django can be asked to account for > this.
And as I already said, I think that most of proxies out there just trust what the client sends, because otherwise you'd never end with more than one IP on the XFF header.
To give a bit of perspective: IMHO, the problem is that we are giving users the illusion that this middleware will always setup the right remote IP when they trust their proxy server. It won't.
And now, as it seems that the case I was advocating for isn't the common case, I think you were right: documentation can fix it, saying loudly that, *when using this middleware, you can't trust the remote IP, even if you trust the proxy server*. That's all.
> On 9/20/07, Deryck Hodge <der...@samba.org> wrote: > > I guess I would challenge the notion, too, that you can't trust the > > client IP when you trust the proxy or proxies, at least in the sense > > of knowing trusted proxies versus untrusted. For example, if my setup > > has proxies p1 and p2:
> > client (untrusted) --> p1 --> p2 --> django
> > Can't I trust p1 and p2 to setup client IP appropriately in XFF > > between the two of them? It's not like p1 or p2 are going to read the > > XXF header from the untrusted client.
> Yes, of course they *are* going to read it. Otherwise, how would they > assemble the XFF header? (Yeah, proxys could have a option to > white-list known proxys downstream, but they do?).
A quick Google search turns up that this is indeed easily configurable for both Squid and mod_proxy and the defaults look sane. I'd guess the same for most any decent proxy, but I'm not willing to do the research on every proxy I can think of. :-) This is why I say it's an issue of trusting the proxies, not Django, to do the right thing in this case. If your proxy blindly follows X-Forwarded-For for untrusted clients, you've got it configured wrong, and there's nothing Django can do about that.
On 9/20/07, Deryck Hodge <der...@samba.org> wrote:
> A quick Google search turns up that this is indeed easily configurable > for both Squid and mod_proxy and the defaults look sane.
What are those defaults?.
My google-foo is very low today, and I only arrived at the squid FAQ[1], which says "We must note that access controls based on this header are extremely weak and simple to fake. Anyone may hand-enter a request with any IP address whatsoever[...]".
And the mod_proxy page dind't help either, it just says: "Be careful when using these headers on the origin server, since they will contain more than one (comma-separated) value if the original request already contained one of these headers."
As an aside, is anyone talking about seriously using this for access control? We've established that using X-F-F is a bad idea for that, in fact, I'd say that even known REMOTE_ADDR based auth is a bad idea, so why does it matter whether it is "trustworthy"?
Anyway - I use X-F-F for IP geolocation, so I want the address to the farthest left.
Just to illustrate the spoofing scenario: I sent X-Forwarded-For: 66.66.66.66 to my server (LB, then mod_proxy) so it receives the following in the end: (unless configured to do otherwise) X-Forwarded-For: 66.66.66.66, 24.152.161.x, 127.0.0.1 otherwise it would have resulted in X-Forwarded-For: 24.152.161.x, 127.0.0.1
Obviously IP geolocation is very unreliable, but I definitely want the farthest left IP. If it's private, I still want it so I don't bother to geolocate - because the geolocation will be worthless anyway as the proxy is often located somewhere else entirely and I don't want to show the user that incorrect location. Likewise, if it's faked, I really don't care either.
If I wanted what my _proxy_ saw as the REMOTE_ADDR, then this middleware would need to be implemented using the x_forwarded_for.split(",")[-NUMBER_OF_TRUSTED_PROXY_SERVERS].strip() method suggested by Leo.
After rethinking this, I believe the interpretation of this header is very much application-specific, so I'd suggest something with the following effect to maintain compatibility and satisfy those that want the client to the farthest out LB/rev proxy:
> On 9/20/07, Deryck Hodge <der...@samba.org> wrote:
> > A quick Google search turns up that this is indeed easily configurable > > for both Squid and mod_proxy and the defaults look sane.
> What are those defaults?.
> My google-foo is very low today, and I only arrived at the squid > FAQ[1], which says "We must note that access controls based on this > header are extremely weak and simple to fake. Anyone may hand-enter a > request with any IP address whatsoever[...]".
> And the mod_proxy page dind't help either, it just says: "Be careful > when using these headers on the origin server, since they will contain > more than one (comma-separated) value if the original request already > contained one of these headers."
> On 9/20/07, Deryck Hodge <der...@samba.org> wrote:
> > A quick Google search turns up that this is indeed easily configurable > > for both Squid and mod_proxy and the defaults look sane.
> What are those defaults?.
> My google-foo is very low today, and I only arrived at the squid > FAQ[1], which says "We must note that access controls based on this > header are extremely weak and simple to fake. Anyone may hand-enter a > request with any IP address whatsoever[...]".
> And the mod_proxy page dind't help either, it just says: "Be careful > when using these headers on the origin server, since they will contain > more than one (comma-separated) value if the original request already > contained one of these headers."
Squid doesn't append to an existing x-f-f header by default, which seems sane. Turns out mod_proxy does blindly append and it's not configurable (I asked on irc.freenode.ne#apache and looked at the source.) Personally, I think since x-forwarded-for is a de facto standard because of Squid I would consider Squid's implementation the be one to follow and call this a bug in Apache. I didn't check perlbal or any other implementation.
Of course, this is straying a bit from the original topic. I still think the middleware as reverted by Jacob is correct. Whether or not you trust REMOTE_ADDR to be the actual client IP after using the middleware is a matter of which proxy or proxies you use, how you have said proxies configured, and what you consider the client (machine connecting to trusted proxies or actual person surfing the web).
If people feel this needs clarifying in the docs, I would be happy to work up a patch for the section on this middleware going into a bit more detail, but this could be a little overkill, given a warning already exists there.
On 9/20/07, Chris Bennett <chrisrbenn...@gmail.com> wrote:
> As an aside, is anyone talking about seriously using this for access > control? We've established that using X-F-F is a bad idea for that, in > fact, I'd say that even known REMOTE_ADDR based auth is a bad idea, so > why does it matter whether it is "trustworthy"?
Access control and auth are not the same.
django.contrib.comments uses REMOTE_ADDR to log the ip address of someone submitting a comment. I've seen other Django apps do similar things (i.e. throttling poll submissions per ip address). This is a form of weak access control and useful if you can reasonably trust REMOTE_ADDR. See my last post to Leo's comment about not following x-forwarded-for headers for better reliability on this.
On 9/21/07, Deryck Hodge <der...@samba.org> wrote:
[...]
> Of course, this is straying a bit from the original topic. I still > think the middleware as reverted by Jacob is correct. Whether or not > you trust REMOTE_ADDR to be the actual client IP after using the > middleware is a matter of which proxy or proxies you use, how you have > said proxies configured, and what you consider the client (machine > connecting to trusted proxies or actual person surfing the web).
OK, I buy it. It's a matter of proxy configuration. It's potentially dangerous, but it's not Django business. -- Leo Soto M.
Leo Soto M. wrote: > On 9/20/07, Deryck Hodge <der...@samba.org> wrote:
>> I completely agree you shouldn't use this middleware unless you know >> and trust the proxy setup, but I can easily imagine (large corporate >> networks) a situation where there could be multiple proxies. Seems to >> me its better to be clear of the dangers in the docs rather than >> trying to prevent someone using this with multiple proxies.
> No. With the patch applied, the Middleware is *secure* for people who > trust in _one_ reverse proxy. Without it, the Middleware is *insecure* > for anybody who uses it. The use case for reverse proxy users who want > to have a *reliable* remote address is not even hard: is impossible. > No documentation can fix it.
Hi. I think I was the original author of this mess ;-)
my 2c's.
X-forwarded-for can be spoofed, and if you want to be really secure you want to get the IP just after the trusted one, as this is the one that the trusted one put in and can't be spoofed. you would need to add a configuration setting somewhere to specify a pool of trusted IPs or a network range and then go through the header for each request. For me I just skipped this and used the first one and never use IP-authentication for anything as we have userid's and passwords for security ;-) I just used the IP# more for detection of bots and multiple users coming in from the same place. (ie. for informational purposes, not security)
There is no use getting the last one, as in most cases this will be *your* machine and will always be the same. (and is also insecure as who the heck cares that your machine in your network passed the request through?)