On 3/13/07, SmileyChris <smileych...@gmail.com> wrote:
> Anton Khalikov opened a new ticket pointing out a hole in the current > session framework which can cause session id collisions.
The proposed solution for collision (more on that in a moment) seems good -- the bug seems to happen solely because we don't immediately instantiate and save a Session, so that'd work around it.
I'm more wary of tying to an IP address, because I don't think that really buys any security; there are still ways to spoof or set up man-in-the-middle attacks, so the only real solution would be to tell paranoid people to use encrypted connections (which they should be doing anyway if they'd paranoid :)).
Though I have to say that I'm rather shocked by the ease of session collision -- it requires the same pair of random numbers (out of a range of sys.maxint) to come up, in the same order, within a fairly short period of time, something that just doesn't seem likely to happen a lot (given sys.maxint on my MacBook, such a collision should happen roughly once for every 4.6 *quintillion* trials). I know the 'random' module isn't industrial-strength, but surely it's better than that? And if it's not, then let's start salting it with more stuff.
-- "Bureaucrat Conrad, you are technically correct -- the best kind of correct."
as far as I see, newsessions checks REMOTE_IP. Is this a good idea?
a) the client can sit behind a NAT that might hand out different IPs b) the server can be behind NAT and not see the true IP at all. It might see different IPs for the same client over time. c) a crazy load balancer might get in the way
I wouldn't use REMOTE_IP.
In Ticket #1180, Anton said that users reported that they were able to get someone else's session. I wonder if that might be related to something quite different. Anton, was this by chance on a threaded server with a python-mysql < 1.2.1p2?
Here's the (old) code that generates a session:
def get_new_session_key(self): "Returns session key that isn't being used." # The random module is seeded when this Apache child is created. # Use person_id and SECRET_KEY as added salt. while 1: session_key = md5.new(str(random.randint(0, sys.maxint - 1)) + str(random.randint(0, sys.maxint - 1)) + settings.SECRET_KEY).hexdigest() try: self.get(session_key=session_key) except self.model.DoesNotExist: break return session_key
The chance that two consecutive session creations generate the same session id is 1 : sys.maxint^2, on my system sys.maxint is 2^31. So we're at 2^62. If the internal state of the random generator is big enough, it might make sense to hash 4 randints instead of only 2. Or to check for collisions at the same time as you save the session. But the chances are good that Anton experiences a different problem.
On 3/13/07, Michael Radziej <m...@noris.de> wrote:
> a) the client can sit behind a NAT that might hand out different IPs > b) the server can be behind NAT and not see the true IP at all. It > might see different IPs for the same client over time. > c) a crazy load balancer might get in the way
(or any proxy, in which case REMOTE_IP will be the same for every request and you'd want to be looking at the 'X-Forwarded-For' header -- this is one of the problems with IP-based sessions)
-- "Bureaucrat Conrad, you are technically correct -- the best kind of correct."
I actually passed this onto the security mailing address, as I thought it was better to be safe than sorry. Adrian's response was that tying things to the IP address is not a good idea (for the reasons that others have stated in this thread).
One thing that I would like to suggest is that we do link the SID to a user-agent (see Chris Shiflett's excellent article on this at [1]), as this provides some protection (-via-obscurity) against these collisions, since the attacker not only needs to get a valid SID, but match it to the correct UA string. There's also no reason for a "normal" user to change UA strings without needing to login again.
This can easily be added to the existing session framework by adding it to the hashing
SessionManager could then calculate a session and try to save an object with this session-key (allowing only INSERT, not UPDATE). If the key exists the DB-backend will raise an Exception as the key is unique (primary_key). If it is free, the INSERT will ensure that no other session may use the key.
If I get this right the newsession-patch is trying to do so, but will fail, beuase s.save() (django/contrib/newsessions/models.py, line 29) will overwrite (SQL UPDATE) existing sessions instead of failing creating it. One tiny advantage remains: There is less code (=time) between calculation the session key and saving it, so less chances to get a race condition. But I don't see a real fix there.
Simon G. wrote: > One thing that I would like to suggest is that we do link the SID to a > user-agent [...]
I heard about some proxies or anonymizer services that actually do change the user agent. But I haven't seen this so far, nor searched for it. ;-)
> There's also no reason for a > "normal" user to change UA strings without needing to login again.
UA strings change on every minor rev of some browsers, plus various tool versions (.net, etc.). Having those users logged out would suck for my purposes.
Guys I would like to explain some things and decisions in my code.
1. REMOTE_IP. I think it is a good idea to check it. Load balancers acts as proxy servers. Every web server has a module that allow to restore REMOTE_IP from X-FORWARDED-FOR, for example apache's module is called mod_rpaf. And ofcourse it's very easy to make REMOTE_IP checking code to be disabled or enabled via settings.py, just like it is in php, such modification is quite simple. 2. In my installation I use lighttpd and django as preforked fastcgi server, python 2.4. It's hard to reproduce such things but they _do_ happen. One can say my installation is broken but I think it's rather a hole in django if it allows such things to be ever happened. 3. Perhaps, current sessions may be patched too, what is needed is to force django to try INSERT into table but it looks like dirty hack to me because documentation says: if object's pk evaluates to something else than True, django does checking if such row already exists in db and while django does it, another process may get the same session id and try it too and we get collision again. My way works rock stable because of different table design and this is the only reason why i used another package name (newsession) istead of just patching existing package.
Even more on remote ip checking: it can be done in a flexible way when user is able to set either don't ever use it, check remote ip, check user agent name or ever both remote ip and user agent for paranoids :) I think everyone understands that it's about 2-3 more lines of code.
> 1. REMOTE_IP. I think it is a good idea to check it. Load balancers > acts as proxy servers. Every web server has a module that allow to > restore REMOTE_IP from X-FORWARDED-FOR, for example apache's module is > called mod_rpaf. And ofcourse it's very easy to make REMOTE_IP > checking code to be disabled or enabled via settings.py, just like it > is in php, such modification is quite simple.
I'm still wary of this -- again, it doesn't do anything to prevent man-in-the-middle, and it introduces complexity and -- if we're worried about session collisions -- introduces a whole new class of bugs when you have a bunch of people behind a NAT or firewall who all, from the POV of your server, have the same IP address.
> 2. In my installation I use lighttpd and django as preforked fastcgi > server, python 2.4. It's hard to reproduce such things but they _do_ > happen. One can say my installation is broken but I think it's rather > a hole in django if it allows such things to be ever happened.
I think we definitely need to tighten up the time between selecting a session key and shoving the session into the DB -- I'm thinking along the lines of using get_or_create and retrying until you get back a True creation value from it (that way you don't have to do anything else -- the session is already in the DB when get_or_create returns.
But I'm still worried about the apparent fact that this is all coming from key collisions, because what it points to is not a race condition at the database level -- what it points to is bad randomness.
Python's random-number generator uses the Mersenne Twister, and in Python 2.3 it seeds itself with the current system time (and in 2.4 it tries system-supplied randomness facilities first, then falls back to system time).
I'm not an expert on randomness by any means, but the problem we have here is the same pair of random numbers being generated at the same time or within a very short time of one another, and my first inclination is to look at the fact that we're using a deterministic RNG seeded -- by default -- with the system time.
So instead of trying to ensure safety at the database level (because, frankly, unless we go to extremes we're not going to manage that), why not look at better seeds for the RNG and solve the real problem?
-- "Bureaucrat Conrad, you are technically correct -- the best kind of correct."
> I'm still wary of this -- again, it doesn't do anything to prevent > man-in-the-middle, and it introduces complexity and -- if we're > worried about session collisions -- introduces a whole new class of > bugs when you have a bunch of people behind a NAT or firewall who all, > from the POV of your server, have the same IP address.
Well I don't agree with you here: 1. This option can be disabled by default and only enabled optionally by developer (like in php, as i said) 2. I have an ethernet connection @home and I sometimes log in to our private web apps from home. Any 'c00l hacker' is able to scan network traffic, get my session id and use it to join to my session too just because there is absolutely no checking who uses session. We added ip checking at our application level code but why don't implement this at django side as an option ? Noone says this is 100% guarantee of getting rid of man-in-the-middle, but at least it prevents from successful hacks from kiddies when using untrusted networks (ie public wi-fi, ethernet, etc).
> I think we definitely need to tighten up the time between selecting a > session key and shoving the session into the DB -- I'm thinking along > the lines of using get_or_create and retrying until you get back a > True creation value from it (that way you don't have to do anything > else -- the session is already in the DB when get_or_create returns.
> But I'm still worried about the apparent fact that this is all coming > from key collisions, because what it points to is not a race condition > at the database level -- what it points to is bad randomness.
Let's separate problems because there are 2 problems: 1. Somehow same ids are generated in a short period of time. So id generation needs to be improved. 2. Django allows session collisions.
I am not an expert in the first problem but I definitely think that second problem must be solved in the way I suggested, instead of shortening time between checking and storing new record to db. Just imagine: you have a gun, that sometimes fires to it's holder. I suggest to change construction to make it impossible in any way, you suggest to change the way how holder should pull the trigger, hold the gun in arms etc but if we don't change the construction, whatever we do there still IS a possibility to shoot yourself :) Hope I explained well enough.
> 2. I have an ethernet connection @home and I sometimes log in to our > private web apps from home. Any 'c00l hacker' is able to scan network > traffic, get my session id and use it to join to my session too just > because there is absolutely no checking who uses session. We added ip > checking at our application level code but why don't implement this at > django side as an option ? Noone says this is 100% guarantee of > getting rid of man-in-the-middle, but at least it prevents from > successful hacks from kiddies when using untrusted networks (ie public > wi-fi, ethernet, etc).
I guess I just feel that someone who's determined enough to sit around and sniff for your session ID is also going to be determined enough to defeat IP checking (which wouldn't be that hard -- again, the real solution to that is encrypted connections, not increasing numbers "sessions are tied to x, y and z"), so it doesn't get us any real improvement in security. But I'll bow out of that debate.
> Let's separate problems because there are 2 problems: > 1. Somehow same ids are generated in a short period of time. So id > generation needs to be improved. > 2. Django allows session collisions.
Problem 2 is a direct consequence of Problem 1. In fact, Problem 2 is, I think, just a symptom of problem 2 -- you can't have a DB race condition with sessions unless you're generating identical session keys for different peopl