Due to the birthday paradox, sqrt(n) is roughly the number you need to
have a 50% collision chance when picking items at random from an
inexhaustible set of n items. (0, sys.maxint - 1) is currently the
random range. On 32-bit platforms the collision bound is thus quite
low as sqrt(2^31) is about 46 000 keys.
Although get_or_create() is used to guard against collisions, they can
still occur in threaded environments that have millions of sessions in
session store (multiple threads enter get_or_create(), collision
probability is very high when the number of stored sessions is, say,
10 times 46 000). This has been reported by at least one user.
There's no problem on 64-bit platforms as the collision bound is
sqrt(2^63) ~ 3,000,000,000.
Proposal: use 63 bits of randomness regardless of architecture.
(Trivial) patch attached to #1180. According to reports, this fixes
the problem on 32-bit systems.
Currently sessions are not related to users. As discussed before, one-
way relation is useful, i.e. users are related to sessions and
sessions are cleared if they logout or if they login and a different
user was logged in previously. This depends on #7515.
On Mon, 2008-07-07 at 06:48 -0700, mrts wrote: > I'd like to get some feedback for the following major tickets > regarding sessions, all of which are in scope for 1.0.
> Due to the birthday paradox, sqrt(n) is roughly the number you need to > have a 50% collision chance when picking items at random from an > inexhaustible set of n items. (0, sys.maxint - 1) is currently the > random range. On 32-bit platforms the collision bound is thus quite > low as sqrt(2^31) is about 46 000 keys.
This isn't correct. The birthday paradox models a situation of choices *with replacement* from a set. That is, when multiple entities can have the same value. We don't have replacement. You have a set of X active sessions, all unique (since that's a database constraint) and somebody is trying to guess one of them at random. So the probability is X/N, not something approaching sqrt(N)/N.
Once a session has been created and saved successfully to the database, it's known to be unique amongst all active sessions.
I'm not saying there's no issue here, but accuracy is important.
> Currently sessions are not related to users. As discussed before, one- > way relation is useful, i.e. users are related to sessions and > sessions are cleared if they logout or if they login and a different > user was logged in previously. This depends on #7515.
As discussed in the earlier thread, deleting sessions on logout is almost certainly the right solution to this.
On Mon, Jul 7, 2008 at 8:48 AM, mrts <m...@mrts.pri.ee> wrote: > I'd like to get some feedback for the following major tickets > regarding sessions, all of which are in scope for 1.0.
I don't really know a nice way of saying this, so I'll trust you to understand that I don't mean to be a dick.
You don't get to decide what's in scope for 1.0 and what isn't.
You're completely right that these tickets fall into the "bug" category (though there's some feature-creep with #7515), which makes them likely candidates. Still, you don't get to make a unilateral decision here. I appreciate that you think these features are vital, but you must understand that *everyone* has his or her own "must have" list, and if we're to meet deadlines some people are inevitably going to be disappointed.
So: please feel free to *suggest* that tickets be closed for 1.0; feel free to *argue* for them... but stop demanding.
Further, if you've looked at the roadmap you'll see that we're two weeks out from 1.0 alpha, which is a major deadline for newforms-admin and a couple of other thorny issues. This means that all eyes are going to be on finishing those features, and *not* on bug fixes -- we'll be focusing on those after feature freeze.
As I said, though, these are very likely candidates for 1.0. So the best thing for you to do is tag them with the 1.0 milestone and wait a couple weeks until we start focusing on bug fixes.
On Jul 7, 5:00 pm, "Jacob Kaplan-Moss" <ja...@jacobian.org> wrote:
> On Mon, Jul 7, 2008 at 8:48 AM, mrts <m...@mrts.pri.ee> wrote:
> > I'd like to get some feedback for the following major tickets
> > regarding sessions, all of which are in scope for 1.0.
> I don't really know a nice way of saying this, so I'll trust you to
> understand that I don't mean to be a dick.
> You don't get to decide what's in scope for 1.0 and what isn't.
That was not my intention (and it appears that somehow many of my
posts create a bit of friction :) -- their relevance looks to be
mostly accepted but wording not... oh well). What I wanted to say is
that robust sessions are a vital component of a web framework and all
the issues above need some discussion and decisions before 1.0. So,
indeed, I wanted to argue, not demand something.
> > Due to the birthday paradox, sqrt(n) is roughly the number you need to
> > have a 50% collision chance when picking items at random from an
> > inexhaustible set of n items. (0, sys.maxint - 1) is currently the
> > random range. On 32-bit platforms the collision bound is thus quite
> > low as sqrt(2^31) is about 46 000 keys.
> This isn't correct. The birthday paradox models a situation of choices
> *with replacement* from a set. That is, when multiple entities can have
> the same value. We don't have replacement. You have a set of X active
> sessions, all unique (since that's a database constraint) and somebody
> is trying to guess one of them at random. So the probability is X/N, not
> something approaching sqrt(N)/N.
random.randint(0, n) *is* a set with choices with replacements.
Multiple picks can, should and, as demonstrated at
http://code.djangoproject.com/ticket/1180#comment:26 , do create
collisions at around sqrt(n) picks and less.
I entirely agree that collisions in picks from the random value set
shouldn't directly manifest in collisions in the session key set as
1) time.time() is used as another varying input
2) collision is explicitly checked for in a loop until a unique value
is created
So, for a session key collision to occur, multiple threads should
simultaneously enter the function, get the same time value and a
colliding random value (and negative result from the uniqueness
check). Admittedly this looks very unlikely when random values collide
at around 46 000 picks.
To properly solve the problem, the particular scenario that leads to
the collision should be pinpointed. Then we can make an informed
decision about the amount of randomness needed.
Until that, 63 bits can be used as it is safer than 31, does not cause
much performance penalty and has proven to be enough to fix the
problem.
On Mon, 2008-07-07 at 08:28 -0700, mrts wrote: > > > Due to the birthday paradox, sqrt(n) is roughly the number you need to > > > have a 50% collision chance when picking items at random from an > > > inexhaustible set of n items. (0, sys.maxint - 1) is currently the > > > random range. On 32-bit platforms the collision bound is thus quite > > > low as sqrt(2^31) is about 46 000 keys.
> > This isn't correct. The birthday paradox models a situation of choices > > *with replacement* from a set. That is, when multiple entities can have > > the same value. We don't have replacement. You have a set of X active > > sessions, all unique (since that's a database constraint) and somebody > > is trying to guess one of them at random. So the probability is X/N, not > > something approaching sqrt(N)/N.
> random.randint(0, n) *is* a set with choices with replacements. > Multiple picks can, should and, as demonstrated at > http://code.djangoproject.com/ticket/1180#comment:26 , do create > collisions at around sqrt(n) picks and less.
That comment has no bearing.
(1) We pick a random session key. (2) We save it to the database, where it should be unique, otherwise an error is raised. (3) We use that session key to pass back to the user.
At this point, the birthday paradox no longer applies. (3) should not happen before (2). If it does, it's a bug that should be fixed, but the database should be being used to guarantee uniqueness here. It's the same database across all threads/processes/machines.
Your comment in that ticket avoids step (2), which is a necessary uniqueness constraint.
On Jul 7, 6:32 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:
> That comment has no bearing.
> (1) We pick a random session key.
> (2) We save it to the database, where it should be unique, otherwise an
> error is raised.
> (3) We use that session key to pass back to the user.
> At this point, the birthday paradox no longer applies. (3) should not
> happen before (2). If it does, it's a bug that should be fixed, but the
> database should be being used to guarantee uniqueness here. It's the
> same database across all threads/processes/machines.
> Your comment in that ticket avoids step (2), which is a necessary
> uniqueness constraint.
Agreed, but increasing the range from (0, 2^31) to (0, 2^63) should
have no effect on collisions then. The report at #1180 seems to
indicate otherwise (unless it is invalid).
As one of the people experiencing issues with the session collisions I
will attempt to explain how it manifests and my setup.
My server is a shared host, running Apache, python2.5 and Django is
configured though fast-cgi (My host wont let me use mod-python). The
way it seems to work is a new thread is spawned on each page request
and terminated when the request is finished.
The issues manifests when the sessions system generates a duplicate
session id for two different users. Only 1 entry is entered into the
DB, so both users are using the same session id, thus can (and do)
have incorrect permissions and data. I am detecting the collisions by
setting a setting a cookie value with the username they logged in with
and checking it against the username in the session data. With the SVN
implementation of session generation I get about 10% of my users
getting an incorrect session when they login, with the latest patch I
have gotten 0 collisions.
The issue may actually be with the get_or_create() method of blocking
collisions since if it generates a duplicate id it just grabs that
data from the table and runs with it (I think), but whatever the
actual cause of the issue the latest patch prevents it on my setup.
Thus, looks there are two problems -- collisions happen too often on
32-bit machines (fixed by my patch) and db session backend doesn't
handle them properly (no fix so far).
On Mon, 2008-07-07 at 10:09 -0700, digitalxero wrote:
[...]
> The issue may actually be with the get_or_create() method of blocking > collisions since if it generates a duplicate id it just grabs that > data from the table and runs with it (I think), but whatever the > actual cause of the issue the latest patch prevents it on my setup.
Yes. It's important to view things like this thread in the context of the earlier thread about session fixes. There are a number of inter-related things here and a fix that solves the general problem instead of playing whack-a-mole is in order. Part of that, and implicit in this whole thing, is ensuring that the new session id is saved (and *created*) immediately to avoid collisions. The database is the only way to ensure uniqueness here and there is a bug at that level. Increasing some value from 32 bits to 64 bits is only changing some probabilities; it's not actually solving the problem, just moving the goalposts to make it harder to score an own goal. The rest of the conversation should proceed on the assumption that the bug about creating unique database entries will be fixed first.
Malcolm Tredinnick wrote: > The rest of the conversation should > proceed on the assumption that the bug about creating unique database > entries will be fixed first.
Now I think that the problem is only exists if one uses non-transactional DB setup. In this case due to race conditions one of the two simultaneous get_or_create calls will "get" instead of "create". In a transactional setup one of the transactions will fail to commit eventually since both get_or_create will try to do INSERT not seeing each other. May be the fix then will be to always use explicit INSERT instead of get_or_create and let it fail on unique constraint?
P.S. Did I miss it in the thread or nobody talks about non-DB session storages?
On Tue, 2008-07-08 at 10:30 +0400, Ivan Sagalaev wrote: > Malcolm Tredinnick wrote: > > The rest of the conversation should > > proceed on the assumption that the bug about creating unique database > > entries will be fixed first.
> Now I think that the problem is only exists if one uses > non-transactional DB setup. In this case > due to race conditions one of the two simultaneous get_or_create calls > will "get" instead of "create". In a transactional setup one of the > transactions will fail to commit eventually since both get_or_create > will try to do INSERT not seeing each other.
Not quite. The root problem is that get_or_create() is simply not designed at the moment to enforce the "create" side. It assumes that if another process creates the object during the call, then it's fine to get the result and carry on. The idea is that after you call it, the object exists in the database and you have a reference to the copy, not that *only* get or *only* create will be called. This is why two people can end up retrieving the same session object. There needs to be a path that says "this must create if it isn't already there". That's a bug that will be fixed.
> May be the fix then will be > to always use explicit INSERT instead of get_or_create and let it fail > on unique constraint?
That is the fix. It was discussed in the thread a couple of months ago. When I've done my high priority tasks, I'll get around to adding the ability to force insert or update at the model save level and the rest falls out naturally. It's easy and I have a patch that mostly works. But I need to review it again and it's in the queue.
It's almost like we need some kind of ticket tracking system so that these things don't get lost and people won't start threads saying "look at mine! look at mine!" Oh, wait ... we do have one of those. :-(
> P.S. Did I miss it in the thread or nobody talks about non-DB session > storages?
It's more or less a side-issue that we'll work out as we go along. If non-db storages have a way of enforcing uniqueness, we'll use it (and that does require some effort to do -- I'm not dismissing them). However, the storage is the only reliable place to enforce uniqueness, so if the storage cannot guarantee it, Django can only work with what it's given. So, for example, file storage might have problems on some filesystems that don't have good locking semantics. Then it's a case of "go buy yourself a real filesystem".
On Tue, 2008-07-08 at 19:27 +1000, Malcolm Tredinnick wrote:
[...]
> It's almost like we need some kind of ticket tracking system so that > these things don't get lost and people won't start threads saying "look > at mine! look at mine!" Oh, wait ... we do have one of those. :-(
Sorry. That was uncalled for on my part. I'm feeling buried under too many active threads, but I didn't need to take it out on the people in this one.
On Jul 8, 5:27 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:
> Increasing
> some value from 32 bits to 64 bits is only changing some probabilities;
> it's not actually solving the problem, just moving the goalposts to make
> it harder to score an own goal. The rest of the conversation should
> proceed on the assumption that the bug about creating unique database
> entries will be fixed first.
Agreed that the uniqueness bug takes precedence. But the uniform 63-
bit randomness patch is also important. Once collisions happen, their
handling is quite expensive, so they should be better avoided in the
first place.
Moreover, 63 bits are already used on 64-bit machines and I don't see
any reason not to use it on 32-bit machines as well.
Though better than manually erasing all session keys in a loop, this
is not robust enough. Consider this scenario:
* session with key X is created for user A
* A "logs out" and session data is cleared, session key X will not be
reset
* session db backend tries remove the empty session, but database
connection fails, exception is raised
* cookie with old session key X remains in browser
* user B accesses the site using the same browser, cookie with key X
will be sent to the site
* database has come online, session exists and A's sensitive
information will be happily served to B
Thus, reliable session clearing should assure that the session cookie
is removed or updated with a different session key even when
exceptions occur in the session storage backend.
No hurry in replying, but discussing this is important in my humble
opinion.
I've updated the patch at #7515 [1] by adding an exception-
safe .destroy() method to session objects that
* clears the session dict
* removes the old session from session store
* generates a new session key
I propose the following roadmap for fixing session problems:
1. review and commit #1180 [2]. This depends indirectly on Malcolm's
work on unique db entries, whether the following queue should wait for
that is up to the core developers.
2. review and commit #7515 [1] as this subtly depends on #1180 (the
rare corner case where exists() throws and key uniqueness can not be
checked, collision probability should be low in this case).
3. review and commit # 6941 [3], this depends on #7515.