Pluggable encryption for django auth (design proposal)

457 views
Skip to first unread message

Christophe Pettus

unread,
Nov 27, 2010, 10:14:42 PM11/27/10
to django-d...@googlegroups.com
Hi, all,

Right now, Django's auth system pretty much uses sha1 hardwired in (literally, in the case of User.set_password) for the hash. For a discussion of why a general-purpose hash function is not the best idea in the world for password encryption, see:

http://codahale.com/how-to-safely-store-a-password/

I'd like to propose a backwards-compatible method of allowing different hash algorithms to be used, while not adding new dependencies on external libraries to the core.

1. Add a setting DEFAULT_PASSWORD_HASH. This contains the code for the algorithm to use; if it is absent, 'sha1' is assumed.

2. Add a setting PASSWORD_HASH_FUNCTIONS. This is a map of algorithm codes to callables; the callable has the same parameters as auth.models.get_hexdigest, and return the hex digest its parameters (to allow for a single function to handle multiple algorithms, the algorithm aprameter to get_hexdigest is retained). For example:

PASSWORD_HASH_FUNCTIONS = { 'bcrypt': 'myproject.myapp.bcrypt_hex_digest' }

3. auth.models.get_hexdigest is modified such that if the algorithm isn't one of the ones it knows about, it consults PASSWORD_HASH_FUNCTIONS and uses the matching function, if present. If there's no match, it fails as it does currently.

4. User.set_password() is modified to check the value of DEFAULT_PASSWORD_HASH, and uses that algorithm if specified; otherwise, it uses 'sha1' as it does not. (Optional: Adding the algorithm as a default parameter to User.set_password().)

Comments?

--
-- Christophe Pettus
x...@thebuild.com

Tom X. Tobin

unread,
Nov 27, 2010, 11:05:28 PM11/27/10
to django-d...@googlegroups.com
On Sat, Nov 27, 2010 at 10:14 PM, Christophe Pettus <x...@thebuild.com> wrote:
> Right now, Django's auth system pretty much uses sha1 hardwired in (literally, in the case of User.set_password) for the hash.  For a discussion of why a general-purpose hash function is not the best idea in the world for password encryption, see:
>
>        http://codahale.com/how-to-safely-store-a-password/

Completely leaving aside the potential benefit of allowing different
hash algorithms, I think the specific argument made by the author of
that article, along with their proposed solution of an "intentionally
slow" algorithm, is the wrong approach. Your application ends up just
as hobbled by such an algorithm as a potential attacker. If you're
choosing a slowdown factor based on your worst-case attacker, you're
likely to significantly impair the ability of a website running on
hardware that's not as fast, especially if you're authenticating users
all the time.

I think there are better potential solutions to concerns about
password cracking. Django already salts the hashes, which is
asymmetrical in a good way: it helps complicate brute force attacks
without slowing down Django's ability to test a given password.
Better password policies can also help; e.g., each additional letter
you require in your users' passwords exponentially increases the space
of passwords that need to be brute-forced. In cases where your
attacker doesn't have direct access to the database, you can greatly
slow them down by only allowing a certain amount of login attempts in
a given time period.

Christophe Pettus

unread,
Nov 27, 2010, 11:47:18 PM11/27/10
to django-d...@googlegroups.com

On Nov 27, 2010, at 8:05 PM, Tom X. Tobin wrote:

> Your application ends up just
> as hobbled by such an algorithm as a potential attacker.

Actually, no, the situations are really quite asymmetrical. In order to brute-force a password, an attacker has to be able to try many, many thousands of combinations per second. To log in a user, an application has to do it exactly once. A hash computation time of, say, 10ms is probably unnoticeable in a login situation, unless you have tens of thousands of users logging in per minute (and if this is the case, then you probably have other problems than the speed of your password hash algorithm). But that would pretty much slam the door down on any brute force attempt at a password recovery.

> Django already salts the hashes, which is
> asymmetrical in a good way: it helps complicate brute force attacks
> without slowing down Django's ability to test a given password.

A salt is of no benefit on a brute force attack; it's function is to prevent dictionary attacks, which are a different animal.

And if you are willing to assume that no attacker can ever get access to your database, then you don't have to hash the password at all.

But, as you point out, that's a separate discussion from the value of pluggable encryption algorithms. There was a time that MD5 was the perfect answer; now, it's SHA-1. Different applications will have different needs as far as how they write the passwords to disk, and having an architecture to handle this seems like a good idea.

Tom X. Tobin

unread,
Nov 28, 2010, 12:01:03 AM11/28/10
to django-d...@googlegroups.com
On Sat, Nov 27, 2010 at 11:47 PM, Christophe Pettus <x...@thebuild.com> wrote:
> Actually, no, the situations are really quite asymmetrical.  In order to brute-force a password, an attacker has to be able to try many, many thousands of combinations per second.  To log in a user, an application has to do it exactly once.  A hash computation time of, say, 10ms is probably unnoticeable in a login situation, unless you have tens of thousands of users logging in per minute (and if this is the case, then you probably have other problems than the speed of your password hash algorithm).  But that would pretty much slam the door down on any brute force attempt at a password recovery.

But how far are you willing to go in your assumption of the worst-case
computational ability of your attacker? Would tuning the hash to
(say) a 10ms delay for your web server's modest hardware translate
into a significant delay for an attacker with far more resources?
(This isn't a rhetorical question; I honestly don't know.)


> A salt is of no benefit on a brute force attack; it's function is to prevent dictionary attacks, which are a different animal.

It does in fact slow down brute force attacks against multiple
encrypted passwords; each password with a different salt is within an
entirely different space that needs to be brute forced separately from
the other passwords.


> And if you are willing to assume that no attacker can ever get access to your database, then you don't have to hash the password at all.

Sure, but my point was that there are various walls you can throw up
against attackers to slow them down that don't involve slowing down
your hash algorithm.


> But, as you point out, that's a separate discussion from the value of pluggable encryption algorithms.

Right; I didn't mean to dissent from (or concur with) that proposal.

Christophe Pettus

unread,
Nov 28, 2010, 12:19:49 AM11/28/10
to django-d...@googlegroups.com

On Nov 27, 2010, at 9:01 PM, Tom X. Tobin wrote:
> But how far are you willing to go in your assumption of the worst-case
> computational ability of your attacker? Would tuning the hash to
> (say) a 10ms delay for your web server's modest hardware translate
> into a significant delay for an attacker with far more resources?
> (This isn't a rhetorical question; I honestly don't know.)

Let's do the math. The space of eight alphanumeric character passwords is 2.8e12. Even assuming you can cut two orders of magnitude off of that with good assumptions about the kind of passwords that people are picking, this means that the attacker has to run about 28 billion times more computations that you do. At 10ms per password, it would take them about 447.8 years to crack a single password, assuming hardware of equivalent speed.

> It does in fact slow down brute force attacks against multiple
> encrypted passwords; each password with a different salt is within an
> entirely different space that needs to be brute forced separately from
> the other passwords.

Remember how a brute force attack works. Given a hash x, the attacker does:

hash('00000000' + salt) = x? No, then,
hash('00000001' + salt) = x? No, then,
...

The only benefit of the salt here is that it makes the string to be hashed a bit longer, but the benefit is linear, not exponential.

A dictionary attack works by consulting a precomputed set of passwords and their hashes, (pwd, hash(pwd)). The attacker then runs down the dictionary, comparing hashes; if they get a hit, they know the password. The salt defeats this by making the pwd -> hash(pwd) mapping incorrect.

Christophe Pettus

unread,
Nov 28, 2010, 12:28:22 AM11/28/10
to django-d...@googlegroups.com
I wrote:
> A dictionary attack works by consulting a precomputed set of passwords and their hashes, (pwd, hash(pwd)). The attacker then runs down the dictionary, comparing hashes; if they get a hit, they know the password. The salt defeats this by making the pwd -> hash(pwd) mapping incorrect.

I'm being slightly inaccurate here; what I'm describing above is a rainbow dictionary attack, rather than just a plain dictionary attack (which is a brute force attempt on the password over a limited range of input values). Anyway, a salt isn't helpful for a plain dictionary attack, either, for the same reason as a brute force attack.

Anyway, back to the discussion of the actual proposal. :)

Tom X. Tobin

unread,
Nov 28, 2010, 1:29:00 AM11/28/10
to django-d...@googlegroups.com
On Sun, Nov 28, 2010 at 12:19 AM, Christophe Pettus <x...@thebuild.com> wrote:
> Let's do the math.  The space of eight alphanumeric character passwords is 2.8e12.  Even assuming you can cut two orders of magnitude off of that with good assumptions about the kind of passwords that people are picking, this means that the attacker has to run about 28 billion times more computations that you do.  At 10ms per password, it would take them about 447.8 years to crack a single password, assuming hardware of equivalent speed.

The point is that I'm *not* assuming hardware of equivalent speed.
I'm assuming that a worst-case attacker has hardware significantly
faster than your webserver at their disposal, so I was curious if the
purported benefit still held in that case. Maybe it does; I don't
know.


>> It does in fact slow down brute force attacks against multiple
>> encrypted passwords; each password with a different salt is within an
>> entirely different space that needs to be brute forced separately from
>> the other passwords.
>
> Remember how a brute force attack works.  Given a hash x, the attacker does:
>
> hash('00000000' + salt) = x? No, then,
> hash('00000001' + salt) = x? No, then,
> ...
>
> The only benefit of the salt here is that it makes the string to be hashed a bit longer, but the benefit is linear, not exponential.

I'm not arguing that a salt helps against brute-forcing a *single*
password (it doesn't), but it does in fact help against someone trying
to brute-force your entire password database (or any subset of more
than one password), since each password with a different salt lies
within an entirely different space that must be brute-forced
separately from the rest.


> Anyway, back to the discussion of the actual proposal. :)

Sure, I didn't mean to veer things too far off course here; even
assuming the bcrypt argument doesn't hold, it's entirely possible that
someone may want to easily plug in SHA512/SHA3/whatever into their
password encryption.

Christophe Pettus

unread,
Nov 28, 2010, 12:11:11 PM11/28/10
to django-d...@googlegroups.com

On Nov 27, 2010, at 10:29 PM, Tom X. Tobin wrote:
> The point is that I'm *not* assuming hardware of equivalent speed.
> I'm assuming that a worst-case attacker has hardware significantly
> faster than your webserver at their disposal, so I was curious if the
> purported benefit still held in that case. Maybe it does; I don't
> know.

Well, yes, it does, for exactly the reason described: The application has to encode exactly one password; the attacker has to try billions in order to brute-force one. If you assume, say, one password per week is the slowest practical attack, and if it takes 10ms to hash one password, the attacker's hardware has to be about 46,654 times more powerful than your web server.

> I'm not arguing that a salt helps against brute-forcing a *single*
> password (it doesn't), but it does in fact help against someone trying
> to brute-force your entire password database (or any subset of more
> than one password), since each password with a different salt lies
> within an entirely different space that must be brute-forced
> separately from the rest.

I'm not sure what you mean by the "space"; I think you are thinking of a rainbow dictionary attack, where the hashes are precomputed; a salt does indeed help (and probably blocks) that kind of attack. In the case of a straight brute-force attack or a standard dictionary attack without precomputing, the only benefit of the salt is that it makes computing the candidate hash a bit longer, based on the length of the salt. It's a trivial amount of time.

Remember, it's extremely inexpensive to brute-force a single MD5 or SHA1 hash, and the salt does not make it appreciably more expensive. If a CUDA application can brute force 700 million MD5s per second, doubling the length is not really going to make it any more secure.

Tom X. Tobin

unread,
Nov 28, 2010, 1:26:19 PM11/28/10
to django-d...@googlegroups.com
On Sun, Nov 28, 2010 at 12:11 PM, Christophe Pettus <x...@thebuild.com> wrote:
>> I'm not arguing that a salt helps against brute-forcing a *single*
>> password (it doesn't), but it does in fact help against someone trying
>> to brute-force your entire password database (or any subset of more
>> than one password), since each password with a different salt lies
>> within an entirely different space that must be brute-forced
>> separately from the rest.
>
> I'm not sure what you mean by the "space"; I think you are thinking of a rainbow dictionary attack, where the hashes are precomputed; a salt does indeed help (and probably blocks) that kind of attack.  In the case of a straight brute-force attack or a standard dictionary attack without precomputing, the only benefit of the salt is that it makes computing the candidate hash a bit longer, based on the length of the salt.  It's a trivial amount of time.
>
> Remember, it's extremely inexpensive to brute-force a single MD5 or SHA1 hash, and the salt does not make it appreciably more expensive.  If a CUDA application can brute force 700 million MD5s per second, doubling the length is not really going to make it any more secure.

No, I'm not thinking of rainbow tables. The key word here is
*single*. As I said before, a salt *does* help against an attacker
trying to brute-force multiple passwords from your database, since he
can't simply test each brute-force result against all your passwords
at once; he has to start all over from scratch for every single
password that has a different salt. If he only cares about one
*particular* account, the salt doesn't help, no.

But regardless, I apologize for derailing this conversation so far off.

Christophe Pettus

unread,
Nov 28, 2010, 2:39:01 PM11/28/10
to django-d...@googlegroups.com

On Nov 28, 2010, at 10:26 AM, Tom X. Tobin wrote:
> No, I'm not thinking of rainbow tables. The key word here is
> *single*. As I said before, a salt *does* help against an attacker
> trying to brute-force multiple passwords from your database, since he
> can't simply test each brute-force result against all your passwords
> at once; he has to start all over from scratch for every single
> password that has a different salt. If he only cares about one
> *particular* account, the salt doesn't help, no.

Even in your scenario, it only helps as much as the entropy in the password selection. If everyone has a unique password, it doesn't help at all (admittedly unlikely). Again, it's a linear benefit, but not an exponential one.

Right. So, about that proposal... :)

Paul McMillan

unread,
Nov 29, 2010, 6:31:49 PM11/29/10
to django-d...@googlegroups.com
I'm not going to get into the arguments about security of the various
hashing methods, other than to observe that there have been some
fairly misleading statements here.

As far as the proposal goes, I think this is a perfectly reasonable
feature request (and you should open a ticket about it if one does not
already exist).

I'd favor a solution where your setting mapped the algo name to the
actual function used:

PASSWORD_HASH_FUNCTIONS = { 'bcrypt':
myproject.myapp.bcrypt_hexdigest, 'sha1':
django.utils.hashcompat.sha_constructor, etc.}

Then we could put the existing hash functions (sha1, md5, etc.) in
that setting as the default, and get rid of the algo-checking code
that currently lives in auth.models. When we do a password comparison,
we simply pull the hash name, lookup the function, and away we go.

I don't think this will make it into 1.3, but it's a reasonable thing
to do and I think it would help improve all the special-case code that
currently lives in auth.models. The patch itself wouldn't be too hard,
and I'd be willing to write it myself if nobody else will.

-Paul

Tom Evans

unread,
Nov 30, 2010, 4:22:17 AM11/30/10
to django-d...@googlegroups.com


First comment is that Django already has a pluggable authentication
stack, which already allows for this - simply define a new auth
backend that tests the password in the manner you wish.

It doesn't allow for this with the default authenticator, but it is
doable. I have a django project with >100k users, and none of them
have a sha1 hash as their password.

Cheers

Tom

Christopher Petrilli

unread,
Nov 30, 2010, 9:30:08 PM11/30/10
to django-d...@googlegroups.com
On Tue, Nov 30, 2010 at 4:22 AM, Tom Evans <teva...@googlemail.com> wrote:

> First comment is that Django already has a pluggable authentication
> stack, which already allows for this - simply define a new auth
> backend that tests the password in the manner you wish.

My understanding of the pluggable authentication system is that it's
for situations where you need a totally different authentication
mechanism, such as LDAP. Simply replacing the crypto mechanism for the
default authentication system should not require developing a lot of
pieces. It is something that needs to be upgraded on an ongoing basis
for everyone. It's simply best practices.

The federal government already forbids use of SHA-1 after 2010.

> It doesn't allow for this with the default authenticator, but it is
> doable. I have a django project with >100k users, and none of them
> have a sha1 hash as their password.

I won't comment on the wisdom of this, but I'd not use it as an
example of why we don't need to provide flexibility to improve
security.

Chris
--
| Chris Petrilli
| petr...@amber.org

Tom Evans

unread,
Dec 1, 2010, 5:15:37 AM12/1/10
to django-d...@googlegroups.com
On Wed, Dec 1, 2010 at 2:30 AM, Christopher Petrilli <petr...@amber.org> wrote:
> On Tue, Nov 30, 2010 at 4:22 AM, Tom Evans <teva...@googlemail.com> wrote:
>
>> First comment is that Django already has a pluggable authentication
>> stack, which already allows for this - simply define a new auth
>> backend that tests the password in the manner you wish.
>
> My understanding of the pluggable authentication system is that it's
> for situations where you need a totally different authentication
> mechanism, such as LDAP. Simply replacing the crypto mechanism for the
> default authentication system should not require developing a lot of
> pieces. It is something that needs to be upgraded on an ongoing basis
> for everyone. It's simply best practices.

It doesn't 'require developing a lot of pieces'. Have you even tried
implementing this in the current stack?

At the moment, a typical setup has AUTHENTICATION_BACKENDS set to
('django.contrib.auth.backends.ModelBackend',). Changing how passwords
are tested simply requires a different backend, typically derived from
ModelBackend, that overrides the authenticate method.

Is that a lot of pieces, or one small one?

>
> The federal government already forbids use of SHA-1 after 2010.
>
>> It doesn't allow for this with the default authenticator, but it is
>> doable. I have a django project with >100k users, and none of them
>> have a sha1 hash as their password.
>
> I won't comment on the wisdom of this, but I'd not use it as an
> example of why we don't need to provide flexibility to improve
> security.
>
> Chris

Wow, that's a thing to say. Your federal government forbids SHA-1, I
don't use SHA-1, but you "won't comment on the wisdom of this"? Let's
try to keep it civil without casting FUD and aspersions around, eh.

We already have flexibility to implement security in any manner that
you can think of. I'm looking for the argument that says 'This current
flexibility is not enough, and we need to re-architecture', and I
don't think that has been made.

Cheers

Tom

William Ratcliff

unread,
Feb 11, 2011, 12:59:52 AM2/11/11
to django-d...@googlegroups.com
Hi!  I'm new to the list and have started to look into authentication.   I find that I will need to patch it for my own needs, but would like to ask the opinions of others who are more familiar with the code-base than I am.  I apologize if I make any mistakes in the protocol of the list in matters such as including too much code.

SHA1 is not secure.  This is not a nationalism issue.  For example:

Or, from NIST:

where the relevant excerpt is:
March 15, 2006: The SHA-2 family of hash functions (i.e., SHA-224, SHA-256, SHA-384 and SHA-512) may be used by Federal agencies for all applications using secure hash algorithms. Federal agencies should stop using SHA-1 for digital signatures, digital time stamping and other applications that require collision resistance as soon as practical, and must use the SHA-2 family of hash functions for these applications after 2010. After 2010, Federal agencies may use SHA-1 only for the following applications: hash-based message authentication codes (HMACs); key derivation functions (KDFs); and random number generators (RNGs). Regardless of use, NIST encourages application and protocol designers to use the SHA-2 family of hash functions for all new applications and protocols.

I have also seen discussions in other venues from people who are worried about the security of SHA1 in the event that their system is compromised, can an attacker gain the passwords in the database?  The appearance of modules such as django-bcrypt:
show that this issue is becoming of more general concern.

Current solutions:
1)  Monkey patch:
put a top level installed_app that has a listener to the class_prepared signal that performs monkey patching throughout user.
This is rather ugly and it feels very fragile to me if the auth module changes internally.
2)  Rewrite the Backend as Tom suggests by subclassing ModelBackend:
Again, it's not sufficient.  Why?

If we look at the Model Backend, we see that yes, the authenticate method currently authenticates against User--but the problem is NOT the authentication per se, but rather that the User class has several methods such as:
check_password, set_password, etc. that have encryption hard coded.   There are admin commands associated with the User class which refer to methods with a particular encryption method chosen.

3) For users of Django who cannot (say US government agencies, people who have tight security concerns, etc.) use the current module, ignore the auth module and roll their own:
This has been attempted before.  However, the problem is that it is easy to make mistakes doing this and most of the functionality in the auth module is very good and simply copying most of it to make a few changes to User--and to maintain those against modules which use the user module seems rather excessive.

While my first suggestion was:
Move the encryption related portions of the code that are hard coded to the authentication backend.  Make a default which follows best practices (I would suggest moving to SHA2 in a backwards compatible fashion) that most people will use.  However, for those that want to use bcrypt for example, it would be easy for them to simply write their own backend.   

However, there are also merits to Paul's approach of having a mapping in the settings file.   What I like about the backend approach is that it allows for graceful fallbacks as function of python version, but gives the user the ability to change the algorithm in a simple way.
Also, it saves one from distinguishing between sha1 and sha1 with stretching....Perhaps a compromise would be to have a backend
which looks to the settings file for the location of a method?

But, if I write a patch that works and maintains backwards compatibility would it be accepted?  Is it better to email it here, or to submit a ticket, claim the ticket, and then add the patch?


Also, would it be better for me to work from the trunk, or from 1.2.5?  This is important because of:


Thanks,
William







Eduardo Cereto Carvalho

unread,
Feb 11, 2011, 8:10:20 AM2/11/11
to django-d...@googlegroups.com
I'm not an expert on the subject. 

But I think that the hashes security issues are olved by the use of a "salt", salted hashes are known to be a very secure way to store data. 









--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.



--
Eduardo Cereto Carvalho

Clemens-O. Hoppe

unread,
Feb 11, 2011, 9:50:18 AM2/11/11
to django-d...@googlegroups.com
That's a subject which comes up every few months, sadly.

In a nutshell, if something requires python >= 2.5 or a lib for older
versions of Python, forget about adding it.

See f. e. http://code.djangoproject.com/ticket/5600 which was closed as
a no-fix 3 years ago (full disclosure: I'm coh in that bug report).
There was also a discussion on this mailing list a few weeks ago about
increasing the salt length, but afaik it had no code-change as a result.

I apologize if I sound a bit grumpy, but I've spend the last 5 days with
monkey-patching a local branch of the auth lib up to the latest in
security (SHA512, 128-bit salt, pre-stretching, pbkdf2, stronger random
token generation (salt, csrf, default-password)), now it spreads into
other areas of the django-lib as well (currently SECRET_KEY in the
starproject script).

Of course I would very much welcome such a proposal, yet I just believe
the odds for it to happen are (very) low.

Cheers,

coh

Clemens-O. Hoppe

unread,
Feb 11, 2011, 9:56:04 AM2/11/11
to django-d...@googlegroups.com
Dear Eduardo,

the idea of a salt is only to make certain that two users who happen to
use the same password (123456, anyone?) don't end up with the same hash
in order to make a pre-computation (password lists or rainbow tables)
infeasible. yet given the short salts in django, it's not really
unlikely that two users will not share the same salt as well as
password. Also keep in mind that, due to the Birthday Paradoxon, a hash
with N bits only has odds of 1:2^(N/2) instead of 1:2^N for a collision
to occur.

Hope that clears up things a little bit :)

coh

william ratcliff

unread,
Feb 11, 2011, 9:57:41 AM2/11/11
to django-d...@googlegroups.com
Hi!  The scenario that I am considering is when the attacker has your database--can they compromise the passwords in it?  While I believe that a salt will protect you against a "Rainbow Table" attack, I'm not convinced that it will protect you against the brute-force attacks which are now possible.  I will try to consult some experts today and see if they are willing to go on record.

William

Russell Keith-Magee

unread,
Feb 11, 2011, 10:04:46 AM2/11/11
to django-d...@googlegroups.com

On Friday, 11 February 2011 at 10:50 PM, Clemens-O. Hoppe wrote:
That's a subject which comes up every few months, sadly.

In a nutshell, if something requires python >= 2.5 or a lib for older
versions of Python, forget about adding it.
That's not true at all.

If an idea is important enough, we will include compatibility options for older Python versions. For example, we ship copies of unittest2, dictconfig logging, and a number of importlib and fuctional programming utilities in order to support the Python versions in which those facilities aren't available natively.

Of course, we will balance the value of a change against the cost of maintaining a local copy of that library, but to say that we won't do this at all is patently and demonstrably incorrect.

Yours,
Russell Keith-Magee

Peter Landry

unread,
Feb 11, 2011, 10:06:25 AM2/11/11
to django-d...@googlegroups.com
I'm not an expert, but that's correct. A too-fast or broken hash function
will still be "vulnerable" to a brute force attack [1]. Salting doesn't
prevent this.

Peter

[1]
http://stacksmashing.net/2010/11/15/cracking-in-the-cloud-amazons-new-ec2-gp
u-instances/

Clemens-O. Hoppe

unread,
Feb 11, 2011, 10:20:40 AM2/11/11
to django-d...@googlegroups.com

To quote the issue 5600:
So I think the only thing we can do here is increase the salt size.
I think anyone who feels they need more security will have to implement a
custom authentication backend; building this into Django is just to
fraught with danger.

Yet the patch for the salt-size only increase, it was added not 24 hours
after that, still didn't make its way into any release as far as I'm
aware of it.

Given the current 20-bit length (5 hex chars), salt-collisions will happen.

On 02/11/2011 04:04 PM, Russell Keith-Magee wrote:
> If an idea is important enough, we will include compatibility options
> for older Python versions.

>> In a nutshell, if something requires python >= 2.5 or a lib for older
>> versions of Python, forget about adding it.
> That's not true at all.

> ... but to say that we won't do


> this at all is patently and demonstrably incorrect.

Sorry if it came along as too harsh --

> I apologize if I sound a bit grumpy, but I've spend the last 5 days with

> monkey-patching a local branch of the auth lib...

Once again, I didn't mean to insult any dev (running a few projects
myself, so I know how much work it is) and I appreciate the work that is
done.

> Yours,
> Russell Keith-Magee

Cheers,

coh

Tyler Mulligan

unread,
Feb 11, 2011, 10:44:16 AM2/11/11
to Django developers
I agree, it seems like a lot of work for individual developers to be
patching django themselves for secure auth. I'd be extremely grateful
to see this merged into the core.

On Feb 11, 10:20 am, "Clemens-O. Hoppe"

william ratcliff

unread,
Feb 11, 2011, 11:54:54 AM2/11/11
to django-d...@googlegroups.com
Wow--you're ahead of me!  Is your custom auth public source?  If so, may I see the repo?  Also, for increasing the length of the salt, are you referring to:

I thought it was marked as accepted.  But I just checked out SVN and you are correct that it is not using gen_salt.

Does anyone know when it will be included?

Thanks,
William

Clemens-O. Hoppe

unread,
Feb 11, 2011, 4:09:41 PM2/11/11
to django-d...@googlegroups.com
On 02/11/2011 05:54 PM, william ratcliff wrote:
> Wow--you're ahead of me! Is your custom auth public source? If so, may
> I see the repo? Also, for increasing the length of the salt, are you
> referring to:
> http://code.djangoproject.com/attachment/ticket/13969/better_salting.diff
>
> Thanks,
> William

Wasn't, but now is -- https://bitbucket.org/coh/django_sec_mod/ . It's
against svn rev 15488. Please note that any sort of backwards compat is
broken with on purpose and I haven't really tested the changes yet.

As for the salt length, I was actually referring to ticket 5600.

Have fun,

coh

poswald

unread,
Feb 12, 2011, 8:02:55 AM2/12/11
to Django developers
There are a lot of ideas and opinions, and a fair amount of confusion
floating around here. Please allow me to summarize the questions and
add my commentary:

1.) Should Django ship using SHA1 (with the current salt length or
even with more bits added)?

- I don't think so. SHA2 (256 or 512) is stronger and part of the
python hashlib (http://docs.python.org/library/hashlib.html). SHA1 is
not recommended for hashing passwords by the cryptographic community
and is also prohibited by NIST. Most developers will simply trust
Django to make the right choice by default and SHA1 is not the right
choice anymore.


2.) Will increasing the size of the salt help?

- Without a salt, a user's password of "password" can be easily looked
up in a pre-computed rainbow table. By adding a salt, you force the
attacker to look have a rainbow table for each salted value. They have
to have a look up an entry for each "password" + "saltvalue"
combination. Increasing the size of the salt increases the size of the
set of rainbow tables the attacker would have to have. A 12 bit salt
set of rainbow tables is probably storable. A 128 bit salt blows up
the size of the tables to "practically impossible". Since the attacker
has the salt, here they will usually switch to brute-forcing the
passwords. The lesson is: a large enough salt forces the attacker to
switch to brute force. They will run through all of the possible
passwords + salt as fast as they can. Here is where having a fast
general-purpose hashing function hurts you. If your attacker can do
tens of thousands of checks per second (and they can), you're in
trouble. Slowing them down 10,000x means the crack goes from hours to
years.

http://en.wikipedia.org/wiki/Rainbow_table


2.) Should Django ship using SHA2 (256 or 512)?

- While using a general purpose hashing library such as SHA2 for
password hashing is certainly better than SHA1, it is probably not
ideal. SHA2 is designed to be fast because it's used for things
besides password hashing. A password hashing function is a better
choice: Bcrypt, PBKDF2 or scrypt. They have a "parameterized cost" and
allow you to tune the speed of the hashing by adding rounds to slow
down brute force attacks as hardware gets faster. This is called "Key
Strengthening" or "Key Stretching". Ulrich Drepper created an
implementation of crypt that uses repeated rounds of SHA2 for people
who needed to stay with a NIST approved function. Scrypt in particular
makes it difficult even for custom built hardware to compute quickly.

Some relevant reading/projects:

http://www.daemonology.net/blog/2009-06-11-cryptographic-right-answers.html
http://www.akkadia.org/drepper/sha-crypt.html
http://www.tarsnap.com/scrypt.html
http://en.wikipedia.org/wiki/PBKDF2
http://pypi.python.org/pypi/py-bcrypt/
http://pypi.python.org/pypi/scrypt/
http://pypi.python.org/pypi/pbkdf2.py/
http://en.wikipedia.org/wiki/Key_strengthening


3.) Should Django allow a new mechanism for people to swap in their
own choice of hashing libraries?

- Since people have varying requirements and varying restrictions, I
think it would be helpful. If you look at how this problem has been
approached by the community to date (to my knowledge) it is mostly by
monkey patching, not by writing a new auth backend. Something like
this:

https://github.com/fwenzel/django-sha2/
https://github.com/dwaiter/django-bcrypt/

I know Mozilla recently opted for the SHA2/monkeypatch approach above
in one of their projects (although, I'm trying to convince them to
upgrade that to bcrypt or scrypt).

I think an improved default and the ability to upgrade (to scrypt) or
downgrade (to SHA2) depending on the developer's requirements would be
ideal.


4.) How should this pluggable capability be provided?

- Assuming we can all agree on the other three points, this is the
open question that needs to be focused on. Here, I don't have much of
an opinion yet. I'd defer to someone with more experience in Django's
API design. If I can swap in bcrypt or scrypt with a few lines of
configuration, I'll be happy.

I'm hoping this background material is useful and gets everyone on the
same page.

Clemens-O. Hoppe

unread,
Feb 12, 2011, 9:15:32 AM2/12/11
to django-d...@googlegroups.com
Nice read, though I would like to add one link:

http://www.f-secure.com/weblog/archives/00002095.html

And referenced from that, http://www.golubev.com/hashgpu.htm with the quote:

> Recovery speed on ATI HD 5970 peaks at 5600M/s MD5 hashes and 2300M/s SHA1 hashes.

That means, 2,300,000,000 SHA1 hashes per second.

On 02/12/2011 02:02 PM, poswald wrote:
> I'm hoping this background material is useful and gets everyone on the
> same page.

fullack.

Cheers,

coh

poswald

unread,
Feb 14, 2011, 1:37:29 AM2/14/11
to Django developers
Here is an overview of issues on this subject opened over the years.
Some have existing code:

http://code.djangoproject.com/ticket/3316 (Adding `crypt' to list of
password hashes for legacy apps. - closed: fixed)
http://code.djangoproject.com/ticket/5600 (Patch to enhance
cryptography on django.contrib.auth - closed: wontfix)
http://code.djangoproject.com/ticket/5787 (BCrypt password hashing
support in Django - closed: duplicate)
http://code.djangoproject.com/ticket/6028 (add compatibility with
glibc2 MD5-based crypt passwords - new )
http://code.djangoproject.com/ticket/9101 (Improved salt generation
for django.contrib.auth - closed: wontfix)
http://code.djangoproject.com/ticket/9194 (Allow additional hashing
algorithms for passwords - closed: duplicate)
http://code.djangoproject.com/ticket/13969 (auth module should use
longer salt for hashing - new)

Some of the arguments being made for this feature have been a bit
misleading and most of them pre-date the NIST requirements changeover.
I think the summary I made before gives the most clear overview of the
current situation: as it currently stands, if you get access to the
contents of a Django user table, you can decrypt the passwords very
cheaply/rapidly.

Looking at the code of existing solutions to this it seems like the
following would be a reasonable approach:

* Django ships with SHA2-256, SHA2-512 or PBKDF2 by default. SHA2 is
python 2.5 compatible (due to hashlib being added in python 2.5) and
PBKDF2 is short enough that it could be included into the project.
This satisfies NIST/US Gov requirements.
* SHA1 is maintained for backwards compatibility
* More secure hashing algorithms can be specified by defining the
functions to be used for 'User.set_password' and 'User.check_password'
as suggested above.

To use SHA2-512 by default requires a larger password db column. That
might be reasonable and would be a better choice. Additionally we
could look into using Drepper's key strengthening algorithm of SHA2 by
default to make django brute-force resilient out of the box. It should
also be noted that an algorithm like bcrypt stores the salt in with
the hash and therefore the salt column is not used.

It seems like we have people motivated to do the work. We need a
design decision made by a core dev that this is an acceptable
approach.

-Paul

william ratcliff

unread,
Feb 14, 2011, 2:04:24 AM2/14/11
to django-d...@googlegroups.com
Excellent summary!  If the core developers agree to this, I'm happy to contribute.

William

Carl Meyer

unread,
Feb 15, 2011, 1:24:43 AM2/15/11
to Django developers
Hi Paul,

On Feb 14, 1:37 am, poswald <paulosw...@gmail.com> wrote:
> * Django ships with SHA2-256, SHA2-512 or PBKDF2 by default. SHA2 is
> python 2.5 compatible (due to hashlib being added in python 2.5) and
> PBKDF2 is short enough that it could be included into the project.
> This satisfies NIST/US Gov requirements.
> * SHA1 is maintained for backwards compatibility
> * More secure hashing algorithms can be specified by defining the
> functions to be used for 'User.set_password' and 'User.check_password'
> as suggested above.

I'm only one core dev, and not a crypto expert, but I've read the
linked material and followed previous conversations, and here's my
take:

I don't think it's OK for Django to continue shipping with a default
password hashing scheme which no crypto expert, as far as I've seen,
considers adequate. People I trust to know their crypto, e.g. Thomas
Ptacek of Matasano, consider PBKDF2 to be significantly better than
salted SHA1 for password storage, if not quite as good as bcrypt. [1]
PBKDF2 is simple enough (just SHA1 iterated many times) that including
an existing pure-Python implementation in Django seems reasonable,
removing the concerns about cross-platform and Python version
compatibility. (It would still be best if we could get the PBKDF2
implementation reviewed by a cryptographer.) So I'm +1 on switching
Django's default password hashing to PBKDF2.

As for the broader configurability question, I'm just fine with
requiring a custom auth backend, which really isn't that hard, as a
condition for customizing password hashing. So I'm not particularly
tempted by proposals to add a new setting for this. The hardcoded
stuff in the User model does bug me, though; I'm interested in the
proposal to make the User model delegate that to new methods on an
authentication backend (with backwards-compatibility fallback for old
auth backends that don't have the new methods).

Carl

[1] http://news.ycombinator.com/item?id=2005182

Russell Keith-Magee

unread,
Feb 15, 2011, 2:03:18 AM2/15/11
to django-d...@googlegroups.com

As with Carl -- I'm only one core dev, and I'm not a crypto expert,
here's my take:

I agree that it's less than ideal for us to continue to use SHA1 given
it's known inadequacies.

My concern with this approach is that it requires us to either
maintain or adopt an encryption algorithm. We're not just using
something from the standard library, we're taking responsibility for
the holes in a specific implementation. Even if we just adopt code
from an existing implementation, we are accepting responsbility for
finding and fixing any holes in that implementation. This is a
responsibility that can't be taken lightly, and I'm not completely
convinced that we should pick up that particular gauntlet.

For this reason alone, I could be convinced that a configuration item
may be called for here -- e.g., registering a user-crypto library that
the default User object will use, in the same way that you can
currently register serialization libraries.

However, that said:

> As for the broader configurability question, I'm just fine with
> requiring a custom auth backend, which really isn't that hard, as a
> condition for customizing password hashing. So I'm not particularly
> tempted by proposals to add a new setting for this. The hardcoded
> stuff in the User model does bug me, though; I'm interested in the
> proposal to make the User model delegate that to new methods on an
> authentication backend (with backwards-compatibility fallback for old
> auth backends that don't have the new methods).

One of the things that I want to tackle in the 1.4 timeframe is the
general problem of a 'pluggable' User model. Allowing for customizable
authentication schemes is one (of many) parts of this problem. Right
now, my focus is on getting the 1.3 release out the door; once the 1.4
feature phase starts, I'll have a lot more time to discuss this sort
of thing.

Yours,
Russ Magee %-)

william ratcliff

unread,
Feb 15, 2011, 2:29:06 AM2/15/11
to django-d...@googlegroups.com
Carl and Russ,

Thanks for the response!  Would you prefer that those of us interested in working on this (pluggable user cryto-system) proceed from the trunk, or from 1.2?  

Thanks,
William


Russell Keith-Magee

unread,
Feb 15, 2011, 2:34:25 AM2/15/11
to django-d...@googlegroups.com
On Tue, Feb 15, 2011 at 3:29 PM, william ratcliff
<william....@gmail.com> wrote:
> Carl and Russ,
> Thanks for the response!  Would you prefer that those of us interested in
> working on this (pluggable user cryto-system) proceed from the trunk, or
> from 1.2?

New features are always applied to trunk, so if you're developing new
code, thats what you should be developing against.

Yours,
Russ Magee %-)

poswald

unread,
Feb 21, 2011, 8:23:57 AM2/21/11
to Django developers
Russ, Carl, thanks for your feedback. Russ, I understand what you say
about not wanting to adopt crypto code because of the additional
responsibility. Unfortunately, there aren't very good options. Django
contrib.auth already makes the recommendation of SHA1 which we all
agree is less than ideal. There is simply no acceptable choice in the
python standard library. I also agree with Carl that PBKDF2 is
probably the most conservative option that qualifies as sufficient.

It seems like the canonical implementation of PBKDF2 in python is
Dwayne Litzenberger's. I think it is simple enough to audit for flaws
and stable enough not to cause too much trouble maintaining:

http://www.dlitz.net/software/python-pbkdf2/
http://ftp.dlitz.net/pub/dlitz/crypto/pkcs5-pbkdf2/1.2/PBKDF2.py
http://en.wikipedia.org/wiki/PBKDF2

I understand that everyone has their hands full with the 1.3 release
so I've gone ahead and opened a new ticket to track contributions to
this issue off-list. Anyone interested can track contributions there:

http://code.djangoproject.com/ticket/15367

Perhaps once the authentication methods are decoupled from the User
object as you plan it becomes sufficiently easy for third party
libraries to replace the hashing algorithm. If that happens, then this
default hashing can be ported to that technique. I do think it is
important to make it "easy enough" for a developer to upgrade to a
different library of choice. For now though, I'm ok with working on a
conservative (but improved) default for Django 1.4.

-Paul

Jacob Kaplan-Moss

unread,
Feb 21, 2011, 9:26:29 AM2/21/11
to django-d...@googlegroups.com
On Mon, Feb 21, 2011 at 3:23 PM, poswald <paulo...@gmail.com> wrote:
> Russ, Carl, thanks for your feedback. Russ, I understand what you say
> about not wanting to adopt crypto code because of the additional
> responsibility. Unfortunately, there aren't very good options. Django
> contrib.auth already makes the recommendation of SHA1 which we all
> agree is less than ideal. There is simply no acceptable choice in the
> python standard library. I also agree with Carl that PBKDF2 is
> probably the most conservative option that qualifies as sufficient.

I've been desperately trying to get up to speed on this stuff over the
past few weeks. Crypto's very far from my strong suit, but I think I
know enough now to agree. It seems to me we need two things:

1. A new, updated default for Django's password hashing. PBKDF2,
perhaps, but whatever as long as it meets some basic requirements.
2. A mechanism to make swapping this hashing algorithm out easy(-ier).
Again, details don't matter, requirements do.

#1's a blocker for 1.4, I think, but if for some reason #2 can't be
figured out I think it's ok to punt there for a bit longer. Ideally
though they'd both go in at once.

Now, I want to make very explicit my requirements here since we've
gone 'round on this one a few times, so I'll lay out exactly what I'm
going to want to see to get on board with any proposal. So:

Requirements for a new password hash:
* As little crypto code in Django as possible. We're not security
experts, and we shouldn't try to be. Ideally would be something that
leaves all of the dangerous parts to the stdlib. Perhaps we relax our
dependency policy (we need to some day, I think, but that's a bigger
argument maybe we shouldn't have now).
* Any code we distribute gets audited by people who know what they're
talking about.
* Those people have reputations sufficient to convince me (or other
core devs) that they know what they're doing. This is sorta a "who
watches the watchers" moment, but we can't just trust someone who says
they're a crypto expert; we have to believe them, too.

Requirements for pluggable hashing algorithms:
* The big one is cross-installation password compatibility. If I
upgrade from Python 2.4 to 2.7 my passwords have to keep working. If I
install django-bcrypt my old passwords have to keep working. If I then
decide to switch to pbkdf2 my bcrypt passwords have to keep working.
We already have an in-place upgrade mechanism for md5; we probably
need something similar as a generic thing.
* Failures need to be clear - I shouldn't get mysterious login
failures if I accidentally uninstall bcrypt (i.e. I should get a loud,
clear, failure quickly).
* We need an internal upgrade path that *we* can use when a few years
from now everyone starts complaining that PBKDF2 is fundamentally
flawed and that we're total idiots for clinging to it.

[It occurs to me that, with the right mentor, this would make a
fantastic SoC project...]

Jacob

Jeff Balogh

unread,
Feb 22, 2011, 4:18:03 PM2/22/11
to django-d...@googlegroups.com, Michael Coates

At Mozilla we've been trying to work out our ideal password storage
scheme for a little while. Spoiler alert: it doesn't involve bcrypt.

There's no code yet, but we have a little bit of documentation here:
https://wiki.mozilla.org/WebAppSec/Secure_Coding_Guidelines#Password_Storage.
This is the outline:

1. sha512 hashing
2. Per-user salt stored with the hashed password
3. Private system salt stored outside the database
4. The system salt can be deprecated if it gets leaked
5. Required minimum password length
6. Common passwords are blocked
7. Migrations towards more security are possible as long as you have
code to unwrap the migration

I've cc'd Michael Coates (our security guy) since he can provide more
background on why we're moving towards this strategy. Basically, if
people are using weak passwords, switching hashing schemes is not
going to provide much more protection. If it takes 0.3 seconds to
encrypt "pa$$word" then it took 0.3 seconds, but I still have your
user's password. We're trying to structure a system that provides more
entropy and prevents weak passwords altogether.

Reply all
Reply to author
Forward
0 new messages