Regarding the use of sha in contrib.auth

32 views
Skip to first unread message

Rob Hudson

unread,
Feb 8, 2007, 12:16:16 PM2/8/07
to Django developers
Since SHA-1 was recently found to have some collisions, and since sha
is deprecated in Python 2.5 in favor of hashlib, should an attempt to
import hashlib be added to contrib.auth.models (both check_password
and set_password) so when Python 2.5 becomes more mainstream, this
will be picked up by default?

Hashlib docs:
http://docs.python.org/lib/module-hashlib.html

Both md5 and sha have deprecation warnings:
http://docs.python.org/lib/module-md5.html
http://docs.python.org/lib/module-sha.html

James Bennett

unread,
Feb 8, 2007, 12:36:03 PM2/8/07
to django-d...@googlegroups.com
On 2/8/07, Rob Hudson <trebor...@gmail.com> wrote:
> Since SHA-1 was recently found to have some collisions, and since sha
> is deprecated in Python 2.5 in favor of hashlib, should an attempt to
> import hashlib be added to contrib.auth.models (both check_password
> and set_password) so when Python 2.5 becomes more mainstream, this
> will be picked up by default?

Using hashlib to generate SHA1 when it's available is something I
could get behind. Deprecating SHA1 hashes, not so much -- *every* hash
algorithm, inevitably, will have collisions (think about it -- you
have a fixed number of digits in the final hash and, hence, a fixed
number of distinct possible permutations of digits. That number of
permutations will always be smaller than the number of possible inputs
to the algorithm, so there will always be at least some sets of inputs
which all yield the same hash).

And collisions, by themselves, don't make an algorithm useless for
what we want out of it, which is a roughly unique representation of a
password that isn't the password itself. Generating a collision
wouldn't mean you could log in as someone else, it'd mean you could
have two users with the same password hash, and that -- since auth
lookups start with the username and not the password hash, and require
a match on *both* columns -- doesn't cause a problem either. Just as
two users in a non-hashing system could both have the password
"secret123" without interfering with one another, two users in a
hashing system can have the same password hash without interfering
with one another.


--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Rob Hudson

unread,
Feb 8, 2007, 2:14:00 PM2/8/07
to Django developers
On Feb 8, 9:36 am, "James Bennett" <ubernost...@gmail.com> wrote:
> Using hashlib to generate SHA1 when it's available is something I
> could get behind. Deprecating SHA1 hashes, not so much

Totally agree... the news about SHA-1 made me want to look up Python's
cryptographic functions as a curiosity. SHA-1 isn't going away any
time soon. A new hashing method won't even be picked by NIST until
2011.

It's still early, but I think adding support for hashlib is good,
especially since Python appears to be deprecating anything else.

As an extra option, could we also add support for sha-256 in
contrib.auth if folks would prefer that?

-Rob


TR

unread,
Feb 8, 2007, 4:21:56 PM2/8/07
to Django developers
On 8 Feb., 18:36, "James Bennett" <ubernost...@gmail.com> wrote:

> On 2/8/07, Rob Hudson <treborhud...@gmail.com> wrote:
>
> > Since SHA-1 was recently found to have some collisions, and since sha
> > is deprecated in Python 2.5 in favor of hashlib, should an attempt to
> > import hashlib be added to contrib.auth.models (both check_password
> > and set_password) so when Python 2.5 becomes more mainstream, this
> > will be picked up by default?
>
> Using hashlib to generate SHA1 when it's available is something I
> could get behind. Deprecating SHA1 hashes, not so much -- *every* hash
> algorithm, inevitably, will have collisions

Yes. That's why they're called hashes. What's bad now is if you can
generate collisions faster than by brute force, which is exactly what
was happening. This is very different. Basically your hash doesn't
mean anything anymore if Joe Random Cracker can present you any data
he wants and your hash algorithm still says "Yes, correct".

> And collisions, by themselves, don't make an algorithm useless for
> what we want out of it, which is a roughly unique representation of a
> password that isn't the password itself. Generating a collision
> wouldn't mean you could log in as someone else, it'd mean you could
> have two users with the same password hash, and that -- since auth
> lookups start with the username and not the password hash, and require
> a match on *both* columns -- doesn't cause a problem either. Just as
> two users in a non-hashing system could both have the password
> "secret123" without interfering with one another, two users in a
> hashing system can have the same password hash without interfering
> with one another.


Which would be right, if you couldn't use a broken hash algorithm to
login without the right password, but something that just generates
the same hash - in other words, knowing the hash (poking at the db,
SQL injection, anything) you don't need the password. It's like
storing a clear text password, and you wouldn't argue that's a good
idea, no?

Alas, the current situation with SHA-1 isn't that bad, there are still
enough bits left, but any algorithm with one successful attack has
historically been taken apart. Could happen again. Right now, there is
now real alternative, the larger SHAs are probably vulnerable to the
same attack vector and WHIRLPOOL's still young (but looks good so
far).

Regards,
Thomas (nitpicker)

James Bennett

unread,
Feb 8, 2007, 4:45:29 PM2/8/07
to django-d...@googlegroups.com
On 2/8/07, TR <dtr...@gmail.com> wrote:
> Which would be right, if you couldn't use a broken hash algorithm to
> login without the right password, but something that just generates
> the same hash - in other words, knowing the hash (poking at the db,
> SQL injection, anything) you don't need the password. It's like
> storing a clear text password, and you wouldn't argue that's a good
> idea, no?

Well, the important thing here is that in order to take over a user's
account by generating a hash collision, an attacker has to know *in
advance* the hash to generate the collision for. And if your attacker
has enough access to get that information out of your database, I
don't really see how choosing a different hash algorithm could help
you out -- if the attacker can retrieve password hashes, it's likely
she no longer needs to generate collisions in order to impersonate
people (and, since the DB entries contain the salt used to generate
the hash, a standard dictionary attack is likely to be a much more
efficient use of the attacker's resources if she does need to do
that).

Rob Hudson

unread,
Feb 8, 2007, 6:35:26 PM2/8/07
to Django developers
Should I file a bug to eventually use hashlib for >= Python 2.5?
Should I provide a patch which attempts to import hashlib and use it
if available, but otherwise falls back on md5/sha1?

Some general confusion about what's going on in contrib.auth.models...

There's 2 check_password methods in there. 1 in the global namespace
and 1 in the User class. User.check_password is there mainly to check
for an md5 password (by absence of a '$') and if it is an md5
password, it converts it to the sha1 password and passes handling to
the global check_password.

But set_password will only set a sha1 password. So why would the
global check_password need to check if the algo is 'md5' if
set_password could never use md5?

Could Django remove the BC check prior to 1.0 to clean this up? I
guess those applications that are in active use with real users this
would be bad since the only way to migrate this to sha1 would be to
know the actual password.

Maybe I answered my own question. :)

Gary Wilson

unread,
Feb 8, 2007, 10:31:23 PM2/8/07
to Django developers
On Feb 8, 5:35 pm, "Rob Hudson" <treborhud...@gmail.com> wrote:
> Should I file a bug to eventually use hashlib for >= Python 2.5?
> Should I provide a patch which attempts to import hashlib and use it
> if available, but otherwise falls back on md5/sha1?

Yes, file a bug so the idea is not forgotten. Patches are always
welcome.

> Some general confusion about what's going on in contrib.auth.models...
>
> There's 2 check_password methods in there. 1 in the global namespace
> and 1 in the User class. User.check_password is there mainly to check
> for an md5 password (by absence of a '$') and if it is an md5
> password, it converts it to the sha1 password and passes handling to
> the global check_password.
>
> But set_password will only set a sha1 password. So why would the
> global check_password need to check if the algo is 'md5' if
> set_password could never use md5?

Because Django used to use md5 hashes.

> Could Django remove the BC check prior to 1.0 to clean this up? I
> guess those applications that are in active use with real users this
> would be bad since the only way to migrate this to sha1 would be to
> know the actual password.

Or a collision :)

Malcolm Tredinnick

unread,
Feb 8, 2007, 10:33:50 PM2/8/07
to django-d...@googlegroups.com
On Thu, 2007-02-08 at 23:35 +0000, Rob Hudson wrote:
> Should I file a bug to eventually use hashlib for >= Python 2.5?
> Should I provide a patch which attempts to import hashlib and use it
> if available, but otherwise falls back on md5/sha1?

Be careful to ensure backwards compatibility. Otherwise an
inconsequential Python upgrade (to 2.5) will mean all your previously
recorded passwords are now unusable. You need to at least be able to
check for SHA1-style hashes and use those if necessary no matter which
version of Python you are using.

Malcolm


Rob Hudson

unread,
Feb 8, 2007, 11:46:40 PM2/8/07
to django-d...@googlegroups.com
Malcolm Tredinnick wrote:
> Be careful to ensure backwards compatibility. Otherwise an
> inconsequential Python upgrade (to 2.5) will mean all your previously
> recorded passwords are now unusable. You need to at least be able to
> check for SHA1-style hashes and use those if necessary no matter which
> version of Python you are using.

Good point. I did a quick test and the SHA-1 hashes are equivalent...

Python 2.4.3 (#1, Nov 3 2006, 21:03:52)
[GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import random
>>> rand = str(random.random())
>>> rand
'0.55628289848'
>>> import sha
>>> salt = sha.new(rand).hexdigest()[:5]
>>> raw_pass = 'turing'
>>> hsh = sha.new(salt+raw_pass).hexdigest()
>>> '%s$%s$%s' % ('sha1', salt, hsh)
'sha1$cb374$bd6289a5f976888b532141483391c108656edfb5'

Python 2.5 (r25:51908, Nov 3 2006, 20:49:30)
[GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> rand = '0.55628289848'
>>> import hashlib
>>> salt = hashlib.sha1(rand).hexdigest()[:5]
>>> raw_pass = 'turing'
>>> hsh = hashlib.sha1(salt+raw_pass).hexdigest()
>>> '%s$%s$%s' % ('sha1', salt, hsh)
'sha1$cb374$bd6289a5f976888b532141483391c108656edfb5'

Lawrence Oluyede

unread,
Feb 9, 2007, 4:09:58 AM2/9/07
to django-d...@googlegroups.com
> Good point. I did a quick test and the SHA-1 hashes are equivalent...

I also tried with Python 2.3

rhymes@groove ~ % python2.3

[10:08]
Python 2.3.5 (#1, Jan 13 2006, 20:13:11)
[GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin


Type "help", "copyright", "credits" or "license" for more information.
>>> rand = '0.55628289848'

>>> import sha
>>> salt = sha.new(rand).hexdigest()[:5]
>>> raw_pass = 'turing'
>>> hsh = sha.new(salt+raw_pass).hexdigest()
>>> '%s$%s$%s' % ('sha1', salt, hsh)
'sha1$cb374$bd6289a5f976888b532141483391c108656edfb5'


--
Lawrence, oluyede.org - neropercaso.it
"It is difficult to get a man to understand
something when his salary depends on not
understanding it" - Upton Sinclair

James Bennett

unread,
Feb 9, 2007, 7:29:14 AM2/9/07
to django-d...@googlegroups.com
On 2/8/07, Rob Hudson <trebor...@gmail.com> wrote:
> But set_password will only set a sha1 password. So why would the
> global check_password need to check if the algo is 'md5' if
> set_password could never use md5?

Django used to use MD5 hashes; that function is in there so that an
old installation which was using MD5 can be upgraded and get switched
over to SHA1 without needing to manually go through and reset
passwords.

Reply all
Reply to author
Forward
0 new messages