Questions pertaining the dubious value and the origin of "common" passwords

81 views
Skip to first unread message

Yo-Yo Ma

unread,
Dec 20, 2015, 12:03:21 PM12/20/15
to Django developers (Contributions to Django itself)
Today, I decided to check out Django's new password validation functionality (which is a great feature, btw).

I noticed there was a CommonPasswordValidator, which mentions "1000 common passwords"...

Part 1.

The first thing that came to mind was, how would one compile a list of 1000 common passwords,  unless they maintained a rainbow table of millions of possible passwords AND had access to a large corpus of leaked password hashes (or databases of plain text passwords)?

Here's where it's worth noting the "This is Facebook, so I'll create a real password" vs "This is just some forum I'll probably never come back to, so I'll just use hunter2" phenomenon.

Now, given the second part of the question (large corpus of data), or even more so, the plain text case, where does intuition tell you that the majority of this kind of data would likely come from? Facebook / Twitter / online banks? Or, forums and defunct website?

I think with that, I've established the potentially dubious potential for the notion of "N most common passwords" being even remotely accurately established.

Part 2.

So, with the above thoughts in mind, I decided to have a look at the passwords Django is using and find their origin (did they come from a compiled list of "leaked" databases or something else?).

The list (plain text: https://gist.github.com/anonymous/59e9eb2935165d7b0fa9), I found after a quick search, is copied wholesale from a website called passwordrandom.com.

The website appears to be owned by one Dmitriy Koshevoy in Ukraine, but other than that I know nothing about it.

The list that Django uses is from this page specifically http://www.passwordrandom.com/most-popular-passwords - purporting to have the 10,000 most commonly used passwords (in order!), but says nothing about where they came from.

I figured, maybe this website is quite popular for password validation / generation, and Dmitriy has compiled... seems like a pretty bad idea to give them your password, but oh wel.

Except that passwordrandom.com has basically no traffic, according to SimilarWeb, Compete, and Alexa.

Side note: passwordrandom also features this strange and suspicious joke http://www.passwordrandom.com/password-database. Hopefully nobody has entered their real password there or anywhere else on the website, or used the site to generate a password, lest they lose it to the public domain, since the website doesn't even employ TLS.

Conclusion.

With all that, I'm now wondering how this list of "common" passwords made it into Django's code base. Perhaps, it should be removed, since, as I've established above, it provides no verifiable value or security. It could just as well be replaced with a configuration option (list setting or file path setting), to maintain backwards compatibility (and warm fuzzies for those who think *they* know the most common passwords?).

Tim Graham

unread,
Dec 20, 2015, 2:14:34 PM12/20/15
to Django developers (Contributions to Django itself)
Hi, the way I try to answer these types of design decision questions is by using git blame to find the ticket that introduced the feature. From there, you can often find links to the pull request or mailing list discussions which might explain the decision.

As noted in the documentation [1], the password list came from:
https://web.archive.org/web/20150315154609/https://xato.net/passwords/more-top-worst-passwords/#.Vnb8FV7L9z0

If you don't think the included list is appropriate, as the documentation for CommonPasswordValidator also notes, "The password_list_path can be set to the path of a custom file of common passwords. This file should contain one password per line and may be plain text or gzipped."

[2] https://docs.djangoproject.com/en/dev/topics/auth/passwords/#django.contrib.auth.password_validation.CommonPasswordValidator
Reply all
Reply to author
Forward
0 new messages