Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Entropy of a password

17 views
Skip to first unread message

asli

unread,
Dec 26, 2009, 9:52:53 PM12/26/09
to
Hello all,

I want to calculate the strength of the password. But my question is
related to the entropy of the characters. So I have the program that
calculates the frequencies of the symbols, single character, bigrams,
word starting and ending chars.

I want to calculate the entropy of the given password based on these
character probabilities.

I know that the entropy is defined as:
H(X)= - Sum [P(x_i).logP(x_i) ]
for a random variable X, with n, outcomes { x_i : i = 1,... ,n}.

If I want to calculate the entropy of a single character, how will I
use this formula? So I can use it to calculate the entropy of the
password "password" by using:
H(password)=- (P(p).logP(p) + P(a).logP(a) + .... + P(d).logP(d))
so this is equivalent with:
H(password) = H(p) + H(a) + .... + H(d)
that means I can create a table that will store the entropy of each
character and whenever the user enters the password to measure its
strength I would just look to the table and sum the entropies of the
characters in the given password.

Is it the case when you want to calculate the entropy of the password?

But actually if I consider the conditional probabilities it gets more
complicated with the formulas:
H(password) = H(p) + H(a|p) + H(s|pa) + ... + H(d|passwor)
where H(y|x) means the conditional entropy of y where x is given.

So which formula should I use to calculate the entropy?

I hope I am clear enough.

Thanks a lot in advance.

ASLI

unruh

unread,
Dec 26, 2009, 10:12:34 PM12/26/09
to

None. The question has no unique or even well defined answer.
That formula is only useful if the probabilities are independent. The
entropy is also relative to the "search procedure". Thus it may be that
the searcher has special love for the phrase "ajkl&T)(Salkelkap7uy70 "
in which case the entropy of that phrase would be very low for him.
Another searcher might not. A common Russian word might have high entropy for
an english speaker, but clearly not for a Russian.
Ie, your question is ill defined. Given a search method you could make
an estimate of the "entropy" Eg for an exhaustive search which ran
through the alphabet. ( all strings with less than 15 letters, starting
with "a" then "b" etc) the word zoo would have a very high entropy,
While if the search did 1 letter, then 2 letter, then 3 etc, it would be
low.

rossum

unread,
Dec 27, 2009, 7:28:43 AM12/27/09
to
On Sat, 26 Dec 2009 18:52:53 -0800 (PST), asli <koks...@gmail.com>
wrote:

>Hello all,
>
>I want to calculate the strength of the password. But my question is
>related to the entropy of the characters.

You might find it easier to pick the strength that you require and
then generate a password/passphrase with that amount of entropy.

I would suggest Diceware:

http://world.std.com/~reinhold/diceware.html

as one possibility.

rossum

asli

unread,
Dec 27, 2009, 7:57:21 AM12/27/09
to
On Dec 27, 5:12 am, unruh <un...@wormhole.physics.ubc.ca> wrote:

Thanks a lot for your reply. That is the reason why everything gets
complicated. If you check the below link, there exists a strength
checker. The important part for me is the area that shows the entropy.

http://www.certainkey.com/demos/password/

I really wonder how they calculate it. The code is:

function calcEntropy(pswd){
var ai=new Array();
for(var i=0;i<pswd.length;i++){
var c=pswd.charCodeAt(i);
if(ai[c]==undefined)
ai[c]=0;
ai[c]++;
}
entropy=0;
for(var i=0;i<ai.length;i++){
if(ai[i]!=undefined &&ai[i]!=0){
var d=ai[i]/ pswd.length;
entropy+=d * Math.log(1.0 / d);
}
}
entropy /=Math.log(2);
var p=entropy,v=0;
var ret="";
p-=v=Math.floor(p);
p *=10;
ret+=v+".";
p-=v=Math.floor(p);
p *=10;
ret+=v;
p-=v=Math.floor(p);
p *=10;
ret+=v;
return ret;
}


Thanks a lot "rossum". I will check Diceware and comment as soon as
possible.

Greets,
ASLI

unruh

unread,
Dec 27, 2009, 1:58:16 PM12/27/09
to
On 2009-12-27, rossum <ross...@coldmail.com> wrote:
> On Sat, 26 Dec 2009 18:52:53 -0800 (PST), asli <koks...@gmail.com>
> wrote:
>
>>Hello all,
>>
>>I want to calculate the strength of the password. But my question is
>>related to the entropy of the characters.
> You might find it easier to pick the strength that you require and
> then generate a password/passphrase with that amount of entropy.

Or you could try wgen-- (www.theory.physics.ubc.ca/wgen/wgen.c) a crypto
password generator that generates "English" ) or whatever language you
choose) type words (Ie they seem to follow the pronunciation style of
English) with an entropty estimate. They use a dictionary to derive the
trigram and quadrigram frequencies of the letters in the words, and then
randomly generate strings of letters with the same frequencies, together
with an estimate of the probability of getting that particular string of
letters if one generated those lists many many many times.
By default it uses /usr/share/dict/words in Linux.
Any large word list from English would do.

Ilmari Karonen

unread,
Dec 28, 2009, 11:02:34 PM12/28/09
to
On 2009-12-27, asli <koks...@gmail.com> wrote:
> On Dec 27, 5:12 am, unruh <un...@wormhole.physics.ubc.ca> wrote:
>> On 2009-12-27, asli <koksa...@gmail.com> wrote:
>>
>> > I want to calculate the strength of the password. But my question is
>> > related to the entropy of the characters. So I have the program that
>> > calculates the frequencies of the symbols, single character, bigrams,
>> > word starting and ending chars.
>>
>> > I want to calculate the entropy of the given password based on these
>> > character probabilities.
>>
>> > I know that the entropy is defined as:
>> > H(X)= - Sum [P(x_i).logP(x_i) ]
>> > for a random variable X, with n, outcomes { x_i : i = 1,... ,n}.
>>
>> > If I want to calculate the entropy of a single character, how will I
>> > use this formula?

As unruh noted, entropy in this sense ("Shannon entropy") is a
property of a probability distribution. It does not make sense to
talk about the entropy of a single, fixed value (except to state that
it is zero, which is technically true, if trivial).

When we speak of "the entropy of a password", that's really shorthand
for the entropy of the probability distribution according to which the
password was randomly chosen. That shorthand makes little sense for
user-chosen passwords, since we generally cannot know the distribution
according to which a given user chooses their passwords.

[snip]


> Thanks a lot for your reply. That is the reason why everything gets
> complicated. If you check the below link, there exists a strength
> checker. The important part for me is the area that shows the entropy.
>
> http://www.certainkey.com/demos/password/
>
> I really wonder how they calculate it. The code is:
>
> function calcEntropy(pswd){
> var ai=new Array();
> for(var i=0;i<pswd.length;i++){
> var c=pswd.charCodeAt(i);
> if(ai[c]==undefined)
> ai[c]=0;
> ai[c]++;
> }
> entropy=0;
> for(var i=0;i<ai.length;i++){
> if(ai[i]!=undefined &&ai[i]!=0){
> var d=ai[i]/ pswd.length;
> entropy+=d * Math.log(1.0 / d);
> }
> }
> entropy /=Math.log(2);

What this code calculates, if I'm reading it correctly, is the entropy
of picking a single random character from the password. (The rest,
which I snipped, just seems truncate the result to two decimal places,
Rube Goldberg style. It could all be replaced with a simple "return
entropy.toFixed(2);" statement.)

Anyway, I wouldn't consider this method at all useful as an indicator
of password strength. For example, it returns the same value for both
"abcdefghijklmnopqrstuvwxyz" and "poskvlqbtacynmxwfgirdjuhze", even
though the latter is obviously a stronger password.

--
Ilmari Karonen
To reply by e-mail, please replace ".invalid" with ".net" in address.

0 new messages