OpenOffice encryption insecure? (repost)

Markus Jansson

unread,

Feb 14, 2005, 1:46:03 PM2/14/05

to

I posted this issue before, but only received couple answers. The way I
see it, however, this is very important issue. I hope someone could give
out good and clear answer to this one...

Here they talk about OpenOffice encryption
http://xml.openoffice.org/servlets/ReadMsg?list=dev&msgId=80006

And here is the issue that bothers me:

Its Blowfish-128-CFB with key information generated from the passphrase.
However, there is one possible vulnerability.

With CFB, the IV must be unique on every encryption, but the key does
not need to be. If the key and the IV are the same, it is easy to
recover plaintext. Now, in the case of OpenOffice, they say:"I suggest
to use a Password Based Key Derivation Function (PBKDF2 from PKCS #5,
RFC 2898) to generate the necessary keymaterial for the Key (16
bytes) and IV (8 bytes) given the user's password..." If this is the way
it has been implemented in OpenOffice, then there is serious security
vulnerability, since if the key and IV are created from the user
passphrase, that means that using the same passphrase on two different
documents creates the same key and IV for both of them, making them both
breakable!!!

--
ï»¿My computer security & privacy related homepage
http://www.markusjansson.net
Use HushTools or GnuPG/PGP to encrypt any email
before sending it to me to protect our privacy.

Peter Pearson

unread,

Feb 14, 2005, 4:05:56 PM2/14/05

to

Markus Jansson wrote:

[snip]

> Its Blowfish-128-CFB with key information generated from the passphrase.
> However, there is one possible vulnerability.
>
> With CFB, the IV must be unique on every encryption, but the key does
> not need to be. If the key and the IV are the same, it is easy to
> recover plaintext.

I think you exaggerate. Identical IVs with identical plaintexts give
identical ciphertexts, which is a weakness because it leaks information
to an adversary, namely that the plaintexts were identical. It does
not in general make it easy to recover plaintext.

Also, once the two plaintexts diverge, the ciphertext soon becomes
just as inscrutable as if the IVs had been different. There will be
a stretch in which the XOR of the ciphertexts equals the XOR of the
plaintexts -- a very bad thing -- but the length of this stretch
varies, depending on the CFB implementation, between 1 bit and the
blocksize of the cipher.

[snip]
> ... since if the key and IV are created from the user

> passphrase, that means that using the same passphrase on two different
> documents creates the same key and IV for both of them, making them both
> breakable!!!

Your concern is legitimate, but the vulnerability is limited to the
attacker's being able to recognize how far the two plaintexts match,
and at best to determine one block of bytes where they begin to differ.

Experimenting with OpenOffice under Linux, I find that saving
the same file twice with the same "password" results in very
different ciphertexts, strongly suggesting that there is some
source of variability other than the password. Perhaps a timestamp
is prefixed to the plaintext before encryption, which would be
just as effective as changing the IV. (On the other hand, my lame
test doesn't eliminate the possibility that only the low-order
byte of the timestamp is used, so that in a few dozen trials
I would probably find a pair of matching ciphertexts.)

--
Peter Pearson
To get my email address, substitute:
nowhere -> spamcop, invalid -> net

Henrick Hellström

unread,

Feb 14, 2005, 5:17:56 PM2/14/05

to

Markus Jansson wrote:

> With CFB, the IV must be unique on every encryption, but the key does
> not need to be. If the key and the IV are the same, it is easy to
> recover plaintext. Now, in the case of OpenOffice, they say:"I suggest
> to use a Password Based Key Derivation Function (PBKDF2 from PKCS #5,
> RFC 2898) to generate the necessary keymaterial for the Key (16
> bytes) and IV (8 bytes) given the user's password..." If this is the way
> it has been implemented in OpenOffice, then there is serious security
> vulnerability, since if the key and IV are created from the user
> passphrase, that means that using the same passphrase on two different
> documents creates the same key and IV for both of them, making them both
> breakable!!!

You are wrong on two accounts.

Firstly, deriving both the key and the IV from a passphrase does *not*
result in the kind of vulnerability you describe in the second sentence
above. You do *not* get an IV identical to the key if you use PBKDF2 for
generating 192 bits you parse as a 128 bit key and a 64 bit IV. Finding
any kind of correlation between the key and the IV generated this way,
is at least as hard as finding a preimage for the underlying PRF (which
is HMAC-SHA1).

Secondly, I would guess that the key and IV are not derived from the
passphrase alone, but from the combination of a passphrase and a random
per-message salt/nonce (since this is how PBKDF2 is specified in PKCS#5).

Daniel Vogelheim

unread,

Feb 14, 2005, 6:05:26 PM2/14/05

to

Hi Markus,

Markus Jansson wrote:
>Here they talk about OpenOffice encryption
>http://xml.openoffice.org/servlets/ReadMsg?list=dev&msgId=80006
>
>And here is the issue that bothers me:
>
>
>Its Blowfish-128-CFB with key information generated from the passphrase.
>However, there is one possible vulnerability.
>
>With CFB, the IV must be unique on every encryption, but the key does
>not need to be. If the key and the IV are the same, it is easy to
>recover plaintext. Now, in the case of OpenOffice, they say:"I suggest
>to use a Password Based Key Derivation Function (PBKDF2 from PKCS #5,
>RFC 2898) to generate the necessary keymaterial for the Key (16
>bytes) and IV (8 bytes) given the user's password..."

Well, it's Open Source, so we can have a look: In the OOo source, IV
and salt are both initialized from a 'random pool'; the key is
generated from salt + password using 'PBKDF2'. So everything should be
fine (assuming the 'random pool' and 'PBKDF2' do what their names
suggest).

(The cited message presumably doesn't accurately reflect the
implementation since it was written before said implementation was
finished.)

>If this is the way
>it has been implemented in OpenOffice, then there is serious security
>vulnerability, since if the key and IV are created from the user
>passphrase, that means that using the same passphrase on two different
>documents creates the same key and IV for both of them, making them both
>breakable!!!

Due to the random salt in PBKDF2 that would not have been the case
anyway.

Hope this helps.

Sincerely,
Daniel Vogelheim

Markus Jansson

unread,

Feb 14, 2005, 6:35:09 PM2/14/05

to

Henrick Hellström wrote:
> Firstly, deriving both the key and the IV from a passphrase does *not*
> result in the kind of vulnerability you describe in the second sentence
> above. You do *not* get an IV identical to the key if you use PBKDF2 for
> generating 192 bits you parse as a 128 bit key and a 64 bit IV.

IF. But for my understanding, OpenOffice does NOT do this. It just
creates the IV and key from passphrases directly.

> Secondly, I would guess that the key and IV are not derived from the
> passphrase alone, but from the combination of a passphrase and a random
> per-message salt/nonce (since this is how PBKDF2 is specified in PKCS#5).

From what I found, there is no record of anything such. Ofcourse, this
is the way it SHOULD BE done. Simply put some 256bit of salt and use
that + passphrase to create key and IV and voila! But is it done? Where
is the salt created from and how? Is it random?

Personally, Im really surprised that the encryption on OpenOffice is so
poorly documented.

--
ď»żMy computer security & privacy related homepage

Markus Jansson

unread,

Feb 14, 2005, 6:39:11 PM2/14/05

to

Daniel Vogelheim wrote:
> (The cited message presumably doesn't accurately reflect the
> implementation since it was written before said implementation was
> finished.)

Exactly my point. Where is the documentation about this issue? :( Has
anyone actually checked the source and encryption systems in OO?

Just reminds of this Microsofts total messup of encryption in OfficeXP
(Its in Finnish, maybe someone can find proper english translation for
this one?). Its very similiar issue, if you encrypt two OfficeXP
documents with the same passphrase, its much, much, much easier to crack
it open.
http://www.digitoday.fi/showPage.php?page_id=14&news_id=39810

--
My computer security & privacy related homepage

David Wagner

unread,

Feb 14, 2005, 6:46:09 PM2/14/05

to

Peter Pearson wrote:

>Markus Jansson wrote:
>> ... since if the key and IV are created from the user
>> passphrase, that means that using the same passphrase on two different
>> documents creates the same key and IV for both of them, making them both
>> breakable!!!
>
>Your concern is legitimate, but the vulnerability is limited to the
>attacker's being able to recognize how far the two plaintexts match,
>and at best to determine one block of bytes where they begin to differ.

There is another concern with deriving both the IV and the key as a
function of the password: such a system may be suspeptible to time-space
precomputation attacks (e.g., Hellman's time-space tradeoff). These
attacks allow an adversary to do a lengthy one-time precomputation,
and thereafter the process of cracking of each individual password is
dramatically sped up.

>Experimenting with OpenOffice under Linux, I find that saving
>the same file twice with the same "password" results in very
>different ciphertexts, strongly suggesting that there is some
>source of variability other than the password. Perhaps a timestamp
>is prefixed to the plaintext before encryption, which would be
>just as effective as changing the IV. (On the other hand, my lame
>test doesn't eliminate the possibility that only the low-order
>byte of the timestamp is used, so that in a few dozen trials
>I would probably find a pair of matching ciphertexts.)

Ah-hah. That is a useful insight, and one that would need to be
investigated further before one could assess the risk of time-space
tradeoffs.

David Wagner

unread,

Feb 14, 2005, 6:54:24 PM2/14/05

to

Daniel Vogelheim wrote:
>Well, it's Open Source, so we can have a look: In the OOo source, IV
>and salt are both initialized from a 'random pool'; the key is
>generated from salt + password using 'PBKDF2'.

That's helpful; thanks. Now where does the salt come from?
Is it predictable? Is it from a cryptographically secure source?

Daniel Vogelheim

unread,

Feb 14, 2005, 6:59:30 PM2/14/05

to

Hello Markus,

Markus Jansson wrote:
>Daniel Vogelheim wrote:
>> (The cited message presumably doesn't accurately reflect the
>> implementation since it was written before said implementation was
>> finished.)
>
>Exactly my point.

Errr, I much rather understood your point to be a suspicion about a
specific weakness in the file format implementation, based on an older
message to one of the mailing lists.

>Where is the documentation about this issue? :(

In the "OpenOffice.org XML File Format Specification", Chapter 11.3
(11. Package Format; 11.3 Encryption):

Quote:
--------------------
11.3 Encryption

The encryption process takes place in the following multiple stages:
1. A 20-byte SHA1 digest of the user entered password is created and
passed to the package component.
2. The package component initializes a random number generator with
the current time.
3. The random number generator is used to generate a random 8-byte
initialization vector and 16-byte salte for each file.
4. This salt is used together with the 20-byte SHA1 digest of the
password to derive a unique 128-bit key for each file. The algorithm
used to derive the key is the PBKDF2 (see RFC 2989) with an iteration
count of 1024.
5. The derived key is used together with the initialisation vector to
encrypt the file using the Blowfish algorithm in cipher-feedback (CFB)
mode.
----------------------

Which appears to be exactly what is implemented...

>Has anyone actually checked the source and encryption systems in OO?

Well, for the specific issue you mentioned: I did, ca. an hour ago. I
didn't bother to check the PBKDF2 or random pool implementations,
however.

>Just reminds of this Microsofts total messup of encryption in OfficeXP
>(Its in Finnish, maybe someone can find proper english translation for
>this one?). Its very similiar issue, if you encrypt two OfficeXP
>documents with the same passphrase, its much, much, much easier to crack

>it open. [...]

Uh oh. VERY different story. The OOo method is 1) fully documented, 2)
standards based, 3) fully Open Source. Just witness your question and
the time frame in which you received competent answers from different
people! Really, a very different story.

Sincerely,
Daniel Vogelheim

Markus Jansson

unread,

Feb 14, 2005, 8:01:00 PM2/14/05

to

Daniel Vogelheim wrote:
> Errr, I much rather understood your point to be a suspicion about a
> specific weakness in the file format implementation, based on an older
> message to one of the mailing lists.

File format implementation?

>>Where is the documentation about this issue? :(
>
> In the "OpenOffice.org XML File Format Specification", Chapter 11.3
> (11. Package Format; 11.3 Encryption):

You made it sound very easy to find. Yet I couldnt find it anywhere.
Thank you. :)

> Which appears to be exactly what is implemented...

Then there apparently is no problem with the encryption itself, right?

> Uh oh. VERY different story. The OOo method is 1) fully documented, 2)
> standards based, 3) fully Open Source. Just witness your question and
> the time frame in which you received competent answers from different
> people! Really, a very different story.

Well, not in my opinion. They both have (or atleast OfficeXP) problem
that is related to the key and IV information being derived directly
from the passphrase.

Mark Borgerding

unread,

Feb 14, 2005, 10:43:06 PM2/14/05

to

I'm curious what you mean by "cryptographically secure".

Do you mean uncontrollable by an attacker, i.e. tamper-proof? Or
cryptographically strong ( high entropy, unbiased, intractable to
predict, etc. ).

If you mean the former, I see your concern. If the latter, can you
expand on your concern for the strength of the IV and salt?

-- Mark

David Wagner

unread,

Feb 14, 2005, 10:58:29 PM2/14/05

to

Mark Borgerding wrote:
>David Wagner wrote:
>> Daniel Vogelheim wrote:
>>>Well, it's Open Source, so we can have a look: In the OOo source, IV
>>>and salt are both initialized from a 'random pool'; the key is
>>>generated from salt + password using 'PBKDF2'.
>>
>> That's helpful; thanks. Now where does the salt come from?
>> Is it predictable? Is it from a cryptographically secure source?
>
>I'm curious what you mean by "cryptographically secure".
>
>Do you mean uncontrollable by an attacker, i.e. tamper-proof? Or
>cryptographically strong ( high entropy, unbiased, intractable to
>predict, etc. ).

I had in mind the latter. But both questions seem relevant.
Do you know the answer to either question?

>If you mean the former, I see your concern. If the latter, can you
>expand on your concern for the strength of the IV and salt?

It is relevant to whether precomputation attacks are possible.
If the salt is predictable, then precomputation attacks may be
possible. If the salt is unpredictable and has enough entropy,
then I believe that will defeat precomputation attacks.

Bill Unruh

unread,

Feb 15, 2005, 2:41:10 AM2/15/05

to

Markus Jansson <seemyh...@katsokotisivuilta.ni> writes:

>I posted this issue before, but only received couple answers. The way I
>see it, however, this is very important issue. I hope someone could give
>out good and clear answer to this one...

>Here they talk about OpenOffice encryption
>http://xml.openoffice.org/servlets/ReadMsg?list=dev&msgId=80006

>And here is the issue that bothers me:

>Its Blowfish-128-CFB with key information generated from the passphrase.
>However, there is one possible vulnerability.

>With CFB, the IV must be unique on every encryption, but the key does
>not need to be. If the key and the IV are the same, it is easy to
>recover plaintext. Now, in the case of OpenOffice, they say:"I suggest
>to use a Password Based Key Derivation Function (PBKDF2 from PKCS #5,
>RFC 2898) to generate the necessary keymaterial for the Key (16
>bytes) and IV (8 bytes) given the user's password..." If this is the way
>it has been implemented in OpenOffice, then there is serious security
>vulnerability, since if the key and IV are created from the user
>passphrase, that means that using the same passphrase on two different
>documents creates the same key and IV for both of them, making them both
>breakable!!!

Well, test it. take the same file. Encrypt it twice with the same password.
Is the encryption the same?

Henrick Hellström

unread,

Feb 15, 2005, 2:48:49 AM2/15/05

to

Markus Jansson wrote:

> Henrick Hellström wrote:
>
>> Firstly, deriving both the key and the IV from a passphrase does *not*
>> result in the kind of vulnerability you describe in the second
>> sentence above. You do *not* get an IV identical to the key if you use
>> PBKDF2 for generating 192 bits you parse as a 128 bit key and a 64 bit
>> IV.
>
>
> IF. But for my understanding, OpenOffice does NOT do this. It just
> creates the IV and key from passphrases directly.

Exactly what do you mean by "creates the IV and key from passphrases
directly"? Exactly how would such a function be consistent with the
paragraph you quoted?

>> Secondly, I would guess that the key and IV are not derived from the
>> passphrase alone, but from the combination of a passphrase and a
>> random per-message salt/nonce (since this is how PBKDF2 is specified
>> in PKCS#5).
>
>
> From what I found, there is no record of anything such.

There is in PKCS#5. If they say that they implement PKCS#5 I would
expect them to follow the specification and all applicable
recommendations contained within the document.

Henrick Hellström

unread,

Feb 15, 2005, 2:52:52 AM2/15/05

to

David Wagner wrote:

> There is another concern with deriving both the IV and the key as a
> function of the password: such a system may be suspeptible to time-space
> precomputation attacks (e.g., Hellman's time-space tradeoff). These
> attacks allow an adversary to do a lengthy one-time precomputation,
> and thereafter the process of cracking of each individual password is
> dramatically sped up.

Even in the case of PKCS#5 PBKDF2? Did I misunderstand?

tomst...@gmail.com

unread,

Feb 15, 2005, 2:58:35 AM2/15/05

to

I think he was strictly talking about the case without a salt. The
"space" requirement grows considerably when a salt [of decent length]
has been used.

Tom

Paul Rubin

unread,

Feb 15, 2005, 3:04:15 AM2/15/05

to

Probably. Consider the case where there's so little entropy that there
are only three possible salts.

Henrick Hellström

unread,

Feb 15, 2005, 3:22:17 AM2/15/05

to

Paul Rubin wrote:

That would not conform with PKCS#5. Quoted from section 4.1.:

"If there is no concern about interactions between multiple uses of
the same key (or a prefix of that key) with the password-based
encryption and authentication techniques supported for a given
password, then the salt may be generated at random and need not be
checked for a particular format by the party receiving the salt. It
should be at least eight octets (64 bits) long.

Otherwise, the salt should contain data that explicitly distinguishes
between different operations and different key lengths, in addition
to a random part that is at least eight octets long, and this data
should be checked or regenerated by the party receiving the salt. For
instance, the salt could have an additional non-random octet that
specifies the purpose of the derived key. Alternatively, it could be
the encoding of a structure that specifies detailed information about
the derived key, such as the encryption or authentication technique
and a sequence number among the different keys derived from the
password. The particular format of the additional data is left to the
application."

Markus Jansson

unread,

Feb 15, 2005, 7:59:13 AM2/15/05

to

Bill Unruh wrote:
> Well, test it. take the same file. Encrypt it twice with the same password.
> Is the encryption the same?

It might be the same, but with bad luck the IV might be only slightly
different (if poorly implemented) which means it could be cracked with
brute forcing etc.

David Wagner

unread,

Feb 15, 2005, 12:31:01 PM2/15/05

to

tomst...@gmail.com wrote:

>Henrick Hellström wrote:
>> Even in the case of PKCS#5 PBKDF2? Did I misunderstand?
>
>I think he was strictly talking about the case without a salt.

Yes, sorry, that was what I was referring to.
(Or, equivalently, where the salt is predictable or constant across
all files encrypted by a single user.)

Daniel Vogelheim

unread,

Feb 15, 2005, 7:39:53 PM2/15/05

to

Hello David,

The salt and IV come from said 'random pool' which, as far as I
understand the code, works as follows:

The 'random pool' is essentially a buffer with two operations: add
random bytes, and read random bytes. For encryption, first a timestamp
is added to the pool, and then the salt and IV are retrieved. The
timestamp is the only source of entropy I can see. As this is probably
rather predictable, I assume the aim was more for uniqueness than
actual randomness. During both adding and retrieving bytes from the
pool, a digest (MD5) is run on the pool's buffer and XORed into the
data stream.

Admittedly, from just looking at the code I'm having a hard time
figuring out exactly what is being digested when. If I understand
things correctly, MD5 is run on and XORed to the buffer at least three
times, namely after adding the timestamp, and each time before getting
IV and salt.

Bascially, the call sequence is:
- rtl_random_createPool - create 'random pool'
- rtl_random_addBytes( <timestamp> ) - add timestamp for randomness
- <salt> = rtl_random_getBytes( ) - get salt, 16 bytes
- <iv> = rtl_random_getBytes( ) - get IV, 8 bytes
- <key> = rtl_digest_PBKDF2( <salt>, <password>, 1024 iterations )
- generate key
- ... encrypt data using <iv> and <key>

If you want to examine the code yourself, the 'random pool' code can
be found at:
http://porting.openoffice.org/source/browse/porting/sal/rtl/source/random.c?rev=1.4&content-type=text/vnd.viewcvs-markup

Sincerely,
Daniel Vogelheim

David Wagner

unread,

Feb 15, 2005, 8:33:22 PM2/15/05

to

Daniel Vogelheim wrote:
>The 'random pool' is essentially a buffer with two operations: add
>random bytes, and read random bytes. For encryption, first a timestamp
>is added to the pool, and then the salt and IV are retrieved. The
>timestamp is the only source of entropy I can see. As this is probably
>rather predictable, I assume the aim was more for uniqueness than
>actual randomness.

Cool. Thanks. Actually, uniqueness may be sufficient to deter
precomputation attacks, as long as there are enough possible timestamp
values. I can't think of any precomputation attack that would exploit
this in any practical way.

For non-precomputation attacks, though, you really would like to have
unguessable randomness. A timestamp is not ideal for this purpose.

The best attack I can think of is a straightforward dictionary attack.
Guess the pair (timestamp, password), and do trial decryption (or check
the inferred salt value) for each guess. You'll have to do 1024 PBKDF2
operations, so there will be a marked slowdown.

If people use passwords with a huge amount of entropy, the dictionary
attack is defeated. On the other hand, few people can remember passwords
of sufficient entropy. Perhaps such a dictionary attack would succeed on
a non-negligible fraction of documents, given sufficient computing power.
For instance, if we imagine some people will be using fairly low entropy
passwords (say, 20 bits) combined with timestamps that don't leave the
attacker with much uncertainty (perhaps 10 bits?), then we're getting
somewhere in the vicinity of 2^40 hash operations to decrypt such
a document. That's not great.

In general, using passwords as cryptographic keys tends to place
some pretty severe limits on the amount of cryptographic strength you
can get. For high-security applications, I'd personally tend to avoid
password-keyed cryptosystems.