SQLCipher v2 Beta

Stephen Lombardo

unread,

Feb 18, 2011, 2:12:00 PM2/18/11

to sqlc...@googlegroups.com

Hi All,

Some of the SQLCipher enhancements that I've recently alluded on this list are now available in the "v2beta" branch of the SQLCipher repository.

https://github.com/sjlombardo/sqlcipher/tree/v2beta

This beta includes the following improvements

Per page HMAC

Every database page now includes a MAC so that individual pages are non-malleable. This change prevents potential attackers who have write access to a database file from making subtle changes to an encrypted page to introduce errors or attempt attacks.

The new page format expands the reserved space so that it includes both the IV and an HMAC to authenticate the page ciphertext, IV, and page number. The HMAC key is derived from the database encryption key using a second iteration of PBKDF2 so the keys are different. SQLCipher now checks the HMAC before decrypting each page thus, if a page has been tampered with or resequenced, the library will immediately return an error.

Note that this change is incompatible with the database format from SQLCipher 1.1.x so you can't open a database created by this version using an older version of SQLCipher. Likewise, if you want to open a 1.1.x database using this new beta build you must disable the feature using PRAGMA cipher_use_hmac = OFF immediately after setting the database key:

PRAGMA KEY = 'testkey';

PRAGMA cipher_user_hmac = OFF;

As with encrypting a new database the first time, the best way to convert from the old version to the new format permanently is to open the old database, attach a new database, and copy the data into it. There is a good example of this process in the crypto.test suite the test named "attach-and-copy-1.1.8".

Custom Page Sizes

This new version introduces a pragma, cipher_page_size, that you can use to adjust the page size for the encrypted database. This is quite useful for applications where a larger page size is desirable to increase performance. For instance, some of our recent testing has shown that increasing the page size can noticeably improve performance (5-30%) for certain queries that manipulate a large number of pages (e.g. selects without an index, large inserts in a transaction, big deletes).

To adjust the page size, call the pragma immediately after setting the key for the first time and each subsequent time that you open the database i.e.:

PRAGMA KEY = 'testkey';

PRAGMA cipher_page_size = 4096;

SQLite 3.7.5

The beta version is based on the latest release of SQLite recommended for new development.

Future changes

There are a few more changes planned before an official release. These are not present in the current beta, but will hopefully be added soon.

A refactoring of the extensions to separate the codec hooks used by SQLite from the encryption implementation. This will eventually to make the code easier to understand, audit and potentially allow customizations to the crypto implementation.
A library or sample code for attach-and-copy converstion to make it easier to move from an unencrypted database to an encrypted database, or between 1.1.8 databases and the new format.

If you have a few moments, please take some time to test this beta with your applications. Your feedback is most welcome. Thanks!

Cheers,

Stephen

Michael Stephenson

unread,

Feb 18, 2011, 3:21:08 PM2/18/11

to sqlc...@googlegroups.com

Nice!

Haven’t looked yet, but some quick (not-well-thought-out) thoughts:

1) Have you compiled any stats on how the performance is using HMAC vs. not?

2) You say you’re using OpenSSL’s PBKDF2 for the HMAC. Is the number of rounds configurable? The whole reason for PBKDF2 is to “slow down” an attacker in reverse engineering a key. Not sure if we want to pay that high of a price for an HMAC on every page.

3) Any possibility of making the MAC configurable or adding additional options? For example, if I recall UMAC has better performance than HMAC on 32bit platforms. Or, I might want a simple CRC or single-pass hash instead of a full-blown MAC.

Stephen Lombardo

unread,

Feb 18, 2011, 4:24:11 PM2/18/11

to sqlc...@googlegroups.com

Hi Michael,

Great questions, I've answered them in line below.

On Friday, February 18, 2011 at 3:21 PM, Michael Stephenson wrote:

Nice!
Haven’t looked yet, but some quick (not-well-thought-out) thoughts:
1) Have you compiled any stats on how the performance is using HMAC vs. not?

Yes, I've done some preliminary testing using a modified crypto-speedtest.tcl. For most statements in my testing the impact is around 5% or less. The impact on operations that touch a large number of pages is greater (big deletes, huge inserts), but those are hopefully less frequent. We can also compensate using larger page sizes now too.

Overall I think the performance impact is acceptable considering the security benefits, so it's enabled by default, but it can always be turned off at runtime using the pragma to get the absolute best performance.

2) You say you’re using OpenSSL’s PBKDF2 for the HMAC. Is the number of rounds configurable? The whole reason for PBKDF2 is to “slow down” an attacker in reverse engineering a key. Not sure if we want to pay that high of a price for an HMAC on every page.

Sorry for the confusion there, my original mail may not have been clear; the MAC is actually a garden variety HMAC-SHA1.

Since it would be undesirable to re-use the database encryption key as the HMAC key, PBKDF2 is used to derive a completely separate HMAC key from the encryption key. This happens only once when the database in initialized for the first time. Thus, there is no per-page performance impact for PBKDF2 once the database is open, just a small performance impact for hmac-sha1 which is quite fast.

The number of rounds used used with PBKDF2 defaults to 4000 but can be configured at runtime via "PRAGMA kdf_iter=N" immediately after the key is set.

3) Any possibility of making the MAC configurable or adding additional options? For example, if I recall UMAC has better performance than HMAC on 32bit platforms. Or, I might want a simple CRC or single-pass hash instead of a full-blown MAC.

I think it may be troublesome to allow the MAC to be configurable at runtime, but I'm in the process of reorganizing the code in SQLCipher so that the encryption implementation is separated from the codec code. This refactoring should make it much easier to swap out the MAC to meet specific requirements in the future.

Let me know what you think based on these clarifications.

Cheers,

Stephen

Michael Stephenson

unread,

Feb 20, 2011, 12:47:07 AM2/20/11

to sqlc...@googlegroups.com

Hi Stephen,

That all sounds like great stuff.

I think you might have some complaints about lack of backward compatibility. I think you could make the MAC optional and maintain backward compatibility as long as it’s set at runtime before the codec is attached. In my personal implementation, I store the size of the MAC in the codec_ctx in the nMac member:

// this is the container for our codec state.

// not all items are necessarily used, depending on the crypto provider

typedef struct _codec_ctx

{

int initialized; // whether CodecInitialize should be called in sqlite3CodecAttach

unsigned char* kdfSalt; // salt for kdf, always created, may or may not be used, FILE_HEADER_SIZE bytes

unsigned char* key; // encryption key

unsigned char* newKey; // used only during a rekey of the database

int nKey; // length of encryption key

int nNewKey;

unsigned char* buffer; // buffer used for encrypt/decrypt, SQLITE_DEFAULT_PAGE_SIZE bytes

int nIV; // size of initialization vector to use on each database page (0 = don't use IVs)

int nMac; // size of message authentication code to use on each database page (0 = don't use MACs)

void* pUserData; // user-defined state, if additional state is needed

} codec_ctx;

As I’ve mentioned previously, I use a boilerplate crypto.h with callbacks to whichever cryptoimpl.h I’m using in that build of the library, but in my crypto.h, the XCodec function (which is what SQLite will call to encrypt/decrypt a page), does a calculation for the size of the encryption buffer based on whether nMac > 0 (and same for nIV > 0), so it’s somewhat flexible:

// modes 0,2,3 = decrypt, modes 6,7 = encrypt

switch(mode)

{

// decrypt - return value is not used other than checking for null; thus, decrypted data must

// go into pData. If CodecDecrypt returns a different result pointer than pData, we'll

// copy the data back to pData ourselves

case 0: // Undo a "case 7" journal file encryption

case 2: // Reload a page

case 3: // Load a page

// if first page, copy file header to the first 16 bytes of pData

if(pgNo == 1) memcpy(pData, SQLITE_FILE_HEADER, FILE_HEADER_SIZE);

if(nIV > 0)

pIV = pData + pgSize - nIV;

if(nMac > 0)

pMac = pData + pgSize - nIV - nMac;

if((pResult = CodecDecrypt(pCtx->key, pCtx->nKey, pIV, nIV, pMac, nMac,

pData+offset, pgSize-offset-nIV-nMac, pBuffer+offset, pCtx->pUserData)) == NULL)

return NULL; // this will cause a crash if not caught later (e.g., by an exception handler in client code)

if(pResult != pData+offset)

memcpy(pData+offset, pResult, pgSize-offset-nIV-nMac);

return pData;

break;

// encrypt - return value points to encrypted data. CodecEncrypt should return pointer to

// pBuffer passed in. If it does not, we'll copy data to pBuffer ourselves

case 6: // Encrypt a page for main database file

case 7: // Encrypt a page for the journal file

// if first page, save key derivation salt to first 16 bytes of the page

if(pgNo == 1) memcpy(pBuffer, pCtx->kdfSalt, FILE_HEADER_SIZE);

if(nIV > 0)

pIV = pBuffer + pgSize - nIV;

if(nMac > 0)

pMac = pBuffer + pgSize - nIV - nMac;

// generate a new IV for the encryption

if(pIV)

CodecRandomness(nIV, pIV);

if((pResult = CodecEncrypt(pCtx->newKey ? pCtx->newKey : pCtx->key, pCtx->nKey, pIV, nIV, pMac, nMac,

pData+offset, pgSize-offset-nIV-nMac, pBuffer+offset, pCtx->pUserData)) == NULL)

return NULL; // this will cause a crash if not caught later (e.g., by an exception handler in client code)

if(pResult != pBuffer+offset)

memcpy(pBuffer+offset, pResult, pgSize-offset-nIV-nMac);

return pBuffer;

break;

default:

assert(FALSE);

return NULL;

}

I think you could do something like this as well, though not 100% sure.

By the way, I think I made some progress towards keying an unencrypted database and unkeying an encrypted database a few weeks ago (these days, can’t remember what I’ve done in the past few days let alone weeks). As you’ve pointed out, the main problem is the page reserve size. The new MAC functionality presents a similar problem, if a client programmer or user wanted to change it at runtime on an existing database. If I do manage to come up with something that seems to work, I’ll pass it along. My recollection was that it was looking about 50/50 whether it would work before I put it aside.

~Mike

Stephen Lombardo

unread,

Feb 21, 2011, 1:15:28 AM2/21/11

to sqlc...@googlegroups.com, Michael Stephenson

Hi Mike,

On Sun, Feb 20, 2011 at 12:47 AM, Michael Stephenson <domeh...@gmail.com> wrote:

I think you might have some complaints about lack of backward compatibility. I think you could make the MAC optional and maintain backward compatibility as long as it’s set at runtime before the codec is attached. In my personal implementation, I store the size of the MAC in the codec_ctx in the nMac member:

....

I think you could do something like this as well, though not 100% sure.

The approach you describe is actually very similar to the way the new code already operates. The context determines whether HMAC is in use at any point in time. It is possible to configure whether HMAC will be used at run time, so you can open an existing database with the new version (just use "pragma cipher_use_hmac = OFF" to tell SQLCipher to disable HMAC when opening a database). Here is an example from the test suite of the new code opening a 1.1.8 database:

https://github.com/sjlombardo/sqlcipher/blob/v2beta/test/crypto.test#L681

As to the question of whether HMAC should be on or off to start with, I lean strongly towards enabling it by default. If a developer grabs a clone of SQLCipher for their application the default behavior should be as secure as possible. For any existing applications it will be a one line change to disable HMAC it at runtime. Plus, we're hoping to have an improved method to convert databases soon.

That said, you raise a fair point that some folks might want to disable it by default for their custom environments for convenience. Therefore, I've pushed a minor change that will allow you to simply override the behavior at compile time. Just define DEFAULT_USE_HMAC=0 and the library will no longer use HMAC by default.

Also, as an aside, I'm in the process of re-organizing the SQLCipher code to use a separate encryption implementation file in a manner similar to what we've discussed in the past. This should make it much easier to tweak in the future.

By the way, I think I made some progress towards keying an unencrypted database and unkeying an encrypted database a few weeks ago (these days, can’t remember what I’ve done in the past few days let alone weeks). As you’ve pointed out, the main problem is the page reserve size. The new MAC functionality presents a similar problem, if a client programmer or user wanted to change it at runtime on an existing database. If I do manage to come up with something that seems to work, I’ll pass it along. My recollection was that it was looking about 50/50 whether it would work before I put it aside.

I'm definitely interested. We are probably going to start this week to develop a new method to do this. The current plan is to use an approach similar to how vacuum works, but a bit more generic, i.e. attach a new database, replicate the table schema, copy data, then re-apply non-table schema. If you've already made some progress on this or have a different approach I'd love to hear about it.

Let me know what you think. Thanks!

Cheers,

Stephen

Michael Stephenson

unread,

Feb 21, 2011, 9:17:54 AM2/21/11

to Stephen Lombardo, sqlc...@googlegroups.com

“It is possible to configure whether HMAC will be used at run time, so you can open an existing database with the new version (just use "pragma cipher_use_hmac = OFF" to tell SQLCipher to disable HMAC when opening a database).”

Great! So it is backwards compatible with the old format, just requires a one-line code change.

“We are probably going to start this week to develop a new method to do this. The current plan is to use an approach similar to how vacuum works, but a bit more generic, i.e. attach a new database, replicate the table schema, copy data, then re-apply non-table schema.”

That’s exactly the approach I’ve been working on, taking sqlite3RunVacuum from vacuum.c as a starting point. sqlite3RunVacuum does what you described and I would assume is an example of “best practice” for doing a database copy. It does a lot of the background stuff to make sure this works transactionally, and it appears to copy all database objects (including things like sequences) over to the new database.

Michael Stephenson

unread,

Feb 24, 2011, 4:36:45 PM2/24/11

to Stephen Lombardo, sqlc...@googlegroups.com

Hi Stephen,

Just wanted to let you know that I’ve got a working implementation of in-place rekeying of a database.

As I see it, there are three cases:

1) Encrypt an unencrypted database.

2) Decrypt an encrypted database.

3) Change the encryption on an encrypted database.

I have cases 1 and 2 working right now. Haven’t visited case 3 (that theoretically should be the simplest case, but we’ll see; it might require a different approach than cases 1 and 2).

What I did originally was just copy sqlite3RunVacuum, execSql, execExecSql, and vacuumFinalize from vacuum.c to my crypto.h file, rename the functions, and set about getting it to work for rekeying. I had this working on Tuesday but have been having a bit of frustration since then because in the end the difference between sqlite3RunVacuum and my “new” function is… 1 line of code. But, that 1 line of code makes all the difference apparently.

Since then, I’ve been trying to figure out how I can “trick” sqlite3RunVacuum to think the page reserves on the source and temporary rekeyed databases are different so that I can just throw away my “new” function and call sqlite3RunVacuum directly. No luck on that front thus far; not 100% sure it’s possible or possible without some undesirable hackery :o).

On the other hand, I figured it might be good to keep the “new” function, let’s call it “sqlite3_rekey_ex”, and add some parameters to it so that you can “trans-crypt” a database to a new file rather than in-place if you like.

Working on this the last couple of days has also lead me to change some of the fundamentals of what I was doing previously. For example, every attached database now gets its own codec, which can be created before the database is even attached and then will be attached when the database is attached. I’m also thinking towards making the codec_ctx more C++ like similar to the way the Pager works with function pointers set for the “member functions”. This would make a codec_ctx more self-contained and more independent and flexible, which could prove useful or simpler to understand if one were dealing with potentially multiple attached databases each with different encryption parameters. Probably not something that would ever be used, but if it makes things simpler to understand that would be benefit enough.

Have to run for now, I’ll post more info soon…

~Mike

From: Stephen Lombardo [mailto:sjlom...@zetetic.net]
Sent: Monday, February 21, 2011 1:15 AM
To: sqlc...@googlegroups.com
Cc: Michael Stephenson
Subject: Re: SQLCipher v2 Beta

Hi Mike,

Billy Gray

unread,

Feb 24, 2011, 4:45:49 PM2/24/11

to sqlc...@googlegroups.com, Michael Stephenson, Stephen Lombardo

Nice work, Mike!

Just so you know, case 3 should already be taken care of by sqlite3_rekey(). We've used that to upgrade some of our customers' databases in-place to adjust the KDF iterations (we bumped it from 1000 to 4000). There's some documentation on it here:

http://sqlcipher.net/documentation/api#rekey

Obviously, you need the key for that, but if your program has unlocked the database successfully, than the key is already known.

Cheers,

Billy

--
Team Zetetic
http://zetetic.net

Michael Stephenson

unread,

Feb 27, 2011, 12:47:42 AM2/27/11

to Billy Gray, sqlc...@googlegroups.com, Stephen Lombardo

Okay, here are what I think are the “tricks” to rekeying a database and adding or removing encryption from it. The critical item is changing the page reserve during the rekey process. If we’re not changing the page reserve, then we can just walk the database pages at the Btree level, read them in via a read cipher_ctx and then write them back out via a write cipher_ctx. However, if we need to change the page reserve (e.g., because we are adding or removing encryption or changing the encryption such that the page reserve should change), then something like the following steps are needed:

1) A codec must be attached to the main database, and it’s write cipher_ctx must be initialized to whatever we want the rekeyed database to look like. It’s read cipher_ctx should be initialized so that it can read the current database before rekeying.

2) A new database must be attached. Let’s call this attached database “rekey_db”.

3) After rekey_db is attached, but before any copying starts, it must have its page reserve set to the desired amount of reserved space as needed by the encryption. For example, an encrypted database would typically reserve 16 bytes for a per-page IV, whereas a plain database would reserve 0 bytes. If the page reserve is set properly, rekey_db does not actually have to be encrypted, but it should be because if it exceeds the size of the in-memory cache, pages will be written to disk in its journal file during the rekeying process.

4) The copy is performed of all database objects from main db over to rekey_db via SQL statements. This copy from main => rekey_db must be performed via SQL statements to give rekey_db a change to reorganize its Btree pages based on its new page reserve (and more space or less space on each page for actual data).

5) Rekeying in place.

a. If we are rekeying “in-place”, then a Btree-level block copy of pages back to main db can be done after main has been SQL-copied to rekey_db. This block-copy is why the page reserve on rekey_db must be whatever it needs to be to match the desired page reserve at the end of the rekey operation.

b. After the block copy is performed, the write cipher_ctx on the main db must be copied to the read cipher_ctx so that the database pages can be read correctly.

6) Rekeying to a new database.

a. For rekeying to a new database, we can just give rekey_db a filename and just skip the block copy of rekey_db back to the main db.

There are a few different use cases that should be considered:

1) Encrypting a plain database.

a. In place.

b. As a new database.

2) Decrypting an encrypted database.

a. In place.

b. As a new database.

3) Changing the encryption with a different page reserve.

a. In place.

b. As a new database.

4) Changing the encryption but with the same page reserve.

a. In place.

b. As a new database.

The basic steps outlined at the top of this e-mail can be used for all of the use cases (SQL-copy to rekey_db, block copy back if in-place). For case 4a, if desired a Btree-level read-rewrite of each database page can be performed if desired (SQLCipher already has this functionality I think).

It turns out that SQLCipher v1.1.8 is very close to handling case 1a already. As I mentioned previously, I took sqlite3RunVacuum and made some minor changes to it and called it sqlite3RekeyImpl. The changes are in a file I’ve named rekey.h, which is attached to this message. By including rekey.h in SQLCipher’s crypto.c, the only changes we need to make are in the sqlite3_rekey function occur about 30 lines into that function. Here are the changes:

/*** begin changes ***/

//db->nextPagesize = SQLITE_DEFAULT_PAGE_SIZE;

pDb->pBt->pBt->pageSizeFixed = 0; /* required for sqlite3BtreeSetPageSize to modify pagesize setting */

//sqlite3BtreeSetPageSize(pDb->pBt, db->nextPagesize, EVP_MAX_IV_LENGTH, 0);

//sqlite3RunVacuum(&error, db);

sqlite3RekeyImpl(&error, db, ctx->write_ctx->iv_sz, NULL);

cipher_ctx_copy(ctx->read_ctx, ctx->write_ctx);

return SQLITE_OK;

/*** end changes ***/

This simply comments out trying to set the page size/reserve, and comments out calling sqlite3RunVacuum, and instead it calls sqlite3RekeyImpl, copies the write cipher_ctx over to the read cipher_ctx, and then returns (skipping the SQLCipher attempt at block copying the database pages). In crypto.c, the code block where these changes are made is only entered if the IV size is changing as part of the rekey. If the IV size is not changing, this code block will be skipped, and SQLCipher’s block page read/rewrite can be used.

Note that the last parameter to sqlite3RekeyImpl is a char pointer to a filename. If this is NULL, then the rekey is done in place. If it is not null, then the rekey is done to a new database with the given filename. SQLCipher could be made to exercise this option without much work. My personal implementation now includes a function named sqlite3_rekey_ex that is similar to sqlite3_rekey except a filename can be passed in. In my implementation, sqlite3_rekey is now a one-liner that just calls sqlite3_rekey_ex and passes NULL for the filename. sqlite3_rekey_ex then attaches codecs, etc., figures out the desired page reserve, and calls sqlite3RekeyImpl to do the grunt work.

I’ve also added a new pragma that is used in the form: “pragma rekey_ex ‘passphrase=>filename’; The code that parses this pragma looks for the first => it finds starting at the end of “zRight” and then calls sqlite3_rekey_ex passing in the passphrase, its length, and the filename for the rekeyed database. This could be tweaked as well to allow passing an encryption key instead of a passphrase, similar to what SQLCipher already does with “pragma key”.

SQLCipher (even with the change noted above) does not handle any of the use cases other than 1a and 4a. Attempting to unencrypt a database by passing an empty encryption key “pragma rekey = ‘’;” apparently does nothing, and SQLCipher does not currently have the notion of rekeying to a new database, though I hope my suggestions in this message might lead to that functionality being added.

I’ve attached the file, rekey.h, which may be of help reworking rekeying in SQLCipher. This file contains four functions copied from vacuum.c and renamed, and I basically changed anything that said “vacuum” to “rekey”. From rekey.h:

** This code is almost entirely copied from code in vacuum.c. Basically

** the words vacuum, Vacuum, and VACUUM have been changed to "rekey",

** and the following functions have been copied from vacuum.c and renamed:

**

** vacuumFinalize => rekeyFinalize

** execSql => rekeyExecSql

** execExecSql => rekeyExecExecSql

** sqlite3RunVacuum => sqlite3RekeyImpl

The function sqlite3_rekey_impl is just sqliteRunVacuum with a few minor changes:

1) The function signature is changed. I’ve added two new parameters: nRes is the desired page reserve for the rekeyed database, and zNewFile is an optional filename for the rekeyed database in case we want to rekey the existing database to a new file instead of doing it in place.

int sqlite3RunVacuum(char **pzErrMsg, sqlite3 *db)

is changed to:

int sqlite3RekeyImpl(char **pzErrMsg, sqlite3 *db, int nRes, const char* zNewFile){

2) The line from sqlite3RunVacuum that sets the page reserve on the attached database to match the main database is commented out. It is no longer needed since nRes is passed in as a function parameter. In fact, this line of code in sqlite3RunVacuum is primarily what prevents a rekey from working in place by just calling sqlite3RunVacuum.

/* commented out, we're now passing nRes in as a parameter */

/* nRes = sqlite3BtreeGetReserve(pMain); */

3) If zNewFile is not null, then the attached database (rekey_db) is given a filename instead of being a temporary database.

4) If zNewFile is not null, then rekey_db is not “block copied” back to the main db near the end of the function (sqlite3BtreeCopyFile is skipped).

rekey.h

Stephen Lombardo

unread,

Mar 2, 2011, 3:03:08 PM3/2/11

to sqlc...@googlegroups.com

Hi Folks,

I've just pushed a substantial code reorganization out to the v2beta branch in the SQLCipher repository that more cleanly separate the SQLite codec interface from the underlying encryption code as follows:

crypto.h - header file now contains prototypes for several sqlcipher_codec* functions that must be implemented for specific cryptography functions

crypto.c - the codec logic and manipulation functions, now generic enough to call externally defined methods for all cryptography functions (i.e. page cipher, codec context manipulation, etc)

crypto_impl.c - default implementation of the sqlcipher "internals" including the codec and cipher contexts, and all crypto code.

This reorganization is intended to make the code easier to understand and audit - for instance, the main crypto.c files no longer directly reference OpenSSL. In the long term, it should also make it easier for users with very specific needs to make tightly-scoped changes to meet specific requirements (e.g. to use a different HMAC algorithm or a different crypto provider).

I'd like to get some early feedback on stability. Please feel free to pull the latest v2beta and put it through it's paces.