PRAGMA KEY = 'testkey';PRAGMA cipher_user_hmac = OFF;
PRAGMA KEY = 'testkey';PRAGMA cipher_page_size = 4096;
Nice!
Haven’t looked yet, but some quick (not-well-thought-out) thoughts:
1) Have you compiled any stats on how the performance is using HMAC vs. not?
2) You say you’re using OpenSSL’s PBKDF2 for the HMAC. Is the number of rounds configurable? The whole reason for PBKDF2 is to “slow down” an attacker in reverse engineering a key. Not sure if we want to pay that high of a price for an HMAC on every page.
3) Any possibility of making the MAC configurable or adding additional options? For example, if I recall UMAC has better performance than HMAC on 32bit platforms. Or, I might want a simple CRC or single-pass hash instead of a full-blown MAC.
On Friday, February 18, 2011 at 3:21 PM, Michael Stephenson wrote:
Nice!
Haven’t looked yet, but some quick (not-well-thought-out) thoughts:
1) Have you compiled any stats on how the performance is using HMAC vs. not?
2) You say you’re using OpenSSL’s PBKDF2 for the HMAC. Is the number of rounds configurable? The whole reason for PBKDF2 is to “slow down” an attacker in reverse engineering a key. Not sure if we want to pay that high of a price for an HMAC on every page.
3) Any possibility of making the MAC configurable or adding additional options? For example, if I recall UMAC has better performance than HMAC on 32bit platforms. Or, I might want a simple CRC or single-pass hash instead of a full-blown MAC.
Hi Stephen,
That all sounds like great stuff.
I think you might have some complaints about lack of backward compatibility. I think you could make the MAC optional and maintain backward compatibility as long as it’s set at runtime before the codec is attached. In my personal implementation, I store the size of the MAC in the codec_ctx in the nMac member:
// this is the container for our codec state.
// not all items are necessarily used, depending on the crypto provider
typedef struct _codec_ctx
{
int initialized; // whether CodecInitialize should be called in sqlite3CodecAttach
unsigned char* kdfSalt; // salt for kdf, always created, may or may not be used, FILE_HEADER_SIZE bytes
unsigned char* key; // encryption key
unsigned char* newKey; // used only during a rekey of the database
int nKey; // length of encryption key
int nNewKey;
unsigned char* buffer; // buffer used for encrypt/decrypt, SQLITE_DEFAULT_PAGE_SIZE bytes
int nIV; // size of initialization vector to use on each database page (0 = don't use IVs)
int nMac; // size of message authentication code to use on each database page (0 = don't use MACs)
void* pUserData; // user-defined state, if additional state is needed
} codec_ctx;
As I’ve mentioned previously, I use a boilerplate crypto.h with callbacks to whichever cryptoimpl.h I’m using in that build of the library, but in my crypto.h, the XCodec function (which is what SQLite will call to encrypt/decrypt a page), does a calculation for the size of the encryption buffer based on whether nMac > 0 (and same for nIV > 0), so it’s somewhat flexible:
// modes 0,2,3 = decrypt, modes 6,7 = encrypt
switch(mode)
{
// decrypt - return value is not used other than checking for null; thus, decrypted data must
// go into pData. If CodecDecrypt returns a different result pointer than pData, we'll
// copy the data back to pData ourselves
case 0: // Undo a "case 7" journal file encryption
case 2: // Reload a page
case 3: // Load a page
// if first page, copy file header to the first 16 bytes of pData
if(pgNo == 1) memcpy(pData, SQLITE_FILE_HEADER, FILE_HEADER_SIZE);
if(nIV > 0)
pIV = pData + pgSize - nIV;
if(nMac > 0)
pMac = pData + pgSize - nIV - nMac;
if((pResult = CodecDecrypt(pCtx->key, pCtx->nKey, pIV, nIV, pMac, nMac,
pData+offset, pgSize-offset-nIV-nMac, pBuffer+offset, pCtx->pUserData)) == NULL)
return NULL; // this will cause a crash if not caught later (e.g., by an exception handler in client code)
if(pResult != pData+offset)
memcpy(pData+offset, pResult, pgSize-offset-nIV-nMac);
return pData;
break;
// encrypt - return value points to encrypted data. CodecEncrypt should return pointer to
// pBuffer passed in. If it does not, we'll copy data to pBuffer ourselves
case 6: // Encrypt a page for main database file
case 7: // Encrypt a page for the journal file
// if first page, save key derivation salt to first 16 bytes of the page
if(pgNo == 1) memcpy(pBuffer, pCtx->kdfSalt, FILE_HEADER_SIZE);
if(nIV > 0)
pIV = pBuffer + pgSize - nIV;
if(nMac > 0)
pMac = pBuffer + pgSize - nIV - nMac;
// generate a new IV for the encryption
if(pIV)
CodecRandomness(nIV, pIV);
if((pResult = CodecEncrypt(pCtx->newKey ? pCtx->newKey : pCtx->key, pCtx->nKey, pIV, nIV, pMac, nMac,
pData+offset, pgSize-offset-nIV-nMac, pBuffer+offset, pCtx->pUserData)) == NULL)
return NULL; // this will cause a crash if not caught later (e.g., by an exception handler in client code)
if(pResult != pBuffer+offset)
memcpy(pBuffer+offset, pResult, pgSize-offset-nIV-nMac);
return pBuffer;
break;
default:
assert(FALSE);
return NULL;
}
}
I think you could do something like this as well, though not 100% sure.
By the way, I think I made some progress towards keying an unencrypted database and unkeying an encrypted database a few weeks ago (these days, can’t remember what I’ve done in the past few days let alone weeks). As you’ve pointed out, the main problem is the page reserve size. The new MAC functionality presents a similar problem, if a client programmer or user wanted to change it at runtime on an existing database. If I do manage to come up with something that seems to work, I’ll pass it along. My recollection was that it was looking about 50/50 whether it would work before I put it aside.
~Mike
I think you might have some complaints about lack of backward compatibility. I think you could make the MAC optional and maintain backward compatibility as long as it’s set at runtime before the codec is attached. In my personal implementation, I store the size of the MAC in the codec_ctx in the nMac member:
....
I think you could do something like this as well, though not 100% sure.
By the way, I think I made some progress towards keying an unencrypted database and unkeying an encrypted database a few weeks ago (these days, can’t remember what I’ve done in the past few days let alone weeks). As you’ve pointed out, the main problem is the page reserve size. The new MAC functionality presents a similar problem, if a client programmer or user wanted to change it at runtime on an existing database. If I do manage to come up with something that seems to work, I’ll pass it along. My recollection was that it was looking about 50/50 whether it would work before I put it aside.
“It is possible to configure whether HMAC will be used at run time, so you can open an existing database with the new version (just use "pragma cipher_use_hmac = OFF" to tell SQLCipher to disable HMAC when opening a database).”
Great! So it is backwards compatible with the old format, just requires a one-line code change.
“We are probably going to start this week to develop a new method to do this. The current plan is to use an approach similar to how vacuum works, but a bit more generic, i.e. attach a new database, replicate the table schema, copy data, then re-apply non-table schema.”
That’s exactly the approach I’ve been working on, taking sqlite3RunVacuum from vacuum.c as a starting point. sqlite3RunVacuum does what you described and I would assume is an example of “best practice” for doing a database copy. It does a lot of the background stuff to make sure this works transactionally, and it appears to copy all database objects (including things like sequences) over to the new database.
Hi Stephen,
Just wanted to let you know that I’ve got a working implementation of in-place rekeying of a database.
As I see it, there are three cases:
1) Encrypt an unencrypted database.
2) Decrypt an encrypted database.
3) Change the encryption on an encrypted database.
I have cases 1 and 2 working right now. Haven’t visited case 3 (that theoretically should be the simplest case, but we’ll see; it might require a different approach than cases 1 and 2).
What I did originally was just copy sqlite3RunVacuum, execSql, execExecSql, and vacuumFinalize from vacuum.c to my crypto.h file, rename the functions, and set about getting it to work for rekeying. I had this working on Tuesday but have been having a bit of frustration since then because in the end the difference between sqlite3RunVacuum and my “new” function is… 1 line of code. But, that 1 line of code makes all the difference apparently.
Since then, I’ve been trying to figure out how I can “trick” sqlite3RunVacuum to think the page reserves on the source and temporary rekeyed databases are different so that I can just throw away my “new” function and call sqlite3RunVacuum directly. No luck on that front thus far; not 100% sure it’s possible or possible without some undesirable hackery :o).
On the other hand, I figured it might be good to keep the “new” function, let’s call it “sqlite3_rekey_ex”, and add some parameters to it so that you can “trans-crypt” a database to a new file rather than in-place if you like.
Working on this the last couple of days has also lead me to change some of the fundamentals of what I was doing previously. For example, every attached database now gets its own codec, which can be created before the database is even attached and then will be attached when the database is attached. I’m also thinking towards making the codec_ctx more C++ like similar to the way the Pager works with function pointers set for the “member functions”. This would make a codec_ctx more self-contained and more independent and flexible, which could prove useful or simpler to understand if one were dealing with potentially multiple attached databases each with different encryption parameters. Probably not something that would ever be used, but if it makes things simpler to understand that would be benefit enough.
Have to run for now, I’ll post more info soon…
~Mike
From: Stephen Lombardo
[mailto:sjlom...@zetetic.net]
Sent: Monday, February 21, 2011 1:15 AM
To: sqlc...@googlegroups.com
Cc: Michael Stephenson
Subject: Re: SQLCipher v2 Beta
Hi Mike,
Okay, here are what I think are the “tricks” to rekeying a database and adding or removing encryption from it. The critical item is changing the page reserve during the rekey process. If we’re not changing the page reserve, then we can just walk the database pages at the Btree level, read them in via a read cipher_ctx and then write them back out via a write cipher_ctx. However, if we need to change the page reserve (e.g., because we are adding or removing encryption or changing the encryption such that the page reserve should change), then something like the following steps are needed:
1) A codec must be attached to the main database, and it’s write cipher_ctx must be initialized to whatever we want the rekeyed database to look like. It’s read cipher_ctx should be initialized so that it can read the current database before rekeying.
2) A new database must be attached. Let’s call this attached database “rekey_db”.
3) After rekey_db is attached, but before any copying starts, it must have its page reserve set to the desired amount of reserved space as needed by the encryption. For example, an encrypted database would typically reserve 16 bytes for a per-page IV, whereas a plain database would reserve 0 bytes. If the page reserve is set properly, rekey_db does not actually have to be encrypted, but it should be because if it exceeds the size of the in-memory cache, pages will be written to disk in its journal file during the rekeying process.
4) The copy is performed of all database objects from main db over to rekey_db via SQL statements. This copy from main => rekey_db must be performed via SQL statements to give rekey_db a change to reorganize its Btree pages based on its new page reserve (and more space or less space on each page for actual data).
5) Rekeying in place.
a. If we are rekeying “in-place”, then a Btree-level block copy of pages back to main db can be done after main has been SQL-copied to rekey_db. This block-copy is why the page reserve on rekey_db must be whatever it needs to be to match the desired page reserve at the end of the rekey operation.
b. After the block copy is performed, the write cipher_ctx on the main db must be copied to the read cipher_ctx so that the database pages can be read correctly.
6) Rekeying to a new database.
a. For rekeying to a new database, we can just give rekey_db a filename and just skip the block copy of rekey_db back to the main db.
There are a few different use cases that should be considered:
1) Encrypting a plain database.
a. In place.
b. As a new database.
2) Decrypting an encrypted database.
a. In place.
b. As a new database.
3) Changing the encryption with a different page reserve.
a. In place.
b. As a new database.
4) Changing the encryption but with the same page reserve.
a. In place.
b. As a new database.
The basic steps outlined at the top of this e-mail can be used for all of the use cases (SQL-copy to rekey_db, block copy back if in-place). For case 4a, if desired a Btree-level read-rewrite of each database page can be performed if desired (SQLCipher already has this functionality I think).
It turns out that SQLCipher v1.1.8 is very close to handling case 1a already. As I mentioned previously, I took sqlite3RunVacuum and made some minor changes to it and called it sqlite3RekeyImpl. The changes are in a file I’ve named rekey.h, which is attached to this message. By including rekey.h in SQLCipher’s crypto.c, the only changes we need to make are in the sqlite3_rekey function occur about 30 lines into that function. Here are the changes:
/*** begin changes ***/
//db->nextPagesize = SQLITE_DEFAULT_PAGE_SIZE;
pDb->pBt->pBt->pageSizeFixed = 0; /* required for sqlite3BtreeSetPageSize to modify pagesize setting */
//sqlite3BtreeSetPageSize(pDb->pBt, db->nextPagesize, EVP_MAX_IV_LENGTH, 0);
//sqlite3RunVacuum(&error, db);
sqlite3RekeyImpl(&error, db, ctx->write_ctx->iv_sz, NULL);
cipher_ctx_copy(ctx->read_ctx, ctx->write_ctx);
return SQLITE_OK;
/*** end changes ***/
This simply comments out trying to set the page size/reserve, and comments out calling sqlite3RunVacuum, and instead it calls sqlite3RekeyImpl, copies the write cipher_ctx over to the read cipher_ctx, and then returns (skipping the SQLCipher attempt at block copying the database pages). In crypto.c, the code block where these changes are made is only entered if the IV size is changing as part of the rekey. If the IV size is not changing, this code block will be skipped, and SQLCipher’s block page read/rewrite can be used.
Note that the last parameter to sqlite3RekeyImpl is a char pointer to a filename. If this is NULL, then the rekey is done in place. If it is not null, then the rekey is done to a new database with the given filename. SQLCipher could be made to exercise this option without much work. My personal implementation now includes a function named sqlite3_rekey_ex that is similar to sqlite3_rekey except a filename can be passed in. In my implementation, sqlite3_rekey is now a one-liner that just calls sqlite3_rekey_ex and passes NULL for the filename. sqlite3_rekey_ex then attaches codecs, etc., figures out the desired page reserve, and calls sqlite3RekeyImpl to do the grunt work.
I’ve also added a new pragma that is used in the form: “pragma rekey_ex ‘passphrase=>filename’; The code that parses this pragma looks for the first => it finds starting at the end of “zRight” and then calls sqlite3_rekey_ex passing in the passphrase, its length, and the filename for the rekeyed database. This could be tweaked as well to allow passing an encryption key instead of a passphrase, similar to what SQLCipher already does with “pragma key”.
SQLCipher (even with the change noted above) does not handle any of the use cases other than 1a and 4a. Attempting to unencrypt a database by passing an empty encryption key “pragma rekey = ‘’;” apparently does nothing, and SQLCipher does not currently have the notion of rekeying to a new database, though I hope my suggestions in this message might lead to that functionality being added.
I’ve attached the file, rekey.h, which may be of help reworking rekeying in SQLCipher. This file contains four functions copied from vacuum.c and renamed, and I basically changed anything that said “vacuum” to “rekey”. From rekey.h:
** This code is almost entirely copied from code in vacuum.c. Basically
** the words vacuum, Vacuum, and VACUUM have been changed to "rekey",
** and the following functions have been copied from vacuum.c and renamed:
**
** vacuumFinalize => rekeyFinalize
** execSql => rekeyExecSql
** execExecSql => rekeyExecExecSql
** sqlite3RunVacuum => sqlite3RekeyImpl
The function sqlite3_rekey_impl is just sqliteRunVacuum with a few minor changes:
1) The function signature is changed. I’ve added two new parameters: nRes is the desired page reserve for the rekeyed database, and zNewFile is an optional filename for the rekeyed database in case we want to rekey the existing database to a new file instead of doing it in place.
int sqlite3RunVacuum(char **pzErrMsg, sqlite3 *db)
is changed to:
int sqlite3RekeyImpl(char **pzErrMsg, sqlite3 *db, int nRes, const char* zNewFile){
2) The line from sqlite3RunVacuum that sets the page reserve on the attached database to match the main database is commented out. It is no longer needed since nRes is passed in as a function parameter. In fact, this line of code in sqlite3RunVacuum is primarily what prevents a rekey from working in place by just calling sqlite3RunVacuum.
/* commented out, we're now passing nRes in as a parameter */
/* nRes = sqlite3BtreeGetReserve(pMain); */
3) If zNewFile is not null, then the attached database (rekey_db) is given a filename instead of being a temporary database.
4) If zNewFile is not null, then rekey_db is not “block copied” back to the main db near the end of the function (sqlite3BtreeCopyFile is skipped).