Okay, here are what I think are the “tricks” to rekeying a database and adding or removing encryption from it. The critical item is changing the page reserve during the rekey process. If we’re not changing the page reserve, then we can just walk the database pages at the Btree level, read them in via a read cipher_ctx and then write them back out via a write cipher_ctx. However, if we need to change the page reserve (e.g., because we are adding or removing encryption or changing the encryption such that the page reserve should change), then something like the following steps are needed:
1) A codec must be attached to the main database, and it’s write cipher_ctx must be initialized to whatever we want the rekeyed database to look like. It’s read cipher_ctx should be initialized so that it can read the current database before rekeying.
2) A new database must be attached. Let’s call this attached database “rekey_db”.
3) After rekey_db is attached, but before any copying starts, it must have its page reserve set to the desired amount of reserved space as needed by the encryption. For example, an encrypted database would typically reserve 16 bytes for a per-page IV, whereas a plain database would reserve 0 bytes. If the page reserve is set properly, rekey_db does not actually have to be encrypted, but it should be because if it exceeds the size of the in-memory cache, pages will be written to disk in its journal file during the rekeying process.
4) The copy is performed of all database objects from main db over to rekey_db via SQL statements. This copy from main => rekey_db must be performed via SQL statements to give rekey_db a change to reorganize its Btree pages based on its new page reserve (and more space or less space on each page for actual data).
5) Rekeying in place.
a. If we are rekeying “in-place”, then a Btree-level block copy of pages back to main db can be done after main has been SQL-copied to rekey_db. This block-copy is why the page reserve on rekey_db must be whatever it needs to be to match the desired page reserve at the end of the rekey operation.
b. After the block copy is performed, the write cipher_ctx on the main db must be copied to the read cipher_ctx so that the database pages can be read correctly.
6) Rekeying to a new database.
a. For rekeying to a new database, we can just give rekey_db a filename and just skip the block copy of rekey_db back to the main db.
There are a few different use cases that should be considered:
1) Encrypting a plain database.
a. In place.
b. As a new database.
2) Decrypting an encrypted database.
a. In place.
b. As a new database.
3) Changing the encryption with a different page reserve.
a. In place.
b. As a new database.
4) Changing the encryption but with the same page reserve.
a. In place.
b. As a new database.
The basic steps outlined at the top of this e-mail can be used for all of the use cases (SQL-copy to rekey_db, block copy back if in-place). For case 4a, if desired a Btree-level read-rewrite of each database page can be performed if desired (SQLCipher already has this functionality I think).
It turns out that SQLCipher v1.1.8 is very close to handling case 1a already. As I mentioned previously, I took sqlite3RunVacuum and made some minor changes to it and called it sqlite3RekeyImpl. The changes are in a file I’ve named rekey.h, which is attached to this message. By including rekey.h in SQLCipher’s crypto.c, the only changes we need to make are in the sqlite3_rekey function occur about 30 lines into that function. Here are the changes:
/*** begin changes ***/
//db->nextPagesize = SQLITE_DEFAULT_PAGE_SIZE;
pDb->pBt->pBt->pageSizeFixed = 0; /* required for sqlite3BtreeSetPageSize to modify pagesize setting */
//sqlite3BtreeSetPageSize(pDb->pBt, db->nextPagesize, EVP_MAX_IV_LENGTH, 0);
//sqlite3RunVacuum(&error, db);
sqlite3RekeyImpl(&error, db, ctx->write_ctx->iv_sz, NULL);
cipher_ctx_copy(ctx->read_ctx, ctx->write_ctx);
return SQLITE_OK;
/*** end changes ***/
This simply comments out trying to set the page size/reserve, and comments out calling sqlite3RunVacuum, and instead it calls sqlite3RekeyImpl, copies the write cipher_ctx over to the read cipher_ctx, and then returns (skipping the SQLCipher attempt at block copying the database pages). In crypto.c, the code block where these changes are made is only entered if the IV size is changing as part of the rekey. If the IV size is not changing, this code block will be skipped, and SQLCipher’s block page read/rewrite can be used.
Note that the last parameter to sqlite3RekeyImpl is a char pointer to a filename. If this is NULL, then the rekey is done in place. If it is not null, then the rekey is done to a new database with the given filename. SQLCipher could be made to exercise this option without much work. My personal implementation now includes a function named sqlite3_rekey_ex that is similar to sqlite3_rekey except a filename can be passed in. In my implementation, sqlite3_rekey is now a one-liner that just calls sqlite3_rekey_ex and passes NULL for the filename. sqlite3_rekey_ex then attaches codecs, etc., figures out the desired page reserve, and calls sqlite3RekeyImpl to do the grunt work.
I’ve also added a new pragma that is used in the form: “pragma rekey_ex ‘passphrase=>filename’; The code that parses this pragma looks for the first => it finds starting at the end of “zRight” and then calls sqlite3_rekey_ex passing in the passphrase, its length, and the filename for the rekeyed database. This could be tweaked as well to allow passing an encryption key instead of a passphrase, similar to what SQLCipher already does with “pragma key”.
SQLCipher (even with the change noted above) does not handle any of the use cases other than 1a and 4a. Attempting to unencrypt a database by passing an empty encryption key “pragma rekey = ‘’;” apparently does nothing, and SQLCipher does not currently have the notion of rekeying to a new database, though I hope my suggestions in this message might lead to that functionality being added.
I’ve attached the file, rekey.h, which may be of help reworking rekeying in SQLCipher. This file contains four functions copied from vacuum.c and renamed, and I basically changed anything that said “vacuum” to “rekey”. From rekey.h:
** This code is almost entirely copied from code in vacuum.c. Basically
** the words vacuum, Vacuum, and VACUUM have been changed to "rekey",
** and the following functions have been copied from vacuum.c and renamed:
**
** vacuumFinalize => rekeyFinalize
** execSql => rekeyExecSql
** execExecSql => rekeyExecExecSql
** sqlite3RunVacuum => sqlite3RekeyImpl
The function sqlite3_rekey_impl is just sqliteRunVacuum with a few minor changes:
1) The function signature is changed. I’ve added two new parameters: nRes is the desired page reserve for the rekeyed database, and zNewFile is an optional filename for the rekeyed database in case we want to rekey the existing database to a new file instead of doing it in place.
int sqlite3RunVacuum(char **pzErrMsg, sqlite3 *db)
is changed to:
int sqlite3RekeyImpl(char **pzErrMsg, sqlite3 *db, int nRes, const char* zNewFile){
2) The line from sqlite3RunVacuum that sets the page reserve on the attached database to match the main database is commented out. It is no longer needed since nRes is passed in as a function parameter. In fact, this line of code in sqlite3RunVacuum is primarily what prevents a rekey from working in place by just calling sqlite3RunVacuum.
/* commented out, we're now passing nRes in as a parameter */
/* nRes = sqlite3BtreeGetReserve(pMain); */
3) If zNewFile is not null, then the attached database (rekey_db) is given a filename instead of being a temporary database.
4) If zNewFile is not null, then rekey_db is not “block copied” back to the main db near the end of the function (sqlite3BtreeCopyFile is skipped).
From: Billy Gray [mailto:wg...@zetetic.net]
Sent: Thursday, February 24, 2011 4:46 PM
To: sqlc...@googlegroups.com
Cc: Michael Stephenson; Stephen Lombardo
Subject: Re: SQLCipher v2 Beta
Nice work, Mike!
Just so you know, case 3 should already be taken care of by sqlite3_rekey(). We've used that to upgrade some of our customers' databases in-place to adjust the KDF iterations (we bumped it from 1000 to 4000). There's some documentation on it here:
Obviously, you need the key for that, but if your program has unlocked the database successfully, than the key is already known.
Cheers,
Billy
On Thu, Feb 24, 2011 at 4:36 PM, Michael Stephenson <domeh...@gmail.com> wrote:
Hi Stephen,
Just wanted to let you know that I’ve got a working implementation of in-place rekeying of a database.
As I see it, there are three cases:
1) Encrypt an unencrypted database.
2) Decrypt an encrypted database.
3) Change the encryption on an encrypted database.
I have cases 1 and 2 working right now. Haven’t visited case 3 (that theoretically should be the simplest case, but we’ll see; it might require a different approach than cases 1 and 2).
What I did originally was just copy sqlite3RunVacuum, execSql, execExecSql, and vacuumFinalize from vacuum.c to my crypto.h file, rename the functions, and set about getting it to work for rekeying. I had this working on Tuesday but have been having a bit of frustration since then because in the end the difference between sqlite3RunVacuum and my “new” function is… 1 line of code. But, that 1 line of code makes all the difference apparently.
Since then, I’ve been trying to figure out how I can “trick” sqlite3RunVacuum to think the page reserves on the source and temporary rekeyed databases are different so that I can just throw away my “new” function and call sqlite3RunVacuum directly. No luck on that front thus far; not 100% sure it’s possible or possible without some undesirable hackery :o).
On the other hand, I figured it might be good to keep the “new” function, let’s call it “sqlite3_rekey_ex”, and add some parameters to it so that you can “trans-crypt” a database to a new file rather than in-place if you like.
Working on this the last couple of days has also lead me to change some of the fundamentals of what I was doing previously. For example, every attached database now gets its own codec, which can be created before the database is even attached and then will be attached when the database is attached. I’m also thinking towards making the codec_ctx more C++ like similar to the way the Pager works with function pointers set for the “member functions”. This would make a codec_ctx more self-contained and more independent and flexible, which could prove useful or simpler to understand if one were dealing with potentially multiple attached databases each with different encryption parameters. Probably not something that would ever be used, but if it makes things simpler to understand that would be benefit enough.
Have to run for now, I’ll post more info soon…
~Mike
From: Stephen Lombardo [mailto:sjlom...@zetetic.net]
Sent: Monday, February 21, 2011 1:15 AM
To: sqlc...@googlegroups.com
Cc: Michael Stephenson
Subject: Re: SQLCipher v2 Beta
Hi Mike,
On Sun, Feb 20, 2011 at 12:47 AM, Michael Stephenson <domeh...@gmail.com> wrote:
I think you might have some complaints about lack of backward compatibility. I think you could make the MAC optional and maintain backward compatibility as long as it’s set at runtime before the codec is attached. In my personal implementation, I store the size of the MAC in the codec_ctx in the nMac member:
....
I think you could do something like this as well, though not 100% sure.
The approach you describe is actually very similar to the way the new code already operates. The context determines whether HMAC is in use at any point in time. It is possible to configure whether HMAC will be used at run time, so you can open an existing database with the new version (just use "pragma cipher_use_hmac = OFF" to tell SQLCipher to disable HMAC when opening a database). Here is an example from the test suite of the new code opening a 1.1.8 database:
As to the question of whether HMAC should be on or off to start with, I lean strongly towards enabling it by default. If a developer grabs a clone of SQLCipher for their application the default behavior should be as secure as possible. For any existing applications it will be a one line change to disable HMAC it at runtime. Plus, we're hoping to have an improved method to convert databases soon.
That said, you raise a fair point that some folks might want to disable it by default for their custom environments for convenience. Therefore, I've pushed a minor change that will allow you to simply override the behavior at compile time. Just define DEFAULT_USE_HMAC=0 and the library will no longer use HMAC by default.
Also, as an aside, I'm in the process of re-organizing the SQLCipher code to use a separate encryption implementation file in a manner similar to what we've discussed in the past. This should make it much easier to tweak in the future.
By the way, I think I made some progress towards keying an unencrypted database and unkeying an encrypted database a few weeks ago (these days, can’t remember what I’ve done in the past few days let alone weeks). As you’ve pointed out, the main problem is the page reserve size. The new MAC functionality presents a similar problem, if a client programmer or user wanted to change it at runtime on an existing database. If I do manage to come up with something that seems to work, I’ll pass it along. My recollection was that it was looking about 50/50 whether it would work before I put it aside.
I'm definitely interested. We are probably going to start this week to develop a new method to do this. The current plan is to use an approach similar to how vacuum works, but a bit more generic, i.e. attach a new database, replicate the table schema, copy data, then re-apply non-table schema. If you've already made some progress on this or have a different approach I'd love to hear about it.
Let me know what you think. Thanks!
Cheers,
Stephen
--
Team Zetetic
http://zetetic.net
Ah, that generally sounds like a very smart approach!
I’ve also been concerned about the vacuum approach being brittle in the face of SQLite code changes; but your approach seems that it would be much less open to problems from such changes. After all, you’re just attaching a database and running some valid SQL statements; hopefully the SQLite code base doesn’t ever change to disallow doing those things :o).
Could you expand on #5 (Swap the rekey database for the main database when complete)? Not sure if you’re talking about closing the database connection and renaming files, or something from within SQLite since you mentioned not using a page-based copy any longer.
I think there still might be some benefit to the page-based copy if/once the page reserve on the rekey database matches the main database, and I don’t really think it makes things much more complicated.
I’m looking forward to the new release!
$ ./sqlite3 plaintext.dbsqlite> ATTACH DATABASE 'encrypted.db' AS encrypted KEY 'testkey';sqlite> PRAGMA encrypted.cipher_page_size = 4096;sqlite> PRAGMA encrypted.cipher = 'aes-128-cbc';sqlite> PRAGMA encrypted.kdf_iter = 1000;sqlite> PRAGMA encrypted.cipher_use_hmac = OFF;sqlite> SELECT sqlcipher_export('encrypted');sqlite> DETACH DATABASE encrypted;
Is there a recommended page size for iOS apps for best performance?