RavenHQ encryption and data at rest

287 views
Skip to first unread message

Chris Marisic

unread,
Apr 2, 2012, 9:59:22 AM4/2/12
to rav...@googlegroups.com
When using RavenHQ is the data stored on an encrypting file system such that if someone broke into Amazon's data centers and stole hard drives from the servers that my data would not be able to be compromised by just directly accessing the esent storage directory?

Oren Eini (Ayende Rahien)

unread,
Apr 2, 2012, 7:50:57 PM4/2/12
to rav...@googlegroups.com
Hi,
No, the RavenHQ data is not encrypted at this time.

Chris Marisic

unread,
Apr 3, 2012, 9:21:08 AM4/3/12
to rav...@googlegroups.com
Is there a time line for supporting this? I don't require the docs themselves to be encrypted in the database, only the data to be on an encrypting partition. Basically running BitLocker on the data directory or entire host.

Oren Eini (Ayende Rahien)

unread,
Apr 3, 2012, 11:41:57 AM4/3/12
to rav...@googlegroups.com
I don't know, it would results in major complication for backups, for example.

Chris Marisic

unread,
Apr 3, 2012, 11:59:16 AM4/3/12
to rav...@googlegroups.com
If the backups are zipped using encryption before being moved to S3 storage that would eliminate the exposure of plain text data.

Oren Eini (Ayende Rahien)

unread,
Apr 4, 2012, 4:25:38 AM4/4/12
to rav...@googlegroups.com
We can't really do that, we handle backups via snapshotting the drive, not by actually doing the full backup.

Chris Marisic

unread,
Apr 4, 2012, 10:18:17 AM4/4/12
to rav...@googlegroups.com
Well this is unfortunate, this means it is impossible for me to host in RavenHQ.

Chris Marisic

unread,
Apr 4, 2012, 10:22:33 AM4/4/12
to rav...@googlegroups.com
This is extremely going to limit your ability to reach adoption in the business world. I suppose full document encryption would be a "work around" however I don't want to pay the price for dealing with encryption at the document level. Unless you can make it so optimized there's no noticeable difference between an encrypting file system vs document encryption.

Oren Eini (Ayende Rahien)

unread,
Apr 4, 2012, 10:29:38 AM4/4/12
to rav...@googlegroups.com, Jonathan Matheus
Chris,
The major problem of handling encryption is simple, what is it that you are trying to do?
Another aspect is key management, but we will deal with that a bit later.

We can have RavenDB store the data internally in an encrypted format quite easily. The question here is what is it that you are trying to do? Obviously RavenDB would have to have a way to decrypt them. And then the question is who holds the key?
In this scenario, RavenHQ would have to hold the key, and anyone who could gain access to the machine to examine the raw files would also very likely get access to the key as well.

Other alternatives, like DPAPI and friends all rely on assuming that you have access to the actual machine, which falls down when you realize that we also need to be able to setup a new machine if the old one just died. 

It would be the simplest thing in the world from our perspective to say something like: "Oh, of course it is encrypted" by simply turning on FS encryption. But that would be irresponsible to do without actually considering all of these factors.

This is something that we put some thought about, but I won't feel comfortable actually doing this without having good answers to all of those problems.

Chris Marisic

unread,
Apr 4, 2012, 10:49:19 AM4/4/12
to rav...@googlegroups.com, Jonathan Matheus
"We can have RavenDB store the data internally in an encrypted format quite easily. The question here is what is it that you are trying to do? "

I want the data encrypted, without the performance cost of having full document encryption. Unless as I said that it's possible to make the performance cost of full document encryption (obviously including the effects indexes and searching) to be negligible when compared to file system encryption.

Oren Eini (Ayende Rahien)

unread,
Apr 4, 2012, 4:30:19 PM4/4/12
to rav...@googlegroups.com, Jonathan Matheus
Chris,
Let us ignore the actual cost of encryption, that is usually an issue that is CPU bound, and those are pretty much free from our point of view.
Our problems in this feature (which is actually something that we are building for RavenDB Enterprise) are different.

For example, what do you care about? That the data is never on disk? Is this ever an issue that the data (decrypted in memory) can end up in the page file?
What about the key management? Is it a concerned that RavenDB itself can get this data? How do you handle backups? Can you do an export? 

There are a LOT of issues that aren't as simple as simply stating "I need it encrypted"

Chris Marisic

unread,
Apr 5, 2012, 8:58:59 AM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
inline


On Wednesday, April 4, 2012 4:30:19 PM UTC-4, Oren Eini wrote:
That the data is never on disk?

I don't want an in memory database
 
Is this ever an issue that the data (decrypted in memory) can end up in the page file?

No, government regulation does not specifically address any requirements on this. Of course if the paging file is on an encrypting file system that covers that at minimal cost.
 
What about the key management?

Government regulation makes no specifics to the implementation of this. A certificate would probably be most secure, but even a crypto string that has best efforts to be safe guarded would be sufficient.
 
Is it a concerned that RavenDB itself can get this data?

No, it's entirely fine that the database can read the data.
 
How do you handle backups?

As long as the plain text data is never physically on the disk as plain text any solution is applicable.
 
Can you do an export? 

 As long as the plain text data is never physically on the disk as plain text any solution is applicable.

There are a LOT of issues that aren't as simple as simply stating "I need it encrypted"

Government regulation generally isn't interested in reality and many times is written purely to be "feel good" legislation. I personally find the concept of needing data encrypted at rest on a cloud platform that is hosted in ultra secure data centers to be a joke, but that doesn't change the legislation.

Matt Johnson

unread,
Apr 5, 2012, 9:33:13 AM4/5/12
to ravendb
Chris, I am curious which legistlation in particular you are concerned
with? What jurisdiction are we speaking of? Are you thinking about
USA / Massachusetts 201 CMR 17, or something else?

-Matt

Chris Marisic

unread,
Apr 5, 2012, 11:24:53 AM4/5/12
to rav...@googlegroups.com
HIPAA

Oren Eini (Ayende Rahien)

unread,
Apr 5, 2012, 11:24:38 AM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
Chris,
As I understand from you answer, you don't care about the actual encryption, right? Just about being compliant with the regulator?

Chris Marisic

unread,
Apr 5, 2012, 11:35:30 AM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
Yes those are my primary concerns.

Oren Eini (Ayende Rahien)

unread,
Apr 5, 2012, 11:36:46 AM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
Would something like this suffice?

(Note that indexes aren't encrypted here)

Chris Marisic

unread,
Apr 5, 2012, 11:45:39 AM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
Indexes have to be protected as the information HIPAA requires protected is some of the most common information staff will want to search on to find users of the system.

Oren Eini (Ayende Rahien)

unread,
Apr 5, 2012, 12:28:31 PM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
Okay, that means that it would have to wait until we complete the encryption bundle for RavenDB enterprise.

Chris Marisic

unread,
Apr 5, 2012, 12:29:56 PM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
Is there a target date for that yet?

Oren Eini (Ayende Rahien)

unread,
Apr 5, 2012, 12:31:25 PM4/5/12
to rav...@googlegroups.com, Jonathan Matheus
couple of months from now

Matt Johnson

unread,
Apr 5, 2012, 12:56:40 PM4/5/12
to ravendb
I'll be looking forward to this also. I'm not directly affected by
HIPPA, but 201 CMR 17 also has some encryption requirements. I don't
think they're as strict as HIPPA, but better safe than sorry.
Document encryption, or even field-level encryption would suffice for
most of it. I'm a bit concerned about indexes being in plaintext.

Maybe they would work for simple scenarios, like if you were
encrypting social security numbers and you needed an index of them so
you could look up a person by their ssn. I could just build an index
of hash values and use that. But if I needed to encrypt their names
and I wanted a fulltext search with the "suggested results" feature, I
don't see how that could work if the fields were encrypted.

-Matt

On Apr 5, 9:31 am, "Oren Eini (Ayende Rahien)" <aye...@ayende.com>
wrote:
> couple of months from now
>
>
>
> On Thu, Apr 5, 2012 at 7:29 PM, Chris Marisic <ch...@marisic.com> wrote:
> > Is there a target date for that yet?
>
> > On Thursday, April 5, 2012 12:28:31 PM UTC-4, Oren Eini wrote:
>
> >> Okay, that means that it would have to wait until we complete the
> >> encryption bundle for RavenDB enterprise.
>
> >> On Thu, Apr 5, 2012 at 6:45 PM, Chris Marisic <ch...@marisic.com> wrote:
>
> >>> Indexes have to be protected as the information HIPAA requires protected
> >>> is some of the most common information staff will want to search on to find
> >>> users of the system.
>
> >>> On Thursday, April 5, 2012 11:36:46 AM UTC-4, Oren Eini wrote:
>
> >>>> Would something like this suffice?
> >>>>http://daniellang.net/**document**-level-encryption-in-**ravendb/<http://daniellang.net/document-level-encryption-in-ravendb/>

Oren Eini (Ayende Rahien)

unread,
Apr 5, 2012, 12:57:50 PM4/5/12
to rav...@googlegroups.com
Matt,
The reason we don't offer an encryption bundle is that we are working on also encrypting indexes.

Hermano Cabral

unread,
Apr 5, 2012, 1:01:23 PM4/5/12
to rav...@googlegroups.com
Any estimates on the performance drop of the encryption bundle?

Oren Eini (Ayende Rahien)

unread,
Apr 5, 2012, 1:03:08 PM4/5/12
to rav...@googlegroups.com
Not until we have it in testing.
I wouldn't expect it to be too much, we already have caching available there to handle most of this, and CPU is cheap compared to IO

Chris Marisic

unread,
Apr 5, 2012, 1:09:54 PM4/5/12
to rav...@googlegroups.com
If everything is encrypted, the search terms would get encrypted but I assume that becomes a nightmare for dealing with case sensitivity?

Oren Eini (Ayende Rahien)

unread,
Apr 5, 2012, 1:15:49 PM4/5/12
to rav...@googlegroups.com
No, encrypting for the index will be done at the file level, not the term level.

Itamar Syn-Hershko

unread,
Apr 6, 2012, 5:09:51 AM4/6/12
to rav...@googlegroups.com, Jonathan Matheus
Out of sheer curiosity - those standards require you to encrypt everything that is on disk, but doesn't require anything from in-memory caches?

Oren Eini (Ayende Rahien)

unread,
Apr 6, 2012, 6:10:03 AM4/6/12
to rav...@googlegroups.com, Jonathan Matheus
That is the crazy part, because from my point of view, if it is in plain text in memory, it can get to the page file, or we can just trigger a dump and read it from memory.

Itamar Syn-Hershko

unread,
Apr 6, 2012, 6:26:48 AM4/6/12
to rav...@googlegroups.com, Jonathan Matheus
Exactly

Oren Eini (Ayende Rahien)

unread,
Apr 6, 2012, 6:28:30 AM4/6/12
to rav...@googlegroups.com, Jonathan Matheus
Major issue with regulations, and having to obey regulations.
You aren't trying to avoid leaking info, you are trying to be compliant with the reg, regardless of their sense.

Itamar Syn-Hershko

unread,
Apr 6, 2012, 6:36:20 AM4/6/12
to rav...@googlegroups.com, Jonathan Matheus
Even if you encrypt the data that is stored in-memory, the asynchronous key will always be available on that machine one way or the other. How do you go past that?
Reply all
Reply to author
Forward
0 new messages