--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/
To visit our Web site, click on http://zfs-fuse.net/
Always the subtle analysis guy :) You might want to think that over just
once (especially regarding the encryption bit)
...
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/
encrypted contents is not supposed to be 'known' so it cannot by
definition be deduped before encryption. It wouldn't exactly make sense
really dedup an identical block of content when it already exists in the
case of encryption, because
(a) the block will (possibly) have another encryption key
(b) it would be possible to detect the presence of a file in encrypted
space (by simply adding the same file elsewhere and seeing whether it
gets deduplicated). This would be quite the security hole in say VPS
environments... not to mention the privacy implications
Similarly, would you be willing to 'dedup' with an existing compressed
block even if the existing block used another compression algorithm than
current 'active' filesystem mandates?
Also, would you need to rewrite the dedup master blocks to be compressed
as soon as a single write is done in compressio mode? How would this
behave in the case of dedup=on (or more specific) and copies=5? Would
all copies have to have the same compression/have to be rewritten
?
?
ZFS is versatile and combines several layers of the classic storage
stack. This means that many more subtle combinations can arise,
precisely _because_ the layers are not isolated anyway. Probably
therefore, some 'counterintuitive' restrictions do apply here and there
that you would label 'stupid' on plain sight.
That said, the ecnryption scenario is exactly the same with dm-crypt; If
you ran any deduping FS om dm-crypt, obviously the encryption happens at
the lowest level, i.e. _after_ deduplication. But then, it would simply
be impossible to _also_ store blocks without encryption (or with
differen encryption) on the entire volume; I prefer the zpool way of
work by a mile or ten!
On 04/06/2011 11:37 AM, sgheeren wrote:Ok, I'll elaborate (as for the longer version of thinking about it):
> Always the subtle analysis guy :) You might want to think that over just
> once (especially regarding the encryption bit)
>
encrypted contents is not supposed to be 'known' so it cannot by
definition be deduped before encryption. It wouldn't exactly make sense
really dedup an identical block of content when it already exists in the
case of encryption, because
(a) the block will (possibly) have another encryption key
(b) it would be possible to detect the presence of a file in encrypted
space (by simply adding the same file elsewhere and seeing whether it
gets deduplicated). This would be quite the security hole in say VPS
environments... not to mention the privacy implications
Similarly, would you be willing to 'dedup' with an existing compressed
block even if the existing block used another compression algorithm than
current 'active' filesystem mandates?
Also, would you need to rewrite the dedup master blocks to be compressed
as soon as a single write is done in compressio mode? How would this
behave in the case of dedup=on (or more specific) and copies=5? Would
all copies have to have the same compression/have to be rewritten
?
?
Ok for encryption... But anyway for this to work you would need to have an decrypted version of the files. In this case, encryption becomes mainly useless if you already have the decrypted version ! (except for testing the presence of a given file, it's really a very specific situation).
So all in all it's not a big risk.
One drawback of data deduplication is that it cannot be used effectively with client-side encryption. For example, encrypted data does not deduplicate well (new extents are not likely to match extents stored on the server), and encrypting data after deduplication requires all users to share the same encryption key. However, you can secure data while using deduplication.http://www.ibm.com/developerworks/wikis/display/tivolistoragemanager/Data+deduplication+in+Tivoli+Storage+Manager+V6.2+and+V6.1#DatadeduplicationinTivoliStorageManagerV6.2andV6.1-Security
Nope and nope, it's just for reading, not for writing.
The idea : it tries to write a compressed block, then notices it already exists in another form (uncompressed or compressed with another method), then it just uses dedup to the version which already exists, that's all. No rewriting needed of anything.
On 04/06/2011 01:49 PM, Emmanuel Anne wrote:I'm sure there are many many people who would love to disagree with you here.Ok for encryption... But anyway for this to work you would need to have an decrypted version of the files. In this case, encryption becomes mainly useless if you already have the decrypted version ! (except for testing the presence of a given file, it's really a very specific situation).
So all in all it's not a big risk.
If my life depended on it, I wouldn't like for people to be able to just by 'dedup-counting' be able to find out exactly which version of ssh is installed on my system... And I wouldn't exactly be thrilled to find that a malicious government was able to detect the presence of forbidden media on my encrypted volumes...
In the VPS situation, one might devise an attack to find whether your host hosts VPS-es that still have the default configuration for, say, passwd, emailserver, database server, etc. and be able to target those hosts. Or like in SQL injection (specifically timing error responses) attacks, it could be used to externally check whether an attack over another attack vector had the intended result (by checking the existence of a resulting special block on disk).
The risk in this type of attack has been published before. For reference I just drug up this quote from the IBM Tivoli whitepapers:
One drawback of data deduplication is that it cannot be used effectively with client-side encryption. For example, encrypted data does not deduplicate well (new extents are not likely to match extents stored on the server), and encrypting data after deduplication requires all users to share the same encryption key. However, you can secure data while using deduplication.http://www.ibm.com/developerworks/wikis/display/tivolistoragemanager/Data+deduplication+in+Tivoli+Storage+Manager+V6.2+and+V6.1#DatadeduplicationinTivoliStorageManagerV6.2andV6.1-Security
HTH
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/
Encryption after deduplication forces all users to share the same encryption key ???
It must be in a different context, there is nothing to force that here.
I remind you I said ok for encryption ! You could make this an option, but it would probably be too much of a hassle, so if paranoids say it's better this way, then leave it this way and avoid dedup if you can (in most situations it can be avoided anyway !).