host-proof hosting vs. instant data liberation

17 views
Skip to first unread message

Michiel de Jong

unread,
Aug 5, 2011, 8:10:53 AM8/5/11
to unhosted
In the original Proof-of-Concept that this whole project started with, there was a lot of emphasis on end-to-end encryption, as a way to accomplish host-proof hosting.

Since then, the idea of instant data liberation (using more than one application on the same data) has entered.

Ever since, the two goals have been competing with each other, because you cannot encrypt data in one app, and then use it in another app. So what's more important, host-proof hosting, or instant data liberation?

One option to combine the two concepts, is to introduce three layers instead of two: app (pluggable), identity (a trusted host where encryption keys are stored), storage (encrypted). This is possible, but becomes less plausible. If you have a server where you host your identity (e.g. a WordPress, your indieweb homestead), then why not install ownCloud or similar on that server, and host your data yourself? well, 'yourself', relatively, because it would still be on a hosting account.

Another view that has been discussed is that this indieweb unhosted storage could implement our protocol for small files, including your encryption keys, and link to one or more secondary storage hosts for (chunks of) bigger files.

So how to do the encryption is a bit of an open question still. In the meantime, there's the question of what we can achieve without using encryption.

Encryption is hard, and makes the system more susceptible to catastrophy. If there's a bug in the code, and encrypted blobs are publicly available, or at least available to people who are not allowed to unencrypt them, then this would endanger all data of all users that have been using the buggy code. If you don't use encryption, and use a trusted server instead to store all your data, then the consequences of a bug would likely be solved by patching the host that runs the buggy code currently.

So how can we liberate users if not by encryption?

Two ways:
- providing choice. as a power user, i can choose who i trust, instead of having to trust specifically the company who offers a website that i want to use my data on
- pushing for user-readable terms of service and/or a 'green label' or five-star rating on different scales, i.e. something that makes this choice accessible to non-power users.

If we get rid of end-to-end encryption, we'll need access control on the storage protocol. From what people have been saying, this would also be a lot more developer-friendly, anyway. Multiple people (among them Daniel, Timothee, and Edwin, I think?) never really agreed with this model of leaving the encrypted data public for all to see. Even if end-to-end encryption works, it might not be the model that developers can easily work with without having to think extra about it. And developer experience is probably almost as important for us as end user experience.

Encryption /sounds/ good, and probably a lot of our twitter followers who are not necessarily programmers, but rather general proponents of freedom, will have associated us with cool projects that use encryption in impressive ways. So we would probably disappoint/confuse/lose that part of our audience a little bit if we stop talking so much about cryptography. Everything else being equal, having an audience is important to help adoption. But if there's a choice between being an interesting project and being a useful project, I think we should obviously switch to being more useful, even if that makes us less interesting. Also, I think the important public we should try to sound interesting to, should be the "html5" or "jsconf" audience. Web developers. They will probably care (or if not, then probably should care) a lot more about freeing a user's data by adding instant data liberation, and building interoperable apps on top of unhosted data, than about host-proof hosting.

The status of the above text is: braindump. So please react and contribute. I hope I'm not saying any strange things here, and there might be important aspects that I haven't thought about here.


Cheers!
Michiel

Bjarni Rúnar Einarsson

unread,
Aug 5, 2011, 2:49:06 PM8/5/11
to unho...@googlegroups.com
Hey Michiel,

Interesting thoughts! Some comments follow in-line.

On Fri, Aug 5, 2011 at 12:10 PM, Michiel de Jong <mic...@unhosted.org> wrote:
In the original Proof-of-Concept that this whole project started with, there was a lot of emphasis on end-to-end encryption, as a way to accomplish host-proof hosting.

Since then, the idea of instant data liberation (using more than one application on the same data) has entered.

I actually think this is a really important concept - there is, as always and as we hopefully learned from the Windows desktop monopoly, a huge risk to user freedom from data lock-in and proprietary formats.

The Unhosted architecture can really do very little to prevent such things (effectively closed-source Unhosted apps will be able to create obfuscated data formats), but we certainly don't want to make the mistake of making such lock-in a *mandatory* feature of the system due to how it uses cryptography.

I want to be able to switch calendar apps and editors without losing my data.  If I can't, I don't really consider a platform viable, except maybe as a toy.

Encrypting all data also means you prevent the data storage server from helping out with processing of the data.  For some apps this does not matter, for others it is critical.  So the decisions made here will in some sense determine what kind of apps can be made "unhosted" and what kind can't.  This isn't necessarily bad (in some senses it is the point, after all), but it's worth keeping in mind.

One option to combine the two concepts, is to introduce three layers instead of two: app (pluggable), identity (a trusted host where encryption keys are stored), storage (encrypted). This is possible, but becomes less plausible. If you have a server where you host your identity (e.g. a WordPress, your indieweb homestead), then why not install ownCloud or similar on that server, and host your data yourself? well, 'yourself', relatively, because it would still be on a hosting account.

This has always been my vision for PageKite, and is part of the vision of the FreedomBox as well. :-)  Self-hosting makes many privacy and security issues moot.

The main problems have to do with availability and reliability - things like backups and not losing everything when hardware malfunctions.  These have proven to be hard problems, which is why the cloud has so much appeal - one of the advantages to encrypting the data as you originally envisioned, is it can be more widely replicated and strangers can help each other with backups and availability, without sacrificing privacy.

So how to do the encryption is a bit of an open question still. In the meantime, there's the question of what we can achieve without using encryption.

Encryption is hard, and makes the system more susceptible to catastrophy. If there's a bug in the code, and encrypted blobs are publicly available, or at least available to people who are not allowed to unencrypt them, then this would endanger all data of all users that have been using the buggy code. If you don't use encryption, and use a trusted server instead to store all your data, then the consequences of a bug would likely be solved by patching the host that runs the buggy code currently.

Another failure mode is when you lose the encryption keys, which is equivalent to losing all data at once.  Related to this, is the problem of bit-rot: if a bit gets flipped in an encryption key or an encrypted file all subsequent bits usually become unreadable - if a bit gets flipped in a plaintext file, then most of the contents can usually be recovered.

So encryption of data at rest is absolutely not without significant costs and some new risks.

On the other hand: Encryption of data in transfer (SSL etc.) is relatively well understood and solved and there are few good reasons not to encourage and use that as much as possible.
 
Two ways:
- providing choice. as a power user, i can choose who i trust, instead of having to trust specifically the company who offers a website that i want to use my data on
- pushing for user-readable terms of service and/or a 'green label' or five-star rating on different scales, i.e. something that makes this choice accessible to non-power users.

I like both of these ideas a lot.

I don't really have any concrete answers (I probably just added to the confusion, sorry), but I like the fact that you're discussing and revisiting these things.  Encryption is like any other tool, it should be used when appropriate and with full awareness of its limitations and costs.

--
Bjarni R. Einarsson
Founder, lead developer of PageKite.

Make localhost servers visible to the world: http://pagekite.net/

Thad Guidry

unread,
Aug 5, 2011, 3:17:42 PM8/5/11
to unho...@googlegroups.com
In regards to Access Control and Encryption...My personal take, as it stands and has for over 30 years in computing...

In the Enterprise, the convention of http://en.wikipedia.org/wiki/Principle_of_Least_Authority is widely used.  Tahoe LAFS implements this convention.  Window Active Directory, mostly out of the box, also implements this philosophy fairly well, and even User Space in Linux for the most part.  Why reinvent a wheel for your cart...when a sled will do just fine (or someone else's sled) ?

If I were a user, who was given a choice, I would be the user who picks a vendor, service, or partner that understands and utilizes the POLA convention heavily. (If I were a user that was worried about risks).

"Special 2 month offer for new users only, that subscribe to our WeStoreYourStuff.org FliksMixPics service that uses POLA technology with our Partner Base of Apps."

kevin.granade

unread,
Aug 5, 2011, 4:29:31 PM8/5/11
to unhosted
Replies inline.

On Aug 5, 7:10 am, Michiel de Jong <mich...@unhosted.org> wrote:
> In the original Proof-of-Concept that this whole project started with, there
> was a lot of emphasis on end-to-end encryption, as a way to accomplish
> host-proof hosting.
>
> Since then, the idea of instant data liberation (using more than one
> application on the same data) has entered.
>
> Ever since, the two goals have been competing with each other, because you
> cannot encrypt data in one app, and then use it in another app. So what's
> more important, host-proof hosting, or instant data liberation?

I might be missing something obvious, but you most certainly CAN
encrypt data in one app, and then use it in another app. An app
obviously needs to know the algorithm used to encrypt the data (which
is non-sensitive, can appear in a plaintext header), and the key used
to encrypt it, so what you are missing is a mechanism for key
exchange. I'm not saying this is easy, but it is surely not
intractable.

>
> One option to combine the two concepts, is to introduce three layers instead
> of two: app (pluggable), identity (a trusted host where encryption keys are
> stored), storage (encrypted). This is possible, but becomes less plausible.
> If you have a server where you host your identity (e.g. a WordPress, your
> indieweb homestead), then why not install ownCloud or similar on that
> server, and host your data yourself? well, 'yourself', relatively, because
> it would still be on a hosting account.

If you are encrypting the data at all, the keys certainly need to be
stored somewhere trusted already! Obviously the traditional trusted
storage location of "on my computer" is out due to your insistence on
not maintaining state in the browser (which I'm not criticizing, I
agree). So there must be a remote location (perhaps self-hosted, but
not necessarily on your person) that is 1. trusted and 2. can
authenticate you. Wherever that trusted location is, it can hand out
keys to apps as you direct in order to access the stored data.

>
> Another view that has been discussed is that this indieweb unhosted storage
> could implement our protocol for small files, including your encryption
> keys, and link to one or more secondary storage hosts for (chunks of) bigger
> files.

I think this is more or less what I'm thinking, the keys have to live
somewhere. In practice, if you trust your storage provider, you can
store the keys there or leave the data unencrypted. Storing the keys
at the host and still performing the encryption lets you migrate the
data to an untrusted host if necessary (as a backup for example).

>
> So how to do the encryption is a bit of an open question still. In the
> meantime, there's the question of what we can achieve without using
> encryption.
>
> Encryption is hard, and makes the system more susceptible to catastrophy. If
> there's a bug in the code, and encrypted blobs are publicly available, or at
> least available to people who are not allowed to unencrypt them, then this
> would endanger all data of all users that have been using the buggy code.

Encryption IS hard, but I was assuming you would be leveraging well-
known encryption technologies so this would be much less of an issue.
Regardless of how much you trust your host to not be abusive, there is
also a question of how much you can trust them to be secure (both from
hacking and business takeover). You might absolutely trust the
management of some host, only to have them encounter financial or life
difficulty, and your data is suddenly managed by different management,
who you do not trust. (this obviously applies to the trusted key host
as well, but is perhaps more amenable to being self-hosted or cheaply
hosted, such that this sort of thing is less of a risk)

> If
> you don't use encryption, and use a trusted server instead to store all your
> data, then the consequences of a bug would likely be solved by patching the
> host that runs the buggy code currently.

Er, the consequences of SOME bugs, you certainly can't retroactively
fix a data leak.

>
> So how can we liberate users if not by encryption?
>
> Two ways:
> - providing choice. as a power user, i can choose who i trust, instead of
> having to trust specifically the company who offers a website that i want to
> use my data on
> - pushing for user-readable terms of service and/or a 'green label' or
> five-star rating on different scales, i.e. something that makes this choice
> accessible to non-power users.
>
> If we get rid of end-to-end encryption, we'll need access control on the
> storage protocol. From what people have been saying, this would also be a
> lot more developer-friendly, anyway. Multiple people (among them Daniel,
> Timothee, and Edwin, I think?) never really agreed with this model of
> leaving the encrypted data public for all to see. Even if end-to-end
> encryption works, it might not be the model that developers can easily work
> with without having to think extra about it. And developer experience is
> probably almost as important for us as end user experience.
>

You might be thinking of this at a lower level than I am. I tend to
assume the application developers will be using either part of the
framework or perhaps importing a library/module to perform encryption/
decryption, it's not something application developers should be
messing with in general.

> Encryption /sounds/ good, and probably a lot of our twitter followers who
> are not necessarily programmers, but rather general proponents of freedom,
> will have associated us with cool projects that use encryption in impressive
> ways. So we would probably disappoint/confuse/lose that part of our audience
> a little bit if we stop talking so much about cryptography. Everything else
> being equal, having an audience is important to help adoption. But if
> there's a choice between being an interesting project and being a useful
> project, I think we should obviously switch to being more useful, even if
> that makes us less interesting. Also, I think the important public we should
> try to sound interesting to, should be the "html5" or "jsconf" audience. Web
> developers. They will probably care (or if not, then probably should care) a
> lot more about freeing a user's data by adding instant data liberation, and
> building interoperable apps on top of unhosted data, than about host-proof
> hosting.
>
> The status of the above text is: braindump. So please react and
> contribute. I hope I'm not saying any strange things here, and there might
> be important aspects that I haven't thought about here.

You are saying strange things, but they are strange things that need
to be said.

Kevin Granade

>
> Cheers!
> Michiel

Daniel Lidstrom

unread,
Aug 5, 2011, 5:21:26 PM8/5/11
to unho...@googlegroups.com
Encryption /sounds/ good, and probably a lot of our twitter followers who are not necessarily programmers, but rather general proponents of freedom, will have associated us with cool projects that use encryption in impressive ways. So we would probably disappoint/confuse/lose that part of our audience a little bit if we stop talking so much about cryptography. Everything else being equal, having an audience is important to help adoption. But if there's a choice between being an interesting project and being a useful project, I think we should obviously switch to being more useful, even if that makes us less interesting. Also, I think the important public we should try to sound interesting to, should be the "html5" or "jsconf" audience. Web developers. They will probably care (or if not, then probably should care) a lot more about freeing a user's data by adding instant data liberation, and building interoperable apps on top of unhosted data, than about host-proof hosting.

Would it simplify setting up a plug server to serve all your separate web-app data?  The FreedomBox guys would like that.

Daniel Lidstrom

unread,
Aug 5, 2011, 5:31:09 PM8/5/11
to unho...@googlegroups.com
Would it simplify setting up a plug server to serve all your separate web-app data?  The FreedomBox guys would like that.

I forgot to say because encryption doesn't matter here.

...

What I'm trying to say is if the job is made easier for the FreedomBox guys, then they'll have an easier time reducing the barrier to people serving your own data.

Edwin Chu

unread,
Aug 7, 2011, 5:01:12 AM8/7/11
to unho...@googlegroups.com
Hi

I have to admit that I was attracted by the end-to-end encryption
property of Unhosted at first. I thought the cool guys behind Unhosted
have invented some clever mechanism to encrypt data at client side
securely without adding burden to the user experience. However I soon
found that this innovative mechanism does not exist yet, and what we
have now is an imperfect solution.

Why the current one is imperfect? The problem mainly lie in storing
and sharing the secret key among devices securely. By secure, I mean
the key is accessible by only authorized person and trusted entity,
and also, the risk of losing the key is low. To many users, it is no
less disaster to lose the key (and hence total lose of all their data
instantly) than some unknown person seeing their personal photos.

It is possible to have both encryption and instant data liberation,
but we couldn't do it without scarifying usability and simplicity. As
Michiel mentioned, we can introduce the third layers to Unhosted
architecture to satisfy both end-to-end encryption and instant data
liberation: app, identity, and storage. But adding another service
require the user to make more decisions (finding and choosing the
service providers) and harder to understand the relationship between
service providers (which the user ought to know in order to choose the
providers wisely). Also, this architecture assumed the user
authenticate to the identity and storage independently and the
encrypted data in storage is inaccessible by the identity node. If
this assumption does not hold, there is little reason to separate the
identity and storage because we are allowing the identity node to
access all data in essence. It may even worse to give users a false
sense of security, than disclosing the fact that there are risks in
Unhosted to use a bad storage, and tell the users to choose the
service providers wisely.

You may argued that users could store their keys in a trusted host and
encrypted data in untrusted host. For most average users, if he could
trust some host to store his private key, why not putting all data in
that trusted host? It is much easier for users and developers if fewer
services are involved. In reality, because the users are more willing
to pay for the storage instead of someone who simply storing the key,
it is more likely to find a storage provide that I could trust (in
term of privacy and continuity of service).

Encrypting the data does not make back up easier, because you still
have to back up your key. If you lose the key, all the back up become
rubbish.

My opinion is, before we find a perfect end-to-end encryption
mechanism which is both secure and easy to use, we should make the
encryption a optional feature of Unhosted software. Serious users who
want to encrypt their data may tolerate the need for separate
passphrase to unlock their key, or even use a usb drive to bring their
key along.

I think introducing standardized term of service (simple like Creative
Commons) or some kind of labeling/rating system do help users a lot to
choose their service provide. It could also be part of the campaign to
promote Unhosted software. We could label/rate the popular cloud
service to educate the public how bad are their ToS, similar to
Greenpeace's Guide to Greener Electronic
http://www.greenpeace.org/international/en/campaigns/toxics/electronics/Guide-to-Greener-Electronics/.

I share your worry about it may sounds less "interesting" without
end-to-end encryption as a core part of Unhosted, as I am one of the
people attracted by it :). But I believed that making the project
useful is a much more important goal, and Unhosted is no less
interesting without end-to-end encryption. Unhosted is still about
"freeing" the cloud. I think the "freedom fighters" following Unhosted
are smart persons who would understand us.

Edwin

Michiel de Jong

unread,
Aug 8, 2011, 8:49:40 AM8/8/11
to unho...@googlegroups.com
Hi! thanks all for your replies, i think we can agree that some of us are in favour of moving towards a more unencrypted approach, and some are not.

It made me realise that the end-to-end encryption is not just 'interesting', it is actually part of the spirit of the project. And that's important to preserve, because our peer group will be built upon that spirit. And that peer group (like drinkers of fair trade coffee) will drive adoption. In fact, I now see that we could also experiment with even more advanced technology that have been named several times on this list, like data storage on .p2p and .onion hosts (since we ajax from javascript, we could maybe implement a custom pre-ajax DNS lookup).

So I think we should cater for both the 'web' school and the 'geek' school (i use the term geek in the positive sense here). We could make the protocol usable without encryption, but also keep thinking about how we can implement end-to-end encryption. If we decide that end-to-end encryption requires a freedombox, then so be it. 

So maybe first hit an unencrypted host, which might be your freedombox, and then if requested, from there hit a host-proof host for bigger data blobs.

It's true that you will not take away the need for good backup of your freedombox. but at least you take away the need for high bandwidth, both on the freedombox cpu and on your home uplink.

so we would at this point try to cover point 2 and 3 of what would sort of become a four-step vision:

1) data stored per-user instead of per-app
2) user controls location of their data (instant data liberation)
3) only user sees the data (host-proof hosting)
4) p2p routing experiments

Step 1 would be web apps that use an architecture with external per-user storage, but offer only a limited choice of compatible storage providers (for instance only dropbox or only github). The are well underway to reaching point 2, so we should applaud them, but our goal are points 2 and 3.

This is different from the way I've been explaining the project lately, which was a focus almost entirely on point 2, and it's also different from how the current version of the protocol is designed, which is exclusively for use as per point 3. so my proposal is that going forward, we simply keep both objectives in mind. Is anyone against that?


Cheers!
Michiel
Reply all
Reply to author
Forward
0 new messages