The starting point I would suggest is "I own my data". This would
correspond more or less to the default copyright on documents - even
if you don't say anything explicit on something you write, you have
What happens when we sign up for a service is we allow that party
access to (some parts of) our own data, currently usually by filling
in forms. Wen we connect to friends within social networking systems
is we allow them access to (some parts of) our own data. In both cases
this seems an implicit licensing of that data for subsequent use.
However, not everyone sees things that way.
Dare Obasanjo  draws a distinction between information exposed on
the service's web pages and that exposed through the API. While the
quality of the data may differ significantly, I'd suggest that in
terms of licensing this distinction is bogus. If I can copy & paste
from one app to another, that can have the net end result as scraping.
As Paul Downey put it , good web APIs are just web sites.
A more extreme view can be found in a comment on Scoble's blog  -
"...you stole my personal details...". While this seems a kneejerk
reaction, it's clear how such a perception could arise.
Right now the service providers generally allow connection with a
vague "he my friend", and bury any details deep within their Terms of
Service. But if the terms of the connection were made explicit, not
only for signup with the service, but with every connection event, any
ambiguity would be removed. Hence:
"Robert is my friend...I'd like to grant him access to my data"
Which leads onto the question of what form such a license might take.
Many of the options are already visible in copyright and software
licenses, though I don't believe (m)any are directly suitable for use
with data. The difficulties arrive with data derived from the original
data - along the lines of software extension and modification, but
twistier (e.g. attention profiles which might only contain derived
data, but couldn't exist without the original).
Anyhow, possible examples would be:
1. open license - anyone can use my data (with/without attribution)
2. reciprocal open license - anyone can use my data, but whatever they
use it with
must also be exposed under this license
3. trust license - the person to whom I license this data may use it
as they please
4. silo license - the person to whom I license this data may use it as
they please within the local system
1. is likely to be impractical in the context of social networking
sites without fine granularity of data access - e.g. I'm happy for my
name and homepage to be associated together in public, but would
rather my email and geographic address are restricted. Long term I
believe we will need this.
2. is ideal for Open Data, in fact this is essentially what the new
Open Data Commons  license looks like (disclosure - I work for
Talis, the company who got together with Creative Commons/Science
Commons to produce this license). But the copyleft nature of the
license probably wouldn't appeal to many social networking services
who see their data garden as business value.
3. seems naive, but I can't think of a better way of approaching things
4. would in effect be a formalisation of the current Facebook position
So I think it might be worth considering what 3 & 4 might look like in
Please bear in mind with the above that IANAL, but then again I doubt
many lawyers have a particularly sophisticated view of data. This
stuff may be best driven by the folks getting their hands dirty.
(thanks too, Elias)