Data licenses

Skip to first unread message

Danny Ayers

Jan 4, 2008, 8:18:04 AM1/4/08
I think the good Mr. Scoble's recent experience with Facebook offers
some good pointers to areas which need work. One which seems prominent
is data ownership and licensing. Here are some sketchyl thoughts on
the matter.

The starting point I would suggest is "I own my data". This would
correspond more or less to the default copyright on documents - even
if you don't say anything explicit on something you write, you have
the copyright.

What happens when we sign up for a service is we allow that party
access to (some parts of) our own data, currently usually by filling
in forms. Wen we connect to friends within social networking systems
is we allow them access to (some parts of) our own data. In both cases
this seems an implicit licensing of that data for subsequent use.
However, not everyone sees things that way.

Dare Obasanjo [1] draws a distinction between information exposed on
the service's web pages and that exposed through the API. While the
quality of the data may differ significantly, I'd suggest that in
terms of licensing this distinction is bogus. If I can copy & paste
from one app to another, that can have the net end result as scraping.
As Paul Downey put it [2], good web APIs are just web sites.

A more extreme view can be found in a comment on Scoble's blog [3] -
" stole my personal details...". While this seems a kneejerk
reaction, it's clear how such a perception could arise.

Right now the service providers generally allow connection with a
vague "he my friend", and bury any details deep within their Terms of
Service. But if the terms of the connection were made explicit, not
only for signup with the service, but with every connection event, any
ambiguity would be removed. Hence:

"Robert is my friend...I'd like to grant him access to my data"

Which leads onto the question of what form such a license might take.
Many of the options are already visible in copyright and software
licenses, though I don't believe (m)any are directly suitable for use
with data. The difficulties arrive with data derived from the original
data - along the lines of software extension and modification, but
twistier (e.g. attention profiles which might only contain derived
data, but couldn't exist without the original).

Anyhow, possible examples would be:

1. open license - anyone can use my data (with/without attribution)
2. reciprocal open license - anyone can use my data, but whatever they
use it with
must also be exposed under this license
3. trust license - the person to whom I license this data may use it
as they please
4. silo license - the person to whom I license this data may use it as
they please within the local system

1. is likely to be impractical in the context of social networking
sites without fine granularity of data access - e.g. I'm happy for my
name and homepage to be associated together in public, but would
rather my email and geographic address are restricted. Long term I
believe we will need this.

2. is ideal for Open Data, in fact this is essentially what the new
Open Data Commons [4] license looks like (disclosure - I work for
Talis, the company who got together with Creative Commons/Science
Commons to produce this license). But the copyleft nature of the
license probably wouldn't appeal to many social networking services
who see their data garden as business value.

3. seems naive, but I can't think of a better way of approaching things

4. would in effect be a formalisation of the current Facebook position

So I think it might be worth considering what 3 & 4 might look like in
more detail.

Please bear in mind with the above that IANAL, but then again I doubt
many lawyers have a particularly sophisticated view of data. This
stuff may be best driven by the folks getting their hands dirty.





Chris Saad

Jan 4, 2008, 8:23:12 AM1/4/08
to DataPortability Workgroup
Danny it's very interesting that you raise this as well - Ian
Forrester and I were just discussing it as well on Skype and he was
talking about creating some sort of "DP Reference Policy" which would
act as a sort of boiler plate legal document (or at least
philosophical framework for a legal document) that services could
adopt right along side the technical DP Reference Design.

At a high level, It might also be based on the broader documents that
have gone around such as the Social Bill of Rights and/or Media 2.0
Best Practices

Your post here could very well act as a great starting point!

> [1]
> [2]
> [3]

Elias Bizannes

Jan 5, 2008, 2:43:44 PM1/5/08
to DataPortability Workgroup

Chris Saad

Jan 5, 2008, 7:43:03 PM1/5/08
to DataPortability Workgroup
Danny - I started this page for you :)

Would love you to re-word your post as an outline there...

It seems to be coming up a lot atm, so seems really timely.

I also linked it from the front page for you


Danny Ayers

Jan 6, 2008, 1:16:44 PM1/6/08
Er...ok, cool!!

(thanks too, Elias)


Danny Ayers

Jan 6, 2008, 3:17:10 PM1/6/08
PS. I pasted my email over on the Talis "Nodalities" blog, along with
some afterthoughts and a plea for feedback...


Chris Saad

Jan 6, 2008, 8:19:24 PM1/6/08
to DataPortability Workgroup
Great Danny - I added it into the DP RSS feed

Look forward to seeing how the policy develops!

I think Ian Forrester may have some great input too - he has some
experience from the BCC in this too I believe.

On Jan 7, 6:17 am, "Danny Ayers" <> wrote:
> PS. I pasted my email over on the Talis "Nodalities" blog, along with
> some afterthoughts and a plea for feedback...
> --

Eran Hammer-Lahav

Jan 6, 2008, 10:30:36 PM1/6/08
to DataPortability Workgroup
This is very interesting.

I think there are two key questions when trying to address this topic.
The first is, what is this data people are so emotional about, and
care so much to keep private. The second, how much of this is coming
out of an American culture of identity that might not be applicable to
other cultures.

If the Plaxo script only took Scoble friends' names and birthday,
would people be so pissed off? What about just their school
affiliation? I think the scars of dealing with spam, having to change
email address every couple of months, and just wasting time has made
people scared of their email being harvested. But the reality is, spam
is no longer such a big problem. Most people who complain about email
overload, talk about valid email from people they trust, just too much
of it. In a world with no spam, would we be looking for other
identifiers than email?

When we say "share our data", email is the first thing that pops into
people's mind and that tends to frame the conversation. Most of the
people who posted against Scoble's stealing their data, publish all
their contacts on their blog, emails, groups, etc. They hand out their
cards with full information on them like candy.

So answering the first question, what is this data we are trying to
protect, is key. Why? Because while many think this should be an
abstract exercise that can fit any data we have now or might have in
the future, for most of what we have today, we already have pretty
good solutions. As Danny mentioned, we got software licenses and
copyrights where the law can help define rights. We got many years of
dealing with file systems to define read/write/execute access rights
for resources. So listing the kind of data we want to protect is key.

The second question is what's behind the emotional reaction to "losing
one's privacy" and how is privacy defined. I think our conversation if
framed by an American point of view of privacy that is very much
unique. In Israel, your friends will ask you how much you make, people
on the street will ask you how much you paid something, and if you
don't answer, you might be considered rude. And at the same time, this
is the same country where I can find out exactly how much you paid for
you house, internet or no internet. People post photos of the kids,
sometime wearing very little on their family website, and they don't
worry about it because their site gets like 2 hits a month, so for
them it is private. But when they post the same photos on Flickr,
where they can be searched, they freak out that perverts might do bad
things with them.

Some of this is just basic anti-government American culture, and some
comes from the way the credit history system is setup where there is
no strong identity provided by the government, making it easier to
steal your identity, hence making information about you very valuable.
Most accounts you have are happy to verify your identity on the phone
with your address and mother maiden name. In 20 years, mom's maiden
name will be a quick Facebook lookup away, because your mom and dad
met on a social network.

I think the 'why' people want to keep their data private, and the
'what' this data is, are critical to our solution of 'how' to
partition it and protect it. Copyrights work because the government
enforces it. Being banned from Facebook doesn't seem to be a strong
enough reason not to run scripts if the data is of value.

My approach has always been to ask, what *can't* we do today because
we are afraid of losing our privacy. Social networks are successful
because at the end, most people just don't care if you can find their
birthday on Google or see their drunken photos. They might live to
regret it but it will never stop them from doing it in the first
place. But there is plenty that we would like to do but can't. To me,
this is the interesting part.

So back to Danny's post, what kind of framework would actually enable
new kinds of services that today are not possible?

> [1]
> [2]
> [3]

Chris Saad

Jan 6, 2008, 10:43:07 PM1/6/08
to DataPortability Workgroup
Eran that's a great set of questions and background.. At the risk of
killing the conversation with a pithy comment - Phil and I were
talking on skype the other day that while the issue of 'Privacy'
always fires the imagination - some of the debate may be a little

If i email you, you get my email address.

If I friend you, you get my email address.

Big deal.

Also, as I always say, Privacy is just a subset of the broader theme
of Control. And DP is all about control - far more than FB gives you

And the subset conversation of Privacy needs to move on... to quote
Chris Messina:

"Our notions of privacy must adapt. That starts by developing the
language to discuss privacy in a way tha's obvious and salient. Simply
demanding the protection of one's privacy is now a hollow and
unrealistic demand. Our systems should protect our privacy by allowing
us to fine-tune the trust we put in our friends."

For the life of me I can't find the link to the original post he made
- but the gist was that we need to start talking about permissions,
trust, brokering etc - rather than just 'Privacy'

Also consider if you send someone anything to their computer, then
they have it, forever. Sure you can cut it off at the source, but you
can't stop people from copying digital bits.

Why cant the same be true for a website/social network.

Anything else is just not practical or neccessary


Eran Hammer-Lahav

Jan 7, 2008, 9:03:13 AM1/7/08
to DataPortability Workgroup
Personally I have no problem with that, but a lot of it has to evolve
organically in order for both users and companies to support it.
People hate DRM but it was absolutly neccesary. The music industry had
to try it out and see that it doesn't work to feel better about
trusting people. To try and build technology to protect you data is
just like DRM, only I hope people will learn from that experience and
not expect us to waste another 10 years building it then throwing it

But at the same time, just because you tell them "all is lost" when it
comes to privacy, doesn't mean they are going to adjust and just let
go. There are still no compeling reasons for people to want all this
data portability. Yes, I agree it will allow for great things but
until you show them to the non-geek community, the other 98% of
Facebook users, I don't expect much to change.

> ...
> read more »- Hide quoted text -
> - Show quoted text -

Elias Bizannes

Jan 7, 2008, 11:37:26 AM1/7/08
to DataPortability Workgroup
EHL - good analysis there.

And of course privacy is a form of control. Subset maybe, but what
else is a sibling of the subset? I think privacy is a pretty broad

Privacy, is my eyes, is person specific. My two best friends, dont
want to reveal their salaries, wheres my third friend openly admits
his 300k sales salary. The point about data like such, is that
everyone has their own interpretation. I am slowly getting pedantic
about giving out my email address, because by controlling it, i
prevent spam. Others dont care. It is interesting you mention email
addresses, because when I researched this whole issue a a few years
back, i noticed all the major sites seem to exclude e-mail as
personally identifiable information in therm terms of service (and
which I recently had a whinge about re facebook [1])

To me, privacy is the right to control
1) who sees 2) what data about you, 3) when you want them to. I am
more than happy to share my birth date to, say members of the
workgroup. A woman in a nightclub, i wouldnt as freely, because its a
automatic determinant of your status before you even get to flirt to
build rapport. People have their own bias as to how they share their
information, and just like they have the freedom to spend their money
as freely as possible, so let them.

So when we talk about privacy, lets not standardise what information
we deem as private or not for the entire population. Its not just
impossible to get it right, but the wrong approach.

> ...
> read more »
Reply all
Reply to author
0 new messages