Difference between storing social data and get social data to store it ?

victorc

unread,

Mar 17, 2008, 6:21:04 PM3/17/08

to DataPortability.Public.General

Hi,

Just a quick question I have in mind.
Some social networks (like Faceboook and MySpace) explain in TOS that
Web site can not retrieve and store social data (name and picture for
example) more than 24h for example ; but how to deal with this
constraint if a social network user want to take with him its social
data and store it somewhere else ?
Is it possible ? not legal ? is there some thoughts around this in the
DataPortability community ?

Thanks,
Victor
http://www.osocial.net/network

Improbulus

unread,

Mar 17, 2008, 6:58:22 PM3/17/08

to DataPortability.Public.General

I think this is one of the key issues for the Policy Group to work
out. The TOS/licences of sites that claim to be data-portable need to
be truly compatible with that concept, in terms of what they allow you
to do or not do with your own data (e.g. let other sites you use and
permit have access to that data).

Another related issue, perhaps more important in some ways from a
control/privacy viewpoint, is what they say in their TOS they're
entitled to do with your data (e.g., in some cases, they say they have
the right to use all the info you add to their site in any of their
other businesses/sites as they wish!).

Imp
http://www.consumingexperience.com/

Julian Bond

unread,

Mar 18, 2008, 3:14:56 AM3/18/08

to dataportabi...@googlegroups.com

Improbulus <impro...@gmail.com> Mon, 17 Mar 2008 15:58:22

>Another related issue, perhaps more important in some ways from a
>control/privacy viewpoint, is what they say in their TOS they're
>entitled to do with your data (e.g., in some cases, they say they have
>the right to use all the info you add to their site in any of their
>other businesses/sites as they wish!).

I really don't mind this as long as it's two way. And it's why I hedged
round it in the Philosophy section here.
http://wiki.dataportability.org/pages/viewpage.action?pageId=1114398

IMHO, both the customer and the site owner have some ownership and
copyright on content posted on the site. If the site wants the permanent
right to do what ever it likes with it, then the customer ought really
to have the same right. Where it gets murky is content that is generated
as a result of community usage rather than direct input. And where
"your" data was generated by somebody else. To make that clearer, how
much right do I have to your profile data because you friended me?

My overriding view though is that if data is shown publicly in HTML then
I should have the right to scrape it or collect it in machine readable
form. And normal copyright should apply where I can do what I like with
it up to the point where I republish. Then things like fair use kick in.
The catch here of course is that almost nothing in Facebook is shown
publicly in HTML. It's virtually all behind the login wall.

--
Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173
Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433
Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat
Regional Office Use Only

Improbulus

unread,

Mar 18, 2008, 7:34:41 PM3/18/08

to DataPortability.Public.General

On Mar 18, 7:14 am, Julian Bond <julian_b...@voidstar.com> wrote:
> Improbulus <improbu...@gmail.com> Mon, 17 Mar 2008 15:58:22

>
> >Another related issue, perhaps more important in some ways from a
> >control/privacy viewpoint, is what they say in their TOS they're
> >entitled to do with your data (e.g., in some cases, they say they have
> >the right to use all the info you add to their site in any of their
> >other businesses/sites as they wish!).
>
> I really don't mind this as long as it's two way. And it's why I hedged
> round it in the Philosophy section here.http://wiki.dataportability.org/pages/viewpage.action?pageId=1114398

Could you please clarify what you mean by "two way" - in what sense?

> IMHO, both the customer and the site owner have some ownership and
> copyright on content posted on the site. If the site wants the permanent
> right to do what ever it likes with it, then the customer ought really
> to have the same right. Where it gets murky is content that is generated
> as a result of community usage rather than direct input. And where
> "your" data was generated by somebody else. To make that clearer, how
> much right do I have to your profile data because you friended me?

There is of course a distinction between "ought to be" and "is".
Although I'm no copyright expert (and copyright laws differ from
country to country), I believe that in their TOS a number of sites try
to claim total ownership of any content you post to their site, to the
exclusion of the poster. If you have agreed to their TOS and you post
the content, then your content is theirs, not yours, and you can't
even re-post it elsewhere without their permission. Whether that bit
of their TOS can be challenged as unfair depends on the country
concerned. That's the "is".

I agree completely that the "ought to be" is that you ought to be able
to retain rights to the original content that you post, in that you
should be able to use it elsewhere yourself - even if the site you've
posted it on has the right to display your content where you
originally posted it. If that's the limit of what you mean by their
ownership, that's fine.

But what if the site wants to reuse your content in a totally
different context from what you (and they) originally envisaged when
you posted it? Example: Blogger Play - you post pics of your kids on
your personal blog; but Google then produce a slideshow stream of all
pics uploaded to Blogger, that anyone who gets the widget can see -
would you be happy with that re-use? (more on that at http://tinyurl.com/24uhec).

And what if the site says you can use your content yourself privately,
but claims the rights in their TOS to stop you posting your own
content, that you originally posted with them, on a competitor site
(because you've given up all copyright to them by agreeing to their
TOS)?

There are questions about whether the DataPortability movement can, or
indeed should, define the limits of what rights a site can claim and
allow its users to retain/have, before the site can call itself
"DataPortable".

In the context of what you've said in the wiki under "Philosophy", I
guess the question is whether "ownership" can or should be defined,
and to what extent "ownership" rights of site and user can overlap or
co-exist.

And I haven't even tried to start thinking about the murky bits you
mention yet, except to think that yes, I agree, they're murky! Would
you please elaborate on what sort of content you have in mind that's
generated by community usage? An example perhaps?

On your profile data example, I think that's actually one of the
easier ones. If you befriend me I should only have the right to view
your profile data, or whatever profile data you've chosen to show to
the class of "friend" I'm in, on the site you befriended me on. If I
choose to leave that service I don't think I should have the right to
copy your profile data and take it with me; though maybe I should be
able to have my new service request access to your profile data on the
old service. If you've opted not to allow other sites to access your
profile data, I ought to be forced to respect that; it was my choice
to move to another service, and it's your profile data not mine. Of
course if you yourself choose to leave that service, you should be
able to take your profile data with you - but that's an issue between
you and that service as to its data portability; just because you can
do that as regards your own profile data, doesn't mean I should be
able to do that as between you and me.

> My overriding view though is that if data is shown publicly in HTML then
> I should have the right to scrape it or collect it in machine readable
> form. And normal copyright should apply where I can do what I like with
> it up to the point where I republish. Then things like fair use kick in.
> The catch here of course is that almost nothing in Facebook is shown
> publicly in HTML. It's virtually all behind the login wall.

Again this is partly an issue of "is", "ought to be" - and perhaps
also "can".

Just because you CAN do something, doesn't mean you have the legal
right to: just because you have the technical means to scrape data
publicly shown in HTML, it doesn't mean you are legally entitled to do
so, or that you should do it.

In fact, applying normal copyright rules, in the UK at least (other
countries may be different), strictly, if I publish stuff on my blog,
it's automatically my copyright, even though it's public. You may
physically be able to scrape it easily, but that ability in itself
doesn't give you the right to do it - not unless I give you
permission. (I'd be the first to say I'm no copyright expert, but I
did a detailed write up of an excellent talk I went to on copyright,
trademark etc in the UK as it affects internet users and website
owners, and the author - a UK IP lawyer - confirmed its accuracy, see
http://tinyurl.com/334byt).

It's the concept of fair use which allows (limited) copying/scraping,
and again it's not in fact called "fair use" in the UK, that's a US
concept - in the UK it's "fair dealing", which is much narrower in
terms of what it allows. The government are in fact currently
consulting on widening it, and about time too.

So, I don't think we can try to change or unify copyright laws in
different countries - we have to live with what they are (though more
users responding to the current consultation at http://tinyurl.com/2odgmm
by 8 April could help make the UK laws that bit more digital-
friendly!) - but we can perhaps focus on what the TOS or licensing
terms of sites ought to be, so that they're compatible with the goals
of DataPortability, but also are realistically within the limits of
what's feasible under local laws.

Brady Brim-DeForest

unread,

Mar 18, 2008, 8:45:07 PM3/18/08

to dataportabi...@googlegroups.com

Perhaps the Policy group could investigate a boiler plate TOS clause
that we could offer to vendors that wish to support DataPortability?
Similar in form to a CreativeCommons license?

This could be a great deliverable.

-Brady

Julian Bond

unread,

Mar 19, 2008, 3:58:39 AM3/19/08

to dataportabi...@googlegroups.com

Improbulus <impro...@gmail.com> Tue, 18 Mar 2008 16:34:41

>> IMHO, both the customer and the site owner have some ownership and
>> copyright on content posted on the site. If the site wants the permanent
>> right to do what ever it likes with it, then the customer ought really
>> to have the same right. Where it gets murky is content that is generated
>> as a result of community usage rather than direct input. And where
>> "your" data was generated by somebody else. To make that clearer, how
>> much right do I have to your profile data because you friended me?
>
>There is of course a distinction between "ought to be" and "is".
>Although I'm no copyright expert (and copyright laws differ from
>country to country), I believe that in their TOS a number of sites try
>to claim total ownership of any content you post to their site, to the
>exclusion of the poster. If you have agreed to their TOS and you post
>the content, then your content is theirs, not yours, and you can't
>even re-post it elsewhere without their permission. Whether that bit
>of their TOS can be challenged as unfair depends on the country
>concerned. That's the "is".

DataPortability is all about "Ought to be".
- Regardless of where I post my content, I ought to retain rights to it
and be allowed to post it elsewhere.
- The site I post it to ought to have some rights too. So for instance
if they produced a book of selected bits of content posted on their site
they shouldn't have to get permission from each person in turn.

There are obvious exceptions to this. For instance where people are
being paid to write.

>On your profile data example, I think that's actually one of the
>easier ones. If you befriend me I should only have the right to view
>your profile data, or whatever profile data you've chosen to show to
>the class of "friend" I'm in, on the site you befriended me on.

Note here. Some sites (eg LinkedIn) make a benefit of allowing this. I
can download a CSV of my contacts containing their profile data. Or at
least the profile data they allow me to see. If this data is publicly
viewable, then I don't see the problem and I think this should be
encouraged. And if we're going to talk about exporting social graph data
then at the very least we need an identifier and a URL of a more
complete profile. If that more complete profile is publicly viewable,
within limits set by its owner, then I don't see the problem.

>> My overriding view though is that if data is shown publicly in HTML then
>> I should have the right to scrape it or collect it in machine readable
>> form. And normal copyright should apply where I can do what I like with
>> it up to the point where I republish. Then things like fair use kick in.
>> The catch here of course is that almost nothing in Facebook is shown
>> publicly in HTML. It's virtually all behind the login wall.
>
>Again this is partly an issue of "is", "ought to be" - and perhaps
>also "can".
>
>Just because you CAN do something, doesn't mean you have the legal
>right to: just because you have the technical means to scrape data
>publicly shown in HTML, it doesn't mean you are legally entitled to do
>so, or that you should do it.

Try and stop me. My browser is scraping html. It's not the scraping that
is the issue, it's the doing something with it after. Scraping doesn't
become a problem until you republish. Barring when it starts looking
like a DoS attack. After all, generally, we allow search engines to
scrape our sites.

Note here that Facebook is the exception that raised the issue. Almost
all other sites have a public non-member, logged out view. Anything
visible like that ought to be accessible and available for private use.
I ought to be able to use it for public use but managing permissions
here can be tricky. We don't typically have a problem displaying RSS in
aggregators. We do have a problem with sites that simply duplicate the
data and redisplay it with their own ads. Usually the touchstone is
whether the data has been transformed in some way.

>So, I don't think we can try to change or unify copyright laws in
>different countries

I don't think copyright law is the place for DataPortability to focus
for all the reasons you give. Much, much easier to focus on the sites
and individuals to be rational.

ShaiD...@gmail.com

unread,

Mar 19, 2008, 11:57:08 AM3/19/08

to DataPortability.Public.General

On Mar 18, 7:14 am, Julian Bond <julian_b...@voidstar.com> wrote:

> Improbulus<improbu...@gmail.com> Mon, 17 Mar 2008 15:58:22

> IMHO, both the customer and the site owner have some ownership and
> copyright on content posted on the site. If the site wants the permanent
> right to do what ever it likes with it, then the customer ought really
> to have the same right. Where it gets murky is content that is generated
> as a result of community usage rather than direct input.

I comment on this a while ago
http://shaidorsai.wordpress.com/2008/01/11/trust-openid-vrm-data-portablity-and-how-does-it-hang-together/
and while I have no clear conclusions, it feels like community content
belongs to, and is for use by that community alone.

> And where
> "your" data was generated by somebody else. To make that clearer, how
> much right do I have to your profile data because you friended me?

This is one of the things I find hardest.
If I say "Julian is my friend" that's my assertion.
I can take my assertion where I like, and wherever I can find you.

If I say "Julian is my friend", and you reply with "Yes, Steve's my
friend" (hey, or even confirm it in your FOAF file) that's now *our*
data.

tbh, unless you have in some way tokenised the data ["only for
Facebook purposes"] I think I *should* be able to take our data with
me.

> My overriding view though is that if data is shown publicly in HTML then
> I should have the right to scrape it or collect it in machine readable
> form. And normal copyright should apply where I can do what I like with
> it up to the point where I republish. Then things like fair use kick in.

as mentioned elsewhere, fair use and fair dealing depend upon where
you are.
Of course, if the data is shown in HTML, you're pretty likely to be
collecting it.
If you don't publish it... who would be likely to know? If you're
using it to populate a "friends" list in another social network... in
my view that would be publishing it.

> The catch here of course is that almost nothing in Facebook is shown
> publicly in HTML. It's virtually all behind the login wall.

Yep, which is why we have to look under the HTML

Improbulus

unread,

Mar 19, 2008, 4:51:09 PM3/19/08

to dataportabi...@googlegroups.com

Indeed, as you say DP/DaPo is very much about "ought to be". My point is that realistically the exact specifics and limits of the
things that "ought to be" should be considered within, and should take account of, real life constraints: what is possible/permitted
under the laws of the relevant countries, in this case initially mainly the USA, UK and Europe I suspect. I agree with you that
copyright law is not the place DP should focus on, but it can't be ignored either.

Your LinkedIn example is fine, if everyone knows that if they join LinkedIn their profile data will be easily exportable, then
that's part of the deal. But not if they didn't. Perhaps there is also a transparency/awareness point here, i.e. that sites should
be explaining clearly upfront what could happen to your data, what you can do with it, what they can do with it, what other users of
that site will be able to do with it. And sites which want to be able to claim they're DataPortable must display a clear explanation
of that nature when people first sign up.

Again, trying to make clearer the distinction between "physically can" and "legally allowed", take the example of ripping CDs to
MP3. Many people can do it, and they do it. But in the UK, technically that's illegal, though hopefully that'll change soon. And I
believe it's the act of ripping itself that's illegal, even if you don't copy it to your MP3 player (the analogy with
scraping/republishing). Even if in practice people get away with it and don't get caught ("can" do it, and also "can" get away with
it!), it's still illegal. So it may be too with HTML (and, more commonly, feed) scraping. Search engine indexing isn't necessarily
legal just because it's technically possible, in fact that's what Google got sued on in (I think) Belgium, and Google lost.

So when you say "Anything visible like that ought to be accessible and available for private use", that may be an "ought to be" -
but try to tell any corporate site that all of their publicly accessible pages should be legally available for private use so people
can reproduce them all they like, and there will be problems! Defining the details and limits of what ought to be accessible and
what ought not to be, and how, for sites which want to sport a DP/DaPo badge, is one of the issues that needs to be worked out.

Mark Neely

unread,

Mar 19, 2008, 6:20:16 PM3/19/08

to dataportabi...@googlegroups.com

Hi all,

Surely the key objective in any data portability standard is to make
*explicit* the implicit assumptions we - both individuals and businesses -
all operate under?

I have little doubt that less than 10% of service users actually read TOS
and Privacy documents. I suspect many businesses use this fact to their
advantage, and bury highly objectionable terms deep within these documents.
Nonetheless, individuals cannot claim ignorance - they know all these
services come with limitations and conditions, and they ought to inform
themselves of their terms.

Any data portability standard must have as a primary goal the requirement to
disclose upfront, in plain language (i.e. non-legalese) who has what rights
over which data, and when and how those rights may be exercised. That, to
me, is what data portability is about - removing any ambiguity. The
underlying technology is merely an enabler.

Disclosure (or, at least, simplified discovery of data rights/restrictions)
is key because, as I mentioned, we all make assumptions when we sign up for
services that store/hold personal data (e.g. MySpace, LinkedIn etc.). Many
of these assumptions are implicit (i.e. we don't actively make those
assumptions when we're signing up, they're just part of our 'world view'
with respect to how things are). My assumptions - implicit and explicit -
will differ from yours, which in turn will be different from the next person
etc.

Data portability, then, is about allowing everyone to make informed choices.
Once a standard for this kind of disclosure exists, we will start seeing
some normalising behaviour, because individuals will be able to make better
informed decisions about which services to use, and the risks v. reward
associated with doing so.

This will result in a form of 'flight to (e)quality' - consumers will, in
the exercise of their judgement, choose those services that provide the
optimum balance of risk v. reward, which will create significant pressure
for those who have an imbalance between risk v. reward to alter their
approach to data collection, storage and use.

Thoughts?

Mark
-----
Mark Neely
Master Strategist
Infolution Pty Ltd

e: m...@infolution.com.au
m: +61 (0)412 0417 29
skype: mark.neely

Read my blogs --> www.infolution.com.au
www.neelyready.com
Connect on LinkedIn --> www.linkedin.com/in/markneely

Elias Bizannes

unread,

Mar 19, 2008, 11:53:53 PM3/19/08

to DataPortability.Public.General

+1 Mark: That's an excellent point.

DataPortability, as a brand, represents some fundamental things. A
user can associate those concepts by virtue of the brand and its
representation, as opposed to the current practice were they blindy
accept a ToS.

Reply all

Reply to author

Forward