I really don't mind this as long as it's two way. And it's why I hedged
round it in the Philosophy section here.
http://wiki.dataportability.org/pages/viewpage.action?pageId=1114398
IMHO, both the customer and the site owner have some ownership and
copyright on content posted on the site. If the site wants the permanent
right to do what ever it likes with it, then the customer ought really
to have the same right. Where it gets murky is content that is generated
as a result of community usage rather than direct input. And where
"your" data was generated by somebody else. To make that clearer, how
much right do I have to your profile data because you friended me?
My overriding view though is that if data is shown publicly in HTML then
I should have the right to scrape it or collect it in machine readable
form. And normal copyright should apply where I can do what I like with
it up to the point where I republish. Then things like fair use kick in.
The catch here of course is that almost nothing in Facebook is shown
publicly in HTML. It's virtually all behind the login wall.
--
Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173
Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433
Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat
Regional Office Use Only
This could be a great deliverable.
-Brady
DataPortability is all about "Ought to be".
- Regardless of where I post my content, I ought to retain rights to it
and be allowed to post it elsewhere.
- The site I post it to ought to have some rights too. So for instance
if they produced a book of selected bits of content posted on their site
they shouldn't have to get permission from each person in turn.
There are obvious exceptions to this. For instance where people are
being paid to write.
>On your profile data example, I think that's actually one of the
>easier ones. If you befriend me I should only have the right to view
>your profile data, or whatever profile data you've chosen to show to
>the class of "friend" I'm in, on the site you befriended me on.
Note here. Some sites (eg LinkedIn) make a benefit of allowing this. I
can download a CSV of my contacts containing their profile data. Or at
least the profile data they allow me to see. If this data is publicly
viewable, then I don't see the problem and I think this should be
encouraged. And if we're going to talk about exporting social graph data
then at the very least we need an identifier and a URL of a more
complete profile. If that more complete profile is publicly viewable,
within limits set by its owner, then I don't see the problem.
>> My overriding view though is that if data is shown publicly in HTML then
>> I should have the right to scrape it or collect it in machine readable
>> form. And normal copyright should apply where I can do what I like with
>> it up to the point where I republish. Then things like fair use kick in.
>> The catch here of course is that almost nothing in Facebook is shown
>> publicly in HTML. It's virtually all behind the login wall.
>
>Again this is partly an issue of "is", "ought to be" - and perhaps
>also "can".
>
>Just because you CAN do something, doesn't mean you have the legal
>right to: just because you have the technical means to scrape data
>publicly shown in HTML, it doesn't mean you are legally entitled to do
>so, or that you should do it.
Try and stop me. My browser is scraping html. It's not the scraping that
is the issue, it's the doing something with it after. Scraping doesn't
become a problem until you republish. Barring when it starts looking
like a DoS attack. After all, generally, we allow search engines to
scrape our sites.
Note here that Facebook is the exception that raised the issue. Almost
all other sites have a public non-member, logged out view. Anything
visible like that ought to be accessible and available for private use.
I ought to be able to use it for public use but managing permissions
here can be tricky. We don't typically have a problem displaying RSS in
aggregators. We do have a problem with sites that simply duplicate the
data and redisplay it with their own ads. Usually the touchstone is
whether the data has been transformed in some way.
>So, I don't think we can try to change or unify copyright laws in
>different countries
I don't think copyright law is the place for DataPortability to focus
for all the reasons you give. Much, much easier to focus on the sites
and individuals to be rational.
Your LinkedIn example is fine, if everyone knows that if they join LinkedIn their profile data will be easily exportable, then
that's part of the deal. But not if they didn't. Perhaps there is also a transparency/awareness point here, i.e. that sites should
be explaining clearly upfront what could happen to your data, what you can do with it, what they can do with it, what other users of
that site will be able to do with it. And sites which want to be able to claim they're DataPortable must display a clear explanation
of that nature when people first sign up.
Again, trying to make clearer the distinction between "physically can" and "legally allowed", take the example of ripping CDs to
MP3. Many people can do it, and they do it. But in the UK, technically that's illegal, though hopefully that'll change soon. And I
believe it's the act of ripping itself that's illegal, even if you don't copy it to your MP3 player (the analogy with
scraping/republishing). Even if in practice people get away with it and don't get caught ("can" do it, and also "can" get away with
it!), it's still illegal. So it may be too with HTML (and, more commonly, feed) scraping. Search engine indexing isn't necessarily
legal just because it's technically possible, in fact that's what Google got sued on in (I think) Belgium, and Google lost.
So when you say "Anything visible like that ought to be accessible and available for private use", that may be an "ought to be" -
but try to tell any corporate site that all of their publicly accessible pages should be legally available for private use so people
can reproduce them all they like, and there will be problems! Defining the details and limits of what ought to be accessible and
what ought not to be, and how, for sites which want to sport a DP/DaPo badge, is one of the issues that needs to be worked out.
Surely the key objective in any data portability standard is to make
*explicit* the implicit assumptions we - both individuals and businesses -
all operate under?
I have little doubt that less than 10% of service users actually read TOS
and Privacy documents. I suspect many businesses use this fact to their
advantage, and bury highly objectionable terms deep within these documents.
Nonetheless, individuals cannot claim ignorance - they know all these
services come with limitations and conditions, and they ought to inform
themselves of their terms.
Any data portability standard must have as a primary goal the requirement to
disclose upfront, in plain language (i.e. non-legalese) who has what rights
over which data, and when and how those rights may be exercised. That, to
me, is what data portability is about - removing any ambiguity. The
underlying technology is merely an enabler.
Disclosure (or, at least, simplified discovery of data rights/restrictions)
is key because, as I mentioned, we all make assumptions when we sign up for
services that store/hold personal data (e.g. MySpace, LinkedIn etc.). Many
of these assumptions are implicit (i.e. we don't actively make those
assumptions when we're signing up, they're just part of our 'world view'
with respect to how things are). My assumptions - implicit and explicit -
will differ from yours, which in turn will be different from the next person
etc.
Data portability, then, is about allowing everyone to make informed choices.
Once a standard for this kind of disclosure exists, we will start seeing
some normalising behaviour, because individuals will be able to make better
informed decisions about which services to use, and the risks v. reward
associated with doing so.
This will result in a form of 'flight to (e)quality' - consumers will, in
the exercise of their judgement, choose those services that provide the
optimum balance of risk v. reward, which will create significant pressure
for those who have an imbalance between risk v. reward to alter their
approach to data collection, storage and use.
Thoughts?
Mark
-----
Mark Neely
Master Strategist
Infolution Pty Ltd
e: m...@infolution.com.au
m: +61 (0)412 0417 29
skype: mark.neely
Read my blogs --> www.infolution.com.au
www.neelyready.com
Connect on LinkedIn --> www.linkedin.com/in/markneely