schema for open data licenses

36 views
Skip to first unread message

Herb Lainchbury

unread,
Jun 20, 2012, 4:17:02 PM6/20/12
to opend...@googlegroups.com, Kent Mewhort, Mike Linksvayer
Hi Folks,

I am in the middle of a discussion at od-discuss about a schema for open data licenses:



I am trying to make the case that rights clearing should be part of a schema of open data licenses as it's an important concept for publishers and consumers of open data.  I have clearly exhausted my modest knowledge of copyright law but would appreciate some input.


Basically, my argument is that licenses that do not explicitly disclaim responsibility for rights clearing are inferior to licenses which do not do this because without that in the disclaimer, my sense is that the publisher is responsible by default.  

I could be wrong on this, but if I am, then why bother disclaiming that responsibility at all?  It seems to me that if a party attempts to license something that they have no right to license, whether they know it or not, they have some responsibility.  Unless, they disclaim that responsibility in the license.  It 

Thus, licenses like PDDL ( http://opendatacommons.org/licenses/pddl/1-0/ ), which do not disclaim this responsibility are superior from a consumer perspective than licenses that do, such as CC0 ( ref 4. c.  http://creativecommons.org/publicdomain/zero/1.0/legalcode ) because as a consumer I can have some confidence that the publisher, the party most equipped to do rights clearing, has done their homework.

It seems to me this is an important quality of an open data license.


To be clear, I am a big fan of both licenses, and I am not recommending one over the other, but I think they are worth distinguishing in this way, especially when it comes to publishing open data.

Does this make sense?  

Comments?  Suggestions?

Herb

David Eaves

unread,
Jun 20, 2012, 4:19:26 PM6/20/12
to opend...@googlegroups.com
Herb - this makes a ton of sense.

d

James McKinney

unread,
Jun 20, 2012, 4:25:00 PM6/20/12
to opend...@googlegroups.com
I share Kent's opinion from the linked response: http://lists.okfn.org/pipermail/od-discuss/2012-June/000160.htm which is that omitting a disclaimer is too much risk for a publisher. It's unlikely publishers will knowingly choose to omit a disclaimer, and I think it's unrealistic to expect them to do all the rights clearing (which is a TON of work which will delay publication of datasets).

So, yes, from a consumer perspective, a publisher that does all the rights clearing is superior to one that doesn't. However, given the Promethean task of rights clearing any significant number/size of datasets, it's not a realistic expectation.

James

Kevin McArthur

unread,
Jun 20, 2012, 4:26:15 PM6/20/12
to opend...@googlegroups.com
Hi Herb,

I would agree with your interpretation. Licenses like the PDDL I believe
will require a jurisdiction to ensure they are not licensing other
people's work or will be open to misrepresentation/negligence tort
claims where a developer relies on this license and is subsequently sued
by the legitimate rightsholder.

The clause in CC0 would seem to shift that responsibility to developers,
whom neither have the capacity or positional ability to do that rights
clearing. Same goes for the BC-OGL if I'm not mistaken.

--

Kevin

Kevin McArthur

unread,
Jun 20, 2012, 4:29:23 PM6/20/12
to opend...@googlegroups.com
James,

So you think developers should go through rights clearing processes
where they have no ability to do this? (We have no idea where the data
comes from, just the providers assurances that they own it and are
licensing it for our use)

This is especially troublesome for geographic data where the sources are
subject to rights enforcement by third parties. Could the province just
release some data with NavTeq sources in it and expect us to know we
dont have right to it under the OGL, even though its an OGL dataset?

--

Kevin

James McKinney

unread,
Jun 20, 2012, 4:33:56 PM6/20/12
to opend...@googlegroups.com, Kent Mewhort
Have you read Kent's response on the od-discuss list? I'm cc'ing Kent here as I'm just parroting (perhaps poorly) his argument.

It's not that I believe developers should be saddled with a ton of work. It's that I don't believe publishers should be saddled with a ton of work. I want data, first and foremost. If publishers have to do weeks of due diligence and still put themselves at high risk (which if that risk realizes itself, may discourage them from publishing at all), it's going to significantly limit what data gets published. Are you thinking about that?

Herb Lainchbury

unread,
Jun 20, 2012, 4:34:09 PM6/20/12
to opend...@googlegroups.com
James:  Good point.  Maybe my use of the word "superior" is what is problematic.  It's not superior if publishers refrain from publishing.

--
Herb Lainchbury
Dynamic Solutions Inc.
www.dynamic-solutions.com
http://twitter.com/herblainchbury

Kevin McArthur

unread,
Jun 20, 2012, 4:36:54 PM6/20/12
to opend...@googlegroups.com
I'm thinking about that, but also realizing that it makes the data
essentially unusable in a copyleft / developer sense. From the sounds
of it, I could apply the CC0 license to Microsoft Windows without
consequence?

--

K

Herb Lainchbury

unread,
Jun 20, 2012, 4:37:35 PM6/20/12
to opend...@googlegroups.com, Kent Mewhort, Mike Linksvayer
James:  I copied both Kent and Mike L. so they could join in if they wanted to.  I just didn't want to spam them while we discuss.  ( not really sure what etiquette applies here )

H

Luke Closs

unread,
Jun 20, 2012, 4:39:38 PM6/20/12
to opend...@googlegroups.com
I don't think it would make the data "essentially unusable" to
developers. It may indeed make it "essentially unusable" to Kevin, or
other people very worried about this liability. But it's a mistake to
assume everyone else is worried about that liability (which is very
possibly a bad decision!).
--
Best,
luk.ec

David Eaves

unread,
Jun 20, 2012, 4:40:42 PM6/20/12
to opend...@googlegroups.com
I take the long view on this. I'm actually not interested in getting a
lot of data from government that, in the end, I'm not legally allowed to
use. So claiming that putting this burden on government will limit the
amount of data the gets open is a false claim - if there are restrictive
rights on the data... then it isn't open. The government has simple
released the data but placed the burden and risk on truly assessing its
openess on the user - in effect undermining the very point of the license.

I actually think government should take on this burden since a) they are
better positioned to do so and b) it is frankly, the responsibility of
the publisher to do so.

More importantly, creating this expectation is a good thing since, over
time, government will devise policies and rules that ensure these rights
have been cleared. If we don't set that expectation, then they never
will - and that would be a terrible outcome.

Dave

Kevin McArthur

unread,
Jun 20, 2012, 4:44:30 PM6/20/12
to opend...@googlegroups.com
Can you think of an example of a person or company who is concerned
about licensed data, but that isnt concerned that the data doesnt belong
to someone else? Read-only uses?

--

Kevin

James McKinney

unread,
Jun 20, 2012, 4:53:39 PM6/20/12
to opend...@googlegroups.com
Kevin, I am not a lawyer, but if you distribute Microsoft products, Microsoft's copyright will still apply, no matter what license you apply to it. You will be liable to Microsoft, as will all the people you distributed the products to. If you did not put a disclaimer on the license under which you distributed those Microsoft products, all those people would be able to sue you, so that you're paying for your copyright breach and everyone else's. Even with a disclaimer though, in this case you'd probably still be liable to your consumers, because you clearly knew that you were distributing a copyrighted work.

The point of the disclaimer for governments is this. They may be quite confident that the data they are publishing is cleared (unlike your Microsoft example). But if some copyrighted data gets in (Kent gives excellent examples of how this can happen), then without a disclaimer, they are liable for not only their breach of copyright (or whatever other law) but every single consumer of that dataset. That's a huge risk!

It's important to stress that the government isn't publishing data willy-nilly, because even with a disclaimer, they would still be liable to whoever's rights they are infringing, and penalties/damages may scale according to however many consumers they have. I'm confident that government is doing a good job and is setting a high bar, because they want to avoid that risk. But to remove a disclaimer, the government would have to set the bar so impossibly high that even in David's far future this would still be a deterrent to publishing data.

It's an exaggeration (or misunderstanding) to claim that the government is placing the whole burden and risk on the consumer. They still have some risk no matter what.

James McKinney

unread,
Jun 20, 2012, 4:58:36 PM6/20/12
to opend...@googlegroups.com
On 2012-06-20, at 4:44 PM, Kevin McArthur wrote:

> Can you think of an example of a person or company who is concerned
> about licensed data, but that isnt concerned that the data doesnt belong
> to someone else? Read-only uses?

There is rarely certainty. The vast majority of people are used to, and comfortable, assuming some risk. When you use a dataset, you think to yourself, "What's the risk that I do not have a right to use this data?" If it's low enough, you use it. Very few people ask themselves, "Do I know with 100% certainty that I have a right to use this data?"

Are people on this list familiar with due diligence work? From friends who work in copyright law, I know that it takes a long time to figure out who owns what when transferring title, doing mergers, who has leans on the property, etc. It's not a super-simple exercise.

James McKinney

unread,
Jun 20, 2012, 4:59:28 PM6/20/12
to James McKinney, opend...@googlegroups.com
Meant to say "corporate law" not "copyright law".

Herb Lainchbury

unread,
Jun 20, 2012, 5:21:33 PM6/20/12
to Kent Mewhort, opend...@googlegroups.com, Mike Linksvayer
Hi Kent,

"what you're really arguing for is that the licensor should provide a warranty"

No.  I am not arguing that.  I am arguing that this is an important attribute of a license and should be considered for the schema.

I get that it's fraught with difficulties, and publishers may have great reasons for not ever wanting to be responsible for rights clearing, and that's exactly why I think folks should be aware that different licenses deal with it differently.


The difference between PDDL and CC0 as I read it (and I could be wrong) is that CC0 explicitly disclaims any responsibility for rights clearing, while PDDL does not.  Maybe I missed something.

That seems like a big difference to me.  Maybe it's not.  If not, why would the CC0 bother to include it?


Thanks for your work on this Kent.

H



On Wed, Jun 20, 2012 at 2:03 PM, Kent Mewhort <kmew...@cippic.ca> wrote:
[I'm not certain if this reply will clear through to the list (I just
subscribed), but here it is...]

Herb, I'll reiterate my point from od-discuss that the PDDL and CC0 are,
in practice, no different in the respect you're talking about.  Both
include explicit disclaimers of liability.

However, from what I took away from your post on od-discuss, what you're
really arguing for is that the licensor should provide a warranty that
she or he has properly cleared all copyright in the works (and perhaps
even provide an indemnity to the license user).  Neither the CC0, nor
the PDDL, nor any other open license that I'm aware of -- other than the
obsolete CC 1.0 licenses -- do this.

For the benefit of everyone on this license, I'll cross-post the reason why:

I also doubt we'll see new licenses take this approach.  I agree that
the publishers of data are almost always in the best position to analyze
whether they actually hold the rights; however, I still don't think it's
feasible to shift the legal responsibility to them.  This is essentially
asking the publisher to offer a warranty or copyright indemnity of
almost limitless scope (given that an open license offers the data to
users for ANY use whatsoever, including in business contexts which could
involve numerous copies with very high damages for an accounting of
profits or statutory infringements where a work turns out to be
infringing).

It's simply too much unknown risk and would deter publication (or, more
likely, would deter use of such a license).  Keep in mind that
copyrighted content could easily creep into a dataset without the
publisher being aware of it.  For example, photographs could
inadvertently include substantial amounts of other works that turn out
to infringe copyright.

In general, I probably tend towards a pro-licensee side of the fence ,
if anything, but I just don't see this one happening.  IMO, the most
reasonable balance is that licensees and licensors each have to fight
their own legal battles: no warranty or indemnity going either way.

Now, having said this, I think data providers could do a better job of
describing -- and using metadata to describe -- the source of data, the
accuracy of data, and any quality control and rights clearing measures
that they have already taken.  This way, data users will be better
informed of the legal risks that they're taking on when they use data
for a particular purpose.

Kent
--
Kent Mewhort
Staff Lawyer
CIPPIC, the Samuelson-Glushko Canadian Internet Policy & Public Interest Clinic
University of Ottawa, Faculty of Law
57 Louis Pasteur St.
Ottawa, Ontario  K1N 6N5

Ph:  (613)562-5800 (ext.2556)
Fax: (613)562-5417

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual or entity to which it is addressed and contains information that is privileged and confidential.

Kent Mewhort

unread,
Jun 20, 2012, 5:07:38 PM6/20/12
to opend...@googlegroups.com

Kevin McArthur

unread,
Jun 20, 2012, 5:21:56 PM6/20/12
to opend...@googlegroups.com
These are licenses intended for reproduction downstream by third
parties. If the implication is that by simply reproducing an OGL dataset
(say in the OpenDataBC catalogue) that we're taking on significant
rights-related liabilities, well, yes, I'd say that's a big problem.

If the govt distributed a dataset to 5 people, and those 5 people on to
100,000... and so forth... how would liability be apportioned? Only the
original publisher err'd, but the rest are liable (and by my read,
-more- liable), because they redistributed a copyrighted work more times?

I expect the publisher of content to not license content they don't own
or are unsure if they own. I'm also not aware of any of the GPL/code
style licensing allowing for this third-party-rights exemption, but
maybe the CIPPIC folks could do a comparison to the GPL/MIT/BSD licenses?

--

K

James McKinney

unread,
Jun 20, 2012, 5:39:11 PM6/20/12
to opend...@googlegroups.com
On 2012-06-20, at 5:21 PM, Kevin McArthur wrote:

These are licenses intended for reproduction downstream by third
parties. If the implication is that by simply reproducing an OGL dataset
(say in the OpenDataBC catalogue) that we're taking on significant
rights-related liabilities, well, yes, I'd say that's a big problem.

If the govt distributed a dataset to 5 people, and those 5 people on to
100,000... and so forth... how would liability be apportioned? Only the
original publisher err'd, but the rest are liable (and by my read,
-more- liable), because they redistributed a copyrighted work more times?

I have no idea, but I would guess that if an intermediary added (or inherited) a disclaimer in its license, they would significantly reduce their exposure to lawsuits. If an intermediary removes a disclaimer from an upstream publisher's license, then they are foolish to assume all that responsibility.


I expect the publisher of content to not license content they don't own
or are unsure if they own. I'm also not aware of any of the GPL/code
style licensing allowing for this third-party-rights exemption, but
maybe the CIPPIC folks could do a comparison to the GPL/MIT/BSD licenses?

As in David's post, you're being a little black and white. The truth is governments may be 99% sure that they have every right to distribute a dataset, but they will keep the disclaimer for that 1% risk. It's not that they're thinking, "This dataset is high value. Can we distribute it? Who cares! Pass the liability onwards." It doesn't work like that. They have strong legal incentives to distribute only data that they have a right to distribute, no matter what license they choose.

The MIT license explicitly offers no warranties of noninfringement, so actually code licenses do contain this sort of third-party-rights exemption: http://en.wikipedia.org/wiki/MIT_License The language of GPL and BSD is less clear, so I will defer to a real lawyer (or to a clever user of Google).

Kevin McArthur

unread,
Jun 20, 2012, 5:46:16 PM6/20/12
to opend...@googlegroups.com
Interesting, you're right that the MIT license specifically disclaims
non-infringement. The inherited disclaimer would cover downstream users
of course, but would do nothing about your own first-party infringements
as I understand the issue.

Curious.

--

Kevin

Wrate, David LCTZ:EX

unread,
Jun 20, 2012, 5:48:39 PM6/20/12
to opend...@googlegroups.com
I'll weigh into this discussion from a licensors' perspective.

Clauses 7 (Exemptions) , 8 and 9 (Warranty) in the BC-OGL are there to essentially protect people. All people. It's important to respect the fact that publishers work very hard to ensure their data does not ever put anyone (including the publisher) into a situation where these clauses kick in. Because people are involved, mistakes can happen. And if that ever occurs, everyone needs to be protected.

I can tell you we are sitting on a lot of data because we cannot secure the rights to the third-party IP. And I can also tell you that we have come close to publishing data that would have fallen under an exemption. But we do have policy<http://www.cio.gov.bc.ca/local/cio/kis/pdfs/open_data.pdf> and procedural checks in place to be very sure that never happens, to make sure that data published meets the license conditions.

Kent and James are quite right in that without these safeguards, publishing would grind to a trickle because govt would become obsessed with making sure there is no chance of error as there would be no safeguards in place. For anyone.

Take the privacy clause for example. Taken out, it means, personal data is licensed; intentionally or not.

Suppose it's my personal information in the data. I'm suing the govt for releasing it, and I'm suing the person(s) using the data. That person then counter-sues the govt for having created a situation where they become liable.

Isn't it easier to acknowledge that jurisdictions work very hard to make sure no one is ever placed in a situation where the data they use contravenes those exemptions and that the larger organizational risk has to be recognized?

My two cents.
David

Kevin McArthur

unread,
Jun 20, 2012, 6:00:07 PM6/20/12
to opend...@googlegroups.com
Thanks David for the two cents and I appreciate the other perspective,
but here's my side on it.

So I'm using BG-OGL data to run www.proactivedisclosure.ca, which has a
lot of what people could consider personal information in it. I rely on
the province to have done their due diligence with regards to whether
this information is privacy-appropriate for release. Once the OGL is
slapped on it, thats my 'go point', where I can be sure the data is safe
for me to publish without drawing privacy ire.

With these exemptions, if a person sues the government for publishing
the data, ok, thats fine, someone made a policy decision and is liable
for that. But then they come after me. I've relied on the government to
make a call only they can make about the data but now I find myself
liable to the governments policy decision and privacy issues.

The way I would think it should work would be that I could recover any
damages/liabilities for the government's mistake/policy decision, by
simply forwarding the suit their way under standard tort-style claims.
Eg govt told me it was ok by licensing the data, I relied on that
license, I suffered a loss, I'm entitled to compensation equivalent to
that loss. This is standard tort stuff.

These exemptions break that chain of liability don't they? So I guess I
should shut my website down in case someone has a privacy issue with
this data given I'm not willing/able to take that liability?

--

Kevin

Kent Mewhort

unread,
Jun 20, 2012, 5:39:31 PM6/20/12
to opend...@googlegroups.com, Herb Lainchbury, Kent Mewhort, Mike Linksvayer
Okay, perhaps we're talking past one another then.  I agree that it's important for users to be aware of the different levels of disclaimers.

However, I think PDDL's disclaimer is actually stronger than CC0 (that is, the user takes on slightly more risk with PDDL).  Taking a close look at the disclaimers, CC0 disclaims all warranty (s. 4(b)) and disclaims responsibility for rights clearance (s. 4(c)):

b. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law.


c. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work.

On the other hand, PDDL disclaims all warranty (s. 5.1) and then disclaims ALL liability (s. 5.2) -- not just liability for rights clearance:

5.1 The Work is provided by the Rightsholder “as is” and without any warranty of any kind, either express or implied, whether of title, of accuracy or completeness, of the presence of absence of errors, of fitness for purpose, or otherwise. Some jurisdictions do not allow the exclusion of implied warranties, so this exclusion may not apply to You.

5.2 Subject to any liability that may not be excluded or limited by law, the Rightsholder is not liable for, and expressly excludes, all liability for loss or damage however and whenever caused to anyone by any use under this Document, whether by You or by anyone else, and whether caused by any fault on the part of the Rightsholder or not. This exclusion of liability includes, but is not limited to, any special, incidental, consequential, punitive, or exemplary damages. This exclusion applies even if the Rightsholder has been advised of the possibility of such damages.

Thus, the difference is that CC0 doesn't disclaim the licensor's liability outside of copyright, whereas PDDL does.  Both disclaim liability for copyright clearance.

I concur this is important to reflect in a machine-readable schema -- perhaps an indication on whether the license includes a "Disclaimer for tort / extra-contractual liability" and whether the license includes a "Disclaimer for copyright clearance".  In this case, PDDL would have a check-mark on both, whereas CC) would only have a check-mark on the latter.

Kent

Kevin McArthur

unread,
Jun 20, 2012, 6:08:26 PM6/20/12
to opend...@googlegroups.com
Also, here's some interesting info on CC0 from their FAQ:

"How can I be sure that I have all the rights I need to use the work?

CC0 contains a disclaimer of warranties just like our licenses, so there
is no assurance whatsoever that the affirmer (the person who applied CC0
to the work) has all the necessary rights to grant permission to use the
CC0�d work. The person applying CC0 to their work is not guaranteeing
anything about it, including whether she owns the copyright or has
cleared any uses of third-party content that her work may be based on or
incorporate. If you are in doubt, then we strongly recommend you not use
the work until you have taken all the steps and precautions you feel you
need to before doing so, which may include contacting the person who
applied CC0 to the work and consulting legal counsel. "

http://wiki.creativecommons.org/CC0_FAQ

So CC0 seems to be pretty clear in that its a bit of an un-license and
more of a rights waiver from one 'affirmer' owner.

--

Kevin

Herb Lainchbury

unread,
Jun 20, 2012, 6:18:47 PM6/20/12
to Kent Mewhort, opend...@googlegroups.com, Kent Mewhort, Mike Linksvayer

Maybe I am reading too much in the the word Rightsholder used in PDDL.  As I read it, clauses 5.1 and 5.2 disclaims responsibility for Rightsholders, not publishers that think they are Rightsholders but as it turns out, are not.

Whereas, CC0 uses the word Affirmer, which covers folks whether or not they actually turn out to be rights holders.

In any case, all I was hoping for in this discussion is to see if it was worthy of consideration in a schema.  

Herb

Kent Mewhort

unread,
Jun 20, 2012, 6:19:15 PM6/20/12
to opend...@googlegroups.com, Kevin McArthur
But I think it's a different story when we step out of copyright (and
away from statutory damages where every party is liable to a minimum,
stackable amount).

The chain of liability isn't necessarily broken in your example. If
someone sued both you and the government -- and successfully made out a
breach of a tort of invasion of privacy -- then the court would most
likely apportion liability depending on each of your responsibilities
for the breach. You'd only be liable for the government's fault in such
an action if there was an indemnity clause -- which, thankfully, is not
in most open licenses.

--
Kent Mewhort
Staff Lawyer
CIPPIC, the Samuelson-Glushko Canadian Internet Policy & Public Interest Clinic
University of Ottawa, Faculty of Law
57 Louis Pasteur St.
Ottawa, Ontario K1N 6N5

Ph: (613)562-5800 (ext.2556)
Fax: (613)562-5417

DISCLAIMER
Although this message may provide information regarding legal issues, it is not intended to constitute legal advice and is not a substitute for legal advice from a qualified lawyer.

Kevin McArthur

unread,
Jun 20, 2012, 6:34:23 PM6/20/12
to Kent Mewhort, opend...@googlegroups.com
Interesting. Thats good to know and takes away my main concern.

What about other forms of harm, like loss of business interest? eg if
someone invests a lot of money in an opendata startup that relies on
published data and then the data's license is revoked under one of these
exemptions, is there any liability on the part of the province there?
Should there be? (Think something like a company dependent on geo-data
like a mining-infomatics startup for example, and then think Streisand's
privacy lawsuit about aerial photography)

I'm only concerned about the liability side of things myself, but others
might be concerned about investment/reliability as this sector grows and
moves forward...

--

Kevin
Reply all
Reply to author
Forward
0 new messages