opinions on open source license for our spectral data?

60 views
Skip to first unread message

Jeffrey Warren

unread,
Aug 14, 2012, 4:59:24 PM8/14/12
to publicla...@googlegroups.com
Hello all -- spectral folks especially!

About a year ago when we started the SpectralWorkbench.org website, this seemed like an academic question, but as our database of shared spectral data grows, I think it's time to ask what people think about how we license that data.

Right now, all spectra are public domain, so there is no requirement to attribute the data if you use it. I was persuaded to start it so liberally by Ethan Zuckerman, who made the case that SafeCast is doing the same thing, and that you just really don't want to limit what people can do with the data so it has the maximum potential to do good. We took a similar position when we did most maps of the Deepwater Horizon oil spill, and many folks continue to put their MapKnitter maps in the public domain (it's the default).

However, i want to point out that other open source projects such as OpenStreetMap and Wikipedia require attribution to their community ("ODbL OpenStreetMap contributors, 2012" for example) and OSM's license in particular is ShareAlike, which requires that "downstream" users release any additions under the same license. 

Much has been made of this ShareAlike provision, and I wonder if we would do well to adopt it -- or at least a Creative Commons Attribution license -- for SpectralWorkbench, so that our work is properly attributed. If we went as far as a ShareAlike license, any additions to the dataset would be required to be released under the same terms.

Maybe this isn't important to you, but I feel like we ought to at least consider encouraging people to "give back" as they use our data, and at a minimum, to attribute it to our community. I'd love to hear your thoughts.

Jeff

Alex Mandel

unread,
Aug 15, 2012, 3:02:07 AM8/15/12
to publicla...@googlegroups.com, Jeffrey Warren
This is an interesting question, the actual spectral readouts from
devices isn't subject to copyright (at least in the US). Any value add,
like specifically comparing 2 spectra could be considered a creative work.

Remember too in this case OSM was Creative Commons and changed to ODbL
as outlined here... http://wiki.openstreetmap.org/wiki/Open_Database_License
Mostly to deal with the fact that Creative Commons doesn't address
databases and data well. If you look into it there's actually a
sub-license for database and a separate one for data. If I'm
understanding correctly it's the data (not creative work) we're talking
about not the database structure for storing it (which is considered
creative work)

The other question that comes up is how reuse would occur?
Example: if someone downloads spectra, would they be likely to modify it
and resubmit or repost it somewhere?

If we talking about a library of data from a device, the only use case I
see is downloading the library, comparing your data to the library and
publishing the results or the comparison. There's no modification of the
data in that so would a share alike clause even come into play?
If someone write an Academic paper on such an analysis, they'd have to
site where the data came from anyways so attribution should be covered
by standard publishing requirements.

I think Public Domain or CC0 make more sense for this use case. If
people find the library useful maybe they will contribute spectra of
other things or more samples of similar things to enrich the library.

Thanks,
Alex

Ned Horning

unread,
Aug 15, 2012, 9:27:23 AM8/15/12
to publicla...@googlegroups.com
Alex did a good job summarizing data licensing and I agree with his thoughts that Public Domain or CC0 would be appropriate for the spectral data. I have a slight preference for CC0 since I think supporting Creative Commons licensing is a good thing in general but that's just a personnel preference.

Here are some sites with commentary on data licensing - the science commons links are a bit dated:
http://creativecommons.org/tag/data/
http://wiki.creativecommons.org/Data
http://sciencecommons.org/resources/faq/database-protocol/
http://sciencecommons.org/projects/publishing/open-access-data-protocol/
http://wiki.creativecommons.org/CC0_FAQ

Ned
Mostly to deal with the fact that Creative Commons doesn't address
databases and data well. If you look into it there's actually a
sub-license for database and a separate one for data. If I'm
understanding correctly it's the data (not creative work) we're talking
about not the database structure for storing it (which is considered
creative work)

The other question that comes up is how reuse would occur?
Example: if someone downloads spectra, would they be likely to modify it
and resubmit or repost it somewhere?

If we talking about a library of data from a device, the only use case I
see is downloading the library, comparing your data to the library and
publishing the results or the comparison. There's no modification of the
data in that so would a share alike clause even come into play?
If someone write an Academic paper on such an analysis, they'd have to
site where the data came from anyways so attribution should be covered
by standard publishing requirements.

I think Public Domain or CC0 make more sense for this use case. If
people find the library useful maybe they will contribute spectra of
other things or more samples of similar things to enrich the library.

Thanks,
Alex

-- Post to this group at publicla...@googlegroups.com. To unsubscribe, email publiclaborato...@googlegroups.com. Options at https://groups.google.com/d/forum/publiclaboratory?hl=en


Jeffrey Warren

unread,
Aug 15, 2012, 1:38:32 PM8/15/12
to te...@wildintellect.com, publicla...@googlegroups.com
the question of whether spectral data is subject to copyright is a good one -- well, is GPS data? Some of this is simply not tested in court, so people (OpenStreetMap) is just being overly careful, and trying to preempt any caselaw through good license-writing.

That said, the two reasons for a more restrictive license are:

- legal requirement of attribution, esp. if there is ever a commercial use of it (think of Apple's use of OpenStreetMap data in the new iPhone OS)
- (maybe more importantly) ShareAlike licenses can help build communities, because any re-users or adders are more obligated to contribute additions back to the main database. This is both a legal and a social obligation, in a sense, and this is the main reason I think OSM and many projects use such licenses. 

The latter reason is the use case I'm thinking of, Alex -- and it's very related to the question of when ShareAlike would come up. If we assume we *can* find or create a binding license, I would hope anyone who downloads or uses our data -- to calibrate against, or to do analysis -- would contribute any new data they collect.

In OSM, what triggers the obligation to share? any specific use of the data? Or is it only when you upload it to OSM, and they rely on the fact that most geodata is not that useful unless it's uploaded and combined with the rest of the OSM database?

Reading it more closely, the ODbL looks pretty good; presumably if you "modify, transform [or] build upon the database", you are obligated to release your changes under the ODbL. Not 100% sure what that means in this context, however. It seems like individual spectra could be separately licensed on a one-by-one basis... at least if there is copyright for them *at all*.

Jeff

Mathew Lippincott

unread,
Aug 15, 2012, 1:50:22 PM8/15/12
to publicla...@googlegroups.com, te...@wildintellect.com
the issue of database rights is an international one-- they're recognized in the EU but not in the US.  My impression is that the ODbL wouldn't cover the database in the US-- we might not be able to compel someone to share back to our data set, only modifications to specific works.  We may need two licenses-- a content license (international scope) and a database license (EU only).

Alex

unread,
Aug 17, 2012, 5:07:02 PM8/17/12
to Mathew Lippincott, publicla...@googlegroups.com, te...@wildintellect.com
If you read closely, GPS data on Osm is not under copyright, its under contract law. The most common use of Osm data is rendering of maps, hence attribution. The odbl is about database structure and tagging semantics, the data records from what I see are mostly covered by a sublicense that's a contract, not copyright. From what I've seen in the USA no GPS is not copyright-able.

Also recall that share-alike usually only affects redistribution, not usage, so users of the data have 0 obligation. So maybe attribution is the most important thing for us to focus on.

It may be worth contacting Creative Commons and some other organizations (NSF?).

Alex

Mathew Lippincott <mat...@publiclaboratory.org> wrote:

>the issue of database rights is an international one-- they're recognized

>in the EU but not in the US <http://en.wikipedia.org/wiki/Database_right>.

Reply all
Reply to author
Forward
0 new messages