License to use when republishing parts of the WDC data.

64 lượt xem
Chuyển tới thư đầu tiên chưa đọc

Djellel Eddine Difallah

chưa đọc,
08:45:01 28 thg 4, 201628/4/16
đến Web Data Commons
Hi Everyone,

We are working on a dataset extracted from CommonCrawl (but including a subset of webpages) and partly your RDFa data. We were wondering about the licence to use, as we want to republish the data. is the following enough?
"The extracted data is provided according the same terms of use, disclaimer of warranties and limitation of liabilities that apply to the Common Crawl corpus."

We had a doubt since the CC terms of use states:
"We grant you a non-assignable, non-transferable, non-sublicensable, limited license to use our site and data in accordance with the terms of the ToU."

We are also contacting CC of their input, but we were wondering if you have already some experience with that. 

Thanks
Djellel Difallah
Exascale Infolab
University of Fribourg, CH




Robert Meusel

chưa đọc,
11:33:56 28 thg 4, 201628/4/16
đến Web Data Commons
Hi,

No unfortunately we have no experience here. Best would be to settle this with CC.

Cheers,
Robert

Tom Morris

chưa đọc,
11:53:29 1 thg 5, 20161/5/16
đến web-data...@googlegroups.com, common...@googlegroups.com
On Thu, Apr 28, 2016 at 8:45 AM, Djellel Eddine Difallah <difa...@gmail.com> wrote:

We are working on a dataset extracted from CommonCrawl (but including a subset of webpages) and partly your RDFa data. We were wondering about the licence to use, as we want to republish the data. is the following enough?
"The extracted data is provided according the same terms of use, disclaimer of warranties and limitation of liabilities that apply to the Common Crawl corpus."

We had a doubt since the CC terms of use states:
"We grant you a non-assignable, non-transferable, non-sublicensable, limited license to use our site and data in accordance with the terms of the ToU."

Yes, that's a little ironic. By my reading (although I'm not a lawyer), unless WDC was separately granted the right to sub-license by the Common Crawl Foundation, they're not actually following the CC Terms of Use.

The current Common Crawl Terms of Use don't seem like they're aligned with the goals of the CC Foundation since they effectively prohibit the distribution of any derivative works such as the WDC. I'd be interested in learning what you hear back from Common Crawl. The right to sub-license is key to anyone who wants to publish derivative works.

Tom

p.s. The full Terms of Use are actually here: http://commoncrawl.org/terms-of-use/full/

Alexander Panchenko

chưa đọc,
12:06:12 12 thg 2, 201812/2/18
đến Web Data Commons
Hello, Tom,

Did you get any answer from CC? What is the best way to license a derivative dataset based on CC?
Trả lời tất cả
Trả lời tác giả
Chuyển tiếp
0 tin nhắn mới