Exclude readme and include license in report

55 views
Skip to first unread message

scott...@twinehealth.com

unread,
Jan 21, 2015, 2:05:17 PM1/21/15
to license...@googlegroups.com
I'm using license_finder-1.2.

For the generated HTML report, I'm seeing the text of the readme file along with each component. I would like to use license_finder to generate a credits or licenses page, similar to what Google Chrome shows in chrome://credits/. Is this possible? And if so, how do I configure license_finder to include the license text instead of the readme text in the report?

Jacob Maine

unread,
Jan 21, 2015, 4:53:40 PM1/21/15
to scott...@twinehealth.com, license...@googlegroups.com
The license_finder-2.0 code allows a little bit of configuration of CSV reports.  You could consider adding something there.  But I want to stay away from adding many configuration options for the other types of reports, especially HTML.  Better to build something on top of license_finder which outputs reports in the style you need.  You can use the existing HTML report code as a superclass, or just a guideline.  I imagine your code would look something like this: https://gist.github.com/mainej/b190d2f138c2b9e2e20a.

One problem you'll face is that `license_finder` doesn't have the full license text of every license.  Instead, it has URLs which point to web pages with the license text.  I think it's better to keep it that way, so you'll have to figure out how to get the license text yourself.

On Wed, Jan 21, 2015 at 11:05 AM, <scott...@twinehealth.com> wrote:
I'm using license_finder-1.2.

For the generated HTML report, I'm seeing the text of the readme file along with each component. I would like to use license_finder to generate a credits or licenses page, similar to what Google Chrome shows in chrome://credits/. Is this possible? And if so, how do I configure license_finder to include the license text instead of the readme text in the report?

--
You received this message because you are subscribed to the Google Groups "license-finder" group.
To unsubscribe from this group and stop receiving emails from it, send an email to license-finde...@googlegroups.com.

scott...@twinehealth.com

unread,
Jan 21, 2015, 5:32:26 PM1/21/15
to license...@googlegroups.com, scott...@twinehealth.com
Thanks, that helps.

Yes, the deficiency of not pulling in the full license text of every license is a problem for me. I appreciate that the primary purpose of license_finder is to find and evaluate the licenses of many dependencies. And it could be argued that a link to the license is generally preferable when evaluating many dependencies which use the same license. However, if the license is custom, wouldn't it be appropriate/helpful to include the text of the license, so that it can be evaluated without going and searching for it? That is, including the full text of each license, at least in the case of unknown/uncategorized licenses seems important for a tool which is supposed to help users evaluate the licenses of dependencies.

Also, I would argue that turning license_finder into a tool that can help developers comply with the licenses of the dependencies they are using might be an appropriate goal, since this is a closely related task to evaluating licenses, and only minor changes (I hope) would be required to achieve this goal. Most licenses include a copyright notice which names specific authors, and it is common for the license to require that redistributions (binary or source code) must retain the copyright. By capturing and including the full text of the license file included with each dependency, license_finder could become a tool for license compliance.

I'm curious, if developers are not using license_finder for license compliance, what tool are they using?

Jacob Maine

unread,
Jan 21, 2015, 7:53:21 PM1/21/15
to license...@googlegroups.com, scott...@twinehealth.com
The original gist I provided seemed verbose, so I just pushed a bit of code to make LicenseFinder easier to use without its CLI.  See the updated gist: https://gist.github.com/mainej/b190d2f138c2b9e2e20a.  You'll have to run off the master branch of pivotal/LicenseFinder for it to work.

To address your follow up questions: most of our users would rather NOT read the licenses... they just want to know "this package uses the BSD license".  Just knowing the name of the license lets them say "yes" or "no" in 99% of the cases.  In the remaining 1%, or if license_finder can't detect a license, users will do a bunch of manual research to decide whether to approve a package.  I agree, license_finder could provide more assistance here, but it's not its primary goal.

Regarding compliance, license_finder was built to help find licenses and record approvals, but not to comply with the terms of licenses.  The compliance use case is intriguing, but it would probably be better as a separate project, if there isn't one already.  license_finder might be able to point to a license file as one part of the puzzle, but there's a lot of trickiness from there on.  As you pointed out, every license will have different requirements.  Sometimes you'll have to link to original material from your own web pages.  Sometimes you'll have to reference copyright holders, or include copies of the licenses in your code.  Or really anything else the package and/or license stipulate.  It would be hard to extract all these requirements from every license, including custom licenses, and instruct a user on how to comply.  license_finder is pretty far from being able to assist with all that.

One way license_finder might be able to help is that we've heard a few times that the output from `license_finder --debug` would be useful in other places.  If you haven't seen the --debug output, it reports *why* license_finder has decided a package has a particular license.  That can essentially be one of three things: either "you, the user, told me it's MIT and Ruby", or "the package definition says it's MIT and Ruby", or "I found files in the package's source that look like MIT and Ruby".  But that actually brings up another question.  That last case is the only one in which license_finder actually has the full text of the license.  On most of my projects, that's by far the minority of packages.  That throws another wrench into copyright extraction and similar issues.

Anyway, I'm also curious: what tools do people use for compliance?

Mike Dalessio

unread,
Jan 27, 2015, 5:23:12 PM1/27/15
to scott...@twinehealth.com, license...@googlegroups.com
I've actually created a story in the LicenseFinder tracker backlog to track the actual text if we have it available:


The TL;DR here is that Pivotal's lawyers have indicated that for some licenses (e.g. MIT) the copyright dates and authors are part of the license, and hence meaningful for some definition of the term "meaningful". This isn't intended for compliance directly, but so that we can republish the license in applications that use MIT-licensed libraries.

But the logical next step is to provide some hooks to make this information available to compliance software ... but I'm getting a bit off topic. Just wanted to note that we're thinking about some of this here at Pivotal.


Jacob Maine

unread,
Jan 28, 2015, 5:03:16 PM1/28/15
to Mike Dalessio, scott...@twinehealth.com, license...@googlegroups.com
I've also looked into this a bit more, and made it easier to extract what information license_finder actually has.  See this gist for details on how to extract license text, and caveats about when it's reasonable to do that.
Reply all
Reply to author
Forward
0 new messages