func Cover(input []byte, opts Options) (Coverage, bool) in
licensecheck currently reports len(input)/len(one of the licenses) for
each known license. I'd need for all known licenses len(known
license)/len(license reference in input).
I'd like to scan >100000 files (possibly a lot more), where some of
them (<0.1%) contain full or partial known license texts.
An example scenario for an example /src, containing >100000 files:
$ listlicenses /src # to get an overview of 100% matching license references
LGPL-2.1
MIT
$ listlicenses -details /src # same tree, more detailed output, to
see the details
/src/license refers 100% MIT # the bytes in /src/license correspond
one for one for the MIT license
/src/fonts/LICENSE refers 100% MIT # the bytes in /src/fonts/LICENSE
correspond one for one for the MIT license
/src/a/Notice refers 100% LGPL-2.1 # same as above with LGPL-2.1
/src/a/b/whatever.go refers 94% GPL2 # most probably a broken
license reference in whatever.go, maybe someone inadvertently deleted
the last word from the lines containing the GPL2 license text. Needs
human inspection to check what's the license situation with
whatever.go
/src/c/ConfusingLicenseReferences.c refers 7% ZLIB #
ConfusingLicenseReferences.c has most probably a false positive report
for reference to ZLIB
/src/c/ConfusingLicenseReferences.c refers 65% MIT #
ConfusingLicenseReferences.c has only 65% of MIT, the author intended
to refer to MIT, but some inadvertent edit later broke the license
reference in ConfusingLicenseReferences.c
Command listlicenses iterates over all files in the subtree, gathering
all full or partial (broken) license references. Command listlicenses
uses the functionality similar to
github.com/google/licensecheck to
check the files in the file system.
thanks!