I've run across a few things while playing with rules_license integration. I'm not sure if these are worthy of their own issue in the repository or not. I''ve got the ball rolling on my end for a corporate CLA approval, so I didn't think I should open up my own issues until that is completed. In some cases, these are POCs and are expected to have short-comings. But should there be an issue opened for the final design/implementation?
a. If there are no licenses applied in the dependency chain, then gathering fails with a backtrace and error message:
...
File ".../external/rules_license/rules_gathering/gather_metadata.bzl", line 275, column 35, in metadata_info_to_json
for mi in sorted(metadata_info.other_metadata.to_list(), key = lambda x: x.label):
Error: 'TransitiveMetadataInfo' value has no field or method 'other_metadata'
Available attributes: deps, licenses, traces
I wouldn't necessarily expect default_applicable_licenses to extend beyond the current BUILD file, so this means a license must be applied at any BUILD file for a given desired dependency of a generate_sbom rule. Note that applying a package_info (but not license) is insufficient as well and results in the same backtrace and error message.
b. sample POC script write_sbom.py only deals with the first dependency of the generate_sbom rule (I'm presuming the json array is emitted in the same order as the deps in the generate_sbom rule). furthermore, if multiple package_infos are defined within the same bazel_package, the last one wins as far as the generate_sbom output is concerned. This seems to be caused by write_sbom.py aggregating by bazel_package, when it should perhaps be aggregating by actual defined package_info (as identified by the tuple package_name, package_version). This problem is seen whether or not there are multiple licenses in the bazel_package. note that the source json <blah>.sbom_licenses_info.json seems to have all the distinct data available.
c. I have played around with adding a package_sbom rule, modeled after package_info. Some further tweaks are required to allow for a lack of license application (since in this case, the license is within the external sbom). There is a short-circuit return in gather_metadata_info_common when there are no licenses nor trans_licenses. I'm not sure a separate rule should be used in this case however, versus supporting multiple behaviors within package_info.
d. I am wondering how to integrate with applicable_licenses when a native rule does not support that keyword. In particular, I would like to define an external sbom and then apply it to a source file, such that whomever uses that source file inherits the external sbom as well (actually, it could be an internally applied license or package_info just as easily). The point is that I know the license applies to a file (or set of files), and I don't necessarily want to find all the rules using those files to add the appropriate entry to *their* applicable_licenses. For example, the exports_files rule does not support applicable_licenses. It does support "license", and it seems there is a way to use "exception=" prefix on values in the license list to pass things on, but then how would aggregation work? In general, is there a way to apply the license/package_info to a source file such that it can then be gathered by any rule that uses that source file?
basically, i'm looking for something like the following, but being able to gather the license info.
package_sbom("some-docker.sbom", srcs = [ "some-docker.spdx.json" ])
exports_files(
[
"some-docker.tgz"
],
licenses = [
"exception=//...:some-docker.sbom",
]
)
(ideally simply using applicable_licenses = [ ":some-docker.sbom" ])
If I simply used this source file in the same BUILD file, then it wouldn't be an issue for me to apply the license on the rule that uses the source file. However, in my case we can generate BUILD.bazel files with rules that use these files (e.g. packaging into an rpm or iso), and the generator doesn't have access to the license information.