Hi Luis!
> On Thu, Oct 23, 2025, 2:04 PM 'Luis Villa' via clearlydefined <
clearly...@googlegroups.com> wrote:
>>
>> Is there a good way (via API or otherwise) to get a list of packages where concluded is not null, and where declared != concluded? The purpose is to create a testing data set to help understand when/how those differ - doesn't need to be super-comprehensive, a random sampling of 50-100 would be completely fine.
You best bet would be to look at curations in
https://github.com/clearlydefined/curated-data/tree/master/curations
In the common case, the curation would be the concluded and the
declared the license detected by ScanCode
Note that the difference could be:
- bug in scancode (that could be already corrected)
- missing license info
- lack of clarity in the declared license if detected correctly
- ambiguity with complex license
- all of the above!
If you can share your experiment results, that would be awesome
--
Cordially
Philippe Ombredanne
AboutCode.org
Package URL (PURL), ScanCode, DejaCode, PurlDB and VulnerableCode
Book a call at
https://cal.com/pombreda