The requirement is real but as Josh says, the task is challenging. IIRC there are two main challenges: automation and namespacing. Right now most of the scanners have some set of regular expressions, templates, or some such they use to identify given text as a particular license. For arbitrary text, that's hard to do. For some of the scenarios you can rely on just having a discoverable identifier (no need for matching), for others, not so much and automation is essential.
Namespace management is just hard. Some have proposed using internet domains (ala Java package naming) but many licenses are, or evolve to be, not "owned" by a particular organization.
In the past a few folks have
discussed using the SPDX "LicenseRef-" syntax and some auto generated hashing of unrecognized license text. That combined with an alias registry gets you the ability to have automated detection and a human-readable, manageable namespace.
The idea is that unrecognized license text is hashed and then referenced as "LicenseRef-XYZABC123" (or some such). Off the bat all such licensed packages are correlated and so can be "cleared" by legal teams together and collaboratively. Over time curators may come to see that hash as the "FooBar" license and then register an alias for the hash. Then "LicenseRef-XYZABC123" and "LicenseRef-FooBar" are then interchangeable (with the latter being more user-friendly"). Variants of FooBar with different hashes can also be aliased to FooBar. It is even possible that FooBar ends up being a recognized SPDX license. If it retains that name, all's good, if it gets a new name, register and alias for that too.
On the practical user side, sticking with SPDX valid syntax allows for the continued use of SPDX tooling and integrations.
Jeff