I probably completely misrepresented the concept :-)
The idea was about generating a hash for every file on your hard disk, and uploading the hash, and meta-information about that file, to a server somewhere.
Eventually, there would be an aggregate consensus on what a particular file hash "means". And how common it is out there.
So I could, for example, easily identify the 99.99% of all valid windows files on my system, and perhaps find out what they are supposed to be doing, and get suspicious about the 0.01% of items that no one else has ever seen before, and which might therefore be infected with a virus.
It could also have as a convenient side effect the ability to tell you that you have umpteen copies of the same file on your file system.
With some additional specialized file-type information, the clients could glean more specific stuff (e.g. this is the same MP3 file as that one, but someone modified the ID3 tags)
Or they could look inside archives.
Open to suggestions as to similar existing projects, or uses for it.
Philipp