You can also extract malware for doing analysis (e.g. how long an URL
is active?), doing reverse engineering of the gathered malware
or matching the path (host can be variable for malware infection
but path could remain fixed or using extra parameters like password.
also a risk with the current method of building suffix and prefix[1]).
In other words with having the full list of URL, you can do security
research and maybe help people to be better protected. That's just
another perspective...
I tend to agree that working with hash is much safer. I just hope that
the current canonicalization is working properly and we don't miss hit
in the hash list due to IDN and encoding evasion or other attacks.
adulau
[1]
http://code.google.com/apis/safebrowsing/developers_guide.html#SuffixPrefix
--
-- Alexandre Dulaunoy (adulau) --
http://www.foo.be/
--
http://www.foo.be/cgi-bin/wiki.pl/Diary
-- "Knowledge can create problems, it is not through ignorance
-- that we can solve them" Isaac Asimov