Adding SimHash implementation to com.google.common.hash

22 views
Skip to first unread message

Patrick Spiegel

unread,
Jul 13, 2021, 7:19:30 AMJul 13
to guava-discuss
Hi all,

my team is implementing SimHash in the context of web page deduplication for security scanning and we also saw it used for other cases. Since the algorithm originated and seems to be in use at Google, we were wondering if it would make sense to have an implementation as part of the com.google.common.hash package. As of now, there does not seem to be a standard implementation for Java.

What do you think? Would that be a proper fit?

Cheers,
Patrick
Reply all
Reply to author
Forward
0 new messages