You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to duke
Dear Team,
First of all thanks for this wonderful library. I used KeyValueDB to deduplicate 150 million records considering Name,address and country. I configured minRelevance as 0.5 and maxSearchHits as 100. It was very fast initially. After reached 10 millions, it becomes slow. It took nearly 1 hr for processing 0.1 million records. Please suggest to improve the performance.
JP de Vooght
unread,
Nov 5, 2019, 4:49:13 AM11/5/19
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to duke
Hi Venkatesh? Did you eventually solve this?
If I understand correctly blocking is possible only with IDs. I was wondering if locality-sensitive hashing would be possible by looking at several attributes for blocking. I am still getting familiar with Duke so perhaps I missed something...