Pls suggest to improve performance in keyvaluedb

23 views
Skip to first unread message

Venkatesh T

unread,
Jul 5, 2017, 9:13:56 AM7/5/17
to duke
Dear Team,

First of all thanks for this wonderful library. I used KeyValueDB to deduplicate 150 million records considering Name,address and country. I configured minRelevance as 0.5 and maxSearchHits as 100. It was very fast initially. After reached 10 millions, it becomes slow. It took nearly 1 hr for processing 0.1 million records. Please suggest to improve the performance.

JP de Vooght

unread,
Nov 5, 2019, 4:49:13 AM11/5/19
to duke
Hi Venkatesh? Did you eventually solve this?
If I understand correctly blocking is possible only with IDs. I was wondering if locality-sensitive hashing would be possible by looking at several attributes for blocking. I am still getting familiar with Duke so perhaps I missed something...
JP
Reply all
Reply to author
Forward
0 new messages