alter nearest neighbor - levensthein method code

2 views
Skip to first unread message

Petros Liveris

unread,
Oct 16, 2020, 3:19:02 AM10/16/20
to openref...@googlegroups.com
Hello,

for my specific needs, i need to make a small adjustment to the way the nearest neighbor - levensthein algorithm works. In a step of creating the clusters, the algorithm removes all punctuation and control characters. (In this step, i need to also remove some stop words: "&", "and", "co", "the" for example).

Where in the source code should I make this change?

Thank you in advance for your help

in the knn folder, there are many files, and i need to spot the place where the normalisation of the string happens, so as there i could also remove temporarily the stop words.


Martin Magdinier

unread,
Oct 19, 2020, 11:55:04 AM10/19/20
to openref...@googlegroups.com
Our documentation Clustering in-depth provides links to the code to review the implementation of each method. 

Martin


--
You received this message because you are subscribed to the Google Groups "OpenRefine Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine-dev/CANN0m7YQz2qTJdG24a%2BeqdKgKJ7Dh5i%2BCMV6Xe%3D_0vwQkGJL8Q%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages