In doing a spell correction, pyenchant is likely trying to do various things, such as make splits of the word, add letters, etc.
I find that in many cases the suggestion it gives involves a split, and I am wondering if there is a way to influence it so it doesn't suggest those so easily.
I could check the returned word, and see if it has a ' ' in it, and if so ask for the next, but I was hoping there is a way to control pyenchant itself. Here is the code I am using:
class SpellingReplacer(object):
def __init__(self, dict_name='en', max_dist=2):
self.spell_dict = enchant.Dict(dict_name)
self.max_dist = 2
def replace(self, word):
if self.spell_dict.check(word):
return word
suggestions = self.spell_dict.suggest(word)
if suggestions and edit_distance(word, suggestions[0]) <= self.max_dist:
return suggestions[0]
else:
return word
replacer = SpellingReplacer()
So then I run a ton of words through it and get results like this:
# it should have selected hmm, by first trying to reduce repetitive letters
hmmmm is probably hmm mm
# should be interrupts
interupts is probably int erupts
# regretted
regreted is probably regret ed
# offer
offfer is probably off fer
So you can see it gets some of these wrong, and I am wondering if there is a way to modify the order of what it tries? I realize you can modify the edit distance, but that doesn't change the method order that its attempting to use.