An Approximate Search Program

26 views
Skip to first unread message

Thomas Passin

unread,
Aug 13, 2025, 8:46:36 AMAug 13
to leo-editor
Sometimes it's useful to run a search that will find inexact matches. Many years ago I implemented an algorithm that matches when the target string contains at most one error.  This is probably the most common case. The algorithm was published by Muth and Manber.  See

https://robert.muth.org/Papers/1996-approx-multi.pdf

This version is the "slow" version of algorithm, known as Algorithm 00. There were several faster implementation of the idea but #00 is the simplest and easiest to implement.  Especially for small strings it's plenty fast.

The idea is that you build a dictionary of N variants of the pattern, each with one letter missing (and including the unchanged pattern). Then you take the first N characters of the target and change them in the same way.  For each change, check the dictionary for a hit.  If no hit, slide the pattern one step to the right and repeat.

I've attached a Leo outline with my code. It can be run with CTRL-B to run a test case:  find the misspelled word "enginser" in the string "the very best product engineer agent". The result will print to the console. Or you can add @clean to the start of the headline and save the outline to create an external Python file.
approx_search.zip

Edward K. Ream

unread,
Aug 13, 2025, 11:25:49 AMAug 13
to leo-e...@googlegroups.com
On Wed, Aug 13, 2025 at 7:46 AM Thomas Passin <tbp1...@gmail.com> wrote:
Sometimes it's useful to run a search that will find inexact matches. Many years ago I implemented an algorithm that matches when the target string contains at most one error. 

Thanks, Thomas.

Edward
Reply all
Reply to author
Forward
0 new messages