So, the first concept to understand is that the primary mechanism of search is exact matching of a set of search terms against a set of indexed terms.
Start with a single field, consider a document with name: Los Angeles International Airport. When we index that, we first analyze it. The analyzer typically first breaks it up into tokens (tokenization), and then runs those tokens through filters (token filters). The filters do basic things like lowercasing the letters, but can do more advanced things like stemming or other transformations.
The standard analyzer would produce the following tokens:
- los
- angeles
- international
- airport
Next, when a user runs a search, we take their search input, and run that through an analyzer as well (usually, but not always, the same analyzer used for indexing). So, if a user searches for "angel", the standard analyzer produces just one token:
- angel
In this case, the search term does not match any of the indexed terms, so the search fails to find a match.
With that as the background, the first approach to consider is a custom analyzer which produces tokens that do match. One way to do this is to use an n-gram analyzer. The n-gram analyzer is configured with a minimum and maximum length, and basically computes various substrings of the tokens, and indexes those as well. Here is an example (I've shortened the input text to just 'Los Angeles' for brevity, but you get the idea):
los
ang
ange
angel
nge
ngel
ngele
gel
gele
geles
ele
eles
les
Now, if the user searches for angel, we actually search for:
ang
ange
angel
nge
ngel
gel
Good news, now some of these tokens match, so the search would succeed in finding your document. The bad news is that your index is substantially larger, so this approach may or may not work for your use case. Second, this still wouldn't find a match with a user typing a single letter. Could you index n-grams with minimum set to 1? Yes, but for most use cases this will simply create an index that is too large, and returns search results that are undesirable. There is also an "edge" n-gram variant that roots the tokens generated at one edge (usually front). That way you can match prefixes only, but again that doesn't seem to match your requirements.
Finally, you mentioned doing fuzzy search. Fuzzy works different than the technique I described because it attempts to find terms that are close, but not exact matches with terms in the index. Fuzzy works using a metric called a levenshtein edit distance. Bleve only supports finding matches with an edit distance of 1 or 2.
In some cases it could be useful to combine these two techniques.
Beyond fuzzy, there are things like wildcard/regexp, but it's important to remember, you're basically doing a brute-force search at that point. If you try to search for '*a*' to match anything with the letter 'a' in it, you will find the index doesn't really help, you would be better of using grep on a text file.
marty