One way to accomplish that is to set up a text factory with three Canonize steps.
The first Canonize step, say a DoublesWordsSanitizerIgnores.txt file, will have the one or more patterns that you want to exclude from the double word finding and fixing step and supplement them with a fix-up, ignore add-on pattern.
The second Canonize step, say a DoublesWordsSanitizer.txt file, will have the one or more patterns that find the double words you want to replace with single words and doesn't find the fixed-up excludes from the first Canonize step.
The third and final Canonize step, say a DoubleWordSanitizerCleanup.txt file, will have the pattern to find all the fixed-up ignores/excludes from the first Canonize step and remove the fix-up add-ons, returning those bits of text back to the original.
For the first Canonize list of ignore/exclude patterns, it would probably be best to start out with a fairly simple list of double word patterns. Starting out simple helps in debugging and easily understanding what you're ignoring/excluding. Something simple like:
(\s)(many,\smany)(\s)
\1%%\2%%\3
(\s)(very,\svery)(\s)
\1%%\2%%\3
where I'm using %% as the fix-up add-ons to ignore/exclude those double word occurrences from the next double word sanitizing step.
(I'm capturing the leading and trailing white space to handle edge cases like line feeds, and I'm not using any word boundaries, \b, to avoid having to handle non-word character letters concatenated with word characters.)
For the second Canonize list of patterns to find double words and replace with single words, one or more grep patterns like:
(\s)(\w+)\s\2
\1\2
(\s)(\w+),\s\2
\1\2
For the last and final Canonize step to clean up the fix-up add-ons added in the first ignore/exclude step, a grep pattern like:
%%(.+)%%
\1
Suggest you start out with sample text, and with that sample text, individually run each Canonize step (from Text -> Canonize...) on the sample text to check the patterns you have in the specific canon file do what they're supposed to do. Then, after that, combine them into a text factory and recheck the combined operation.