If you want to target browsers that do not support Wasm multithreading, then compiling any code with -pthread will cause issues, since those browsers won't recognize the atomics operations that get emitted.
Also notable is that the wasm linker will complain if linking in any set of object files that have a *mixed* use of the -pthread flag. So you'll have to be careful about compiling all object files consistently with that flag enabled if you go down that route. The linker will probably also complain if some objects were first compiled with -pthread, but if linking without -pthread (not 100% sure on that). So you will likely need to pass -pthread at link stage as well if you had that at compile stage.
Building with multithreading will cause performance issues if you are using --allow-memory-growth setting, though how much depends on how much Wasm<->JS interop your program has.
Passing
-sTHREAD_POOL_SIZE=0 is a good idea, that will avoid prewarming background threads.
What the overall performance impact of the added atomics will be, depends on what your codebase eventually uses them on. The LLVM compiler will not spontaneously generate atomics in your code beyond what you instruct, except in a few special cases (e.g. one off startup time global data section init). Maybe this will become one of those "have to profile and see" things.
Finally if you are building with multithreading enabled, you will need to set up the COOP and COEP web server headers, even if your app will not spawn any threads.
Using MAIN_THREAD_(ASYNC_)EM_ASM is safe when building without -pthread, it will be a direct alias to regular EM_ASM. Good notes about missing documentation. Some of those APIs may indeed be best documented in the header files themselves. If you find glaring omissions, opening documentation page bugs is welcome.