One possibility is adding the option to the backend instead and using '-mllvm –no_nop_optimise' on the clang command. This approach is good for options you don't intend end-users to use.
Just to check: The NOP's you're trying to eliminate aren't related to branch delay slots are they? The reason I ask is because removing those without putting another instruction in their place will change the behaviour of your code.