Cameron, I think this is a useful proposal. Elixir has means to check validity (String.valid?/1) and a mechanism to split valid and invalid code points (String.chunk/2 with the :valid trait). But there isn't, to my knowledge, a means to coerce validity. A couple of thoughts:
1. Since Elixir strings are, by definition, UTF8, I don't know that special handling of UTF16 and UTF32 code points makes much sense - although I accept this may be more Unicode compliant.
2. What would the function be called? Since we have String.valid?/1 maybe String.validate/2 with an option `replace_invalid: utf8_string`. The default `:replace_invalid` could be U+FFFD or it could be `nil`. If the default is `nil` then there could also be a `String.validate!/2` that raises if there is no `:replace_invalid` option.
3. I think the implementation could leverage the code of `String.chunk/2` which uses `String.next_codepoint/1`. That would simplify implementation and be more consistent in code style.