Perhaps this is so invalid character streams (e.g. mismatched or orphaned surrogate pairs) can survive encoding and decoding (I haven't tested)? Strictly speaking not every CharacterSequence is validly encode-able to utf-8. Java just kind of hides this. For example, this is a reversed surrogate pair (or two orphaned surrogates, take your pick):
(mapv #(Integer/toHexString (int %)) (String. (.getBytes "\uDC00\uD800" "UTF-8") "UTF-8"))
=> ["3f" "3f"]
Note that Java's utf-8 encoder will translate these to "?", losing information about the original char value.
That said, if this is the case, it makes more sense for fressian to say "we have a custom encoding that is mostly utf-8 except it preserves invalid utf-16" than "this is utf-8". I wonder if other fressian implementations handle this the same way? Javascript also shares java's utf-16 string type but not every platform does.