[Proposal] String.replace_invalid optionally tread <<0>> as invalid

42 views
Skip to first unread message

Daniel Kukula

unread,
Dec 17, 2023, 4:43:19 AM12/17/23
to elixir-lang-core
Any chance to have an option to also escape null byte in String.replace_invalid ??

In utf-8 <<0>> is a valid character but there are problems with it:
- it's invalid in modified utf-8 https://en.wikipedia.org/wiki/UTF-8#Modified_UTF-8
- postgres does not accept text containing it, you get an error character not in repertoire
- it's sometimes used as a string terminating sequence
- it cli apps when you try to get the actual string width it's problematic:
iex> String.length(<<0>>)
1
iex> IO.puts("-" <> <<0>> <> "-")
-^ - <- it has a width of 2 characters when printed in elixir. Some specifications also say to skip printing it at all.

José Valim

unread,
Dec 17, 2023, 4:47:28 AM12/17/23
to elixir-l...@googlegroups.com
You can do a pass after the fact replacing <<0>> by the replacement character.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/94558371-5fcd-4de3-af99-5a0e2ef22261n%40googlegroups.com.

Daniel Kukula

unread,
Dec 17, 2023, 5:07:06 AM12/17/23
to elixir-lang-core
I know, I have logic currently that does it, similar to the implementation of replace_invalid. When I saw the announcement I was thinking that I can replace this custom logic by the new function.
Reply all
Reply to author
Forward
0 new messages