Proposal: Add String.from_codepoint/1

38 views
Skip to first unread message

Nathan Long

unread,
Jul 18, 2016, 11:58:27 AM7/18/16
to elixir-lang-core
Hi! I propose adding String.from_codepoint/1

Example uses:

String.from_codepoint(97)      # => "a"
String.from_codepoint(128_518) # => "😆"
String.from_codepoint("1F916") # => "🤖"


I've personally wanted this while playing around with Elixir's unicode support.

I'm not sure whether this would ever be useful in production code, but it seems nice to me to have something symmetrical to `?a`.

The question has at least been poised before.

José has said "I am not a big fan of it exactly because we can write it today as `<<cp::utf8>>`" when I made a pull request, but he suggested I float the idea here.

String.from_codepoint would be more discoverable and easier to use; it would handle hexadecimal, which is how codepoints are usually referenced.

Thoughts?

Ben Wilson

unread,
Jul 18, 2016, 1:19:19 PM7/18/16
to elixir-lang-core
I too think that <<x ::utf8>> is perfectly adequate.

Obviously the proposed function also lets you do "1F916" but I can't think of a use case I've ever seen where that's necessary.

Martin Svalin

unread,
Jul 18, 2016, 2:55:53 PM7/18/16
to elixir-l...@googlegroups.com
While a function in the String module would be more discoverable, I think that rather points to a need for a really good way of discovering all you can do with bitstrings. They're underused exactly because folks don't think to read the Kernel.SpecialForms.<<>> docs when they have a problem they'd solve.

As for taking a hex value, <<0x1F916::utf8>> does what you'd expect.

You can also `0x1F916 |> List.wrap |> to_string`, if you really care about pipelining. (That'd fall into "overusing pipelining" imho, though.)

I think it has limited use. When are we working with individual codepoints as opposed to a collection of them? And if we have a list of codepoints, we have a CharList. I wouldn't want to see this out there:

    codepoints |> Enum.map(&String.from_codepoint) |> Enum.join

- Martin

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/a304b14f-1cbb-4325-8722-ad66f3d9fb1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages