[Proposal] Add ß --upcase--> ẞ

62 views
Skip to first unread message

Tobias Pfeiffer

unread,
Apr 9, 2025, 4:46:05 AMApr 9
to elixir-lang-core
Hello all and thanks for all your work.

This isn't super important, but after wondering how ß, a German character that notoriously has no capital version, was treated I found a bit of a rabbit hole and found out that apparently since 2024 it _has_ a capital version "ẞ" which is now preferred. [1]

I wanted to suggest adding this to elixir. I'm not super familiar with the unicode support in elixir but I'd hope that that an update to `UnicodeData.txt` plus some updated tests [2] might do the job. I'm happy to do the work myself if not _too_ complex and would appreciate pointers.

The current behavior (1.18.3 @ OTP 27.3.2) is:

iex(1)> String.capitalize("ß") "Ss"
iex(2)> String.upcase("ß") "SS"

I don't know in general how other programming languages treat this now, the latest Ruby release has the same behavior as elixir.

Thanks so much, nothing urgent - just a fun exploration that came out of a conversation I had at Alchemyconf in Braga last week :)

Cheers,
Tobi

[1] https://en.wikipedia.org/wiki/%C3%9F#Development_of_a_capital_form (yes I know wikipedia is my only source so far, I can check the German sources as well if wished)

José Valim

unread,
Apr 9, 2025, 5:15:35 AMApr 9
to elixir-l...@googlegroups.com
Hi Tobias,

We strictly follow the Unicode rules here. So if new Unicode versions include this update, then it will automatically work in the future. If Unicode does not update, I don't believe there is anything we can do, because then we would no longer be Unicode compliant and diverge from all other tools and languages out there.

There is typically a new Unicode version every year, around September, so maybe it will be addressed once it is out and we update Elixir files!


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/elixir-lang-core/68b59315-cd33-475a-bec9-8a4fa60f988fn%40googlegroups.com.

Tobias Pfeiffer

unread,
Apr 9, 2025, 5:23:52 AMApr 9
to elixir-l...@googlegroups.com
Heyo José,

that's what I figured - just thought we might have missed a unicode update or so but probably not. Thanks for the quick response as always and I'll see what happens.

I checked and yeah the official unicode still goes to S apparently: https://www.compart.com/en/unicode/U+00DF - should have checked before, sorry!

Have a great week!

Reply all
Reply to author
Forward
0 new messages