Seeking community feedback: keep or remove some Unicode symbols?

44 views

Skip to first unread message

Andrew Helwer

unread,

Jun 11, 2024, 6:04:41 PMJun 11

to tlaplus

Here is the list of Unicode symbols that are currently supported by SANY: https://github.com/tlaplus/tlaplus-standard/blob/main/unicode/tla-unicode.csv

The symbols under discussion very closely resemble their ASCII counterparts, and are as follows (name, ASCII, Unicode):

label_as, ::, ∷
assign, :=, ≔
bnf_rule, ::=, ⩴
dots_2, .., ‥
dots_3, ..., …
vertver, ||, ‖
excl, !!, ‼
qq, ??, ⁇

At the monthly community meeting this morning we discussed how some Unicode symbols do not render very well across several fonts such as those used in GitHub, VS Code, and the Toolbox. The assign and bnf_rule symbols are especially bad. Here is the PR which precipitated this discussion. I am seeking peoples' feedback on whether these Unicode symbols should be (a) kept, (b) removed from the standard but accepted by the parser, or (c) both removed from the standard and rejected by the parser. I will summarize the arguments as:

(a) Pro-keep: this is fundamentally a client font issue; examples exist where these symbols are rendered nicely. As Unicode support grows across fonts, font designers will put more effort into ensuring a wider array of symbols are rendered pleasingly. Furthermore, parser support for these symbols simply offers people a choice to use them; if they wish to use the ASCII equivalents that will always be available to them.

(b)/(c) Pro-remove: at the moment these symbols do not provide a good user experience in the event that they are using Unicode specs in the Toolbox, VS Code, or GitHub (note: neither the Toolbox nor VS Code extension currently have a functioning as-you-type Unicode translator extension). Furthermore, the symbols under discussion do not meaningfully improve the presentation over their ASCII equivalents; they simply resemble contracted forms of the ASCII code string.

Look forward to your thoughts!

Andrew Helwer

C. M. Sperberg-McQueen

unread,

Jun 13, 2024, 5:58:50 PMJun 13

to tla...@googlegroups.com, Andrew Helwer

Andrew Helwer <andrew...@gmail.com> writes:

> .... I am seeking peoples' feedback on whether these Unicode symbols

> should be (a) kept, (b) removed from the standard but accepted by the
> parser, or (c) both removed from the standard and rejected by the
> parser. I will summarize the arguments as:
>

> (a) Pro-keep: ...
>
> (b)/(c) Pro-remove: ...

I'm not currently a heavy user of TLA+, and I expect the voices of those
who are heavy users should probably weigh more heavily. But for what
it's worth, the arguments for (a) seem to me much stronger, and to
support (a) as a conclusion.

The arguments brought forward in favor of (b) and (c) are weaker by
nature, and do not in fact support conclusion (b) or (c). (Some users
use software with disappointing fonts. Therefore (?!), the software
should forbid other users from using the characters whose rendering is
thought to be poor, or perhaps the parser and the standard should just
be brought out of alignment with each other.) The idea that
semantically precise Unicode encodings of operators should be eliminated
in favor of less precise ASCII encodings would be a hard one to sell me,
even if the primary argument in its favor were stronger than the
observation that some users don't think highly of some Unicode fonts.

That's my two cents (Use only as directed, void where prohibited by law,
past performance is not a guarantee of future results, if condition
persists consult a qualified font designer).

--
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Reply all

Reply to author

Forward

0 new messages