On Wed, Nov 29, 2023 at 02:01:29PM +0100, Grégory Vanuxem wrote:
>
> Le dim. 26 nov. 2023 à 18:00, Waldek Hebisch <
de...@fricas.org> a écrit :
> >
> > On Sun, Nov 26, 2023 at 05:20:11PM +0100, Grégory Vanuxem wrote:
> > >
> > so, as you see Unicode is supported.
>
> Thanks for making me remember greek letter support. In fact I also no
> longer remember where this is implemented.
Well, one thins is that we query Lisp to check if something is a
letter and if yes we allow it in identifiers. In Unicode-enabled
Lisp this means that we allow Unicode letters.
Other thing are strings where things work in natural way. And
'ucodeToString' and its dual which depend on implementation.
> But that's a good thing
> this is supported. Since I sometimes interact with Julia I like the
> way it supports Unicode in a terminal, for example \in plus <TAB> will
> replace \in with ∈ automatically. It even completes functions or
> unicode commands so \empt plus two <TAB> will complete to \emptyset
> and after ∅. It could be interesting I think to add this type of
> support in terminal supporting unicode in FriCAS. And even for Jupyter
> notebook, Jfricas, why not.
Such things have its place in Clef (which now can handle UTF-8) and
in case of Jfricas in Jupyter frontend. There is a question how
exactly this should work? In FriCAS '_' is an escape character,
so by analogy we should use this. But in FriCAS '_' means that
operators loose their special syntactic properties and you would
like almost the opposite. I guess that as user controlable option
this would be OK.
> > But FriCAS has no definition
> > for ∈, so
> > (7) -> _∈ + β
> >
> > (7) ∈ + 2
> > Type: Polynomial(Integer)
> > works because leading _ intructs FriCAS to treat ∈ as identifier,
> > but FriCAS has no idea that you want ∈ to be infix operator.
>
> Yes, and it's a pity I think. But when I speak of Unicode support in
> terminal or spad/input file I do not think about all unicode
> characters, just, say, greek letters and, grossly, mathematical
> related characters.
Yes, that is natural. My point was that we need a table of characters
that we want and their properties. In particular for operators we
need priorities. And corresponding function/keyword so that
in situations where Unicode is unavailable we can still use
all functions implemented in FriCAS.
> After, some 'look like" character can be
> introduced. For example, Chrome, Firefox etc. refused to add unicode
> URL support to their browser because of security concerns. There is a
> cyrilic character that looks very like the 'l' (L), so gmail, google,
> apple can easily be spoofed.
Well, IIUC several cyrilic glyphs are considered "the same" as
latin glyphs. But cyrilic using countries traditionally had
distinct code block for cyrilic characters, so Unicode decided
that Unicode cyrylic character codes also should be different from
latin character codes. More generally, I was early enthusiat of
Unicode, but now I have serious doubts about many Unicode decisions.
One promise of Unicode was that one should be able to simply
work with character codes, that is not true. There are combining
characters so properly interpret text one needs to look at
surrounding characters. There are normalization forms and
charactes essentially serving as abbreviations. And largish
tables of character properties. This all is error prone.
> But what do you think of adding support to some mathematical operators
> in Unicode notation?
Well, ATM interpreter parser is hand written and essentially
hardcodes syntax. This make it hard to add new operators.
OTOH old parser, modified version of which is used by Spad
compiler is table driven: you add new operator to the tables
and parser can handle it.
--
Waldek Hebisch