Tip: easy input of math/unicode symbols

882 views
Skip to first unread message

Nicolas M. Thiery

unread,
Feb 21, 2020, 11:09:36 AM2/21/20
to sage-...@googlegroups.com
Hi,

Since Sage uses Python 3, we can finally use unicode symbols for variables:

sage: Φ = lambda λ: λ + 1

But how to input them? I just accidently discovered that IPython (and
thus Jupyter) makes it super easy. Type:

sage: \Phi<tab>

And you get:

sage: Φ

https://ipython.readthedocs.io/en/stable/api/generated/IPython.core.completer.html#forward-latex-unicode-completion

Reverse conversion from symbol to latex name works too:

sage: \Φ<tab>

And you get:

sage: \Phi

Not all symbols are available by default; for example \otimes. But
it's possible to add more symbols:

import IPython.core.completer
IPython.core.completer.latex_symbols[r'\otimes'] = '⊗'

(best to add it as well to reverse_latex_symbol).


A rather big caveat though: one can easily input all the usual math
glyphs (mathbb, mathfrak, ...) of a character. However they all are
considered as equal by Python itself:

sage: \mbfN<tab>
sage: 𝐍
<function numerical_approx at 0x7f712339f6a8>
sage: N
<function numerical_approx at 0x7f712339f6a8>

Cheers,
Nicolas
--
Nicolas M. Thiéry "Isil" <nth...@users.sf.net>
http://Nicolas.Thiery.name/

Emmanuel Charpentier

unread,
Feb 21, 2020, 2:39:31 PM2/21/20
to sage-devel
A similar bag of tricks is available to users of sage-shell-mode in emacs, using TeX Input Method or one of its possible customizations.

But such variable names may cause havoc in a \LaTeX output. Consider:

sage: var("λ")
λ

This works

sage: latex(λ^2)
λ^{2}

Ahem: the rendition of λ is ... "λ", which pdflatex doesn't accept :

sage: view(λ^2)
An error occurred.

[ pdflatex dumps its bowels : Snip... ]

Latex error

xelatex accepts this :

sage: view(λ^2, engine="xelatex")

but the λ is not rendered (probably because the default font used by xelatex has no λ symbol or because the preamble doesn't contain the "right" set of packages (maybe \usepackage{utf8} ?)...).

Paradoxically, if your program is someday aimed at producing \LaTeX output, better not to use emacs' TeX input method, and stick to "classical" input. Or customize the preambles used by Sage (for view) and SageTeX. Probably not trivial...

What happens when you want to export a Jupyter notebook with such symbols to PDF ?

HTH,

Simon King

unread,
Feb 21, 2020, 4:26:55 PM2/21/20
to sage-...@googlegroups.com
Hi Emmanuel,

On 2020-02-21, Emmanuel Charpentier <emanuel.c...@gmail.com> wrote:
> sage: latex(λ^2)
> λ^{2}

Couldn't we modify the latex() function, by exploiting
IPython.core.completer.reverse_latex_symbol, to automatically translate
all unicode symbols that aren't understood by pdflatex?

Best regards,
Simon

Simon King

unread,
Feb 21, 2020, 4:46:54 PM2/21/20
to sage-...@googlegroups.com
Such as:
sage: import IPython
sage: from IPython.core.completer import reverse_latex_symbol
sage: def replfunc(match):
....: return reverse_latex_symbol[match.group(0)]
....:
sage: import re
sage: regex = re.compile('|'.join(re.escape(x) for x in reverse_latex_symbol))
sage: f(λ) = sin(λ)
sage: latex(f)
λ \ {\mapsto}\ \sin\left(λ\right)
sage: print(regex.sub(replfunc,latex(f)))
\lambda \ {\mapsto}\ \sin\left(\lambda\right)

Best regards,
Simon

Emmanuel Charpentier

unread,
Feb 21, 2020, 6:22:15 PM2/21/20
to sage-devel
Who would do that ?
You ?
And which 73 others ? Unicode is Huuuuge...

Seriously : Such a proposal couldn't stop to "just" the "customary" single-letter greek variable names we are used to and  think of first. There is no a priori valid reason to accept them and refuse, for example, hebrew or cyrillic characters (and names), katakana, sinograms or greek polytonic multiletter names...

The real question(s) would be "Where do you stop ? (And why ?)"

Furthermore, the question remains of what happens to those characters in \LaTeX math mode (not obvious to me), what happens to spacing, alignment and possibly kerning, etc, etc...


Le vendredi 21 février 2020 22:46:54 UTC+1, Simon King a écrit :

Eric Gourgoulhon

unread,
Feb 22, 2020, 8:35:49 AM2/22/20
to sage-devel
Hi Emmanual,


Le vendredi 21 février 2020 20:39:31 UTC+1, Emmanuel Charpentier a écrit :
But such variable names may cause havoc in a \LaTeX output. Consider:

sage: var("λ")
λ

This works

sage: latex(λ^2)
λ^{2}

Ahem: the rendition of λ is ... "λ", which pdflatex doesn't accept :

sage: view(λ^2)
An error occurred.

A solution here is to declare explicitly the LaTeX name of the symbolic variable at creation, as we did when using only ASCII names:

sage: λ = var('λ', latex_name=r'\lambda')
sage: view(λ^2)

works perfectly.

Best regards,

Eric.

Emmanuel Charpentier

unread,
Feb 22, 2020, 4:46:27 PM2/22/20
to sage-devel


Le samedi 22 février 2020 14:35:49 UTC+1, Eric Gourgoulhon a écrit :
Hi Emmanual,


Le vendredi 21 février 2020 20:39:31 UTC+1, Emmanuel Charpentier a écrit :
But such variable names may cause havoc in a \LaTeX output. Consider:

sage: var("λ")
λ

This works

sage: latex(λ^2)
λ^{2}

Ahem: the rendition of λ is ... "λ", which pdflatex doesn't accept :

sage: view(λ^2)
An error occurred.

A solution here is to declare explicitly the LaTeX name of the symbolic variable at creation, as we did when using only ASCII names:

Of course. but this somehow defeats the purpose of "easy use of Unicode characters"...

BTW : without this addituin, what happents whet you try to export a Jupyter notebook containing such an Unicode character to PDF ?

Cordially,

Simon King

unread,
Feb 22, 2020, 5:24:06 PM2/22/20
to sage-...@googlegroups.com
On 2020-02-22, Emmanuel Charpentier <emanuel.c...@gmail.com> wrote:
> Le samedi 22 février 2020 14:35:49 UTC+1, Eric Gourgoulhon a écrit :
>> A solution here is to declare explicitly the LaTeX name of the symbolic
>> variable at creation, as we did when using only ASCII names:
>>
>
> Of course. but this somehow defeats the purpose of "easy use of Unicode
> characters"...

And that's why I believe it makes sense to exploit what IPython has to
offer concerning automatic transition between latex and unicode. Some
typical cases would work out of the box, one wouldn't reinvent the wheel
(because it makes use of existing functionality), and for those cases
that still don't work the user can provide the latex name explicitly.

Best regards,
Simon

Nicolas M. Thiery

unread,
Feb 23, 2020, 4:38:04 PM2/23/20
to sage-...@googlegroups.com

> A rather big caveat though: one can easily input all the usual math
> glyphs (mathbb, mathfrak, ...) of a character. However they all are
> considered as equal by Python itself:
>
> sage: \mbfN<tab>
> sage: 𝐍
> <function numerical_approx at 0x7f712339f6a8>
> sage: N
> <function numerical_approx at 0x7f712339f6a8>

For the curious: Python uses the so-called NFKC normalization for its
identifiers:

https://docs.python.org/3.3/reference/lexical_analysis.html#identifiers
https://en.wikipedia.org/wiki/Unicode_equivalence

sage: unicodedata.normalize('NFKC', '𝐍')
'N'

Nicolas M. Thiery

unread,
Feb 23, 2020, 4:51:35 PM2/23/20
to sage-...@googlegroups.com
On Fri, Feb 21, 2020 at 11:39:31AM -0800, Emmanuel Charpentier wrote:
> A similar bag of tricks is available to users of sage-shell-mode in emacs,
> using TeX Input Method <https://www.emacswiki.org/emacs/TeXInputMethod> or
> one of its possible customizations.

Cool! I really need to update my sage-shell-mode :-)

> Ahem: the rendition of λ is ... "λ", which pdflatex doesn't accept :

Argl. And exporting the notebook to pdf fails for the same reason.

Gosh. In 2020, more than 20 years into unicode, our tool chain really
ought to support unicode ... Time to move on to xelatex for building
our pdf?

Emmanuel Charpentier

unread,
Feb 25, 2020, 2:41:13 AM2/25/20
to sage-devel
Le dimanche 23 février 2020 22:51:35 UTC+1, Nicolas M. Thiéry a écrit :

[ Snip... ]

Gosh. In 2020, more than 20 years into unicode, our tool chain really
ought to support unicode ... Time to move on to xelatex for building
our pdf?

The current "line of the Party" seems to be that the future of TeX is LuaTeX. Even if there are currently a tad more XeTeX-compatible LaTeX packages than LuaTeX-compatible ones.

Introducing LuaTeX compatibility in our tools would be a nice addition. But seems a lot of work (if only to find the places needing update...).

HTH,

William

unread,
Feb 26, 2020, 1:16:19 AM2/26/20
to sage-devel
Make sure to benchmark the speed of luatex for this application, since in my experience it can be significantly slower than the other tex engines...

Emmanuel Charpentier

unread,
Feb 26, 2020, 4:23:33 AM2/26/20
to sage-devel


Le mercredi 26 février 2020 07:16:19 UTC+1, William a écrit :
Make sure to benchmark the speed of luatex for this application, since in my experience it can be significantly slower than the other tex engines...

Indeed. But this slowdown seems to be in large part bound to the building of fonts' dimensional parameters, and *this* can be avoided by pre-computation (the details escape me at the moment, but I remember having looked tht up...).

Dima Pasechnik

unread,
Feb 26, 2020, 5:06:49 AM2/26/20
to sage-devel
IIRC, one can prebuild a format for the document, in particular
speeding up font loading.

Also, lua is an interpreted langauge, we all know how one can wrote
slow code in such a scenario, and how it can be sped up.


Dima

E. Madison Bray

unread,
Mar 6, 2020, 10:01:16 AM3/6/20
to sage-devel
I opened a ticket about this some weeks ago:
https://trac.sagemath.org/ticket/28966
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/20200221160932.GJ3271%40mistral.

rjf

unread,
Mar 7, 2020, 7:03:28 PM3/7/20
to sage-devel
Consider the consequences to a parser when there are multiple forms
of what appear to be (say)  "+"  or "space"  or "A"   (is that a capital alpha?)
..
Selecting Greek or other symbols from a palette might be cute. Freeform
input of unicode, probably a bag of worms.
RJF

Matthias Koeppe

unread,
Jul 11, 2020, 3:04:07 PM7/11/20
to sage-devel
I have created Meta-ticket https://trac.sagemath.org/ticket/30111 for keeping track of Unicode issues.


On Friday, February 21, 2020 at 8:09:36 AM UTC-8, Nicolas M. Thiéry wrote:
Reply all
Reply to author
Forward
0 new messages