Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

LaTex Unicode entry issues

218 views
Skip to first unread message

Haines Brown

unread,
Aug 24, 2023, 4:40:06 PM8/24/23
to
I'm necessarily working with a pdfLaTeX document.

I seem to recall from long ago that I could enter Unicode
in this way:

^^^^002B

Now it returns the error: "not set up for use with LaTeX." What does
this error imply?

This code works to produce an astrisk:

\char"002A

but \char"20AE fails although the character ₮ is accessible in my
font. The error I get is "Bad character code (8366)... character
number must be between 0 and 255. Does "character number refer to
002A? If so why does this sequence of characters not fall between 0
and 255?

My TexLive runs in emacs and automatically changes " to ``. This makes
it annoying to enter Unicode, for I have to paste it from the
terminal. Is there an easy way around this?

--

Haines Brown

David

unread,
Aug 24, 2023, 5:00:30 PM8/24/23
to
I just use TexMaker - occasionally Texstudio - with unicode enabled in
the editor settings.
Every now and again I will get a bad character reference, but that is
always because of the lack of that character or character size being
supplied by the font foundry concerned.
Not anything to do with the editor or TexLive.
Cheers!

--
A Kiwi in Australia,
doing my bit toward raising the national standard.

to...@tuxteam.de

unread,
Aug 25, 2023, 12:30:07 AM8/25/23
to
On Fri, Aug 25, 2023 at 06:23:00AM +0200, to...@tuxteam.de wrote:

[...]

> Here [1] [...]

Gah. The ref:

[1] https://tex.stackexchange.com/questions/377613/solve-unicode-char-is-not-set-up-for-use-with-latex-without-special-handling-o

--
t
signature.asc

to...@tuxteam.de

unread,
Aug 25, 2023, 12:30:07 AM8/25/23
to
The short answer: use a Unicode capable TeX/LaTeX engine (LuaTex/LuaLaTeX
is said to be the currently most complete, perhaps also XeTeX). Enter
the characters as Unicode (most probably UTF-8 encoded).

Here [1] is a longer answer, which explains that the "classical" TeX input
engine (and PDFLaTeX is in that category) is not quite up to the task
of "doing" full Unicode.

I'd go with LuaLaTex. And stop entering funny things like \char"XXXX
and use straight UTF-8 instead.

Cheers
--
t
signature.asc

Michel Verdier

unread,
Aug 25, 2023, 2:30:07 AM8/25/23
to
On 2023-08-24, Haines Brown wrote:

> I'm necessarily working with a pdfLaTeX document.

You mean LaTeX document compiled with pdflatex ?

> My TexLive runs in emacs and automatically changes " to ``. This makes
> it annoying to enter Unicode, for I have to paste it from the
> terminal. Is there an easy way around this?

In LaTeX mode double-type " to get the character

Max Nikulin

unread,
Aug 25, 2023, 11:00:06 AM8/25/23
to
On 25/08/2023 03:24, Haines Brown wrote:
> Now it returns the error: "not set up for use with LaTeX." What does
> this error imply?
>
> This code works to produce an astrisk:
>
> \char"002A

A complete minimal example of LaTeX document may describe better what
are you trying to achieve. I believed that a usual issue is to allow
unicode characters typed using compose key: α → ∞. LuaLaTeX should work
with Unicode better than PdfLaTeX, however fonts must be explicitly
configured. Fortunately it is easier to use e.g. truetype fonts.

Charles Kroeger

unread,
Aug 26, 2023, 3:10:07 AM8/26/23
to
I just have a really large list of UTF-8 characters and if I need one I
copy it and zap it in. I suppose this is not cool but, chacun a son gout.

a fun site if you want to write someone in UTF-8 runes.

https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt

ⅩⅩⅥ - Ⅷ - ⅯⅯⅩⅩⅢ

--
☢ ➛ ☠ ➛ ♺

debia...@howorth.org.uk

unread,
Aug 26, 2023, 7:20:06 AM8/26/23
to
Charles Kroeger <mb...@gmx.co.uk> wrote:
> I just have a really large list of UTF-8 characters and if I need one
> I copy it and zap it in. I suppose this is not cool but, chacun a son
> gout.
>
> a fun site if you want to write someone in UTF-8 runes.
>
> https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt

Gah! A mindworm. My browser shows boxes instead of characters for the
runes, but that's not what's got me.

The second column of the Thai poems is not aligned in my browser :(

But it is in my terminal. But not in gedit or tea or LO. Even when I
set LO to use Hack, which is the font my terminal says it is using.

So now I'm deep in the rabbit holes. Why do fonts behave like that! !!

> ⅩⅩⅥ - Ⅷ - ⅯⅯⅩⅩⅢ

Haines Brown

unread,
Aug 26, 2023, 8:20:06 AM8/26/23
to
On Fri, Aug 25, 2023 at 09:53:23PM +0700, Max Nikulin wrote:
> On 25/08/2023 03:24, Haines Brown wrote:
> > Now it returns the error: "not set up for use with LaTeX." What does
> > this error imply?
> >
> > This code works to produce an astrisk:
> >
> > \char"002A
>
> A complete minimal example of LaTeX document may describe better what are
> you trying to achieve.

\documentclass[12pt]{article} %
\usepackage[utf8]{inputenc} %
\usepackage[T1]{fontenc} %
\usepackage[greek,english]{babel} % to make Greek charactes available
\DeclareUnicodeCharacter{2014}{\dash} % to get m-dash

\begin{document}

To get m dash rather than "-" I do % \dash.
Why does this produce the error: Undefined control sequence?

If I paste an upper case Omega % Ω
into a LaTeX file and run pdflatex on it I am told the character is not
set up for LaTeX. \$ albatross tells me it is available in the DejaVu
Sans font which I have installed.

If I add \verb|\usepackage[greek,english]{babel}|
to preface and paste an upper case Omega, in the body I get error:
Command \verb|\textOmega| unavailable in encoding T1.

A program I use requires the dagger symbol, †, which is code point 2020.
But % \symbol{"2020}
produces fatal error.

The code for E is U+0045. I try % ^^^^0045
and get error: Unicode character % ^^^ (U+001E)
not set up for use with LaTeX. It turns out that the character
\verb| ^ | can't be used. I suspect the command deprecated because
clashes it its use in math mode.

The command to produce a Omega % \char"005B
produces garbage, not a µ. This character is in DejaVu Sans, but LaTeX
is not able to display it.

The question is: if my system has access to a character in that it can
be pasted, why cannot LaTeX do so as well.

\end{document}


--

Haines Brown

Eduardo M KALINOWSKI

unread,
Aug 26, 2023, 8:30:06 AM8/26/23
to
On 26/08/2023 09:08, Haines Brown wrote:
> The question is: if my system has access to a character in that it can
> be pasted, why cannot LaTeX do so as well.

Because TeX dates from before Unicode was even being discussed, and does
not use the libraries for handling Unicode that the other software in
your system does.

That's not to say it has not been modified to add at least partial
support for Unicode (utf8 is even the default encoding since a couple
years), but it's not the same as something built from the start to
handle Unicode.

On the other hand, luatex and xetex are evolutions of tex that have a
much more modern and native Unicode support. Use them instead of pdf(la)tex.

--
Now I lay me down to sleep
I pray the double lock will keep;
May no brick through the window break,
And, no one rob me till I awake.

Eduardo M KALINOWSKI
edu...@kalinowski.com.br

Max Nikulin

unread,
Aug 26, 2023, 11:20:07 PM8/26/23
to
On 26/08/2023 19:08, Haines Brown wrote:
> \documentclass[12pt]{article} %
> \usepackage[utf8]{inputenc} %
> \usepackage[T1]{fontenc} %
> \usepackage[greek,english]{babel} % to make Greek charactes available

It seems, you are overestimating effect. You still need to provide
fontenc containing Greek characters. Unless you have real reasons to
avoid LuaTeX, I recommend you to use lualatex instead of pdflatex.

> \DeclareUnicodeCharacter{2014}{\dash} % to get m-dash

Do you really need it? Even without explicit declaration there is no
issue with pdflatex and U+2014 character

\begin{document}
Test—character.
\end{document}

Not to mention

Test---ligature.

On the other hand in response to

\begin{document}
Test\dash{}test.
\end{document}

I get

! Undefined control sequence.
l.7 Test\dash
{}test.
?

There is \textemdash command however. So you \DeclareUnicodeCharacter is
incorrect and almost certainly unnecessary.

> If I paste an upper case Omega % Ω
> into a LaTeX file and run pdflatex on it I am told the character is not
> set up for LaTeX. \$ albatross tells me it is available in the DejaVu
> Sans font which I have installed.

It is sour, but in general even with LuaTeX engine you have to
explicitly specify particular font. In the specific case of "Ω", the
character is available in lmodern (default font used by LuaTeX). Unlike
browsers or office software, TeX engines do not try hard to find at
least some font that can be used as a fallback for particular character.

David

unread,
Aug 26, 2023, 11:30:06 PM8/26/23
to
No need for any of it, just insert --- for an em dash and -- for an en
dash with numbers.

> > If I paste an upper case Omega % Ω
> > into a LaTeX file and run pdflatex on it I am told the character is
> > not
> > set up for LaTeX. \$ albatross tells me it is available in the
> > DejaVu
> > Sans font which I have installed.
>
> It is sour, but in general even with LuaTeX engine you have to
> explicitly specify particular font. In the specific case of "Ω", the
> character is available in lmodern (default font used by LuaTeX).
> Unlike
> browsers or office software, TeX engines do not try hard to find at
> least some font that can be used as a fallback for particular
> character.
>
>

Max Nikulin

unread,
Aug 27, 2023, 10:20:06 PM8/27/23
to
On 27/08/2023 10:23, David wrote:
> On Sun, 2023-08-27 at 10:16 +0700, Max Nikulin wrote:
>> On 26/08/2023 19:08, Haines Brown wrote:
>>> \documentclass[12pt]{article} %
>>> \usepackage[utf8]{inputenc} %
>>> \usepackage[T1]{fontenc} %
>>> \usepackage[greek,english]{babel} % to make Greek charactes
>>> available
>>
>> It seems, you are overestimating effect. You still need to provide
>> fontenc containing Greek characters. Unless you have real reasons to
>> avoid LuaTeX, I recommend you to use lualatex instead of pdflatex.

To be clear, babel does some high level job: loading hyphenation
patterns, configures \selectlanguage and \foreignlanguage commands. If I
read docs in /usr/share/doc/texlive-doc/generic/babel-greek/ correctly,
to allow "Ω" and other Greek letters in PdfLaTeX, it is necessary to add

\usepackage{textalpha}

The fontenc package may map ASCII characters to Greek letters.

> No need for any of it, just insert --- for an em dash and -- for an en
> dash with numbers.

I do not see a real reason to prohibit Unicode "—" EM DASH character
nowadays. Moreover, dashes and spaces around them depends on particular
language. Likely it is not the case of Greek however. There are variety
of dashes besides --- and --:

\cdash--- ”--- Cyrillic emdash in plain text.
\cdash--~ ”--~ Cyrillic emdash in compound names (as in
Mendeleev”--~Klapeiron).
\cdash--* ”--* Cyrillic emdash for denoting direct speech.

As to LuaLaTeX, CMU (Computer Modern Unicode) fonts has better coverage
of various characters than default Latin Modern fonts

\usepackage{fontspec}
\setmainfont{CMU Serif}
\setsansfont{CMU Sans Serif}
\setmonofont{CMU Typewriter Text}

I can say nothing concerning its quality in respect to printed documents
in comparison to DejaVu or Noto.
0 new messages