Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Converting \u00a0 (nbsp) to TeX

475 views
Skip to first unread message

Oleg Paraschenko

unread,
Nov 10, 2011, 8:34:14 AM11/10/11
to
Hello,

so far, I considered the unicode symbol \u00a0 should be converted to
the tilde in TeX. But now I've got a counterexample:

\documentclass{article}
\begin{document}
test:\newline
aaa~aaa\newline
~~~~~aaa~~~aaa\newline
\end{document}

The use of "~" as non-collapsing white-space does work inside lines,
but does not work at the beginning of lines. "~~~~~" is ignored. I
think I know why (kern after break?), but it does not help me to
realize:

How to represent the symbol \u00a0 in TeX?

Thanks for the suggestions.


--
Oleg Parashchenko olpa@ http://uucode.com/
http://uucode.com/blog/ XML, TeX, Python, Mac, Chess

Herbert Schulz

unread,
Nov 10, 2011, 8:46:43 AM11/10/11
to
In article
<5f54bce3-9dc1-4ea8...@m19g2000yqh.googlegroups.com>,
Howdy,

White space at the start of a line is thrown out by TeX. place an empty
box at the start of a line (\mbox{}), so

\documentclass{article}
\begin{document}
test:\newline
aaa~aaa\newline
\mbox{}~~~~~aaa~~~aaa\newline
\end{document}

works.

by the way, if you are trying to align items this is WRONG way to do it.
Use the tabbing environment or make a tabular.

Good Luck,
Herb Schulz

Enrico Gregorio

unread,
Nov 10, 2011, 8:47:07 AM11/10/11
to
Oleg Paraschenko <ole...@gmail.com> wrote:

> Hello,
>
> so far, I considered the unicode symbol \u00a0 should be converted to
> the tilde in TeX. But now I've got a counterexample:
>
> \documentclass{article}
> \begin{document}
> test:\newline
> aaa~aaa\newline
> ~~~~~aaa~~~aaa\newline
> \end{document}
>
> The use of "~" as non-collapsing white-space does work inside lines,
> but does not work at the beginning of lines. "~~~~~" is ignored. I
> think I know why (kern after break?), but it does not help me to
> realize:
>
> How to represent the symbol \u00a0 in TeX?

~ is not "non collapsing space": it does what is the duty of spaces
in TeX, that is, disappearing after a line break.

I would never use a sequence of spaces in that way, but rather
\hspace or \hspace* that allows for precise control.

If you really want to use U+00A0 in that way,

\usepackage{newunicodechar}
\newunicodechar{^^^^00a0}{\leavevmode~}

Ciao
Enrico

Donald Arseneau

unread,
Nov 10, 2011, 9:08:18 PM11/10/11
to
Enrico Gregorio <Facile.d...@in.rete.it> writes:

> I would never use a sequence of spaces in that way, but rather
> \hspace or \hspace* that allows for precise control.

But if you have text that *uses* the unicode non-breaking space,
perhaps written by somebody else, then TeX should typeset it
as intended.

> If you really want to use U+00A0 in that way,
>
> \usepackage{newunicodechar}
> \newunicodechar{^^^^00a0}{\leavevmode~}

\leavevmode doesn't interrupt discarding (and ~ already
does \leavevmode). You probably meant

\newunicodechar{^^^^00a0}{\mbox{}~}

or

\newunicodechar{^^^^00a0}{\leavevmode\vadjust{}~}


Donald Arseneau as...@triumf.ca

Enrico Gregorio

unread,
Nov 11, 2011, 3:09:16 AM11/11/11
to
Thanks.

Ciao
Enrico

Manuel Collado

unread,
Nov 11, 2011, 5:07:54 AM11/11/11
to
El 11/11/2011 3:08, Donald Arseneau escribió:
> Enrico Gregorio<Facile.d...@in.rete.it> writes:
>
>> I would never use a sequence of spaces in that way, but rather
>> \hspace or \hspace* that allows for precise control.
>
> But if you have text that *uses* the unicode non-breaking space,
> perhaps written by somebody else, then TeX should typeset it
> as intended.
>...
>
> \newunicodechar{^^^^00a0}{\mbox{}~}
>
> or
>
> \newunicodechar{^^^^00a0}{\leavevmode\vadjust{}~}

Do these commands also work in math mode?

--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

Donald Arseneau

unread,
Nov 11, 2011, 5:14:42 AM11/11/11
to
Manuel Collado <m.co...@domain.invalid> writes:

> > \newunicodechar{^^^^00a0}{\mbox{}~}
> >
> > or
> >
> > \newunicodechar{^^^^00a0}{\leavevmode\vadjust{}~}
>
> Do these commands also work in math mode?

Yes.


Donald Arseneau as...@triumf.ca

Oleg Paraschenko

unread,
Nov 11, 2011, 8:10:24 AM11/11/11
to
Hello Herbert,

On 10 Nov., 14:46, Herbert Schulz <he...@wideopenwest.com> wrote:
...
> White space at the start of a line is thrown out by TeX. place an empty
> box at the start of a line (\mbox{}),

Thanks, it helps.

...
> by the way, if you are trying to align items this is WRONG way to do it.
> Use the tabbing environment or make a tabular.

Yes, I agree. But in a converter from something to LaTeX I have to
retain as much user's formatting as possible.

>
> Good Luck,
> Herb Schulz

Donald Arseneau

unread,
Nov 13, 2011, 5:36:43 AM11/13/11
to
Donald Arseneau <as...@triumf.ca> writes:

> Manuel Collado <m.co...@domain.invalid> writes:
>
> > > \newunicodechar{^^^^00a0}{\mbox{}~}
> > >
> > > or
> > >
> > > \newunicodechar{^^^^00a0}{\leavevmode\vadjust{}~}
> >
> > Do these commands also work in math mode?
>
> Yes.

Actually, I don't know what \newunicodechar does with the definition,
so it might be math-incompatible.

And they don't work well in math mode -- they allow line breaks.
I suggest instead that the non-breaking space be defined as

\newunicodechar{^^^^00a0}{\leavevmode\nobreak\vadjust{}~}

Donald Arseneau as...@triumf.ca

Enrico Gregorio

unread,
Nov 13, 2011, 6:08:18 AM11/13/11
to
\newunicodechar{<char>}{...} just activates <char> and performs
the equivalent of

\protected\def<char>{...}

Notice that <char> can be used in the body of the definition,
because it's already been tokenized. One can so "safely" say

\newunicodechar{<char>}{\ifmmode...\else...\fi}

if <char> is intended to be used in math mode with a different
meaning than in text mode.

Ciao
Enrico

William F Hammond

unread,
Nov 21, 2011, 1:18:42 PM11/21/11
to


>> White space at the start of a line is thrown out by TeX. place an empty
>> box at the start of a line (\mbox{}),
> ...
>> by the way, if you are trying to align items this is WRONG way to do it.
>> Use the tabbing environment or make a tabular.
>
> Yes, I agree. But in a converter from something to LaTeX I have to
> retain as much user's formatting as possible.

Many kinds of 'something' are really not suitable for conversion to formats
residing at a level higher than dvi or pdf.

-- Bill

Manuel Collado

unread,
Nov 21, 2011, 6:35:36 PM11/21/11
to
I took part in the origin of this discussion. The "something" in
question is HTML and similar markups, which makes extensive usage of
&nbsp; (U+00A0) not only to avoid line beaks at specific points but also
to emulate indentation.

And I assume that HTML is certainly suitable for conversion to formats
at level higher than dvi or pdf.

--

William F Hammond

unread,
Nov 28, 2011, 6:07:00 PM11/28/11
to
Manuel Collado <m.co...@domain.invalid> writes:

> . . .
>> Many kinds of 'something' are really not suitable for conversion to formats
>> residing at a level higher than dvi or pdf.
>
> I took part in the origin of this discussion. The "something" in
> question is HTML and similar markups, which makes extensive usage of
> &nbsp; (U+00A0) not only to avoid line beaks at specific points but
> also to emulate indentation.
>
> And I assume that HTML is certainly suitable for conversion to formats
> at level higher than dvi or pdf.

The use of U+00A0 to 'emulate indentation' is abuse of HTML -- to the
point that one has a kind of 'something'. :-)

-- Bill

0 new messages