Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Q] Hyphenation of words which contain hyphens

55 views
Skip to first unread message

Duncan Hothersall

unread,
Sep 6, 1996, 3:00:00 AM9/6/96
to

I can't seem to find an answer to this problem: how can one
specify appropriate hyphenation for words which actually contain
a hyphen? For example, "government-funded" won't hyphenate, so I
tried using

\hyphenation{gov-ern-ment-funded}

but, unsurprisingly, this didn't work.

I am using mikTeX under Win95, and the plain TeX macros.

Thanks for any light that can be shed.

-------------------------------------------------------------
Duncan Hothersall Phone: +44 (0)131 451 3526
Edinburgh Business School Fax: +44 (0)131 451 3002
Heriot-Watt University Email: d...@ebs.hw.ac.uk
Edinburgh, EH14 4AS, UK URL: http://www.ebs.hw.ac.uk
-------------------------------------------------------------

David Carlisle

unread,
Sep 6, 1996, 3:00:00 AM9/6/96
to Duncan Hothersall

In article <50osjj$i...@lomond.icbl.hw.ac.uk>
d...@ebs.hw.ac.uk (Duncan Hothersall) writes:


I can't seem to find an answer to this problem: how can one
specify appropriate hyphenation for words which actually contain
a hyphen? For example, "government-funded" won't hyphenate, so I
tried using

\hyphenation{gov-ern-ment-funded}

but, unsurprisingly, this didn't work.

I am using mikTeX under Win95, and the plain TeX macros.

^^^^^^^^^
Strange choice if you are
writing a document

Thanks for any light that can be shed.

======================== cut here

\hsize1mm
a gov\-ern\-ment-funded

a gov\-ern\-ment\nobreak\hskip0pt-\nobreak\hskip0pt funded

a government\nobreak\hskip0pt-\nobreak\hskip0pt funded

\bye

========================

any of the above produce breakpoints.

An alternative is to use the dc fonts or any other fonts in the Cork
(or T1) encoding. These have two hyphen characters, so you can tell
TeX that the normal - is a letter, so it does not inhibit hyphenation,
and then have TeX insert the alternative character automatically.

David

Dorai Sitaram

unread,
Sep 8, 1996, 3:00:00 AM9/8/96
to

In article <ud685rleob.fsf@vummath>,
David Carlisle <carl...@ma.man.ac.uk> wrote:
$In article <50osjj$i...@lomond.icbl.hw.ac.uk>
$d...@ebs.hw.ac.uk (Duncan Hothersall) writes:
$
$ I can't seem to find an answer to this problem: how can one
$ specify appropriate hyphenation for words which actually contain
$ a hyphen? For example, "government-funded" won't hyphenate, so I
$ tried using
$
$ \hyphenation{gov-ern-ment-funded}
$
$ but, unsurprisingly, this didn't work.
$
$ I am using mikTeX under Win95, and the plain TeX macros.
$ ^^^^^^^^^
$ Strange choice if you are
$ writing a document

But a good one.

$a gov\-ern\-ment-funded
$
$a gov\-ern\-ment\nobreak\hskip0pt-\nobreak\hskip0pt funded
$
$a government\nobreak\hskip0pt-\nobreak\hskip0pt funded
$
$any of the above produce breakpoints.

The second \nobreak in the 2nd and 3rd suggestions above
shouldn't be there, as it prevents the following, valid,
hyphenation:

government-
funded


Alain Kessi

unread,
Sep 11, 1996, 3:00:00 AM9/11/96
to d...@ebs.hw.ac.uk

d...@ebs.hw.ac.uk (Duncan Hothersall) wrote:
>I can't seem to find an answer to this problem: how can one
>specify appropriate hyphenation for words which actually contain
>a hyphen? For example, "government-funded" won't hyphenate, so I
>tried using
>
>\hyphenation{gov-ern-ment-funded}

>
>but, unsurprisingly, this didn't work.

Define

\def\hyph{\hskip0pt-\penalty0\hskip0pt\relax}

then use government\hyph funded.

-- Alain Kessi (alain...@psi.ch), at Paul Scherrer Institut, Zuerich, CH
++++ stop the execution of Mumia Abu-Jamal ++++
++++ if you agree copy these 3 sentences in your own sig ++++
++++ see: http://www.xs4all.nl/~tank/spg-l/sigaction.htm ++++


Bernd Raichle

unread,
Sep 11, 1996, 3:00:00 AM9/11/96
to

David Carlisle <carl...@ma.man.ac.uk> writes:
> In article <50osjj$i...@lomond.icbl.hw.ac.uk>
> d...@ebs.hw.ac.uk (Duncan Hothersall) writes:
>
>
> I can't seem to find an answer to this problem: how can one
> specify appropriate hyphenation for words which actually contain
> a hyphen? For example, "government-funded" won't hyphenate, so I
> tried using
>
> \hyphenation{gov-ern-ment-funded}
>
> but, unsurprisingly, this didn't work.

The character `-' is hard coded in the TeX source code for the
\hyphenation primitive to denote a valid hyphenation point. Thus, the
above input tells TeX that the word

governmentfunded

(not `-'!!) can be hyphenated at the places "gov|ern|ment|funded".


This restriction can not be removed, or only with a change of TeX.web
which means that the result can not be called "TeX" anymore.


[...]


>
> \hsize1mm
> a gov\-ern\-ment-funded
>

> a gov\-ern\-ment\nobreak\hskip0pt-\nobreak\hskip0pt funded
>

> a government\nobreak\hskip0pt-\nobreak\hskip0pt funded

^^^^^^^^^^^^^^^^^

cf. TeXbook, Appendix D, search for "\allowhyphens".

Btw. it's only necessary to input

a government\nobreak-\hskip0pt\relax funded

to achieve the expected result. Unless you change the \hyphenchar of
the current \font to something other than `-' (`-' is the default),
TeX will automatically insert a \discretionary{}{}{} after a `-'
giving a valid hyphenation point at this place. The \nobreak in
David's examples can not prevent TeX from inserting this
\discretionary.

(If you are using `german.sty', you can use "= which is defined as
`\nobreak-\hskip0pt\relax'---german words often consists of many parts
glued together with a `-', e.g. Arbeiter-Unfallversicherungsgesetz :-)


[...]


> An alternative is to use the dc fonts or any other fonts in the Cork
> (or T1) encoding. These have two hyphen characters, so you can tell
> TeX that the normal - is a letter, so it does not inhibit hyphenation,
> and then have TeX insert the alternative character automatically.

Nonetheless you can not use \hyphenation to declare the hyphenation
exceptions for words with an explicit hyphen `-' :-(((((


-bernd
____________________________________________________________________
Bernd Raichle "Le langage est source
DANTE e.V., Koordinator `german.sty' de malentendus"
email: ger...@dante.de (A. de Saint-Exupery)
Infos ueber DANTE e.V.:
ftp: ftp.dante.de oder
email: ftp...@dante.de in /tex-archive/usergrps/dante/
www: http://www.dante.de/

Bernd Raichle

unread,
Sep 11, 1996, 3:00:00 AM9/11/96
to

do...@cs.rice.edu (Dorai Sitaram) writes:
: In article <ud685rleob.fsf@vummath>,

: David Carlisle <carl...@ma.man.ac.uk> wrote:
: $In article <50osjj$i...@lomond.icbl.hw.ac.uk>
: $d...@ebs.hw.ac.uk (Duncan Hothersall) writes:
[...]
: $a gov\-ern\-ment-funded
: $
: $a gov\-ern\-ment\nobreak\hskip0pt-\nobreak\hskip0pt funded
^^^^^^^^

: $a government\nobreak\hskip0pt-\nobreak\hskip0pt funded
^^^^^^^^

: $any of the above produce breakpoints.


:
: The second \nobreak in the 2nd and 3rd suggestions above
: shouldn't be there, as it prevents the following, valid,
: hyphenation:
:
: government-
: funded


No, it doesn't prevent the hyphenation at this point, because TeX
inserts automatically a \discretionary{}{}{} after each
\hyphenchar\font (i.e. normally each `-'). Cf. TeXbook, TeX.web.

-bernd

David Kastrup

unread,
Sep 12, 1996, 3:00:00 AM9/12/96
to Bernd Raichle

Bernd Raichle <rai...@informatik.uni-stuttgart.de> writes:

>
> David Carlisle <carl...@ma.man.ac.uk> writes:

> > An alternative is to use the dc fonts or any other fonts in the Cork
> > (or T1) encoding. These have two hyphen characters, so you can tell
> > TeX that the normal - is a letter, so it does not inhibit hyphenation,
> > and then have TeX insert the alternative character automatically.
>
> Nonetheless you can not use \hyphenation to declare the hyphenation
> exceptions for words with an explicit hyphen `-' :-(((((

This, of course, is shamefully untrue. See the following transcript
for an example of how to specify such hyphenations:
tex
tex
This is TeX, Version 3.14159 (C version 6.1)
**\lccode`==`-
Hyphenation patterns for english, german, loaded.

*\hyphenation{gover-nment=fo-unded} % bad hyphenation for illustration
*\hyphenchar\tenrm=127
*\lccode`-=`-
*\showhyphens{government-founded}
Underfull \hbox (badness 10000) detected at line 0
[] \tenrm gover^^?nment-fo^^?unded

*\end
(see the transcript file for additional information)
No pages of output.
Transcript written on texput.log.

--
David Kastrup Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum
Email: d...@neuroinformatik.ruhr-uni-bochum.de Telephon: +49-234-700-5570


Dr Yoshimasa Tsuji

unread,
Sep 13, 1996, 3:00:00 AM9/13/96
to

In article <wnx91ah...@informatik.uni-stuttgart.de> Bernd Raichle <rai...@informatik.uni-stuttgart.de> writes:
> a hyphen? For example, "government-funded" won't hyphenate, so I

The character `-' is hard coded in the TeX source code for the


\hyphenation primitive to denote a valid hyphenation point. Thus, the

The author of the lines above obviously hasn't read Knuth's TeXBook
thoroughly.

--------another answer----
In order to hyphenate a hyphenated word like government-funded, you need
to use two different encodings for your hyphen character.
For example, if you insist on assigning a dot for your hyphen character,
you can experience the following sequence: (start your tex and get a
prompt)
**\relax
*\hyphenchar\font=46 % dot is the hyphen now. You can use
% \defaulthyphenchar if you wish
*lccode'055='055 % Cheat TeX. 055 is the ordinary hyphen sign
*\showhyphens{government-funded}
Underfull \hbox {badness 10000) detected at line 0
[] \tenrm gov.ern.ment-.fun.ded

One way to decently cope with this matter is to invent an end-of-line
hyphen character of your own (e.g. char123 -- en dash is not usually
used in languages that permit division of hyphenated words) and also
make a ligature such that a preceding hyphen character (true char'055)
will not be printed when followed by this private hyphen character.
{let the glyph of your private hyphen be exactly the same as char'055
and don't forget to tell \patterns to split after - }

That is what I do in Russian for which division of hyphenated words
is quite common.

----------end of an answer----
Another consideration concerning hyphenated words is that it is
usually better to do division there (perhaps in the first phase
at which no hyphenation is attempted). In that case, insert
a glue after the hyphen character like
\def\-{-\hskip0pt}
government\-funded

But this is not frightfully elegant.

Cheers,
Tsuji

David Carlisle

unread,
Sep 13, 1996, 3:00:00 AM9/13/96
to Dr Yoshimasa Tsuji

>> The character `-' is hard coded in the TeX source code for the
>> \hyphenation primitive to denote a valid hyphenation point. Thus, the


> The author of the lines above obviously hasn't read Knuth's TeXBook
> thoroughly.

Actually he is very well acquainted with the TeXBook and the TeX Source

You can have a font with two hyphens and set up the apropriate ligatures
This is standard in the T1/Cork/dc/ec encoded fonts,

So your ``--------another answer----'' is correct but does not refer
to the point that Bernd was making which was that although you can
change \hyphenchar and \defaulthyphenchar to anything you like
you *have* to use - (position 45,) to specify legal breakpoints
in the \hyphenation command, even if this is not being used as the
\hyphenchar.

This is a real pain, as while you can tell TeX that - is a `letter' by
giving it a lccode, and setting up appropriate \patterns, you can not
specify hyphenation exceptions with \hyphenation for words with -.


David

Dr Yoshimasa Tsuji

unread,
Sep 14, 1996, 3:00:00 AM9/14/96
to

I'd like to cancel my previous posting mentioned two lines above.

First, Bern Raichle is right in saying that the hyphen character is
hard code in tex.web: In section 937 of "TeX. The Programs", Knuth
says
if cur_chr = "-" then
instead of
if cur_chr = hyf_chr then
as he did everywhere else. (This means that \hyphenchar and \defaulthyphenchar
primitives are meaningless as long as this inconsistency persists.)

Thus you cannot say,
\hyphenchar\font='056
\hyphenation{gov.ern.ment}
\hyphenchar\font='055 % restore

Allowing "-" to behave like just another ordinary letter
is not a good solution. I would like to suggest
\def\-{\kern0pt-\hskip0pt}
and say "government\-funded" instead of "government-funded". TeX will
try to break after "-" if it is a good place, but will keep "ment" and
"-" together. If your are not satisfied with the way TeX splits "government"
or "funded", try e.g.
\hyphenation{gov-vern-ment fund-ed}\righthyphenmin=2

Cheers,
Tsuji

David Kastrup

unread,
Sep 16, 1996, 3:00:00 AM9/16/96
to David Carlisle

David Carlisle <carl...@ma.man.ac.uk> writes:

> >> The character `-' is hard coded in the TeX source code for the
> >> \hyphenation primitive to denote a valid hyphenation point. Thus, the
>

> So your ``--------another answer----'' is correct but does not refer
> to the point that Bernd was making which was that although you can
> change \hyphenchar and \defaulthyphenchar to anything you like
> you *have* to use - (position 45,) to specify legal breakpoints
> in the \hyphenation command, even if this is not being used as the
> \hyphenchar.
>
> This is a real pain, as while you can tell TeX that - is a `letter' by
> giving it a lccode, and setting up appropriate \patterns, you can not
> specify hyphenation exceptions with \hyphenation for words with -.

But of course you can, and I have posted so before (the trick used is
the same that you'd use if you wanted to specify in the hyphenation
patterns how words containing digits would have to be hyphenated):

\bgroup
\lccode`==`-
\hyphenation{govern-ment=-foun-ded}
\egroup

David Carlisle

unread,
Sep 16, 1996, 3:00:00 AM9/16/96
to David Kastrup

Me> ... you can not ...

I should have known better: Never say Never.

David Kastrup (in a message that hasnt shown up in my news spool yet)
says:
dak> But of course you can....


Well I'll be blowed. Go to the top of the class. I *promise* I won't
argue with you about \over ever again.


David

This looked so much fun I had to try it myself:

\documentclass{minimal}

% set up - is a normal letter, and use hanging char from 127 for hyphens
\defaulthyphenchar=127
\lccode`\-=`\-

\usepackage[T1]{fontenc}

\begin{document}
\showthe\hyphenchar\font
\showhyphens{aaaaaaaaaaaaaa-bbbbbbbbbbbbbbbb ffffffff}

\bgroup
\lccode`==`-
\hyphenation{aaaaaaa-aaaaaaa=-bbbbbbb-bbbbbbbbb ffff=ffff}
\egroup

\showhyphens{aaaaaaaaaaaaaa-bbbbbbbbbbbbbbbb ffffffff}

\end{document}

0 new messages