Add/change hyphenation rules?

35 views
Skip to first unread message

Martin Koistinen

unread,
Sep 13, 2013, 10:36:26 AM9/13/13
to pyhy...@googlegroups.com
Greetings,

Firstly, many thanks for putting this PyHyphen together.  Very useful!

I'm writing because I'm seeing some head-scratching behaviour.  In particular, the word 'calculator' never seems to get hyphenated, even though other words around it are, including shorter words.

Initially I figured OfficeLibre’s hyphenation dictionary must just be lacking that word, but once I DL'ed and installed OfficeLibre, I find that OL hyphenates that word just fine.  So... I'm struggling to understand why I cannot get it to hyphenate in my project.

Is there a way to augment the dictionary with my own words such that I can continue to use the existing en-US hyphenation dictionary (and any future updates of the same) but at the same time use project-specific words too?  It isn't clear to me how I might do that.

Any tips greatly appreciated!

Dr. Leo

unread,
Sep 16, 2013, 2:38:20 PM9/16/13
to pyhy...@googlegroups.com
Hi,

on my machine, 'calculator' is hyphenated correctly as follows (win32, P2.7.5, PyHyphen 2.0.4):


In [2]: from hyphen import Hyphenator

In [3]: h=Hyphenator()

In [4]: h.pairs(u'calculator')
Out[4]: [[u'cal', u'culator'], [u'calcu', u'lator'], [u'calcula', u'tor']]

If you want to apply multiple hyphenators, e.g. to use your own dictionaries on top of an existing one, you could simply create a list of hyphenators and write a function that applies these successively to a given word until one of the hyphenators in the list returns a positive result.

�Leo




Am 13.09.2013 16:36, schrieb Martin Koistinen:
Greetings,

Firstly, many thanks for putting this PyHyphen together. �Very useful!

I'm writing because I'm seeing some head-scratching behaviour. �In particular, the word 'calculator' never seems to get hyphenated, even though other words around it are, including shorter words.

Initially I figured OfficeLibre�s hyphenation dictionary must just be lacking that word, but once I DL'ed and installed OfficeLibre, I find that OL hyphenates that word just fine. �So... I'm struggling to understand why I cannot get it to hyphenate in my project.

Is there a way to augment the dictionary with my own words such that I can continue to use the existing en-US hyphenation dictionary (and any future updates of the same) but at the same time use project-specific words too? �It isn't clear to me how I might do that.

Any tips greatly appreciated!

--
You received this message because you are subscribed to the Google Groups "pyhyphen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyhyphen+u...@googlegroups.com.
To post to this group, send email to pyhy...@googlegroups.com.
Visit this group at http://groups.google.com/group/pyhyphen.
For more options, visit https://groups.google.com/groups/opt_out.


Martin Koistinen

unread,
Sep 17, 2013, 9:42:19 AM9/17/13
to pyhy...@googlegroups.com, fhax...@googlemail.com
OK, it turns out that ever case of the word 'calculator' in my document is followed by a full stop. I suspect that the hyphenator doesn't hyphenate words at the end of a sentence, perhaps to prevent widows.

Thanks for the reply!
Reply all
Reply to author
Forward
0 new messages