Issue with the word "outer"

30 views
Skip to first unread message

Robert Rybczynski

unread,
Jan 6, 2018, 3:31:24 PM1/6/18
to pyhyphen
Dr. Leo,

Thank you for making PyHyphen and releasing it to the public. I needed a package to break words into syllables and I was starting to think that I would have to write it myself. I know it's a tough nut to crack so I wasn't looking forward to writing that code. Then I found PyHyphen. It will save me considerable work and let me stay on track for my own project.

I encountered a problem. I found that the function hyphen.Hyphenator.syllables("outer") returns ['outer'] while syllables("outerwear") returns ['out', 'er', 'wear']. I then tried misspelling the word: syllables("outter"). It returned ['out', 'ter'], which is actually closer to what I expected (but still incorrect).

What can I do about this? I see that the project is open source but I have never worked on an open source project before. Should I grab the source and see if I can discover what's going on? Is this likely to be a data problem, and the solution will be to update a dictionary?

Any advice, guidance, or recommendations will be appreciated.

Best regards,
Rob Rybczynski

Régis Behmo

unread,
Jan 7, 2018, 10:48:13 AM1/7/18
to pyhyphen
@Robert: I managed to reproduce your issue:

    >>> import hyphen
    >>> hyphen.Hyphenator().syllables('outer')
    ['outer']

Unless I'm mistaken, the last syllable of a word will always 3 characters or more. This is due to the "RIGHTHYPHENMIN 3" line at the top of the the pyhyphen/hyph_en_US.dic file. If you change it to "RIGHTHYPHENMIN 1" then you get the desired result:

    >>> import hyphen
    >>> hyphen.Hyphenator().syllables('outer')
    ['out', 'er']
Reply all
Reply to author
Forward
0 new messages