At work, I needed to hyphenate words (so that Tabblo Print Toolkit could do a nicer job of wrapping text into narrow columns). Knuth's TeX system has long had a competent hyphenation algorithm, created by Frank Liang. It's described in Appendix H of the TeXbook, and it's actually really simple. It's driven from a dictionary of word fragments, so all of the hairy special cases are kept out of the code.

There were already implementations in a number of languages, but none in Python that I could find. So I wrote one: hyphenate.py.

Enjoy.

tagged: , , » react

Comments

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.