Monday 9 July 2007

At work, I needed to hyphenate words (so that Tabblo Print Toolkit could do a nicer job of wrapping text into narrow columns). Knuth’s TeX system has long had a competent hyphenation algorithm, created by Frank Liang. It’s described in Appendix H of the TeXbook, and it’s actually really simple. It’s driven from a dictionary of word fragments, so all of the hairy special cases are kept out of the code.

There were already implementations in a number of languages, but none in Python that I could find. So I wrote one: hyphenate.py.



Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.