hyphenate

Hyphenate.py implements Frank Liang’s hyphenation algorithm (the one used in TeX) in Python.

This module provides a single function to hyphenate words. hyphenate_word takes a string (the word), and returns a list of parts that can be separated by hyphens:

>>> hyphenate_word("hyphenation")
['hy', 'phen', 'ation']
>>> hyphenate_word("supercalifragilisticexpialidocious")
['su', 'per', 'cal', 'ifrag', 'ilis', 'tic', 'ex', 'pi', 'ali', 'do', 'cious']
>>> hyphenate_word("project")
['project']

This Python code is in the public domain.

The module as provided only hyphenates English words, but if you can find TeX hyphenation patterns for another language (and can deal with the character encoding issues you’ll encounter in them), the same algorithm will work for other languages.

The Liang algorithm does not provide all possible hyphenation points. It merely tries to provide some of them, without providing any wrong ones. So the set of breaks from hyphenate.py will be a subset of the full set of break points.

Download: hyphenate.py

See also

Comments

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.