Verbose Python regular expressions

Thursday 10 April 2003

Latest Python tidbit: the re module has an option to write regular expressions in re.VERBOSE format. This means that whitespace can be used to layout the regular expression in a more readable style, and comments can be included with hash marks.

For example, this regular expression:

logFmt = '\[[0-9]{8}T[0-9]{6}\.[0-9]{3}Z:[0-9](/[0-9]*)?\][ ]*.*'
logFmtRe = re.compile(logFmt)

becomes:

logFmt = '''
    \[
    [0-9]{8}T[0-9]{6}\.[0-9]{3}Z          # the date
    :[0-9]                                # the severity
    (/[0-9]*)?                            # a possible facility
    \]
    [ ]*.*                                # the message
'''

logFmtRe = re.compile(logFmt, re.VERBOSE)

Admittedly, regular expressions are pretty dense no matter what you do, but at least this way you can try to pull them apart a little for future readers of the code (which includes yourself starting tomorrow).

Comments

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.