Last weekend I released a version of Coverage.py for Python 3.x. Getting to that point took a while because 3.x was new to me, and, it seems, everyone is still figuring out how to support it.

I experimented with using 2to3 to create my 3.x code from my 2.x code base, and that worked really well, see Coverage.py on Python 3.x for some details. For a while, I developed like this, with 3.x code translated by 2to3 so that I could run the tests under Python 3.1. But then I had to figure out how to package it.

I didn't want to have to create a separate package in PyPI for the 3.x support. I tried for a while to make one source package with two distinct trees of code in it, but I never got setup.py to be comfortable with that. Setup.py is run during kitting, and building, and installation, and the logic to get it to pick the right tree at all times became twisted and confusing.

(As an aside, setuptools has forked to become Distribute, and they've just released their Python 3 support which includes being able to run 2to3 as part of build and install. That may have been a way to go, but I didn't know it at the time.)

Something, I forget what, made me consider having one source tree that ran on both Python 2 and Python 3. When I looked at the changes 2to3 was making, it seemed doable. I adapted my code to a 2-and-3 idiomatic style, and now the source runs on both.

Changes I had to make:

¶   I already had a file called backward.py that defined 2.5 stuff for 2.3, now I also used it to deal with import differences between 2 and 3. For example:

try:
    from cStringIO import StringIO
except ImportError:
    from io import StringIO

and then in another file:

from backward import StringIO

¶   exec changed from a statement to a function. Syntax changes like this are the hardest to deal with because code won't even compile if the syntax is wrong. For the exec issue, I used this (perhaps too) clever conditional code:

# Exec is a statement in Py2, a function in Py3

if sys.hexversion > 0x03000000:
    def exec_function(source, filename, global_map):
        """A wrapper around exec()."""
        exec(compile(source, filename, "exec"), global_map)
else:
    # OK, this is pretty gross.  In Py2, exec was a statement, but that will
    # be a syntax error if we try to put it in a Py3 file, even if it isn't
    # executed.  So hide it inside an evaluated string literal instead.
    eval(compile("""\
def exec_function(source, filename, global_map):
    exec compile(source, filename, "exec") in global_map
""",
    "<exec_function>", "exec"
    ))

¶   All print statements have to adopt an ambiguous print(s) syntax. The string to be printed has to be a single string, so some comma-separated lists turned into formatted strings.

¶   2to3 is obsessive about converting any d.keys() use into list(d.keys()), since keys returns a dictionary view object. If the dict isn't being modified, you can just loop over it without the list(), but in a few places, I really was returning a list, so I included the list() call.

¶   A few 2to3 changes are fine to run on both, so these:

d.has_key(k)
d.itervalues()
callable(o)
xrange(limit)

became:

k in d
d.values()
hasattr(o, '__call__')
range(limit)

¶   Exception handling has changed when you want to get a reference to the exception. This is one of those syntax differences, and it's structural, so a tricky function definition isn't going to bridge the gap.

Where Python 2 had this:

try:
    # .. blah blah ..
except SomeErrorClass, err:
    # use err

now Python 3 wants:

try:
    # .. blah blah ..
except SomeErrorClass as err:
    # use err

The only way to make both versions of Python happy is to use the more cumbersome:

try:
    # .. blah blah ..
except SomeErrorClass:
    _, err, _ = sys.exc_info()
    # use err

This is uglier, but there were only a few places I needed it, so it's not too bad.

¶   Simple imports are relative or absolute in Python 2, but only absolute in Python 3. The new relative import syntax in Python 3 won't compile in Python 2, so I can't use it. I was only using relative imports in my test modules, so I used this hack to make them work:

sys.path.insert(0, os.path.split(__file__)[0]) # Force relative import 
from myotherfile import MyClass

By explicitly adding the current directory to the path, Python 3's absolute-only importer would find the file alongside this one in the current directory.

¶   One area that still tangles me up is str/unicode and bytes/str. Python 3 is making a good change here, but it feels like we're still in transition. The docs aren't always clear on what will be returned, and trying to get the same code to do the right thing under both versions still seems to require experiments with decode and encode.

After making all of these changes, I had a single code base that ran on both Python versions, without too much strangeness. It's way better than having to maintain two packages at PyPI, or trying to trick setup.py into installing different code on different versions.

Others have written about the same challenge:

tagged: » 11 reactions

Comments

[gravatar]
andrew 9:35 AM on 3 Oct 2009

I don't even know py (even though I am dying to find a reason to learn it), but I love reading this type of code dorkage.

[gravatar]
ulrik 9:50 AM on 3 Oct 2009

The exec example makes it clear that there's always a solution, no matter how dirty. I for sure hope that this dirtyness won't be prevalent in all future python code. I have only experimented little with such cross-platformness myself, for a very simple module (see link in my name in signature).

[gravatar]
Ned Batchelder 10:07 AM on 3 Oct 2009

The good news about the dirtiness is that it's mostly encapsulated in backward.py.

[gravatar]
Robert Brewer 10:36 AM on 3 Oct 2009

Instead of the code-in-a-string approach, you might consider deploying separate modules for Py2 and P3, and only importing the one appropriate for the current platform inside your 'backward' module. That approach worked fairly well for CherryPy back when we wanted to test decorator syntax, but not require it.

[gravatar]
Ned Batchelder 11:08 AM on 3 Oct 2009

@Robert: you are right, that's a technique I hadn't considered. In the case of backward.py, it started with a few 2.3-to-2.4 and 2.3-to-2.5 compatibility imports, so I was used to using it for everything. If the exec hack grows to encompass more ugliness, I'll think about separate modules.

[gravatar]
Andre 3:50 AM on 5 Oct 2009

@Ned: I would also do the following:

if 'xrange' not in dir(__builtins__):
  xrange = range
This way you can use xrange on 2 and 3 and stop creating a bunch of lists. A similar solution might exist for itervalues.

[gravatar]
Ned Batchelder 5:30 AM on 5 Oct 2009

@Andre, that's a good idea. The place where I replaced xrange with range I knew was going to be a very small list, so I just used range, but avoiding a list creation is good even for small lists.

[gravatar]
Andre 9:28 AM on 7 Oct 2009

Even better, Fabio shows a way to find out if we have a builtin without creating a key list with dir() - a solution that, in my opinion is more pythonic:

try:
   xrange = xrange
except:
   xrange = range

[gravatar]
Eli 1:09 AM on 19 May 2010

Ned, in the exec_function code sample you're missing two closing parens at the end.

Thanks for the great resource

[gravatar]
Ned Batchelder 7:30 AM on 19 May 2010

Thanks, Eli, I've fixed it.

[gravatar]
Paul McGuire 10:30 AM on 21 Jun 2010

I *never* liked xrange, it was just such an obvious hacked up form of range to provide what range should have done in the first place. Since Py3 finally cleans this up, why perpetuate xrange's ugly legacy? In my Py2-3 cross code, I define range = xrange in the Py2 code, and just use range throughout. Good riddance, xrange!

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.