Trace function debugging challenge

Tuesday 2 March 2010 — This is more than 15 years old. Be careful.

One of the more challenging aspects of maintaining a tool like coverage.py is that people use it on complex code bases, and something goes wrong, they report it to me, and I have to dig in and figure out why.

The latest head-scratcher was reported by Christophe Zwerschke as issue 51 on the coverage.py bitbucket tracker. After a few turns to shake out what the issue was, it came down to this: Christophe’s code gets the right answer when run without coverage.py, or when run one way with coverage.py, but computes a different answer when run a second way with coverage.py. This is a real mystery, because it is the first report of code actually behaving differently because coverage.py is measuring it.

This has all the markings of a bug in Python, but it could be my fault. In any case, we have to find out. I won’t be able to get to it for a bit, and why should I have all the fun anyway? So I’m crowd-sourcing it here: maybe a reader will have an insight into what the heck is going on here.

Here is the file, bug51.py (slightly simplified from the ticket):

class Foo:
    def __init__(self, name):
        self.name = name

    def __repr__(self):
        return "<Foo %r>" % self.name


class MetaFooList(type):
    def __new__(mcs, cls_name, bases, cls_dict):
        declared = []
        for base in bases:
            declared.extend(getattr(base, 'declared', []))
        for name, value in cls_dict.items():
            if isinstance(value, Foo):
                declared.append(value)
        declared.sort(key=lambda w: w.name)
        cls = type.__new__(mcs, cls_name, bases, cls_dict)
        cls.declared = declared
        return cls


class FooList(list):
    __metaclass__ = MetaFooList

    def __init__(self):
        super(FooList, self).__init__(self.declared)


def test_foolist():
    w = Foo(name="foo")

    class W(FooList):
        foo = w

    w2 = Foo(name="bar")

    class W2(W):
        bar = w2

    foolist = W2()
    print "foolist has %d entries" % len(foolist)

if __name__ == '__main__':
    test_foolist()

Here are the three runs:

$ python bug51.py
foolist has 2 entries

$ coverage run bug51.py
foolist has 2 entries

$ coverage run --timid bug51.py
foolist has 4 entries

An explanation about that last run: the --timid switch forces coverage.py to use a trace function written in Python rather than its fancier one written in C. Ironically, I added the switch as a way to use a gentler tracing mechanism that wouldn’t interfere so much with other packages using a trace function, to prevent bizarre problems. But now, it seems to be the source of a real problem itself.

Trace functions can do strange things that can affect the running program, but the ones in coverage.py don’t, or at least they aren’t supposed to. This test code has plenty of twisty turns, but still, how can it get different answers from two implementations of the same trace function?

Anyone up for the challenge of figuring out what’s going on here? If not, I’ll get to it eventually, and report back.

Comments

kumasama 9:41 PM on 2 Mar 2010

Sorry, but:

U:\>python --version
Python 2.6.2

U:\>coverage --version
Coverage.py, version 3.3. http://nedbatchelder.com/code/coverage

U:\>coverage run bug51.py
foolist has 2 entries

U:\>coverage run --timid bug51.py
foolist has 2 entries

U:\>

jens persson 12:07 AM on 3 Mar 2010

As kumasama said: Sorry :-)

$ python --version
Python 2.6.4
$ coverage --version
Coverage.py, version 3.2. http://nedbatchelder.com/code/coverage
$ coverage run bug51.py
foolist has 2 entries
$ coverage run --timid bug51.py
foolist has 2 entries

This is running om Ubuntu 9.10 in a virtualenv.

Ned Batchelder 7:36 AM on 3 Mar 2010

Hey, no reason to be sorry! That's a good clue. In fact, on Python 2.4, I get:

$ python bug51.py
foolist has 2 entries
$ coverage run bug51.py
foolist has 4 entries
$ coverage run --timid bug51.py
foolist has 4 entries

So on 2.4, coverage always gets the wrong answer, on 2.5, only timid gets the wrong answer, and on 2.6, it always gets the right answer.

This points more and more to a Python issue...

Paul Ollis 8:32 AM on 3 Mar 2010

It is a Python issue. Add this code:

    def tracer(*args):
        return None
    import sys
    sys.settrace(tracer)

Then you get this:

    $ python2.5 ./bug51.py 
    foolist has 4 entries

With this trivial trace function in place then if you look at cls_dict being passed to __new__ then you will see that it contains the variable 'w', from test_foolist.

John Eikenberry 9:25 AM on 3 Mar 2010

I ran into pretty much this same bug a while back. Here is my simple test case;

def test():                   
    class Foo(object):
        pass
    class Bar(object):
        def foo(self):
            return Foo()

    print 'This should be False:',
    print Bar.__dict__.has_key('Foo')


if __name__ == '__main__':
    import sys
    sys.settrace(lambda f,w,j:None)
    test()
    sys.settrace(None)
    test()

Run using python2.4 or python2.5 this produces the wrong result where python2.6 has the correct result. I had written it up as a bug that is already fixed in 2.6.

[~]python2.5 settrace-sideeffect.py
This should be False: True
This should be False: False
[~]python2.6 settrace-sideeffect.py
This should be False: False
This should be False: False

Christoph Zwerschke 3:44 PM on 3 Mar 2010

Thanks for the quick replies. So Ned's hunch of a Python bug turned out to be true. John is right, I found the bug report (issue1569356). It's already fixed in 2.6, but will not (cannot) be fixed in 2.4 and 2.5.

Since this kind of code is often used in unit tests, I was thinking about a workaround. The following seems to work, any better ideas?

def test():
    
    class Test:
        class Foo(object):
            pass
        class Bar(object):
            def foo(self):
                return Foo()

    print 'This should be False:',
    print Test.Bar.__dict__.has_key('Foo')

This quirk should also be mentioned somewhere in the coverage documentation or FAQ as a known issue.

Trace function debugging challenge

Comments

Add a comment: