Slim comparisons

Thursday 26 January 2012 — This is over 13 years old. Be careful.

Hanging out in the #python IRC channel today, I learned something new about Python comparisons. It isn’t so much a new detail of the language, as a way to make use of a detail, a clever technique that I hadn’t seen before.

When defining a class, it’s often useful to define an equality comparison so that instances of your class can be considered equal. For example, in an object with three attributes, the typical way to define __eq__ is like this:

class Thing(object):
    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = c

    def __eq__(self, other):
        print "Comparing %r and %r" % (self, other)
        return (
            self.a == other.a and
            self.b == other.b and
            self.c == other.c
            )

When run, it shows what happens:

>>> x = Thing(1, 2, 3)
>>> y = Thing(1, 2, 3)
>>> print x == y
Comparing <Thing 37088896> and <Thing 37088952>
True

Here the __eq__ method compares the three attributes directly on the self and other objects, and returns a boolean, a simple direct comparison.

But on IRC, a different technique was proposed:

class Thing(object):
    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = c

    def __eq__(self, other):
        print "Comparing %r and %r" % (self, other)
        return (self.a, self.b, self.c) == other

Now when we run it, something unusual happens:

>>> x = Thing(1, 2, 3)
>>> y = Thing(1, 2, 3)
>>> print x == y
Comparing <Thing 37219968> and <Thing 37220024>
Comparing <Thing 37220024> and (1, 2, 3)
True

Our __eq__ is being called twice! The first time, it’s called with two Thing objects, and it tries to compare a tuple of (1, 2, 3) to other, which is y, which is a Thing. Tuples don’t support comparison to Thing’s, so it returns NotImplemented. The == operator handles that case, and relying on the commutative nature of ==, tries swapping the two arguments. That means comparing y to (1, 2, 3), which calls our __eq__ again. Now it compares (1, 2, 3) to (1, 2, 3), which succeeds, producing the final True result.

This is an interesting technique, but I’m not sure I like it. For one thing, the code doesn’t read clearly. It’s comparing a tuple to an object, which isn’t supported. It only makes sense when you keep in mind the argument-swapping dance.

For another, it makes operations work that maybe shouldn’t:

x == (1, 2, 3)
(1, 2, 3) == x

I don’t know that I want these comparisons to succeed. It exposes internals that should be hidden. Of course, why would a caller who didn’t know the internals try a comparison like this? But things like this have a way of creeping out to bite you.

I’m glad to have a better understanding of the workings of comparisons, but I’m not sure I’ll write them like this.

Comments

Gruszczy 1:51 AM on 27 Jan 2012

This is interesting to know, that this is how comparisons work in Python. But I prefer the former version. The latter is not very clear and I doubt many people would know what happens behind the scene, to understand it quickly.

Nick Coghlan 2:11 AM on 27 Jan 2012

One small clarification: the "NotImplemented" singleton is not an exception. Instead, it gets *returned* from the __eq__ call. (NotImplementedError *is* an exception, but it plays no part in comparisons, or any other binary operations)

That doesn't change your overall point, though: the shorthand version is a bad idea because it doesn't express the intent clearly and allows comparisons that should trigger an exception.

To compare a group of attributes, it *can* be convenient to write it like this, though:

attrs = "a b c".split()
return all(getattr(self, x) == getattr(other, x) for x in attrs)

Michał Kwiatkowski 3:59 AM on 27 Jan 2012

I agree, it goes against Python's "prefer explicit over implicit" principle. And if you want your objects to be only equal to themselves, why make them also equal to tuples? It'd be a bug waiting to happen.

Ned Batchelder 7:02 AM on 27 Jan 2012

@Nick, thanks for the exception/singleton clarification, I've fixed the text. And the shorthand for comparing a number of attributes is very nice.

Ed Davies 7:03 AM on 27 Jan 2012

If you're not happy for a Thing to be equal to a three element tuple why are you happy for it to be equal to some arbitrary object which happens to have 'a', 'b' and 'c' attributes (and maybe 'd' and 'e' attributes which are not looked at)? It's difficult to know where to draw the line with this duck-typing stuff.

I think that if other is not a Thing then __eq__ should return False immediately. The interesting (i.e., difficult) case is when Thing is in a class derived from Thing but that would be a bit of a digression.

Ned Batchelder 7:51 AM on 27 Jan 2012

@Ed: you are right, this is one area where even Python developers are forced to confront is-instance questions. What should Thing.__eq__ insist about "other"? I left out that whole topic to focus on the comparison check itself, but it is also thorny.

Nick Coghlan 5:39 PM on 29 Jan 2012

The "What counts as quacking like a duck?" question is actually the problem Abstract Base Classes are designed to help with - because they support explicit registration, you can use them in isinstance() checks without excessively constraining the types you allow, or inadvertently accepting things you don't want.

Of course, ABCs themselves can be duck-typed according to appropriate protocols (e.g. "isinstance(obj, collections.Hashable)" will accept anything with a __hash__ method, whether it is explicitly registered or not).

Max Moroz 3:38 PM on 11 Mar 2012

@Ed, @Nick: When the two arguments are from different classes (even class/subclass, or registered with the same ABC), I don't see how it would be safe to allow equality to return True. In fact, I would even raise an exception, rather than return False. I don't know why Python's built-in classes just return False - isn't it dangerous?

After all, even (1, 2, 3) != [1,2,3]. So equality must imply really similar behavior, not just similar contents. And clearly, two different classes do not, normally, behave the same. Furthermore, if you allow comparison between different classes, how do you ensure the commutative property (even with subclasses of the same class)?

What are the use cases of equality between different classes?

As to duck typing, I never fully bought into the idea, since name collisions (like the ones in your example) are so unpredictable and so hard to debug. And even with duck typing, shouldn't it be for the methods only, not attributes?

Slim comparisons

Comments

Add a comment: