Sunday 19 October 2008 — This is 16 years old. Be careful.
I was refactoring some code today, and couldn’t figure out why my simple textbook change was making code break, until I relearned a subtlety of Python.
The code looked like this:
def __eq__(self, other):
# Compare the parts, but be clever when comparing the specs.
if self.__class__ != other.__class__:
return False
return self.spec.make_div_only() == other.spec.make_div_only() and self.chunks == other.chunks
I didn’t like that long last line, so I made the simple change to this:
def __eq__(self, other):
# Compare the parts, but be clever when comparing the specs.
if self.__class__ != other.__class__:
return False
if self.spec.make_div_only() != other.spec.make_div_only():
return False
return self.chunks == other.chunks
and ran my unit tests, and they failed. What!? How could doing a simple
boolean refactoring cause breakage? I scratched my head, re-read the lines,
questioned my understanding of the short-circuit nature of the and
operator, and so on. The usual “am I going crazy?” debugging techniques.
Undoing the refactor made the tests work again, I changed it again to the
shorter lines, and the tests failed again. Adding some print statements to see
the actual values being compared, I realized that the result of make_div_only
is an object (of class Spec), and that object defines its own Spec.__eq__
method
to define the meaning of the ==
operator for its instances.
Then it hit me: the class doesn’t define a __ne__
method.
My refactoring changed the operator from == to !=, the first was overridden to
provide custom semantics, but the second was not, so simple object inequality was
being checked, so make_div_only
inequality test was always
true, and the method incorrectly always returned False.
The Python docs are clear on this point:
There are no implied relationships among the comparison operators. The truth of x==y does not imply that x!=y is false. Accordingly, when defining __eq__(), one should also define __ne__() so that the operators will behave as expected.
Adding a __ne__
method to my Spec class made everything
work properly:
def __ne__(self, other):
return not self.__eq__(other)
Comments
Of course, someone who really liked to override __ne__ to mean something different would have no problem doing it, apart from the users of his class coming after him with torches and pitchforks.
So where is the upside of not defaulting __ne__ to the right thing?
You might want your class's comparison definition to work nicely with subclasses or unrelated classes which can compare themselves to it. If so, take a look at http://jcalderone.livejournal.com/32837.html.
I am pretty sure that one can in fact default things so that the functions are chained correctly (the default for __eq__ could look inside the class if the __ne__ function exists, and if so, return the inverse and __ne__ vice versa).
But even if the implementor did not want to go this extra mile (for whatever reason), he could assume that someone writing a class would implement equality first, and default __ne__ accordingly, thus removing a potential source for errors with little effort.
Same thing goes for __lt__ and the rest. Normally, people may implement __lt__ (and maybe __gt__), but I'd guess that __le__ is very seldom implemented, so it could default to not __gt__ (if it exists) or __lt__ or __eq__.
At least, that's how I read http://www.python.org/doc/2.5.2/ref/customization.html
But like I said, I don't "do" python so there's probably something I've overlooked.
"These are the so-called “rich comparison” methods, and are called for comparison operators in preference to __cmp__() below"
I've been using a class decorator based around this total_ordering decorator to synthesize rich comparison methods given just one.
Terseness is obvious, readability might vary between individuals, but the main fact is that conditional branches ("if statements") are classic points where mistakes happen, because you have to do one higher level of reasoning about where the program counter is - and classically people make mistakes there. (The problem you exhibit here wouldn't have been fixed this way but other less subtle ones would. The typical result is a return statement that you can't in practice reach...)
But this is a tiny detail :-D not worth fussing about in the large scale of things.
http://docs.python.org/dev/3.0/whatsnew/3.0.html#other-language-changes
Add a comment: