Why your mock doesn’t work

Friday 2 August 2019

Mocking is a powerful technique for isolating tests from undesired interactions among components. But often people find their mock isn’t taking effect, and it’s not clear why. Hopefully this explanation will clear things up.

BTW: it’s really easy to over-use mocking. These are good explanations of alternative approaches:

A quick aside about assignment

Before we get to fancy stuff like mocks, I want to review a little bit about Python assignment. You may already know this, but bear with me. Everything that follows is going to be directly related to this simple example.

Variables in Python are names that refer to values. If we assign a second name, the names don’t refer to each other, they both refer to the same value. If one of the names is then assigned again, the other name isn’t affected:

x23x = 23xy23y = xxy1223x = 12

If this is unfamiliar to you, or you just want to look at more pictures like this, Python Names and Values goes into much more depth about the semantics of Python assignment.

Importing

Let’s say we have a simple module like this:

# mod.py

val = "original"

def update_val():
    global val
    val = "updated"

We want to use val from this module, and also call update_val to change val. There are two ways we could try to do it. At first glance, it seems like they would do the same thing.

The first version imports the names we want, and uses them:

# code1.py

from mod import val, update_val

print(val)
update_val()
print(val)

The second version imports the module, and uses the names as attributes on the module object:

# code2.py

import mod

print(mod.val)
mod.update_val()
print(mod.val)

This seems like a subtle distinction, almost a stylistic choice. But code1.py prints “original original”: the value hasn’t changed! Code2.py does what we expected: it prints “original updated.” Why the difference?

Let’s look at code1.py more closely:

# code1.py

from mod import val, update_val

print(val)
update_val()
print(val)

After “from mod import val”, when we first print val, we have this:

mod.pyval‘original’code1.pyval

“from mod import val” means, import mod, and then do the assignment “val = mod.val”. This makes our name val refer to the same object as mod’s name val.

After “update_val()”, when we print val again, our world looks like this:

mod.pyval‘original’‘updated’code1.pyval

update_val has reassigned mod’s val, but that has no effect on our val. This is the same behavior as our x and y example, but with imports instead of more obvious assignments. In code1.py, “from mod import val” is an assignment from mod.val to val, and works exactly like “y = x” does. Later assignments to mod.val don’t affect our val, just as later assignments to x don’t affect y.

Now let’s look at code2.py again:

# code2.py

import mod

print(mod.val)
mod.update_val()
print(mod.val)

The “import mod” statement means, make my name mod refer to the entire mod module. Accessing mod.val will reach into the mod module, find its val name, and use its value.

mod.pyval‘original’code2.pymod

Then after “update_val()”, mod’s name val has been changed:

mod.pyval‘original’‘updated’code2.pymod

Now we print mod.val again, and see its updated value, just as we expected.

OK, but what about mocks?

Mocking is a fancy kind of assignment: replace an object (or function) with a different one. We’ll use the mock.patch function in a with statement. It makes a mock object, assigns it to the name given, and then restores the original value at the end of the with statement.

Let’s consider this (very roughly sketched) product code and test:

# product.py

from os import listdir

def my_function():
    files = listdir(some_directory)
    # ... use the file names ...
# test.py

def test_it():
    with mock.patch("os.listdir") as listdir:
        listdir.return_value = ['a.txt', 'b.txt', 'c.txt']
        my_function()

After we’ve imported product.py, both the os module and product.py have a name “listdir” which refers to the built-in listdir() function. The references look like this:

os modulelistdirlistdir()product.pylistdir

The mock.patch in our test is really just a fancy assignment to the name “os.listdir”. During the test, the references look like this:

os modulelistdirlistdir()mock!product.pylistdir

You can see why the mock doesn’t work: we’re mocking something, but it’s not the thing our product code is going to call. This situation is exactly analogous to our code1.py example from earlier.

You might be thinking, “ok, so let’s do that code2.py thing to make it work!” If we do, it will work. Your product code and test will now look like this (the test code is unchanged):

# product.py

import os

def my_function():
    files = os.listdir(some_directory)
    # ... use the file names ...
# test.py

def test_it():
    with mock.patch("os.listdir") as listdir:
        listdir.return_value = ['a.txt', 'b.txt', 'c.txt']
        my_function()

When the test is run, the references look like this:

os modulelistdirlistdir()mock!product.pyos

Because the product code refers to the os module, changing the name in the module is enough to affect the product code.

But there’s still a problem: this will mock that function for any module using it. This might be a more widespread effect than you intended. Perhaps your product code also calls some helpers, which also need to list files. The helpers might end up using your mock (depending how they imported os.listdir!), which isn’t what you wanted.

Mock it where it’s used

The best approach to mocking is to mock the object where it is used, not where it is defined. Your product and test code will look like this:

# product.py

from os import listdir

def my_function():
    files = listdir(some_directory)
    # ... use the file names ...
# test.py

def test_it():
    with mock.patch("product.listdir") as listdir:
        listdir.return_value = False
        my_function()

The only difference here from our first try is that we mock “product.listdir”, not “os.listdir”. That seems odd, because listdir isn’t defined in product.py. That’s fine, the name “listdir” is in both the os module and in product.py, and they are both references to the thing you want to mock. Neither is a more real name than the other.

By mocking where the object is used, we have tighter control over what callers are affected. Since we only want product.py’s behavior to change, we mock the name in product.py. This also makes the test more clearly tied to product.py.

As before, our references look like this once product.py has been fully imported:

os modulelistdirlistdir()product.pylistdir

The difference now is how the mock changes things. During the test, our references look like this:

os modulelistdirlistdir()product.pylistdirmock!

The code in product.py will use the mock, and no other code will. Just what we wanted!

Is this OK?

At this point, you might be concerned: it seems like mocking is kind of delicate. Notice that even with our last example, how we create the mock depends on something as arbitrary as how we imported the function. If our code had “import os” at the top, we wouldn’t have been able to create our mock properly. This is something that could be changed in a refactoring, but at least mock.patch will fail in that case.

You are right to be concerned: mocking is delicate. It depends on implementation details of the product code to construct the test. There are many reasons to be wary of mocks, and there are other approaches to solving the problems of isolating your product code from problematic dependencies.

If you do use mocks, at least now you know how to make them work, but again, there are other approaches. See the links at the top of this page.

Comments

[gravatar]
Shawn Tolidano 3:32 AM on 5 Aug 2019

if you have a module like this:

thing.py:
import os

You can:
@mock.patch(“thing.os.getcwd”)

And I believe it will work (Python 3.7+ anyway)

[gravatar]
Alan Franzoni 8:44 AM on 5 Aug 2019

Hello Ned,
I think your problem is not with mocking. It's monkey patching. Time to start using explicit dependency injection, so you don't depend on irrelevant implementation details.

What if I refactor my code to use os.walk instead of using os.listdir? If my test fails, it's a bad test.

Also: I think your example is a bit off. If I'm using os.listdir, presumably your code is doing something with the filesystem, and it's something you SHOULD definitely test. You should, instead, be able to inject someway a root directory where filesystem-related things happen, and inject a dedicated, possibly temporary, directory in your tests, so that you can a) control the state and b) verify what happens.

[gravatar]
Ned Batchelder 10:32 AM on 5 Aug 2019

You are right, there are definitely more disciplined approaches, like the ones I linked to at the top of the piece.

[gravatar]
MrBean Bremen 11:53 AM on 9 Aug 2019

As a side note: for mocking filesystem functions you can use pyfakefs (from pypi or GitHub), which will mock listdir for usages like:

from os import listdir
import os
from os import listdir as mylistdir
Disclaimer: I'm a collaborator on that project.

[gravatar]
Philippe Bourgau 7:48 AM on 13 Aug 2019

Thanks for your very clear explanation of variable assignments in Python. Illustrating it made it great.

Another problem can happen if you use the same dependency in different ways in the same module, in different functions for example. From experience, I find it easier to explicitly inject dependencies as function arguments. Even better, we can write most of the code in a functional style (~ immutable and without side effects) and move the side effects to a small section of the codebase. This makes mocking unnecessary for most of the tests. Which in turn makes the tests more maintainable. I am definitely in the camp that tries to avoid mocks as much as possible. I wrote quite a few articles about mocking (https://philippe.bourgau.net/how-immutable-value-objects-fight-mocks/)

Thanks again for your post.

[gravatar]
Wojtek 11:22 AM on 14 Aug 2019

I love the way you have visualised the variable assigements and how you can use the mechanism to create test doubles.

I wanted to also point out that your implicit definition of a mock is narrower than the generally accepted one. I beleive you have actually described a stub created by monkey-patching. A mock allows for call verifications as well.

There are 3 main categories of techniques for managing dependent components used these days:

1. In-process class/method/function mocks or stubs http://xunitpatterns.com/Mocks,%20Fakes,%20Stubs%20and%20Dummies.html
1a. By monkey patching
2a. By dependency injection
2. Over-the-wire API mocks or stubs
3. Virtual services/simulators
https://en.wikipedia.org/wiki/Comparison_of_API_simulation_tools

It's worth keeping in mind that all of them are part of a wider group of test doubles: https://www.infoq.com/articles/stubbing-mocking-service-virtualization-differences/

Other options available for decoupling from test dependencies:
1. In-memory database
2. Test container
3. Legacy in a box

[gravatar]
Eric Delvalet 8:33 AM on 22 Aug 2019

I am still new to python so I am still slowly digesting the whole post but it looks really useful so far. Thanks to you for taking the time. Your other post on assignment was a life saver to me.
So, half way through, did I guess right that you forgot "import product" at the start of your test.py files ? You kind of mention it in the text.

[gravatar]
Ned Batchelder 10:11 AM on 22 Aug 2019

Well, not forgot... I tend to leave out obvious imports for the sake of brevity. The import in product.py is the subject of the piece, so it's included but "import mock", "import os", and "import product" are all omitted in test.py to keep it short. I guess it's short enough even with them, so I didn't need to skip them.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.