Why your mock doesn’t work

Friday 2 August 2019

Mocking is a powerful technique for isolating tests from undesired interactions among components. But often people find their mock isn’t taking effect, and it’s not clear why. Hopefully this explanation will clear things up.

BTW: it’s really easy to over-use mocking. These are good explanations of alternative approaches:

A quick aside about assignment

Before we get to fancy stuff like mocks, I want to review a little bit about Python assignment. You may already know this, but bear with me. Everything that follows is going to be directly related to this simple example.

Variables in Python are names that refer to values. If we assign a second name, the names don’t refer to each other, they both refer to the same value. If one of the names is then assigned again, the other name isn’t affected:

x23x = 23xy23y = xxy1223x = 12

If this is unfamiliar to you, or you just want to look at more pictures like this, Python Names and Values goes into much more depth about the semantics of Python assignment.

Importing

Let’s say we have a simple module like this:

# mod.py

val = "original"

def update_val():
    global val
    val = "updated"

We want to use val from this module, and also call update_val to change val. There are two ways we could try to do it. At first glance, it seems like they would do the same thing.

The first version imports the names we want, and uses them:

# code1.py

from mod import val, update_val

print(val)
update_val()
print(val)

The second version imports the module, and uses the names as attributes on the module object:

# code2.py

import mod

print(mod.val)
mod.update_val()
print(mod.val)

This seems like a subtle distinction, almost a stylistic choice. But code1.py prints “original original”: the value hasn’t changed! Code2.py does what we expected: it prints “original updated.” Why the difference?

Let’s look at code1.py more closely:

# code1.py

from mod import val, update_val

print(val)
update_val()
print(val)

After “from mod import val”, when we first print val, we have this:

mod.pyval‘original’code1.pyval

“from mod import val” means, import mod, and then do the assignment “val = mod.val”. This makes our name val refer to the same object as mod’s name val.

After “update_val()”, when we print val again, our world looks like this:

mod.pyval‘original’‘updated’code1.pyval

update_val has reassigned mod’s val, but that has no effect on our val. This is the same behavior as our x and y example, but with imports instead of more obvious assignments. In code1.py, “from mod import val” is an assignment from mod.val to val, and works exactly like “y = x” does. Later assignments to mod.val don’t affect our val, just as later assignments to x don’t affect y.

Now let’s look at code2.py again:

# code2.py

import mod

print(mod.val)
mod.update_val()
print(mod.val)

The “import mod” statement means, make my name mod refer to the entire mod module. Accessing mod.val will reach into the mod module, find its val name, and use its value.

mod.pyval‘original’code2.pymod

Then after “update_val()”, mod’s name val has been changed:

mod.pyval‘original’‘updated’code2.pymod

Now we print mod.val again, and see its updated value, just as we expected.

OK, but what about mocks?

Mocking is a fancy kind of assignment: replace an object (or function) with a different one. We’ll use the mock.patch function in a with statement. It makes a mock object, assigns it to the name given, and then restores the original value at the end of the with statement.

Let’s consider this (very roughly sketched) product code and test:

# product.py

from os import listdir

def my_function():
    files = listdir(some_directory)
    # ... use the file names ...
# test.py

def test_it():
    with mock.patch("os.listdir") as listdir:
        listdir.return_value = ['a.txt', 'b.txt', 'c.txt']
        my_function()

After we’ve imported product.py, both the os module and product.py have a name “listdir” which refers to the built-in listdir() function. The references look like this:

os modulelistdirlistdir()product.pylistdir

The mock.patch in our test is really just a fancy assignment to the name “os.listdir”. During the test, the references look like this:

os modulelistdirlistdir()mock!product.pylistdir

You can see why the mock doesn’t work: we’re mocking something, but it’s not the thing our product code is going to call. This situation is exactly analogous to our code1.py example from earlier.

You might be thinking, “ok, so let’s do that code2.py thing to make it work!” If we do, it will work. Your product code and test will now look like this (the test code is unchanged):

# product.py

import os

def my_function():
    files = os.listdir(some_directory)
    # ... use the file names ...
# test.py

def test_it():
    with mock.patch("os.listdir") as listdir:
        listdir.return_value = ['a.txt', 'b.txt', 'c.txt']
        my_function()

When the test is run, the references look like this:

os modulelistdirlistdir()mock!product.pyos

Because the product code refers to the os module, changing the name in the module is enough to affect the product code.

But there’s still a problem: this will mock that function for any module using it. This might be a more widespread effect than you intended. Perhaps your product code also calls some helpers, which also need to list files. The helpers might end up using your mock (depending how they imported os.listdir!), which isn’t what you wanted.

Mock it where it’s used

The best approach to mocking is to mock the object where it is used, not where it is defined. Your product and test code will look like this:

# product.py

from os import listdir

def my_function():
    files = listdir(some_directory)
    # ... use the file names ...
# test.py

def test_it():
    with mock.patch("product.listdir") as listdir:
        listdir.return_value = False
        my_function()

The only difference here from our first try is that we mock “product.listdir”, not “os.listdir”. That seems odd, because listdir isn’t defined in product.py. That’s fine, the name “listdir” is in both the os module and in product.py, and they are both references to the thing you want to mock. Neither is a more real name than the other.

By mocking where the object is used, we have tighter control over what callers are affected. Since we only want product.py’s behavior to change, we mock the name in product.py. This also makes the test more clearly tied to product.py.

As before, our references look like this once product.py has been fully imported:

os modulelistdirlistdir()product.pylistdir

The difference now is how the mock changes things. During the test, our references look like this:

os modulelistdirlistdir()product.pylistdirmock!

The code in product.py will use the mock, and no other code will. Just what we wanted!

Is this OK?

At this point, you might be concerned: it seems like mocking is kind of delicate. Notice that even with our last example, how we create the mock depends on something as arbitrary as how we imported the function. If our code had “import os” at the top, we wouldn’t have been able to create our mock properly. This is something that could be changed in a refactoring, but at least mock.patch will fail in that case.

You are right to be concerned: mocking is delicate. It depends on implementation details of the product code to construct the test. There are many reasons to be wary of mocks, and there are other approaches to solving the problems of isolating your product code from problematic dependencies.

If you do use mocks, at least now you know how to make them work, but again, there are other approaches. See the links at the top of this page.

Set_env.py

Sunday 21 July 2019

A good practice when writing complicated software is to put in lots of debugging code. This might be extra logging, or special modes that tweak the behavior to be more understandable, or switches to turn off some aspect of your test suite so you can focus on the part you care about at the moment.

But how do you control that debugging code? Where are the on/off switches? You don’t want to clutter your real UI with controls. A convenient option is environment variables: you can access them simply in the code, your shell has ways to turn them on and off at a variety of scopes, and they are invisible to your users.

Though if they are invisible to your users, they are also invisible to you! How do you remember what exotic options you’ve coded into your program, and how do you easily see what is set, and change what is set?

I’ve been using environment variables like this in coverage.py for years, but only recently made it easier to work with them.

To do that, I wrote set_env.py. It scans a tree of files for special comments describing environment variables, then shows you the values of those variables. You can type quick commands to change the values, and when the program is done, it updates your environment. It’s not a masterpiece of engineering, but it works for me.

As an example, this line appears in coverage.py:

# $set_env.py: COVERAGE_NO_PYTRACER - Don't run the tests under the Python tracer.

This line is found by set_env.py, so it knows that COVERAGE_NO_PYTRACER is one of the environment variables it should fiddle with.

When I run set_env.py in the coverage.py tree, I get something like this:

$ set_env
Read 298 files
 1:              COVERAGE_AST_DUMP                  Dump the AST nodes when parsing code.
 2:               COVERAGE_CONTEXT                  Set to 'test_function' for who-tests-what
 3:                 COVERAGE_DEBUG                  Options for --debug.
 4:           COVERAGE_DEBUG_CALLS                  Lots and lots of output about calls to Coverage.
 5:                COVERAGE_ENV_ID                  Use environment-specific test directories.
 6:              COVERAGE_KEEP_TMP                  Keep the temp directories made by tests.
 7:          COVERAGE_NO_CONTRACTS                  Disable PyContracts to simplify stack traces.
 8:            COVERAGE_NO_CTRACER                  Don't run the tests under the C tracer.
 9:           COVERAGE_NO_PYTRACER = '1'            Don't run the tests under the Python tracer.
10:               COVERAGE_PROFILE                  Set to use ox_profile.
11:            COVERAGE_TRACK_ARCS                  Trace every arc added while parsing code.
12:                 PYTEST_ADDOPTS                  Extra arguments to pytest.

(# [value] | x # ... | ? | q)>

All of the files were scanned, and 12 environment variables found. We can see that COVERAGE_NO_PYTRACER has the value “1”, and none of the others are in the environment. At the prompt, if I type “4”, then COVERAGE_DEBUG_CALLS (line 4) will be toggled to “1”. Type “4” again, and it is cleared. Typing “4 yes please” will set it to “yes please”, but often I just need something or nothing, so toggling “1” as the value works.

One bit of complexity here is that a program you run in your shell can’t change environment variables for subsequent programs, which is exactly what we need. So “set_env” is actually a shell alias:

alias set_env='$(set_env.py $(git ls-files))'

This runs set_env.py against all of the files checked-in to git, and then executes whatever set_env.py outputs. Naturally, set_env.py outputs shell commands to set environment variables. If ls-files produces too much output, you can use globs there also, so “**/*.py” might be useful.

Like I said, it’s not a masterpiece, but it works for me. If there are other tools out there that do similar things, I’d like to hear about them.

Coverage.py 5.0a6: context reporting

Wednesday 17 July 2019

I’ve released another alpha of coverage.py 5.0: coverage.py 5.0a6. There are some design decisions ahead that I could use feedback on.

Important backstory:

  • The big feature in 5.0 is “contexts”: recording data for varying execution context, also known as Who Tests What. The idea is to record not just that a line was executed, but also which tests ran each line.
  • Some of the changes in alpha 6 were driven by a hackathon project at work: using who-tests-what on the large Open edX codebase. We wanted to collect context information, and then for each new pull request, run only the subset of tests that touched the lines you changed. Initial experiments indicate this could be a huge time-savings.

Big changes in this alpha:

  • Support for contexts when reporting. The --show-contexts option annotates lines with the names of contexts recorded for the line. The --contexts option lets you filter the report to only certain contexts. Big thanks to Stephan Richter and Albertas Agejevas for the contribution.
  • Our largest test suite at work has 29k tests. The .coverage SQLite data file was 659Mb, which was too large to work with. I changed the database format to use a compact bitmap representation for line numbers, which reduced the data file to 69Mb, a huge win.
  • The API to the CoverageData object has changed.

Some implications of these changes:

  • The HTML reporting on contexts is good for small test suites, but very quickly becomes unwieldy if you have more than 100 tests. Please try using it and let me know what kind of reporting would be helpful.
  • The new more-compact data file is harder to query. The larger data file has a schema designed to be useful for ad-hoc querying. It was a classic third-normal form representation of the data. Now I consider the database schema to be a private implementation detail. Should we have a new “coverage sql” report command that exports the data to a convenient SQLite file?
  • Because CoverageData has changed, you will need an updated version of pytest-cov if you use that plugin. The future of the plugin is somewhat up in the air. If you would like to help maintain it, get in touch. You can install the up-to-date code with:
    pip install git+https://github.com/nedbat/pytest-cov.git@nedbat/cov5-combine#egg=pytest-cov==0.0
  • To support our hackathon project, we wrote a new pytest plugin: it uses pytest hooks to indicate the test boundaries, and can read the database and the code diff to choose the subset of tests to run. This plugin is in very rough shape (as in, it hasn’t yet fully worked), but if you are interested in participating in this experiment, get in touch. The code is here nedbat/coverage_pytest_plugin. I don’t think this will remain as an independent plugin, so again, if you want to help with future maintenance or direction, let me know.
  • All of our experimentation (and improvements) for contexts involve line coverage. Branch coverage only complicates the problems of storage and reporting. I’ve mused about how to store branch data more compactly in the past, but nothing has been done.

I know this is a lot, and the 5.0 alpha series has been going on for a while. The features are shaping up to be powerful and useful. All of your feedback has been very helpful, keep it coming.

Changelog podcast: me, double-dipping

Saturday 29 June 2019

I had a great conversation with Jerod Santo on the Changelog podcast: The Changelog 351: Maintainer spotlight! Ned Batchelder. We talked about Open edX, and coverage.py, and maintaining open source software.

One of Jerod’s questions was unexpected: what other open source maintainers do I appreciate? Two people that came to mind were Daniel Hahler and Julian Berman. Some people are well-known in the Python community because they are the face of large widely used projects. Daniel and Julian are known to me for a different reason: they seem to make small contributions to many projects. I see their names in the commits or issues of many repos I wander through, including my own.

This is a different kind of maintainership: not guiding large efforts, but providing little pushes in lots of places. If I had had the presence of mind, I would have also mentioned Anthony Sottile for the same reason.

And I would have mentioned Mariatta, for a different reason: her efforts are focused on CPython, but on the contribution process and tooling around it, rather than the core code itself. A point I made in the podcast was that people and process challenges are often the limiting factor to contribution, not technical challenges. Mariatta has been at the forefront of the efforts to open up CPython contribution, and I wish I had mentioned her in the podcast.

And I am sure there are people I am overlooking that should be mentioned in these appreciations. My apologies to you if you are in that category...

A year of light and dark

Sunday 23 June 2019

Friday was the summer solstice, when day and night are at their most extreme imbalance. It reminded me of the last summer solstice — and the year of light and dark, ups and downs, since then — all revolving around Nat, my 29-year-old autistic son.

Last year on the solstice we were at a block party in a neighborhood in Boston. We had become close friends with a young couple, and planned for Nat to move in with them, and for them to be his caregivers.

The couple was eager to extend their family from two to three. We had had long serious discussions with them about the challenges involved. They knew Nat pretty well, and had deep experience with similar disabled populations. They had even moved to a new apartment in order to have space for Nat.

The new apartment was on this quirky cul-de-sac on a hill, a small tight-knit community, complete with a summer solstice party. It seemed magical, like an entirely new experience opening up to us. Nat would be moving in with a couple his own age, with young enthusiasms, and an anything-is-possible approach to the world. The neighborhood only added to the sense of expanding possibilities. It seemed like a good plan, almost too good to be true.

We planned for Nat to move at the end of August. We spent lots of time with his new caregivers over the summer, doing new things from their world. This helped them understand Nat’s full-day routines better, and was exciting for us.

The move went great. But over the course of the fall, things started not going well. Nat has always had periods of anxiety, but it’s hard to pinpoint the causes. He was going through a bad time, with alarming head-hitting. The caregivers were having health issues of their own, which made it difficult for them to give Nat the routine and stability he needs.

We tried to support the new arrangement by having Nat on weekends, and generally being there for everyone. For reasons that are still not clear to me, it wasn’t enough, and things just kept going downhill, including our interactions with the couple. By March, the arrangement that seemed too good to be true proved to be exactly that. Nat moved back with us.

This was a hard time. Everyone involved reacted in their own ways to the stress, which caused conflict between us and the caregivers. I think Nat overall was happy to be back in our house, but his anxieties had not lessened. Was the change of home part of the cause? We’ll never know.

Parenting Nat has involved a long series of choices for him: where he’ll be schooled, where he will live, what he will do during his days and nights. These choices often fall into two broad categories: the exciting but risky, and the safe but underwhelming — another kind of light and dark. And underlying those decisions is always the impossible question: are we doing enough?

Now that he was back with us, we had to decide where he would go next. He could stay with us permanently, but we know that we are perhaps the least stimulating place for him. We get caught up in our own activities and interests, and he is passive enough that lots of time passes doing nothing. He might be fine with it, but it makes us wonder: are we doing enough?

And looming over all of our planning for him is what will happen at the end of our lives? Now he is 29 and we are 57, but when he is 49 and we are 77 (or 87 or 97!), living together will be a very different story. We want him to have a life separate from us. We think it will be better for him.

The arrangement we had with the couple is known as “shared living,” and we thought about whether we wanted to try that again. Nat had been in two shared living situations by now, and our feeling was that it was too reliant on too few people. We know shared living has worked for other people, but that’s another constant in parenting Nat: just because something works for one autism family doesn’t mean it will work for us. Shared living didn’t seem right for Nat.

We talked with other families we know about what they were planning to do. But most of them had younger guys, or far more resources, or were making decisions on longer timescales than us for other reasons. And honestly, housing together with families we’re already friends with could be like going into business with friends: a good way to strain or ruin the friendship. We didn’t want to do that again.

We asked for a new placement in a group home, figuring we’d take a look at what opened up and see how we felt about it. Two months later we were offered a placement, in a house run by the same organization as Nat’s previous group home that he had moved out of the year before.

The residents are a much better fit with Nat this time, and the staff seems eager and energetic. It’s hard to know whether we are getting accurate answers from Nat when asked his opinion, but he has been nothing but positive about moving to this new house. Being in the same organization means we are familiar with some of the logistics, and Nat will know some of the residents from other houses when they do things together.

Although a group home generally falls into the safe category rather than the risky, it feels like this one might be safe without being underwhelming. We moved him in yesterday, and all seems good. We’ve been through this enough to know that it won’t be perfect. There will be miscommunications with the rotating staff, and he’ll come home wearing another resident’s shirt, but nothing is perfect.

We are still connected to the couple, through other circles. But it is awkward now, because we have never directly talked about the strains from the move-out. I hope that we can do that some day.

As I have said before, I know this is not the last time we will have to make big decisions for Nat. This one feels good, but others have felt good in the past too. I’m optimistic but alert.

Marketing factoid of the day: 57 varieties

Sunday 16 June 2019

The Heinz company has been using the slogan “57 varieties” for more than 120 years. But it was never factual. When they introduced the slogan in 1896, the company already had more than 60 products. The number was chosen for its sound and “psycho­logical effect.”

It’s hard to know the exact number, but today Heinz has thousands of products, including at least 20 ketchups.

BTW, you might be interested in other posts I’ve written on this day in the past.

Older:

May 20:

Tidelift

Apr 2:

Cog 3.0