« | » Main « | »

Multi-parameter Jupyter notebook interaction

Saturday 29 October 2016

I'm working on figuring out retirement scenarios. I wasn't satified with the usual online calculators. I made a spreadsheet, but it was hard to see how the different variables affected the outcome. Aha! This sounds like a good use for a Jupyter Notebok!

Using widgets, I could make a cool graph with sliders for controlling the variables, and affecting the result. Nice.

But there was a way to make the relationship between the variables and the outcome more apparent: choose one of the variables, and plot its multiple values on a single graph. And of course, I took it one step further, so that I could declare my parameters, and have the widgets, including the selection of the variable to auto-slide, generated automatically.

I'm pleased with the result, even if it's a little rough. You can download retirement.ipynb to try it yourself.

The general notion of a declarative multi-parameter model with an auto-slider is contained in a class:

%pylab --no-import-all inline

from collections import namedtuple

from ipywidgets import interact, IntSlider, FloatSlider

class Param(namedtuple('Param', "default, range")):
    A parameter for `Model`.
    def make_widget(self):
        """Create a widget for a parameter."""
        is_float = isinstance(self.default, float)
        is_float = is_float or any(isinstance(v, float) for v in self.range)
        wtype = FloatSlider if is_float else IntSlider
        return wtype(
            min=self.range[0], max=self.range[1], step=self.range[2], 

class Model:
    A multi-parameter model.

    output_limit = None
    num_auto = 7
    def _show_it(self, auto_param, **kw):
        if auto_param == 'None':
            plt.plot(self.inputs, self.run(self.inputs, **kw))
            autop = self.params[auto_param]

            auto_values = np.arange(*autop.range)
            if len(auto_values) > self.num_auto:
                lo, hi = autop.range[:2]
                auto_values = np.arange(lo, hi, (hi-lo)/self.num_auto)
            for auto_val in auto_values:
                kw[auto_param] = auto_val
                output = self.run(self.inputs, **kw)
                plt.plot(self.inputs, output, label=str(auto_val))
            plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
        if self.output_limit is not None:

    def interact(self):
        widgets = {
            name:p.make_widget() for name, p in self.params.items()
        param_names = ['None'] + sorted(self.params)
        interact(self._show_it, auto_param=param_names, **widgets)

To make a model, derive a class from Model. Define a dict called params as a class attribute. Each parameter has a default value, and a range of values it can take, expressed (min, max, step):

class Retirement(Model):
    params = dict(
        invest_return=Param(3, (1.0, 8.0, 0.5)),
        p401k=Param(10, (0, 25, 1)),
        retire_age=Param(65, (60, 75, 1)),
        live_on=Param(100000, (50000, 150000, 10000)),
        inflation=Param(2.0, (1.0, 4.0, 0.25)),
        inherit=Param(1000000, (0, 2000000, 200000)),
        inherit_age=Param(70, (60, 90, 5)),

Your class can also have some constants:

start_savings = 100000
salary = 100000
socsec = 10000

Define the inputs to the graph (the x values), and the range of the output (the y values):

inputs = np.arange(30, 101)
output_limit = (0, 10000000)

Finally, define a run method that calculates the output from the inputs. It takes the inputs as an argument, and also has a keyword argument for each parameter you defined:

def run(self, inputs, 
    invest_return, p401k, retire_age, live_on,
    inflation, inherit, inherit_age
    for year, age in enumerate(inputs):
        if year == 0:
            yearly_money = [self.start_savings]
        inflation_factor = (1 + inflation/100)**year
        money = yearly_money[-1]
        money = money*(1+(invest_return/100))
        if age == inherit_age:
            money += inherit
        if age <= retire_age:
            money += self.salary * inflation_factor *(p401k/100)
            money += self.socsec
            money -= live_on * inflation_factor

    return np.array(yearly_money)

To run the model, just instantiate it and call interact():


You'll get widgets and a graph like this:

Jupyter notebook, in action

There are things I would like to be nicer about this:

  • The sliders are a mess: if you make too many parameters, the slider and the graph don't fit on the screen.
  • The values chosen for the auto parameter are not "nice", like tick marks on a graph are nice.
  • It'd be cool to be able to auto-slide two parameters at once.
  • The code isn't packaged in a way people can easily re-use.

I thought about fixing a few of these things, but I likely won't get to them. The code is here in this blog post or in the notebook file if you want it. Ideas welcome about how to make improvements.

BTW: my retirement plans are not based on inheriting a million dollars when I am 70, but it's easy to add parameters to this model, and it's fun to play with...

A failed plugin

Saturday 22 October 2016

A different kind of story today: a clever test runner plugin that in the end, did not do what I had hoped.

At edX, our test suite is large, and split among a number of CI workers. One of the workers was intermittently running out of memory. Something (not sure what) lead us to the idea that TestCase objects were holding onto mocks, which themselves held onto their calls' arguments and return values, which could be a considerable amount of memory.

We use nose (but plan to move to pytest Real Soon Now™), and nose holds onto all of the TestCase objects until the very end of the test run. We thought, there's no reason to keep all that data on all those test case objects. If we could scrub the data from those objects, then we would free up that memory.

We batted around a few possibilities, and then I hit on something that seemed like a great idea: a nose plugin that at the end of a test, would remove data from the test object that hadn't been there before the test started.

Before I get into the details, the key point: when I had this idea, it was a very familiar feeling. I have been here many times before. A problem in some complicated code, and a clever idea of how to attack it. These ideas often don't work out, because the real situation is complicated in ways I don't understand yet.

When I had the idea, and mentioned it to my co-worker, I said to him, "This idea is too good to be true. I don't know why it won't work yet, but we're going to find out." (foreshadowing!)

I started hacking on the plugin, which I called blowyournose. (Nose's one last advantage over other test runners is playful plugin names...)

The implementation idea was simple: before a test runs, save the list of the attributes on the test object. When the test ends, delete any attribute that isn't in that list:

from nose.plugins import Plugin

class BlowYourNose(Plugin):

    # `test` is a Nose test object. `test.test` is the
    # actual TestCase object being run.

    def beforeTest(self, test):
        test.byn_attrs = set(dir(test.test))

    def afterTest(self, test):
        obj = test.test
        for attr in dir(obj):
            if attr not in test.byn_attrs:
                delattr(obj, attr)

By the way: a whole separate challenge is how to test something like this. I did it with a class that could report on its continued existence at the end of tests. Naturally, I named that class Booger! If you are interested, the code is in the repo.

At this point, the plugin solved this problem:

class MyLeakyTest(unittest.TestCase):
    def setUp(self):
        self.big_thing = big_thing()

    def test_big_thing():
        self.assertEqual(self.big_thing.whatever, 47)

The big_thing attribute will be deleted from the test object once the test is over, freeing the memory it consumed.

The next challenge was tests like this:

def test_directory_handling(self, mock_listdir):
    blah blah ...

The patch decorator stores the patches on an attribute of the function, so I updated blowyournose to look for that attribute, and set it to None. This nicely reclaimed the space at the end of the test.

But you can see where this is going: as I experiment with using the plugin on more and more of our test suite, I encounter yet-more-exotic ways to write tests that exceed the capabilities of the plugin. Each time, I add more logic to the plugin to deal with the new quirk, hoping that I can finally deal with "everything."

We use ddt for data-driven tests like this:

class FooTestCase(unittest.TestCase):

    @data(3, 4, 12, 23)
    def test_larger_than_two(self, value):

This turns one test method into four test methods, one for each data value. When combined with @patch, it means that we can't clean up the patch when one method is done, we need to wait until all the methods are done. But we don't know which is the last. To deal with this, the plugin sniffs around for indications that ddt is being used, and defers the cleanup until the entire class is done.

But then comes test case inheritance:

class BaseTest(unittest.TestCase):
    __test__ = False

    def test_something(self, something):

class Setting1Test(BaseTest):
    __test__ = True

    def setUp(self):
        self.setting = 1

class Setting2Test(BaseTest):
    __test__ = True

    def setUp(self):
        self.setting = 2

Now we have patches on generated methods, and even the end of the class is too early to clean up, because there are other classes using them later. We have no way to know when it is safe to clean up, except at the very end of all the tests. But the whole point was to reclaim memory sooner than that.

So the good news is, I was right: there were reasons my simple brilliant idea wasn't going to work. The bad new is, I was right. This is so typical of this kind of work: it's a simple idea, that seems so clearly right when you are in the shower, or on your bike, or swimming laps. Then you get into the actual implementation and all the real-world complexity and twistiness reveals itself. You end up in a fun-house of special cases. You chase them down, thinking, "no problem, I can account for that," and maybe you can, but there are more creepy clowns around the next corner, and chances are really good that eventually one will be too much for your genius idea.

In this case, just to top it off, it turns out the memory problem in our test suite wasn't about long-lived mocks at all. It was due to Django 1.8 migrations consuming tons of memory, and the solution is to upgrade to 1.9 (someday...). Sigh.

One of the challenges of software engineering is remaining optimistic in the face of boss battles like this. Occasionally a simple genius idea will work out. Sometimes, solving 90% of the problem is a good thing, and the other 10% can remain unsolved. Even total losses like blowyournose are good experience, good learning exercises.

And the next idea will be better!

« | » Main « | »