|Ned Batchelder : Blog | Code | Text | Site|
Python's context managers are the general mechnism underlying the "with" statement. They're a nice abstraction of doing a thing, and then later undoing it. The classic example is opening a file and then later closing it:
But context managers can be used for other do-then-later-undo types of behavior. Here's a context manager that changes the current directory, then later changes it back:
Context managers are objects that have __enter__ and __exit__ methods, but here we've used a very handy decorator from contextlib to make a context manager using the yield statement.
Now, suppose you have a context manager that neatly encapsulates your needed behavior, and further suppose that you are writing unit tests, and wish to get this behavior in your setUp and tearDown methods. How do you do it?
You can't use a with statement, because you need the "do" part of the context manager to happen in setUp, and then you need the "later undo" part of it to happen in tearDown. The syntax-indicated scope of the with statement won't work.
You can do it using the context manager protocol directly to perform the actions you need. And unittest has a mechanism better than tearDown: addCleanup takes a callable, and guarantees to call it when the test is done. So both the before-test logic and the after-test logic can be expressed in one place.
Here's how: write a helper function to use a context manager in a setUp function:
Now where you would have used a context manager like this:
you can do this in your setUp function:
Simple and clean.
Notice that @contextlib.contextmanager lets us write a generator, then use a decorator to turn it into a context manager. There's a lot of Python features at work here in a very small space, which is kind of cool. Then we use addCleanup to take a callable as a first-class object to get the clean up we need, which is even more cool.
One caveat about this technique: a context manager's __exit__ method can be called with information about an exception in progress. The mechanism shown here will never do that. I'm not sure it even should, considering how it's being used in a test suite. But just beware.
A seemingly simple change to fix a small bug lead me to some interesting software design choices. I'll try to explain.
In the new beta of coverage.py, I had a regression where the "run --append" option didn't work when there wasn't an existing data file. The problem was code in class CoverageScript in cmdline.py that looked like this:
If there was no .coverage data file, then this code would fail. The fix was really simple: just check if the file exists before trying to combine it:
(Of course, all of these code examples have been simplified from the actual code...)
The problem with this has to do with how the CoverageScript class is tested. It's responsible for dealing with the command-line syntax, and invoking methods on a coverage.Coverage object. To make the testing faster and more focused, test_cmdline.py uses mocking. It doesn't use an actual Coverage object, it uses a mock, and checks that the right methods are being invoked on it.
The test for this bit of code looked like this, using a mocking helper that works from a sketch of methods being invoked:
This test means that "run --append foo.py" will make a Coverage object with no arguments, then call cov.start(), then cov.run_python_file with two arguments, etc.
The problem is that the product code (cmdline.py) will actually call os.path.exists, and maybe call .combine, depending on what it finds. This mocking test can't easily take that into account. The design of cmdline.py was that it was a thin-ish wrapper over the methods on a Coverage object. This made the mocking strategy straightforward. Adding logic in cmdline.py makes the testing more complicated.
OK, second approach: change Coverage.combine() to take a missing_ok=True parameter. Now cmdline.py could tell combine() to not freak out if the file didn't exist, and we could remove the os.path.exists conditional from cmdline.py. The code would look like this:
and the test would now look like this:
Coverage.combine() is part of the public API to coverage.py. Was I really going to extend that supported API for this use case? It would mean documenting, testing, and promising to support that option "forever". There's no nice way to add an unsupported argument to a supported method.
Extending the supported API to simplify my testing seemed like the tail wagging the dog. I'm all for letting testing concerns inform a design. Often the tests are simply proxies for the users of your API, and what makes the testing easier will also make for a better, more modular design.
But this just felt like me being lazy. I didn't want combine() to have a weird option just to save the caller from having to check if the file exists. I imagined explaining this option to someone else, and I didn't want my future self to have to sheepishly admit, "yeah, it made my tests easier..."
What finally turned me back from this choice was the principle of saying "no." Sometimes the best way to keep a product simple and good is to say "no" to extraneous features. Setting aside all the testing concerns, this option on Coverage.combine() just felt extraneous.
Having said "no" to changing the public API, it's back to a conditional in cmdline.py. To make testing CoverageScript easier, I use dependency injection to give the object a function to check for files. CoverageScript already had parameters on the constructor for this purpose, for example to get the stand-in for the Coverage class itself. Now the constructor will look like:
and the test code can provide a mock for _path_exists and check its arguments:
Yes, this makes the testing more involved. But that's my business, and this doesn't change the public interface in ways I didn't like.
When I started writing this blog post, I was absolutely certain I had made the right choice. As I wrote it, I wavered a bit. Would missing_ok=True be so bad to add to the public interface? Maybe not. It's not such a stretch, and a user of the API might plausibly convince me that it's genuinely helpful to them. If that happens, I can reverse all this. That would be ok too. Decisions, decisions...
I think Coverage.py v4.0 is ready. But to be on the safe side, I'm releasing it as a beta because there have been a ton of changes since v3.7.1. Try it: coverage.py 4.0b1.
Changes since 4.0a6:
If you are interested, there is a complete list of changes: CHANGES.txt.
Also available is the latest version of the Django coverage plugin: django_coverage_plugin 0.6. This uses the new plugin support in Coverage.py 4.0 to implement coverage measurement of Django templates.
My exercise is swimming, and it's an important part of my day. I track my distance. Usually I swim a mile or so. To swim a mile in a 25-yard pool, you have to make 36 round trips.
I say this as, "36 laps." The sign at my pool says a mile is 36 laps.
I was listening to the How to Do Everything podcast, and they had a question about whether a lap is once across a pool, or there and back. I smugly thought to myself, "there and back, of course."
To answer the question, they asked Natalie Coughlin, an Olympic swimmer, who said,
What!? How does this make sense? We already have a word for one end to the other, "a length." Are we really going to use both words to mean the same thing, and then have no word for there and back?
In any other sport, a lap takes you from a starting point, out some distance, and then back to where you started. Why should swimming be different? I thought this was supposed to be an erudite sport?
Looking for a higher authority, I consulted the glossary at USA Swimming:
Thanks a lot... This definition both exposes the absurdity, by defining lap to mean precisely "a length," and then throws out there that some people use the word differently (in the useful way), so we really don't know what we're talking about.
Can we do something about this? Can't the universe make just a little more sense?
If you participate in mailing lists or IRC long enough, you will encounter a type of person I call The Lone Confused Expert. These are people who know a lot, but have gotten something wrong along the way. They have a fundamental misconception somewhere that is weaving through their conclusions.
Others will try to correct their wrong worldview, but because the Lone Confused Expert is convinced of their own intelligence, they view these conversations as further evidence that they know a great deal and that everyone around them is wrong, and doesn't understand.
I'm fascinated by the Lone Confused Expert. I want to understand the one wrong turn they took. One of the things I like about teaching is seeing people's different views (some right, some wrong) on the topics we're discussing. Understanding how others grasp a concept teaches me something about the concept, and about the people.
But the LCE is just a tantalizing mystery, because we never get to uncover their fundamental understandings. The discussions just turn into giant foodfights over their incorrect conclusions.
As an example, recently in the #python IRC channel, someone learning Python said (paraphrased),
I'd like to know what this person thought a dict was, and how they missed its essential nature, which is nothing like what other people call lists. Perhaps they were thinking of Lisp's association lists? That seems unlikely because they were also very dismissive of languages other than C/C++.
Typical of The Lone Confused Expert, the discussion balloons as more people see the odd misconceptions being defended as a higher truth. The more people flow in to try to correct The Expert, the more they stick to their guns and mock the sheeple that simply believe what they've been told rather than attaining their rarer understanding.
Two more examples, from the Python-List mailing list:
At a certain level, these statements are simply wrong. But I think somewhere deep in The Lone Confused Expert's mind, there's a kernel of truth that's been misapplied, some principle that's been extended beyond its utility, to produce these ideas. I want to understand that process. I want to see where they stepped off the path.
There's just no way to get at it, because the LCE won't examine and discuss their own beliefs. Challenges are viewed as attacks on their intelligence, which they hold in higher esteem than their knowledge.
In idle moments, these statements come back to me, and I try to puzzle through what the thought process could be. How can someone know what a punched card is, but also think that characters on it cannot be tokenized?
I wonder if a face-to-face discussion would work better. People can be surprisingly different in person than they are online. It's easy to feel attacked if you have a dozen people talking to you at once. I've never had the opportunity to meet one of these Lone Confused Experts in real life. Maybe I don't want to?
A new alpha of Coverage.py 4.0 is available: coverage.py 4.0a6. (Yes, there are many alphas: I'm changing a lot, and want to let it bake well before locking things in.)
Also available is the latest version of the Django coverage plugin: django_coverage_plugin 0.5. This uses the new plugin support in Coverage.py 4.0 to implement coverage measurement of Django templates.
Other changes since 4.0a5:
53 is a hex-palindrome: 5310 = 3516, or in programming terminology: 53 == 0x35.
In fact, the only numbers with this property (other than the trivial single-digit numbers) are multiples of 53:
Update: I asked for a proof of this on math.stackexchange.com, and I got one!
Of the popular Python static checkers, pylint seems to be the most forceful: it raises alarms more aggressively than the others. This can be annoying, but thankfully it also has detailed controls over what it complains about.
It is also extensible: you can write plugins that add checkers for your code. At edX, we've started doing this for problems we see that pylint doesn't already check for.
edx-lint is our repo of pylint extras, including plugins and a simple tool for keeping a central pylintrc file and using it in a number of repos.
The documentation for pylint internals is not great. It exists, but too quickly recommends reading the source to understand what's going on. The good news is that all of the built-in pylint checkers use the same mechanisms you will, so there are plenty of examples to follow.
A pylint checker is basically an abstract syntax tree (AST) walker, but over a richer AST than Python provides natively. Writing a checker involves some boilerplate that I don't entirely understand, but the meat of it is a simple function that examines the AST.
One problem we've had in our code is getting engineers to understand the idiosyncratic way that translation functions are used. When you use the gettext functions in your code, you have to use a literal string as the first argument. This is because the function will not only be called at runtime, but is also analyzed statically by the string extraction tools.
So this is good:
but this won't work properly:
The difference is subtle, but crucial. And both will work with the English string, so the bug can be hard to catch. So we wrote a pylint checker to flag the bad case.
The checker is i18n_check.py, and here is the important part:
Because the method is named "visit_callfunc", it will be invoked for every function call found in the code. The "node" variable is the AST node for the function call. In the first line, we look at the expression for the function being called. It could be a name, or it could be some other expression. Most function calls will be a simple name, but if it isn't a name, then we don't know enough to tell if this is one of the translation functions, so we return without flagging a problem.
Next we look at the name of the function. If it isn't one of the dozen or so functions that will translate the string, then we aren't interested in this function call, so again, return without taking any action.
The next check is to see if this checker is even enabled. I think there's a better way to do this, but I'm not sure.
Finally we can do the interesting check: we look at the first argument to the function, which remember, is not a calculated value, but a node in the abstract syntax tree representing the code that will calculate the value.
The only acceptable value is a string constant. So we can check if the first argument is a Const node. Then we can examine the actual literal value, to see that it's a string. If it is, then everything is good, and we can return without an alarm.
But if the first argument is not a string constant, then we can use self.add_message to add a warning message to the pylint output. Elsewhere in the file, we defined MESSAGE_ID to refer to the message:
Our add_message call uses that string, providing an argument for the string formatter, so the message will have the actual function name in it, and also provides the AST node, so that the message can indicate the file and line where the problem happened.
That's the whole checker. If you're interested, the edx-lint repo also shows how to test checkers, which is done with sample .py files, and .txt files with the pylint messages they should generate.
We have a few other checkers also: checks that setUp and tearDown call their super() counterparts properly, and a check that range isn't called with a needless first argument.
The checker I'd like to write is one that can tell you that this:
should be re-written as:
and other similar improvements to test assertions.
Once you write a pylint checker, you start to get ideas for others that might work well. I can see it becoming a kind of mania...
Working in a Python project, it's common to have a clean-up step that deletes all the .pyc files, like this:
This works great, but there's a slight chance of a problem: Git records information about branches in files within the .git directory. These files have the same name as the branch.
This makes a branch called "cleanup-all-.pyc". After making a commit, I will have files named .git/refs/heads/cleanup-all-.pyc and logs/refs/heads/cleanup-all-.pyc. Now if I run my find command, it will delete those files inside the .git directory, and my branch will be lost.
One way to fix it is to tell find not to delete the file if it's found in the .git directory:
A better way is:
The first command examines every file in .git, but won't delete the .pyc it finds there. The second command will skip the entire .git directory, and not waste time examining it.
UPDATE: I originally had -delete in that latter command, but find doesn't like -prune and -delete together. It seems simplistic and unfortunate, but there it is.
A recent pull request for coverage.py by Conrad Ho added a timestamp to the HTML report pages. Of course, it included tests. They needed a little cleaning up, because they dealt with the current time, and that always gets involved.
The original test looked like this:
Here, run_coverage creates the HTML report, then the test reads the HTML file directly, computes the expected timestamp, and then checks that the expected timestamp is in the file.
Seems straightforward enough, but there's a problem. Deep inside run_coverage is a call to datetime.now() to get the current time to create the timestamp. Then in our test, we call datetime.now() again to create the expected timestamp. The problem is that because we call now() twice, they will return different times. Even formatting to hours and minutes as we do, the timestamps could be different.
This test will very occasionally fail: it is a flaky test, which is a very bad thing. Some of the existing tests in the test suite weren't changed in this pull request, but they also become flaky. They looked kind of like this:
Here, we're creating two different HTML reports, and asserting that they are the same. But run_coverage() in each calls now() at different times, so the timestamps can differ in them. Some might say that the chances are really small, and a very occasional test failure is not worth the extra complexity. True story: the first time these tests were run on Travis, they failed because of different timestamps!
One way to solve time problems like this is to mock out datetime.now(), but that can be complicated. So I took different approaches.
The second tests were straightforward to make impervious to the time changes. In that case, I amended get_html_index_content to strip out the timestamp:
Now the text of index.html doesn't have the timestamp, so the value of now() doesn't matter, and the tests aren't flaky. These are tests of other aspects than the timestamp, so it's fine to just remove the timestamp.
But the first tests were about the timestamp itself, we can't just scrub it from the output. For those tests, I chose a different approach: extract the timestamp from the HTML, and check that it is a very recent timestamp:
Here I have a new method, assert_correct_timestamp. It takes the content of the HTML, extracts the timestamp with a regex, converts it into a datetime, and then checks that the datetime is recent. This fixes the flaky test: it will not fail due to shifting time windows.
But now the test method has a bunch of code for figuring out if the datetime is recent. And it has a bug: I used abs(age.seconds) < 120, which will pass if the datetime is in the near future as well as in the near past.
This test has two ideas in it: get the timestamp from the HTML code, and check if it is recent. Better would be to factor out that second part into its own datetime assert method:
This assert method is purely about datetimes and their recency. We've fixed the bug with the near future. Now we can test this assert method directly to be sure we have the logic right:
And with all that in place, we can simplify our HTML report test:
I like giving talks. I spend a lot of time on my presentation slides, and have a typically idiosyncratic toolchain for them. This is how I make them. Note: I am not recommending that anyone else make slides this way. If you like it, fine, but most people will prefer more common tools.
I generally favor text-based tools over WYSIWYG, and slides are no exception. For simple presentations, I will use Google Docs. But PyCon talks are not simple. They usually involve technical details, or involved explanations, and I want to have code helping me make them. I choose text tools for the control they give me, not for convenience.
HTML-based presentations are popular, and they suit my need for text-based tooling. Other options include Markdown- or ReST-based tools, but they remove control rather than provide it, so I prefer straight-up HTML.
There are a number of HTML-based presentation tools, like impress.js and reveal.js. For reasons lost in the mists of time, I long ago chose one that no one else seems to use: Slippy. Maybe someday I will switch, but Slippy does what I need.
To make a Slippy presentation, I create a .html file, open it in vim, and start typing. Each slide is a <div class="slide">. To see and present the slides, I just open that HTML file in a browser. If you want to see an actual artifact, click the "actual presentation" link on any of my recent talks, or take a look at the repo for one of them:
When I need more power than just my own typing, I want to use Python to produce content. In Pragmatic Unicode, I used it to produce tables of character translations, and to run the Python code samples. In Names and Values, I used it to write Cupid figures.
To run Python code that can create content in my HTML file, I use Cog, a tool I wrote that can execute Python code inline in a text file, and put the output back into the file. I originally wrote it to solve a different problem, but it works great here. It lets me stick with a workflow where I have one file that contains both the program and result.
Sometimes, I don't need Cog. Loop Like a Native is just static text, with no need, so it's not in there.
For explaining code, it's very helpful to be able to highlight individual lines in a snippet on the screen. I couldn't find a way to do this, so I wrote lineselect.js, a jQuery plugin to let me select individual lines. While presenting, I use a presentation remote with volume control buttons, and remap those keys to j and k so that I can manually move the line selection as I talk.
As I write the presentation, I like working out what I am going to say by writing it out in English. This helps me find the right way to explain things, but has another huge advantage: it means I have a written presentation as well as a visual one. It frustrates me to hear about someone's great presentation, and then to have two options of how to learn from it: either watch a video, or look at slides with no words behind them.
When I write the English, I put it into the .html file also, interleaved with the slides, as <div class="text">. CSS lets me hide those divs during the presentation, but I can work in my HTML file and see the slides near the text.
For publication on my site, I have a Python program that parses the HTML and extracts the text divs into a .px file for insertion into my typically idiosyncratic site publication toolchain.
Producing that .px file also involves producing PNGs from the slides. Slippy comes with a phantomjs program to do this which works well. The px-producing program inserts those PNGs into the page.
As I say, I'm not explaining this to convince you to make slides this way. Most people will vastly prefer a more convenient set of tools. I like the control this gives me, and I like writing the kind of tooling I need to make them this way. To each her own.
My dad and stepmother were here for lunch yesterday. It happened to be their 45th wedding anniversary, so we made them a cake. Not anniversary themed, but theater-themed, because that is a huge passion of theirs. They both worked in the theater, and continue to help run the Barnstomers Theater in New Hampshire.
So we made a theater cake, with stage, house (where the audience sits), proscenium, lights, curtains, and some kind of confused production going on:
The view from backstage:
You can't see the seats, they are Rolos with Hershey bar backs. Sorry for the poor photo quality...