|Ned Batchelder : Blog | Code | Text | Site|
A new alpha of Coverage.py 4.0 is available: coverage.py 4.0a6. (Yes, there are many alphas: I'm changing a lot, and want to let it bake well before locking things in.)
Also available is the latest version of the Django coverage plugin: django_coverage_plugin 0.5. This uses the new plugin support in Coverage.py 4.0 to implement coverage measurement of Django templates.
Other changes since 4.0a5:
53 is a hex-palindrome: 5310 = 3516, or in programming terminology: 53 == 0x35.
In fact, the only numbers with this property (other than the trivial single-digit numbers) are multiples of 53:
Update: I asked for a proof of this on math.stackexchange.com, and I got one!
Of the popular Python static checkers, pylint seems to be the most forceful: it raises alarms more aggressively than the others. This can be annoying, but thankfully it also has detailed controls over what it complains about.
It is also extensible: you can write plugins that add checkers for your code. At edX, we've started doing this for problems we see that pylint doesn't already check for.
edx-lint is our repo of pylint extras, including plugins and a simple tool for keeping a central pylintrc file and using it in a number of repos.
The documentation for pylint internals is not great. It exists, but too quickly recommends reading the source to understand what's going on. The good news is that all of the built-in pylint checkers use the same mechanisms you will, so there are plenty of examples to follow.
A pylint checker is basically an abstract syntax tree (AST) walker, but over a richer AST than Python provides natively. Writing a checker involves some boilerplate that I don't entirely understand, but the meat of it is a simple function that examines the AST.
One problem we've had in our code is getting engineers to understand the idiosyncratic way that translation functions are used. When you use the gettext functions in your code, you have to use a literal string as the first argument. This is because the function will not only be called at runtime, but is also analyzed statically by the string extraction tools.
So this is good:
but this won't work properly:
The difference is subtle, but crucial. And both will work with the English string, so the bug can be hard to catch. So we wrote a pylint checker to flag the bad case.
The checker is i18n_check.py, and here is the important part:
Because the method is named "visit_callfunc", it will be invoked for every function call found in the code. The "node" variable is the AST node for the function call. In the first line, we look at the expression for the function being called. It could be a name, or it could be some other expression. Most function calls will be a simple name, but if it isn't a name, then we don't know enough to tell if this is one of the translation functions, so we return without flagging a problem.
Next we look at the name of the function. If it isn't one of the dozen or so functions that will translate the string, then we aren't interested in this function call, so again, return without taking any action.
The next check is to see if this checker is even enabled. I think there's a better way to do this, but I'm not sure.
Finally we can do the interesting check: we look at the first argument to the function, which remember, is not a calculated value, but a node in the abstract syntax tree representing the code that will calculate the value.
The only acceptable value is a string constant. So we can check if the first argument is a Const node. Then we can examine the actual literal value, to see that it's a string. If it is, then everything is good, and we can return without an alarm.
But if the first argument is not a string constant, then we can use self.add_message to add a warning message to the pylint output. Elsewhere in the file, we defined MESSAGE_ID to refer to the message:
Our add_message call uses that string, providing an argument for the string formatter, so the message will have the actual function name in it, and also provides the AST node, so that the message can indicate the file and line where the problem happened.
That's the whole checker. If you're interested, the edx-lint repo also shows how to test checkers, which is done with sample .py files, and .txt files with the pylint messages they should generate.
We have a few other checkers also: checks that setUp and tearDown call their super() counterparts properly, and a check that range isn't called with a needless first argument.
The checker I'd like to write is one that can tell you that this:
should be re-written as:
and other similar improvements to test assertions.
Once you write a pylint checker, you start to get ideas for others that might work well. I can see it becoming a kind of mania...
Working in a Python project, it's common to have a clean-up step that deletes all the .pyc files, like this:
This works great, but there's a slight chance of a problem: Git records information about branches in files within the .git directory. These files have the same name as the branch.
This makes a branch called "cleanup-all-.pyc". After making a commit, I will have files named .git/refs/heads/cleanup-all-.pyc and logs/refs/heads/cleanup-all-.pyc. Now if I run my find command, it will delete those files inside the .git directory, and my branch will be lost.
One way to fix it is to tell find not to delete the file if it's found in the .git directory:
A better way is:
The first command examines every file in .git, but won't delete the .pyc it finds there. The second command will skip the entire .git directory, and not waste time examining it.
UPDATE: I originally had -delete in that latter command, but find doesn't like -prune and -delete together. It seems simplistic and unfortunate, but there it is.
A recent pull request for coverage.py by Conrad Ho added a timestamp to the HTML report pages. Of course, it included tests. They needed a little cleaning up, because they dealt with the current time, and that always gets involved.
The original test looked like this:
Here, run_coverage creates the HTML report, then the test reads the HTML file directly, computes the expected timestamp, and then checks that the expected timestamp is in the file.
Seems straightforward enough, but there's a problem. Deep inside run_coverage is a call to datetime.now() to get the current time to create the timestamp. Then in our test, we call datetime.now() again to create the expected timestamp. The problem is that because we call now() twice, they will return different times. Even formatting to hours and minutes as we do, the timestamps could be different.
This test will very occasionally fail: it is a flaky test, which is a very bad thing. Some of the existing tests in the test suite weren't changed in this pull request, but they also become flaky. They looked kind of like this:
Here, we're creating two different HTML reports, and asserting that they are the same. But run_coverage() in each calls now() at different times, so the timestamps can differ in them. Some might say that the chances are really small, and a very occasional test failure is not worth the extra complexity. True story: the first time these tests were run on Travis, they failed because of different timestamps!
One way to solve time problems like this is to mock out datetime.now(), but that can be complicated. So I took different approaches.
The second tests were straightforward to make impervious to the time changes. In that case, I amended get_html_index_content to strip out the timestamp:
Now the text of index.html doesn't have the timestamp, so the value of now() doesn't matter, and the tests aren't flaky. These are tests of other aspects than the timestamp, so it's fine to just remove the timestamp.
But the first tests were about the timestamp itself, we can't just scrub it from the output. For those tests, I chose a different approach: extract the timestamp from the HTML, and check that it is a very recent timestamp:
Here I have a new method, assert_correct_timestamp. It takes the content of the HTML, extracts the timestamp with a regex, converts it into a datetime, and then checks that the datetime is recent. This fixes the flaky test: it will not fail due to shifting time windows.
But now the test method has a bunch of code for figuring out if the datetime is recent. And it has a bug: I used abs(age.seconds) < 120, which will pass if the datetime is in the near future as well as in the near past.
This test has two ideas in it: get the timestamp from the HTML code, and check if it is recent. Better would be to factor out that second part into its own datetime assert method:
This assert method is purely about datetimes and their recency. We've fixed the bug with the near future. Now we can test this assert method directly to be sure we have the logic right:
And with all that in place, we can simplify our HTML report test:
I like giving talks. I spend a lot of time on my presentation slides, and have a typically idiosyncratic toolchain for them. This is how I make them. Note: I am not recommending that anyone else make slides this way. If you like it, fine, but most people will prefer more common tools.
I generally favor text-based tools over WYSIWYG, and slides are no exception. For simple presentations, I will use Google Docs. But PyCon talks are not simple. They usually involve technical details, or involved explanations, and I want to have code helping me make them. I choose text tools for the control they give me, not for convenience.
HTML-based presentations are popular, and they suit my need for text-based tooling. Other options include Markdown- or ReST-based tools, but they remove control rather than provide it, so I prefer straight-up HTML.
There are a number of HTML-based presentation tools, like impress.js and reveal.js. For reasons lost in the mists of time, I long ago chose one that no one else seems to use: Slippy. Maybe someday I will switch, but Slippy does what I need.
To make a Slippy presentation, I create a .html file, open it in vim, and start typing. Each slide is a <div class="slide">. To see and present the slides, I just open that HTML file in a browser. If you want to see an actual artifact, click the "actual presentation" link on any of my recent talks, or take a look at the repo for one of them:
When I need more power than just my own typing, I want to use Python to produce content. In Pragmatic Unicode, I used it to produce tables of character translations, and to run the Python code samples. In Names and Values, I used it to write Cupid figures.
To run Python code that can create content in my HTML file, I use Cog, a tool I wrote that can execute Python code inline in a text file, and put the output back into the file. I originally wrote it to solve a different problem, but it works great here. It lets me stick with a workflow where I have one file that contains both the program and result.
Sometimes, I don't need Cog. Loop Like a Native is just static text, with no need, so it's not in there.
For explaining code, it's very helpful to be able to highlight individual lines in a snippet on the screen. I couldn't find a way to do this, so I wrote lineselect.js, a jQuery plugin to let me select individual lines. While presenting, I use a presentation remote with volume control buttons, and remap those keys to j and k so that I can manually move the line selection as I talk.
As I write the presentation, I like working out what I am going to say by writing it out in English. This helps me find the right way to explain things, but has another huge advantage: it means I have a written presentation as well as a visual one. It frustrates me to hear about someone's great presentation, and then to have two options of how to learn from it: either watch a video, or look at slides with no words behind them.
When I write the English, I put it into the .html file also, interleaved with the slides, as <div class="text">. CSS lets me hide those divs during the presentation, but I can work in my HTML file and see the slides near the text.
For publication on my site, I have a Python program that parses the HTML and extracts the text divs into a .px file for insertion into my typically idiosyncratic site publication toolchain.
Producing that .px file also involves producing PNGs from the slides. Slippy comes with a phantomjs program to do this which works well. The px-producing program inserts those PNGs into the page.
As I say, I'm not explaining this to convince you to make slides this way. Most people will vastly prefer a more convenient set of tools. I like the control this gives me, and I like writing the kind of tooling I need to make them this way. To each her own.
My dad and stepmother were here for lunch yesterday. It happened to be their 45th wedding anniversary, so we made them a cake. Not anniversary themed, but theater-themed, because that is a huge passion of theirs. They both worked in the theater, and continue to help run the Barnstomers Theater in New Hampshire.
So we made a theater cake, with stage, house (where the audience sits), proscenium, lights, curtains, and some kind of confused production going on:
The view from backstage:
You can't see the seats, they are Rolos with Hershey bar backs. Sorry for the poor photo quality...
I am on the plane back to Boston from PyCon 2015 in Montreal. You've probably read over and over again that PyCon is the best conference ever, yadda-yadda. I haven't been to another conference in a long time, so I don't have points of comparison. I can tell you that PyCon feels like a huge family reunion.
I started on Thursday, and was not feeling part of things. I don't know why. I thought perhaps 9 PyCons in a row is too many. I thought maybe I should be spending my energies elsewhere.
But Friday, I started the day by helping with the keynotes, keeping time, tracking down speakers, and so on. I felt involved. I was helping friends with things they needed to do.
PyCon is almost entirely organized and run by volunteers. There is one employee, all the rest is done by people just helping as a side project. I think this gives the event a tone of something you do, rather than something you attend or consume. Anyone can volunteer to make things happen, and it can be a really good way to meet people.
There are 2500 people at PyCon, but we are all in the same group. There isn't a entire cadre of paid staff on one side, and attendees on the other. We're all making the conference happen in our own ways. It an open-source conference in the truest sense of the word.
My co-worker Adam Palay gave his talk early on Friday. I'd first seen Adam speak in a lightning talk at Boston Python. His girlfriend Anne was there to record him. They seemed supportive and close. I really liked the talk he gave, and told him so. When the call for talks opened for PyCon, he let me know he was submitting a proposal, and I helped him where I could.
His talk was accepted, along with mine and two other speakers from edX. For each talk, we had a rehearsal at work, and at a Boston Python rehearsal night. Each time Adam rehearsed his talk, his girlfriend Anne and his brother Josh were there. I was impressed by their support. It turned out Anne was going to not only come to Montreal, but attend the conference with him.
Friday morning at PyCon, I went to Adam's talk. Sitting in the second row was Anne. Next to her was Josh. Next to him was Adam's sister, and on either side were his mother and father, all with conference badges! I joked about "Team Palay", and that the five of them should have held up cards spelling P-A-L-A-Y.
Clearly, this level of support from a family is unusual, to take the time, buy airfare and hotel, and pay the conference fees, just to see Adam present his 30-minute talk at a technical conference.
I'm explaining all this about Adam's supportive family because when I am at PyCon, I feel a bit like Adam must all the time. I am surrounded by friends who feel like family. We are brought together by an odd esoteric shared interest, but we come together each year, and interact online throughout the year. We are together to talk about technical topics, but it goes beyond that.
I know this must sound like a greeting card or something. Don't get me wrong: like any family, there is friction. I don't like everyone in the Python world. But so many people at PyCon know each other and have built relationships over years, there are plenty of friendly faces all around.
All those friendly faces give rise to an effect my devops guy Feanil coined "Ned latency": the extra time I have to figure in when planning to be at a certain place at a certain time. When traveling over any significant distance at PyCon, there will be people I want to stop and talk to.
This is called the "hallway track": the social or technical activity that happens in the hallways all during the day, regardless of the track talks. I've spoken to people at PyCon who've said, "I haven't seen any talks!"
Last year during lunch, I happened to sit next to a woman I didn't know. We introduced ourselves. Her name was Jenny. We chatted a bit, and then headed off to our own activities. Over the next few days, I'd wave to Jenny as we passed each other on the escalators, and so on.
I saw Jenny again this year and miraculously remembered her name, so I waved and said, "Hi Jenny." This happened a few times. Later in the weekend, Jenny came up to me and said, "I want to thank you, you really made me feel welcome."
This made me really happy. I was saying hi to Jenny originally so that I would know more people, but we'd made a tiny connection that helped her in some way, and she felt strongly enough about it to tell me. Ian describes a similar dynamic from the bag-stuffing evening: just learning another person's name gives you a connection to that person that can last a surprisingly long time.
There are people I greet at PyCon purely because I've been chatting with them for five minutes once a year at every PyCon I've been to.
One of the highlights of PyCon for me is giving talks. I've spoken at the last 7 PyCons (the talks are on my text page). I put a lot of work into the talks, and am proud that they have some lasting power as things people recommend to other learners. After a talk, people always ask, "how did it go?" My answer is usually, "people seemed to like it," but the other half is, "on the inside, horrible. I know all the things I wish I had done differently!"
On Sunday evening, Shauna Gordon-McKeon and Open Hatch organized an intro to sprinting session for new contributors. I agreed to be a mentor there, thinking it would be a classroom style lecture, with mentors milling around helping people one-on-one. Turned out it was a series of 15-minute lectures at a number of stations around the room, with people shuttling between topics they wanted to hear about. I was the speaker on unit testing.
I was able to start by saying, if you really want to know about this, see my PyCon talk from last year, Getting Started Testing. Then I launched into an impromptu 15-minute overview of unit testing.
During one of the breaks, on my way to the water fountain, I passed a woman in the hallway watching the talk on her headphones. She said it was great, then later on Twitter, we had a typical PyCon love-fest.
To be able to see someone learning from something you've created is very gratifying and rewarding.
I attended one day of sprints. My main project there was Open edX, but I also said I would be sprinting on coverage.py, which I had never done before. I'd always had the feeling that coverage.py was esoteric and thorny, and it would be difficult to get contributors going. I was pleasantly surprised that five people joined me to make some headway against issues in the tracker.
But some of the interesting bugs are about branch coverage, which I had become somewhat frustrated by. I warned people that the problems might require a complete rewrite, but they were game to look into it.
Mickie Betz in particular was digging into an issue involving loops in generators. I was interested to watch her progress, and helped her with debugging techniques, but was not hopeful that there was a practical fix. To my surprise, a day later, she has submitted a pull request with a very simple solution. Mickie has restored my faith in humanity. She persevered in the face of a discouraging maintainer (me), and proved him wrong!
Another sprinter, Jon Chappell, picked up an issue that was easy but annoying to fix. Annoying because it was asking coverage.py to accommodate a stupid limitation in a different tool. It was not glamourous work, but I really appreciated him taking the task so that I didn't have to do it.
Two other sprinters, Conrad Ho and Leonardo Pistone, have each submitted a pull request, and Leonardo is also chasing down other issues. Lastly, Frederick Wagner has expressed interest in adding a warning-suppressing feature.
A very productive time, considering I was only at the sprints for about four hours. PyCon is amazing.
One thing I've never seen at PyCon is organized juggling. I considered bringing beanbags with me this time, but thought they would be heavy to carry around. Then Yelp was handing out bouncy balls at their booth, so I got four of those, and used them all weekend. It was a good way to play with people, especially once we did some pair juggling. Next year, I'll bring some serious equipment, and have a real open space (or two!) Who's in?
I don't know why I felt off the first day. PyCon is an amazing time, and now I again can't imagine missing it. It connects you to people. One afternoon, an attendee pulled me aside to show me a bug in coverage.py. I looked in the issue tracker, and saw that it had been written up four years ago by Christian Heimes, who was attending PyCon this year for the first time, and who I met at the bar on my first night!
PyCon energizes me, and cements my relationship to the entire Python world. Sometimes I wonder about a programming language as the basis for a group of people, but why not? They share my sensibilities and interests. They like what I do, and I like what they do. We move in similar circles. Do you need better reasons for a group of 2500 people to be close friends?
I gave my talk yesterday at PyCon 2015: Python Names and Values. PyCon has always been good at getting videos online, but they just keep getting better: the video was online the same day.
People ask me afterwards how the talk went. I got good reactions, but I also know what I would like to have done differently. I think I spoke too fast, and I think I should have had more practical advice about not mutating values if you can avoid it.
At least I didn't swear on stage this time...
My youngest son Ben turned 17 today. He is fascinated with mushrooms, so we made him a mushroom cake. Actually a trio of cakes:
It looks a bit like cupcakes, but no cupcakes were harmed in the making of this cake.
The main mushroom has a stem ("It's called a stipe, Dad") made of two 4.5-inch cake rounds. The cap ("pileus, Dad") was baked in the bottom of a stainless steel mixing bowl. The two stem pieces bulged more than we expected, so we sliced them off and made caps for the medium mushrooms. They are supported by stacked Ring-Dings for the stem.
The dots are mega M&M's. The tiny mushrooms are mini-marshmallows supporting white chocolate Reese's peanut butter cups. Gummi worms add character.
A cut-away view of the medium mushroom:
One of the things that is very useful about Python is its extreme introspectability and malleability. Taken too far, it can make your code an unmaintainable mess, but it can be very handy when trying to debug large and complex projects.
Open edX is one such project. Its main repository has about 200,000 lines of Python spread across 1500 files. The test suite has 8000 tests.
I noticed that running the test suite left a number of temporary directories behind in /tmp. They all had names like tmp_dwqP1Y, made by the tempfile module in the standard library. Our tests have many calls to mkdtemp, which requires the caller to delete the directory when done. Clearly, some of these cleanups were not happening.
To find the misbehaved code, I could grep through the code for calls to mkdtemp, and then reason through which of those calls eventually deleted the file, and which did not. That sounded tedious, so instead I took the fun route: an aggressive monkeypatch to find the litterbugs for me.
My first thought was to monkeypatch mkdtemp itself. But most uses of the function in our code look like this:
Because the function was imported directly, if my monkeypatching code ran after this import, the call wouldn't be patched. (BTW, this is one more small reason to prefer importing modules, and using module.function in the code.)
Looking at the implementation of mkdtemp, it makes use of a helper function in the tempfile module, _get_candidate_names. This helper is a generator that produces those typical random tempfile names. If I monkeypatched that internal function, then all callers would use my code regardless of how they had imported the public function. Monkeypatching the internal helper had the extra advantage that using any of the public functions in tempfile would call that helper, and get my changes.
To find the problem code, I would put information about the caller into the name of the temporary file. Then each temp file left behind would be a pointer of sorts to the code that created it. So I wrote my own _get_candidate_names like this:
This code uses inspect.stack to get the call stack. We slice it oddly, to get the closest three calling frames in the right order. Then we extract the filenames from the frames, strip off the ".py", and concatenate them together along with the line number. This gives us a string that indicates the caller.
The real _get_candidate_names function is used to get a generator of good random names, and we add our stack inspection onto the name, and yield it.
Then we can monkeypatch our function into tempfile. Now as long as this module gets imported before any temporary files are created, the files will have names like this:
The first shows that the file was created in test_import_export.py at line 289, called from case.py line 78, from case.py line 53. The second shows that test_video.py has a few functions calling eventually into tempfile.py.
I would be very reluctant to monkeypatch private functions inside other modules for production code. But as a quick debugging trick, it works great.
Coverage.py has a trace function written in C, for speed. It uses the Python C API, which is notoriously tricky to get right because you have to manage reference counts yourself.
I've made some significant changes to the trace function recently, to add plugin support to the C tracer. Adding tests for badly behaved plugins, I managed to crash Python. Not a traceback, a for-real crash in CPython.
Naturally, this means something is wrong in my C extension. Poring over the code, I couldn't see anything amiss. I'd long been intrigued by the idea of David Malcolm's CPyTracer, a plugin to gcc that performs static path analysis to find mistakes in Python C extensions, so I decided to give it a try.
The best instructions are on A. Jesse Jiryu Davis' blog: Analyzing Python C Extensions With CPyChecker. I installed Fedora as suggested, and got the compiler running without much trouble (I just typed "yum" every time I wanted to type "apt-get").
The simple way to run the checker worked fine:
This generates very nice HTML reports (like this) in two different styles that walk you through a path through your code that leads to a bad outcome. Well, supposedly a bad outcome. I found as Jesse did that there are false positives.
With the default settings, the checker only considers 256 paths through a function then stops, to avoid combinatorial explosions. But my functions had many more paths than that.
I increased the memory size of my Fedora Vagrantfile, then told CPyChecker to push on to examine a quarter million paths:
This found a few issues, but did not resolve the crash I'm experiencing. Next step: rebuild CPython --with-pydebug.
BTW, Stefan Behnel has rewritten my extension in Cython, and I really should seriously consider switching over, so that this kind of thing doesn't happen any more.