In 1964, Richard Feynman gave a series of seven lectures at Cornell called The Character of Physical Law. They were recorded by the BBC, and are now on YouTube. These are great.

These are not advanced lectures, they were intended for a general audience, and Feynman does a great job inhabiting the world of fundamental physics. He's clearly one of the top experts, but explains in such a personal approachable style that you are right alongside him as he explores this world looking for answers, following in the footsteps of Newton and Einstein.

If you've never heard Feynman, at least dip into the first one if only to hear his deep, thick New York accent. He also is witty: he places the French Revolution in 1783, and says, "Accurate to three decimal places, not bad for a physicist!" It's disarmingly out of character for an intellectual, but Feynman is the real thing, discussing not just the basics of forces and particles, but the philosophical implications for scientists and all thinkers:

I converted the videos to pure audio and listened to them in my car, which meant I couldn't see what he was drawing on the blackboard, but it was enlightening nonetheless. Highly recommended: The Character of Physical Law.

tagged: » 2 reactions

I do a lot of side projects in the Python world. I write coverage.py. I give talks at PyCon, like the one about iterators, or the one with the Unicode sandwich. I hang out in the #python IRC channel and help people with their programs. I organize Boston Python.

I enjoy these things, and I don't get paid for them. But if you want to help me out, here's how you can: my son Max is in his last semester at NYU film school, which means he and his friends are making significant short films. These films need funding. If you've liked something I've done for you in the Python world, how about tossing some money over to a film?

Max will be doing a film of his own this semester, but his Kickstarter isn't live yet. In the meantime, he's the cinematographer on his friend Jacob's film Go To Hell. So give a little money to Jacob, and in a month or so I'll hit you up again to give a lot of money to Max. :)

The first alpha of the next major version of coverage.py is available: coverage.py v4.0a1.

The big new feature is support for the gevent, greenlet, and eventlet concurrency libraries. Previously, these libraries' behind-the-scenes stack swapping would confuse coverage.py. Now coverage adapts to give accurate coverage measurement. To enable it, use the "concurrency" setting to specify which library you are using.

Huge thanks to Peter Portante for getting the concurrency support started, and Joe Jevnik for the last final push.

Also new is that coverage.py will read its configuration from setup.cfg if there is no .coveragerc file. This lets you keep more of your project configuration in one place.

Lastly, the textual summary report now shows missing branches if you are using branch coverage.

One warning: I'm moving around lots of internals. People have a tendency to use what they need to to get their plugin or tool to work, so some of those third-party packages may now be broken. Let me know what you find.

Full details of other changes are in the CHANGES.txt file.

tagged: » react

I thought today was going to be a good day. I was going to release the first alpha version of coverage.py 4.0. I finally finished the support for gevent and other concurrency libraries like it, and I wanted to get the code out for people to try it.

So I made the kits and pushed them to PyPI. I used to not do that, because people would get the betas by accident. But pip now understands about pre-releases and real releases, and won't install an alpha version by default. Only if you explicitly use --pre will you get an alpha.

About 10 minutes after I pushed the kits, someone I was chatting with on IRC said, "Did you just release a new version of coverage?" Turns out his Travis build was failing.

He was using coveralls to report his coverage statistics, and it was failing. Turns out coveralls uses internals from coverage.py to do its work, and I've made big refactorings to the internals, so their code was broken. But how did the alpha get installed in the first place?

He was using tox, and it turns out that when tox installs dependencies, it defaults to using the --pre switch! Why? I don't know.

OK, I figured I would just hide the new version on PyPI. That way, if people wanted to try it, they could use "pip install coverage==4.0a1", and no one else would be bothered with it. Nope: pip will find the newer version even if it is hidden on PyPI. Why? I don't know.

In my opinion:

  • Coveralls shouldn't have used coverage.py internals.
  • Tox shouldn't use the --pre switch by default.
  • Pip shouldn't install hidden versions when there is no version information specified.

So now the kit is removed entirely from PyPI while I figure out a new approach. Some possibilities, none of them great:

  1. Distribute the kit the way I used to, with a download on my site. This sucks because I don't know if there's a way to do this so that pip will find it, and I don't know if it can handle pre-built binary kits like that.
  2. Do whatever I need to do to coverage.py so that coveralls will continue to work. This sucks because I don't know how much I will have to add back, and I don't want to establish a precedent, and it doesn't solve the problem that people really don't expect to be using alphas of their testing tools on Travis.
  3. Make a new package on PyPI: coverage-prerelease, and instruct people to install from there. This sucks because tools like coveralls won't refer to it, so either you can't ever use it with coveralls, or if you install it alongside, then you have two versions of coverage fighting with each other? I think?
  4. Make a pull request against coveralls to fix their use of the now-missing coverage.py internals. This sucks (but not much) because I don't want to have to understand their code, and I don't have a simple way to run it, and I wish they had tried to stick to supported methods in the first place.
  5. Leave it broken, and let people fix it by overriding their tox.ini settings to not use --pre, or wait until people complain to coveralls and they fix their code. This sucks because there will be lots of people with broken builds.

Software is hard, yo.

tagged: » 9 reactions

A friend recommended a technical talk today: How to Design an a Good API and Why it Matters by Joshua Bloch. Looks good! It's also an hour long...

For a variety of reasons, it's hard to watch an hour-long video. I'd prefer to read the same content. But it isn't available textually. For my own talks, I produce full text as part of the preparation (for example, the Unicode sandwich talk).

I've even transcribed other peoples' PyCon talks: Stop Mocking, Start Testing, and Speedily Practical Large-Scale Tests. It was a good way to ensure I actually watched them!

People put slide decks up on SlideShare, but decks vary wildly in how well they contain the content. Some simply provide a backdrop, which is entertaining during a talk, but useless afterward.

Is there some way we can pool efforts to get more talks transcribed or summarized? Surely others would like to see it done? And there must be people eager to contribute in some way who could spend the time? Does something like this already exist?

I know the full talk, with the real speaker really speaking to me, is the best way to get their message. For example, Richard Feynman's series The Character of Physical Law just wouldn't be the same without his accent and delivery. But if the choice is reading a lengthy summary or not getting the message at all, I'll definitely take the summary.

Or maybe I'm an old codger stuck in text-world while all the younguns just want video?

Ben spent the summer at a RISD program for high-schoolers (obligatory celebration cake was here.) He majored in comics, and this is his final project. It's five pages long, click to see the entire comic as one long image, then possibly click again to enlarge it so you can read it:

Ben's Avis comic

I can't tell you how proud this comic makes me. Ben has always been a naturally talented artist, but this is a quantum leap up in technique and execution for him. Also, it's a really sweet story.

I've been writing about Ben's progress as an artist here for a long time:

Now at 16, Ben continues to amaze me with what he can do. In the robot movie post, I said, "I've always tried to encourage their creative sides, and they haven't let me down." Still true.

When reviewing GitHub pull requests, I sometimes want to get the proposed code onto my own machine, for running it, or just reading it in my own editor. Here are a few ways to do it, with a digression into git/alias/shell weirdness tacked on the end.

If the pull request is from a branch in the same repo, you can just check out the branch by name:

$ git checkout joe/proposed-feature

But you might not remember the name of the branch, or it might be in a different fork. Better is to be able to request the code by the pull request number.

The first technique I found was to modify the repo's .git/config file so that when you fetch code from the remote, it automatically pulls the pull request branches also. On GitHub, pull requests are at refspecs like "refs/pull/1234" (no, I don't really know what refspecs are, but I look forward to the day when I do...) Bert Belder wrote up a description of how to tweak your repo to automatically pull down all the pull request branches. You add this line to the [remote "origin"] section of your .git/config:

fetch = +refs/pull/*/head:refs/remotes/origin/pr/*

Now when you "git fetch origin", you'll get all the pull request branches, and you can simply check out the one you want with "git checkout pr/1234".

But this means having to edit your repo's .git/config file before you can get the pull request code. If you have many repos, you're always going to be finding the ones that haven't been tweaked yet.

A technique I liked better is on Corey Frang's gist, provided by Rico Sta. Cruz: Global .gitconfig aliases for pull request management. Here, you update your single ~/.gitconfig file to define a new command that will pull down a pull request branch when you need it:

[alias]
copr = "!f() { git fetch -fu ${2:-origin} refs/pull/$1/head:pr/$1 &&
                    git checkout pr/$1; }; f"

(That should all be on one line, but I wanted it to be readable here.) This gives us a new command, "git copr" (for CheckOut Pull Request) that gets branches from pull requests:

$ git copr 1234            # gets and switches to pr/1234 from origin
$ git copr 789 upstream    # gets and switches to pr/789 from upstream

This technique has the advantage that once you define the alias, it's available in any repo, and also, it both fetches the branch and switches you to it.

BTW: finding and collecting these kinds of shortcuts can be daunting, because if you don't understand every bit of them, then you're in cargo-cult territory. "This thing worked for someone else, and if I copy it here, then it will work for me!"

In a few of the aliases on these pages, I see that the commands end with "&& :". I asked about this in the #git IRC channel, and was told that it was pointless: "&&" joins two shell commands, and runs the second one if the first one succeeded, and ":" is a shell built-in that simply succeeds (it's the same as "true"). So what does "&& :" add to the command? Seemed like it was pointless; we were stumped.

Then I also asked why other aliases took the form they did. Our copr alias has this form:

"!f() { command1; command2; }; f"

The bang character escapes from git syntax to the shell. Then we define a shell function called f with two commands in it, then we call the function. Why define the function? Why not just define the alias to be the two commands?

More discussion and experimentation turned up the answer. The way git invokes the shell, the arguments to the alias are available as $1, $2, etc, but they are also appended to the command line. As an example, let's define three different git aliases, each of which uses two arguments:

[alias]
    ee1 = "!echo 2 is $2 stop; echo 1 is $1 stop"
    ee2 = "!echo 2 is $2 stop; echo 1 is $1 stop && :"
    ee3 = "!f() { echo 2 is $2 stop; echo 1 is $1 stop; }; f"

When we try these, the first does a bad thing, but the second and third are good:

$ git ee1 one two
1 is one stop
2 is two stop one two
$ git ee2 one two
1 is one stop
2 is two stop
$ git ee3 one two
1 is one stop
2 is two stop

The second one works because the ":" command eats up the extra arguments. The third one works because the eventual command run is "f one two", so the values are passed to the function. So the "&& :" wasn't pointless afterall, it was needed to make the arguments work properly.

From previous cargo-cult expeditions, my ~/.gitconfig has other aliases using a different form:

[alias]
    ee4 = !sh -c 'echo 1 is $1 stop && echo 2 is $2 stop'
    ee5 = !sh -c 'echo 1 is $1 stop && echo 2 is $2 stop' -

These do this:

$ git ee4 one two
1 is two stop
2 is stop
$ git ee5 one two
1 is one stop
2 is two stop

(No, I have no idea why ee4 does what it does.) So we have three odd forms that all are designed to let you access arguments positionally, but not get confused by them:

[alias]
    cmd1 = "!command1 && command2 && :"
    cmd2 = "!f() { command1; command2; }; f"
    cmd3 = !sh -c 'command1 && command2' -

All of them work, I like the function-defining one best, it seems most programmery, and least shell-tricky. I'm sure there's something here I'm misunderstanding, or a subtlety I'm overlooking, but I've learned stuff today.

tagged: » 4 reactions

One of the interesting things about helping beginning programmers is to see the way they think. After programming for so long, and using Python for so long, it's hard to remember how confusing it can all be. Beginners can reacquaint us with the difficulties.

Python has a handy way to iterate over all the elements of a sequence, such as a list:

for x in seq:
    doit(x)

But if you've only learned a few basic things, or are coming from a language like C or Javascript, you might do it like this:

i = 0
while i < len(seq):
    x = seq[i]
    doit(x)
    i += 1

(BTW, I did a talk at the PyCon before last all about iteration in Python, including these sorts of comparisons of techniques: Loop Like a Native.)

Once you learn about the range() builtin function, you know you can loop over the indexes of the sequence like this:

for i in range(len(seq)):
    x = seq[i]
    doit(x)

These two styles of loop are commonly seen. But when I saw this on Stackoverflow, I did a double-take:

i = 0
while i in range(len(seq)):
    x = seq[i]
    doit(x)
    i += 1

This is truly creative! It's an amalgam of the two beginner loops we've already seen, and at first glance, looks like a syntax error.

In fact, this works in both Python 2 and Python 3. In Python 2, range() produces a list, and lists support the "in" operator for checking element membership. In Python 3, range() produces a range object which also supports "in".

So each time around the loop, a new range is constructed, and it's examined for the value of i. It works, although it's baroque and performs poorly in Python 2, being O(n2) instead of O(n).

People are creative! Just when I thought there's no other ways to loop over a list, a new technique arrives!

tagged: » 8 reactions

At edX, I help with the Open edX community, which includes being a traffic cop with the flow of pull requests. We have 15 or so different repos that make up the entire platform, so it's tricky to get a picture of what's happening where.

So I made a chart:

Pull requests, charted by age.

The various teams internal to edX are responsible for reviewing pull requests in their areas of expertise, so this chart is organized by teams, with most-loaded at the top. The colors indicate the time since the pull request was opened. The bars are clickable, showing details of the pull requests in each bunch.

This was a fun project because of the new stuff I got to play with along the way. The pull request data is gathered by a Python program running on Heroku, using the GitHub API of course. The summary of the appropriate pull requests are stored in a JSON file. A GitHub webhook pings Heroku when a pull request changes, and the Python updates the JSON.

Then I used d3.js in the HTML page to retrieve the JSON, slice and dice it, and build an SVG chart. The clickable bars open to show HTML tables embedded with a foreignObject. This was complicated to get right, but drawing the tables with SVG would be painful, and drawing the bars with HTML would be painful. This let me use the best tool for each job.

D3.js is an impressive piece of work, but took some getting used to. Mike Bostock's writings helped explain what was going on. The key insight: d3 is not a charting library. It's a way to use data to create pages, of turning data into DOM nodes.

So far, the chart seems to have helped edX stay aware of how pull requests are moving. It hasn't made everything speedy, but at least we know where things are stalled, and it has encouraged teams to try to avoid being at the top. I'd like to add more to it, for example, other ways of sorting and grouping, and more information about the pull requests themselves.

The code is part of our repo-tools if you are interested.

tagged: » react

That is all.

tagged: » 3 reactions

As the maintainer of coverage.py, it's always been intriguing that web applications have so much code in template files. Coverage.py measures Python execution, so the logic in the template files goes un-measured.

(Of course, in a web application, there's even more Javascript code, which coverage.py also can't help with, but there are other tools that measure Javascript coverage.)

Recently I started experimenting with measuring templates as well as pure Python code. Mako templates compile to Python files, which are then executed. Coverage.py can see the execution in the compiled Python files, so once we have a way to back-map the lines from the Mako output back to the Mako template, we have the start of a usable Mako coverage measurement.

This Mako experiment is on the tip of the coverage.py repo, and requires some code on the tip of Mako. The code isn't right yet, but it shows the idea. Eventually, this should be a plugin to coverage.py provided by Mako, but for now, we're just trying to prove out the concept.

If you want to try the Mako coverage (please do!), configure Mako to put your compiled .py files someplace convenient (like a mako/ directory in your project), then set this environment variable:

$ export COVERAGE_MAKO_PATH=/mako/

Jinja also compiles templates to Python files, but Django does not. Django is very popular, so I would like to support those templates also. Dmitry Trofimov wrote dtcov to measure Django template coverage. He does a tricky thing: in the trace function, determine if you are inside the Django template engine, and if so, walk the stack and look at the locals to grab line numbers.

As written dtcov looks too compute-intensive to run on full-sized projects, but I think the idea could work. I'm planning to experiment with it this week.

tagged: » 2 reactions

I had coffee the other day with Nathan Kohn. He goes by the nickname en_zyme, and it's easy to see why. He relishes the role of bringing pairs of people together to see what kind of new reaction can result.

This time, it was to meet Jonathan Henner, a doctoral student of his at Boston University. The topic was how to include deaf people in the Python community.

The discussion was wide-ranging, and I'm sure I've forgotten interesting tangents, but I got this jumble of notes:

Accommodating the deaf at Python community gatherings is a challenge because it means getting either an ASL interpreter, or a CART provider to close-caption presentations live. This presents a few hurdles:

  • Neither solution is best for all deaf people. Some prefer ASL, but ASL doesn't have a large technical vocabulary. CART has the advantage that it also helps those that are a little hard of hearing, or too far back in the room, or even with speakers who have an accent. But some deaf people find CART to be like reading a second language.
  • ASL interpreters and CART providers cost real money, and need space and special equipment.
  • For international gatherings, such as PyCon 2015 in Montreal, there's the language question. Montreal may have more LSF interpreters.
  • For project nights, which involve small-group interactions, we talked about having the communication over IRC, even among people sitting together.

Programming is a good career for the deaf, since it is heavily textual, but they may have a hard time accessing the curriculum for it. Jonathan is exploring the possibility of creating classes in ASL, since that is many deaf people's first language. A common misconception is that ASL is simply English spoken with the hands, but it is not.

We talked a bit about the overlap between the deaf and autistic worlds. The Walden school near Boston specializes in deaf students with other mental or emotional impairments, including autism. Jonathan made a claim that made me think: that deafness and autism are the two disabilities that have their own sub-culture. I don't know if that is true, I'm sure people with other disabilities will disagree, but it's interesting to discuss.

There were a lot of avenues to explore, I'm not sure what will come of it all. It would be great to broaden Python's reach into another community of people who haven't had full access to tech.

Has anyone had any experience doing this? Thoughts?

Older:

Fri 14:

Pi day

Even older...