|Ned Batchelder : Blog | Code | Text | Site|
Often, when I am headed to bed, I stop in at my son Ben's room, to see what he's up to. He'll be working on some piece of art, and we'll chat for a moment about it.
The other night, he was working on a self-portrait. It was a realistic depiction, but in a style reminiscent of a Renaissance prince. We talked about the style, what parts looked just like him, and what parts might need tweaking.
I went to bed, and then in the morning this was on his Facebook:
I didn't get a story about why the realistic face was gone, and why the self-homunculus is in its place instead. This picture looks much less like him, but says much more about him. He changed it from a picture of him to a picture about him. Ben has always impressed me with his art, not just as a technical skill, but as an expression of deeper ideas.
As always, I am proud of him, and thrilled to see what he creates.
Python has a compact syntax for constructing a list with a loop and a condition, called a list comprehension:
You can also build dictionaries with dictionary comprehensions, and sets with set comprehensions:
(The syntax allows more complexity than these examples, let's not get distracted!)
Finally, you can make a generator with similar syntax:
Unfortunately, this is called a generator expression, not a generator comprehension. Why not? If the first three are all comprehensions, why isn't this a comprehension?
PEP 289, Generator Expressions has detailed notes at the end which point out that Raymond Hettinger originally proposed "generator comprehensions," that they were then resurrected by Peter Norvig as "accumulation displays," and that Tim Peters suggested the name "generator expressions." It does not explain why the names changed along the way.
I made a query on Twitter:
Guido's reply gets at the heart of the matter:
Matt Boehm found the email where Tim Peters proposed "generator expression" that also has some details.
After reading that, I understand more. First, what's with the word "comprehension"? As Tim pointed out, the word comes from set theory's Axiom of Comprehension, which talks about sets formed by applying a predicate (condition) to elements of another set. This is very similar to lists formed by applying a condition to elements of another sequence.
As Guido's tweet points out, and the subject line of the email thread makes clear ("accumulator display syntax"), the designers at the time were thinking much more about displays than they were about conditions. The word "display" here means that the syntax for the code looks like the data structure it will create. A list display (list comprehension) looks like a list. Same for set and dictionary displays. But there is no generator literal syntax, so there's nothing for a generator display to look like, so there are no generator displays.
In that original email thread designing the feature, the word "comprehension" became synonymous with "display", and since generators couldn't have displays, they also couldn't have comprehensions.
But as Tim points out in his email, the interesting part of a comprehension is the condition. The heart of the Axiom of Comprehension is the predicate. Perhaps because the condition is optional in a Python comprehension, the focus shifted to the display aspect.
I think we should call them "generator comprehensions" again. We don't use the term "display" for these things. There's no reason to link "comprehension" to "display," and literal syntax.
The four different expressions (list comprehension, dict comprehension, set comprehension, and generator expressions) have an awful lot in common with each other. It would be a great shorthand to be able to discuss their similarities by talking about "comprehensions" and having it cover all four. Their similarities are more than their differences, so let's use the same word for all four.
Proposal: call them "generator comprehensions."
Work on Coverage.py 4.1 is continuing: beta 3 is available.
If you haven't used any of the 4.1 betas, the big change is that branch coverage has been completely rewritten. The new code produces much more reliable results, and has allowed me to implement things like better support for lambdas. Eleven bugs with branch coverage have been fixed.
The HTML report has a cool new feature, contributed by Dmitry Shishov, a map in the scrollbar of where the highlighted lines are, so you can quickly drag to where you need to look. (By the way, there are also keyboard shortcuts to do that, have been for a long time!)
One small backward-incompatibility: if you've been using the API, and calling the Coverage.report function, the default for the show_missing parameter has changed.
Try Coverage.py 4.1b3 and let me know what you think.
Now, the truth about Coverage.py: I think it could be much better. There are lots of things about the internals that I don't like. I think the classes could be refactored better. Too many of the tests are integration tests rather than unit tests. Too many real-world scenarios aren't covered by tests. I'm not good at staying on top of the pull requests and issues. If you think you could help with any of this, get in touch.
I have two great juggling videos to share: two jugglers, each great in his own way.
First: Alexander Koblikov. He is a great professional juggler, currently with the Big Apple Circus. He has a smooth evocative style with a small number of balls, starting with simple contact moves, but growing to flawless five-ball work. Then he can show off raw power with nine-ball multiplexes. A very impressive combination of both ends of the professional spectrum:
Kota Hayashi is very different. He's an amateur juggler, performing at the International Juggler's Association convention. He isn't wearing any particular costume, his act has no story. He's not a poker-faced artiste, and he only juggles three balls. He's a good juggler, but more importantly, he just obviously loves juggling. His enthusiasm is infectious. As you watch his act, it gets a bit ridiculous. You start to think, this is silly. But really, isn't juggling silly to begin with? Why do we throw objects around in fancy patterns? There's no point to it, other than our own amusement. Kota's act is a visible embodiment of the pure pleasure of mastering an absurd skill for its own sake.
Skip ahead to 1:05 where Kota starts:
And because I can't stop watching juggling videos, here are two bonus jugglers showing two more completely different styles:
A little over a week ago, a nephew-of-sorts of mine died in a fall. He was almost 19, a freshman at Tufts. It was tragic, and senseless, and horrifying. The funeral was Sunday, and he's been on my mind a lot.
To be honest, I didn't know Alex that much. We saw each other at most once a year, and usually less frequently. I learned more about Alex at his funeral than I had over the years of rote greetings at family gatherings. He was a smart, generous, energetic guy (boy? man?)
When I think about Alex's death, of course I think about him, and his too-short life, and his final hours. I think about his parents, and what they must be going through, and I wonder if I could handle such a loss.
I think about my own children and I think about parenthood: the enormous commitment, energy, love, and work that goes into shaping and guiding these new people. The pain and fear of sending them off into the world, away from your protective watch. It's difficult in the best of times.
Alex's death was painful not only because we lost Alex, but because it was a brutal reminder that we can lose anyone, at any time, with no notice. It's easy to imagine a nearby parallel universe where it was one of my sons instead of him.
At the funeral, Alex's dad recounted an uncle's similar loss. The uncle's wisdom was that we will not find a reason for Alex's death, but that we will find meaning in life.
I come back again to Kurt Vonnegut's son Mark, answering, and neatly side-stepping, the question of meaning: "We're here to get each other through this thing, whatever it is."
Take care of each other.
If you use Slack, or read docs on Read The Docs, you've seen Lato. It's a free high-quality font. I like it a lot, but it has a feature that bugs me a lot: the f-i ligature:
If you've never looked into this before, a ligature is a combination of letters that are designed as a new distinct glyph. In this case, there's an "fi" shape that is used when you have an "f" and an "i" next to each other.
Ligatures have a long tradition in Latin fonts, for a reason: some pairings of letters can have a jarring look. The bulb of the f and the dot of the i can clash, and it looks better to design a combined shape that shares the space better.
But Lato doesn't suffer from this problem. Ligatures are a solution to a problem, and here they are being used when there is no problem to solve. The Lato fi ligature is more jarring than the f and the i, because it looks like there's no dot for the i.
Here's a comparison of the fi ligature in some fonts. The first column is a plain f and i presented naturally, but forced to be individual, naively. Then the fi combination as the browser text renderer draws them, and then the Unicode fi ligature, U+FB01 (LATIN SMALL LIGATURE FI):
The naive Lato f and i look fine together without any intervention. The ligature looks silly without the dot. The f is trying to reach over to join the dot, but it's too far to reach, so it doesn't get there, and the f has no bulb in the first place. It doesn't make any sense.
Constantia and Georgia demonstrate a good use of ligatures: the naive pairing shows how the bulb and the dot crowd into each other, and the ligatures shift things a little to resolve the clash.
(Somehow, Lato doesn't map its fi ligature to the U+FB01 code point, so we get the default font there instead.) If you want to experiment, here's the HTML file I used to make the image.
By the way, it was an interesting challenge to get the browsers to display the unligatured f-i pairs. In Firefox, I used a zero-width space (U+200B) between the characters. But Chrome substituted the ligature anyway, so I tried putting the f and the i in adjacent spans. This worked in Chrome, but Firefox used the ligature. So I combined both methods:
I got an email from a mom last week:
I told her I couldn't meet one-on-one, but suggested they attend the upcoming Boston Python project night. I didn't know what would come of it. Project night is completely unstructured, an opportunity to hang out with other Python people. It's a complete jumble of all kinds of people. There's no guarantee you'll find what you need there, but there's a good chance you will.
Last night was the project night, and there they were! They sat down at one of our beginning learner tables, and others joined them. I didn't have a chance to sit and talk with them at length, but I could see they had the attention of helpers, including John, one of the regulars. Each time I looked over, John was in deep discussion with the kid.
While talking to someone else last night who was interested in game programming, I looked up the game that originally got the mom's attention: I posted Nat's World to this site 13 years ago today!
The mom and the kid said goodbye to me when they had to go. She seemed pleased, and he did too, in his quiet but eager way. I told him, "That makes me happy."
As the night was winding down, I caught up with John, who was talking to a few others. "That kid was amazing!" he said. "I know, that was so cool," said someone else.
This is what makes local user groups so great. I don't know what John was expecting to do with his evening. I don't know what the mom and the kid were expecting when they decided to come. But they made a connection, got some help, and made an impression on each other. People across the room who didn't even talk to the mom or the kid came away with an unexpected picture of what the Python community can be like: broader and more diverse, more welcoming than the stereotype of a tech user group filled with brogrammers.
And this is also what is cool about making things and putting them online. Nat's World was a fun project when I made it. I haven't run it in years, but my family still remembers it fondly. When I first posted it, I had a few nibbles of interest from people, but it was only a little side project, I could have just as easily not put it on my site.
In the way of the internet, Nat's World had receded into the past, an old post unlikely to get any further attention. The code doesn't even run any more. But someone found it, and because of it they got in touch, and they got to a project night, and connected with other people, and who knows where it will all lead?
I was in the library yesterday, and wandered into the Brookline room, where books particular to Brookline are kept. They have annual town directories going back more than a century. I pulled one down and looked up my street address.
Of course, when we bought the house, the realtor had little stories about previous occupants. We were told it was built by a Cabot, and that he was a bachelor.
The town records gave me two names: John H. Cabot, and F. Ernest Cabot, living in the house in 1905. Googling around a little bit, I found a notice about John H's death:
Is it safe to say that today he would be out of the closet? It goes on:
I love that line: a distaste for retrospect!
"His chamber!" That's a room in our house! Of course living in an old house, you know that people have lived there before, had entire lives there, and felt as proprietary and private about the house as you do. But this sentence somehow made it much more real. This weekend as I have moved through my house, I've been more aware that others have preceded me.
A little more digging shows that John H. is buried in Mt. Auburn cemetery, along with some close family, so in the spring I may look them up...
My new favorite t-shirt:
It's a stellated icosahedron from Henry Segerman, who makes many interesting nerdy things, inspired by both math and juggling:
BTW: stellation is the process of creating new shapes by extending the faces of a polyhedron. The shirt is a stellation of a regular icosahedron (known in gaming circles as a D20). The logo for this site is a stellation of a regular dodecahedron.
Seems like testing and podcasts are in the air... First, I was interviewed on Brian Okken's Python Test podcast. I wasn't sure what to expect. The conversation went in a few different directions, and it was really nice to just chat with Brian for 45 minutes. We talked about coverage.py, testing, doing presentations, edX, and a few other things.
Then I see that Brian was himself a guest on Talk Python to Me, Michael Kennedy's podcast about all things Python.
On that episode, Brian does a good job arguing against some of the prevailing beliefs about testing. For example, he explains why unit tests are bad, and integration tests are good. His argument boils down to, you should test the promises you've made. Unit tests mostly deal with internal details that are not promises you've made to the outside world, so why focus on testing them? The important thing is whether your product behaves right from the outside.
I liked this argument, it made sense. But I don't think I agree with it. Or, I completely agree with it, and come to a different conclusion.
When I build a complex system, I can't deal with the whole thing at once. I need to think of it as a collection of smaller pieces. And the boundaries between those pieces need to remain somewhat stable. So they are promises, not to the outside world, but to myself. And since I have made those promises to myself, I want unit tests to be sure I'm keeping those promises.
Another value of unit tests is that they are a way to chop up combinatorial explosions. If my system has three main components, and each of them can be in ten different states, I'll need 1000 integration tests to cover all the possibilities. If I can test each component in isolation, then I only need 30 unit tests to cover the possibilities, plus a small number of integration tests to consider everything mashed together. Not to mention, the unit tests will be faster than the integration tests. Which would you rather have? 1000 slow tests, or 30 fast tests plus 20 slow tests?
Sure, it's possible to overdo unit testing. And it's really easy to have all your unit tests pass and still have a broken system. You need integration tests to be sure everything fits together properly. Finding the right balance is an art. I really like hearing Brian's take on it. Give it a listen.
Two good things in the Python testing world intersected this week.
Harry Percival wrote a great book called Test-Driven Development with Python. I should have written about it long ago. It's a step-by-step example of building real software (Django web applications) using Test-Driven Development.
Harry describes the philosophy, the methods, and the steps, of doing real TDD with Django. Even if you aren't using Django, this book shows the way to use TDD for serious projects. I'm not yet a TDD convert, but it was very helpful to see it in action and understand more about it.
The entire book is available to read online if you like. Taking the meta to a whole new level, Harry also has the source for the book, including tests on GitHub.
Brian Okken has been running a podcast devoted to Python testing, called Python Test Podcast.
His latest episode is an interview with Harry. People must have thought I was nuts driving to work the other day, I was nodding so much. It was a good conversation. Highly recommended.
Let's say I have a piece of software. In this case, it's some automation for installing and upgrading Open edX. I want to know how it is being used, for example, how many people in the last month used certain versions or switches.
To collect information like that, I can put together a URL in the program, and ping that URL. What's a good simple way to collect that information? What server or service is easy to use and can help me look at the data? Is this something I should use classic marketing web analytics for? Is there a more developer-centric service out there?
This is one of those things that seems easy enough to just do with bit.ly, or a dead-stupid web server with access logs, but I'm guessing there are better ways I don't yet know about.