Don't follow me on Instagram

Monday 5 September 2016

This summer I started taking pictures and posting them on Instagram. It started with a conversation with my son Max, and his assertion that posting more than one picture a day on Instagram was Instaspam. That constraint appealed to me. I like the idea of photography as a way of attending to what I am seeing. So I started trying to look around me to find interesting shots for Instagram posts.

My summer has been different than I expected, so I've had chances to look around places I didn't expect to be. Ironically, thinking about what can go on Instagram can be a way to focus on the here-and-now. You have to see what is immediately around you in order to get a shot.

Normally, thinking about stuff to post on a social network can be the furthest thing from being in the moment. You're thinking about how people will react to your tweet, or who will look at your Facebook status. It's easy to fall into second-guessing what will get the biggest response. You spend time going back to look at what happened to your recent activity.

I have mixed stances toward different social media. I like Twitter, and like having followers. I want my tweets to get widely retweeted. I ignore Facebook, except to find out what my sons are up to. Pinterest and Snapchat might as well not exist. Now I'm putting pictures on Instagram, but not to get followers or tons of likes. The photos have no message, I rarely put any words on them. If I can post a picture I like, and have one other person like it, that's enough.

If you want to follow someone good on Instagram, my brother is an actual photographer that knows what he is doing. Follow him!

Walks in the morning

Thursday 25 August 2016

The summer is wrapping up, and it's been a strange one. On July 4th weekend, we discovered a serious bruise on Nat's chest. We took him to the emergency room to have it properly documented so we could make a formal investigation. The doctor there told us that Nat had a broken rib, and what's more, he had another that had healed perhaps a year ago.

Nat is 26, and has autism. We tried asking him what had happened, but his reports are sketchy, and it's hard to know how accurate they are. We moved him out of his apartment, and back home with us. We ended his day program. He'd had a good experience at a camp in Colorado a few years ago, so we sent him back there, which was expensive, and meant two Colorado trips for us.

The investigation has not come up with any answers. A year ago, he had been acting oddly, very still and reluctant to move. Then, we couldn't figure out why, but now we know: he had a broken rib.

We've found a new day program for Nat which seems really good. It starts full-time on Monday. During the last month, we've been cobbling together things for Nat to do during the day. He has a lot of energy and likes walking, so I've switched my exercise from swimming to doing early-morning walks with Nat before work.

Parenting is not easy. No matter what kind of child(ren) you have, there are challenges. You have to understand their needs, decide what you want for them, and try to make a match. You have to include them in the many forces that shape your days and your life.

This summer has been a challenge that way, figuring out how to fit this complicated man into our day. The walks have been something Nat and I do together, one of the few things we both enjoy. I'll be glad to be back to my swimming routine, but I'm also glad to have had this expansion of our walking together, something that used to only happen on weekends.

Nat, walking

We still have to find a place for Nat to live, and we have to make sure the day program takes hold in a good way. I know this is not that last time Nat will need our energy, worry, and attention, and I know we won't always know when those times are coming. This is what it means to be his parent. He needs us to plan and guide his life.

And he needs to walk in the morning.

Lists vs. Tuples

Thursday 18 August 2016

A common beginner Python question: what's the difference between a list and a tuple?

The answer is that there are two different differences, with complex interplay between the two. There is the Technical Difference, and the Cultural Difference.

First, the things that are the same: both lists and tuples are containers, a sequence of objects:

>>> my_list = [1, 2, 3]
>>> type(my_list)
<class 'list'>
>>> my_tuple = (1, 2, 3)
>>> type(my_tuple)
<class 'tuple'>

Either can have elements of any type, even within a single sequence. Both maintain the order of the elements (unlike sets and dicts).

Now for the differences. The Technical Difference between lists and tuples is that lists are mutable (can be changed) and tuples are immutable (cannot be changed). This is the only distinction that the Python language makes between them:

>>> my_list[1] = "two"
>>> my_list
[1, 'two', 3]
>>> my_tuple[1] = "two"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

That's the only technical difference between lists and tuples, though it manifests in a few ways. For example, lists have a .append() method to add more elements to the list, while tuples do not:

>>> my_list.append("four")
>>> my_list
[1, 'two', 3, 'four']
>>> my_tuple.append("four")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'

Tuples have no need for an .append() method, because you can't modify tuples.

The Cultural Difference is about how lists and tuples are actually used: lists are used where you have a homogenous sequence of unknown length; tuples are used where you know the number of elements in advance because the position of the element is semantically significant.

For example, suppose you have a function that looks in a directory for files ending with *.py. It should return a list, because you don't know how many you will find, and all of them are the same semantically: just another file that you found.

>>> find_files("*.py")
["control.py", "config.py", "cmdline.py", "backward.py"]

On the other hand, let's say you need to store five values to represent the location of weather observation stations: id, city, state, latitude, and longitude. A tuple is right for this, rather than a list:

>>> denver = (44, "Denver", "CO", 40, 105)
>>> denver[1]
'Denver'

(For the moment, let's not talk about using a class for this.) Here the first element is the id, the second element is the city, and so on. The position determines the meaning.

To put the Cultural Difference in terms of the C language, lists are like arrays, tuples are like structs.

Python has a namedtuple facility that can make the meaning more explicit:

>>> from collections import namedtuple
>>> Station = namedtuple("Station", "id, city, state, lat, long")
>>> denver = Station(44, "Denver", "CO", 40, 105)
>>> denver
Station(id=44, city='Denver', state='CO', lat=40, long=105)
>>> denver.city
'Denver'
>>> denver[1]
'Denver'

One clever summary of the Cultural Difference between tuples and lists is: tuples are namedtuples without the names.

The Technical Difference and the Cultural Difference are an uneasy alliance, because they are sometimes at odds. Why should homogenous sequences be mutable, but hetergenous sequences not be? For example, I can't modify my weather station because a namedtuple is a tuple, which is immutable:

>>> denver.lat = 39.7392
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't set attribute

And sometimes the Technical considerations override the Cultural considerations. You cannot use a list as a dictionary key, because only immutable values can be hashed, so only immutable values can be keys. To use a list as a key, you can turn it into a tuple:

>>> d = {}
>>> nums = [1, 2, 3]
>>> d[nums] = "hello"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> d[tuple(nums)] = "hello"
>>> d
{(1, 2, 3): 'hello'}

Another conflict between the Technical and the Cultural: there are places in Python itself where a tuple is used when a list makes more sense. When you define a function with *args, args is passed to you as a tuple, even though the position of the values isn't significant, at least as far as Python knows. You might say it's a tuple because you cannot change what you were passed, but that's just valuing the Technical Difference over the Cultural.

I know, I know: in *args, the position could be significant because they are positional parameters. But in a function that's accepting *args and passing it along to another function, it's just a sequence of arguments, none different from another, and the number of them can vary between invocations.

Python uses tuples here because they are a little more space-efficient than lists. Lists are over-allocated to make appending faster. This shows Python's pragmatic side: rather than quibble over the list/tuple semantics of *args, just use the data structure that works best in this case.

For the most part, you should choose whether to use a list or a tuple based on the Cultural Difference. Think about what your data means. If it can have different lengths based on what your program encounters in the real world, then it is probably a list. If you know when you write the code what the third element means, then it is probably a tuple.

On the other hand, functional programming emphasizes immutable data structures as a way to avoid side-effects that can make it difficult to reason about code. If you are a functional programming fan, you will probably prefer tuples for their immutability.

So: should you use a tuple or a list? The answer is: it's not always a simple answer.

Breaking out of two loops

Thursday 4 August 2016

A common question is, how do I break out of two nested loops at once? For example, how can I examine pairs of characters in a string, stopping when I find an equal pair? The classic way to do this is to write two nested loops that iterate over the indexes of the string:

s = "a string to examine"
for i in range(len(s)):
    for j in range(i+1, len(s)):
        if s[i] == s[j]:
            answer = (i, j)
            break   # How to break twice???

Here we are using two loops to generate the two indexes that we want to examine. When we find the condition we're looking for, we want to end both loops.

There are a few common answers to this. But I don't like them much:

  • Put the loops into a function, and return from the function to break the loops. This is unsatisfying because the loops might not be a natural place to refactor into a new function, and maybe you need access to other locals during the loops.
  • Raise an exception and catch it outside the double loop. This is using exceptions as a form of goto. There's no exceptional condition here, you're just taking advantage of exceptions' action at a distance.
  • Use boolean variables to note that the loop is done, and check the variable in the outer loop to execute a second break. This is a low-tech solution, and may be right for some cases, but is mostly just extra noise and bookkeeping.

My preferred answer, and one that I covered in my PyCon 2013 talk, Loop Like A Native, is to make the double loop into a single loop, and then just use a simple break.

This requires putting a little more work into the loops, but is a good exercise in abstracting your iteration. This is something Python is very good at, but it is easy to use Python as if it were a less capable language, and not take advantage of the loop abstractions available.

Let's consider the problem again. Is this really two loops? Before you write any code, listen to the English description again:

How can I examine pairs of characters in a string, stopping when I find an equal pair?

I don't hear two loops in that description. There's a single loop, over pairs. So let's write it that way:

def unique_pairs(n):
    """Produce pairs of indexes in range(n)"""
    for i in range(n):
        for j in range(i+1, n):
            yield i, j

s = "a string to examine"
for i, j in unique_pairs(len(s)):
    if s[i] == s[j]:
        answer = (i, j)
        break

Here we've written a generator to produce the pairs of indexes we need. Now our loop is a single loop over pairs, rather than a double loop over indexes. The double loop is still there, but abstraced away inside the unique_pairs generator.

This makes our code nicely match our English. And notice we no longer have to write len(s) twice, another sign that the original code wanted refactoring. The unique_pairs generator can be reused if we find other places we want to iterate like this, though remember that multiple uses is not a requirement for writing a function.

I know this technique seems exotic. But it really is the best solution. If you still feel tied to the double loops, think more about how you imagine the structure of your program. The very fact that you are trying to break out of both loops at once means that in some sense they are one thing, not two. Hide the two-ness inside one generator, and you can structure your code the way you really think about it.

Python has powerful tools for abstraction, including generators and other techniques for abstracting iteration. My Loop Like A Native talk has more detail (and one egregious joke) if you want to hear more about it.

Coverage.py 4.2

Wednesday 27 July 2016

Coverage.py 4.2 is done.

As I mentioned in the beta 1 announcement, this contains work from the sprint at PyCon 2016 in Portland.

The biggest change since 4.1 is the only incompatible change. The "coverage combine" command now will ignore an existing .coverage data file, rather than appending to it as it used to do. This new behavior makes more sense to people, and matches how "coverage run" works. If you've ever seen (or written!) a tox.ini file with an explicit coverage-clean step, you won't have to any more. There's also a new "--append" option on "coverage combine", so you can get the old behavior if you want it.

The multiprocessing support continues to get the polish it deserves:

  • Now the concurrency option can be multi-valued, so you can measure programs that use multiprocessing and another library like gevent.
  • Options on the command line weren't being passed to multiprocessing subprocesses. Now they still aren't, but instead of failing silently, you'll get an error explaining the situation.
  • If you're using a custom-named configuration file, multiprocessing processes now will use that same file, so that all the processes will be measured the same.
  • Enabling multiprocessing support now also enables parallel measurement, since there will be subprocesses. This reduces the possibility for error when configuring coverage.py.

Finally, the text report can be sorted by columns as you wish, making it more convenient.

The complete change history is in the source.

Coverage.py 4.2 beta 1

Tuesday 5 July 2016

Coverage.py 4.2 beta 1 is available.

This contains a few things we worked on during a day of sprinting at PyCon 2016 in Portland. Thanks to my fellow sprinters: Dan Riti, Dan Wandschneider, Josh Williams, Matthew Boehm, Nathan Land, and Scott Belden. Each time I've sprinted on coverage.py, I've been surprised at the number of people willing to dive into the deep end to make something happen. It's really encouraging to see people step up like that.

What's changed? The biggest change is the only incompatible change. The "coverage combine" command now will ignore an existing .coverage data file, rather than appending to it as it used to do. This new behavior makes more sense to people, and matches how "coverage run" works. If you've ever seen (or written!) a tox.ini file with an explicit coverage-clean step, you won't have to any more. There's also a new "--append" option on "coverage combine", so you can get the old behavior if you want it.

A new option lets you control how the text report is sorted.

The concurrency option can now be multi-valued, if you are using multiprocessing and some other concurrency library, like gevent.

The complete change history is in the source.

This isn't going to be a long beta, so try it now!

Math factoid of the day: 54

Thursday 16 June 2016

54 can be written as the sum of three squares in three different ways:

7² + 2² + 1² = 6² + 3² + 3² = 5² + 5² + 2² = 54

It is the smallest number with this property.

Also, a Rubik's cube has 54 colored squares.

Loudest guy in the room

Sunday 5 June 2016

I just got back from PyCon 2016, and it was a giant summer camp love-fest as usual. But I've been thinking about a subtle and unfortunate dynamic that I saw a few times there.

In three different cases, I was with a group of people, and one person in particular had a disproportionate amount of air-time. They were different guys each time, but they just had a way of being the one doing more talking than listening, and more talking than others. In some cases, they were physically loud, but I don't always mean literally the loudest.

These weren't bad people. Sometimes, they were explicitly discussing the need to include others, or improve diversity, or other good impulses. They weren't trying to dominate the space, and they might even be surprised to hear that they were.

But I found myself cringing watching their interactions with others. Even when they thought they were being encouraging, I felt like they were subtly pushing others aside to do it. Keep in mind, this was at PyCon, one of the most explicitly inclusive places I frequent.

I'm a successful white guy, so I know it can be very easy to slip into the alpha male stance. Sometimes people expect it of me. It can be hard to tamp down the impulse to hold forth, letting others have the spotlight. But it's important, and a good exercise for yourself. It's fine to be able to be at the front of the room, but you should be able to turn it off when needed, which is more often than you would think.

Sometimes, this was in a men-only setting. It's great to be aware of the gender gap, but there are other kinds of gaps to consider also: non-native speakers, introverts, beginners, outsiders of various sorts. There are lots of reasons people might be quiet, and need a little room.

Ask questions instead of making statements. Stay quiet, and see what happens. Listen rather than speak. Even when it seems no one is going to say anything, wait longer than you are comfortable. See what happens. Leave space.

Next time you are in a group of people, look around and try to figure out who is the loudest guy in the room. If you aren't sure, then maybe it's you.

Coverage.py 4.1

Friday 27 May 2016

Coverage.py 4.1 is out!

I'm not sure what else to say about it that I haven't said a few times in the last six months: branch coverage is completely rewritten, so it should be more accurate and more understandable.

The beta wasn't getting much response, so I shielded my eyes and released the final version a few days ago. No explosions, so it's seems to be OK!

Ben portrait

Sunday 15 May 2016

Often, when I am headed to bed, I stop in at my son Ben's room, to see what he's up to. He'll be working on some piece of art, and we'll chat for a moment about it.

The other night, he was working on a self-portrait. It was a realistic depiction, but in a style reminiscent of a Renaissance prince. We talked about the style, what parts looked just like him, and what parts might need tweaking.

I went to bed, and then in the morning this was on his Facebook:

Ben' portrait

I didn't get a story about why the realistic face was gone, and why the self-homunculus is in its place instead. This picture looks much less like him, but says much more about him. He changed it from a picture of him to a picture about him. Ben has always impressed me with his art, not just as a technical skill, but as an expression of deeper ideas.

As always, I am proud of him, and thrilled to see what he creates.

Generator comprehensions

Wednesday 11 May 2016

Python has a compact syntax for constructing a list with a loop and a condition, called a list comprehension:

my_list = [ f(x) for x in sequence if cond(x) ]

You can also build dictionaries with dictionary comprehensions, and sets with set comprehensions:

my_dict = { k(x): v(x) for x in sequence if cond(x) }
my_set = { f(x) for x in sequence if cond(x) }

(The syntax allows more complexity than these examples, let's not get distracted!)

Finally, you can make a generator with similar syntax:

my_generator = ( f(x) for x in sequence if cond(x) )

Unfortunately, this is called a generator expression, not a generator comprehension. Why not? If the first three are all comprehensions, why isn't this a comprehension?

PEP 289, Generator Expressions has detailed notes at the end which point out that Raymond Hettinger originally proposed "generator comprehensions," that they were then resurrected by Peter Norvig as "accumulation displays," and that Tim Peters suggested the name "generator expressions." It does not explain why the names changed along the way.

I made a query on Twitter:

OK, #python question I don’t know the answer to: why are they called “generator expressions” and not “generator comprehensions”?

Guido's reply gets at the heart of the matter:

Originally comprehension was part of the "literal display" notion. GenExprs are not displays.

Matt Boehm found the email where Tim Peters proposed "generator expression" that also has some details.

After reading that, I understand more. First, what's with the word "comprehension"? As Tim pointed out, the word comes from set theory's Axiom of Comprehension, which talks about sets formed by applying a predicate (condition) to elements of another set. This is very similar to lists formed by applying a condition to elements of another sequence.

As Guido's tweet points out, and the subject line of the email thread makes clear ("accumulator display syntax"), the designers at the time were thinking much more about displays than they were about conditions. The word "display" here means that the syntax for the code looks like the data structure it will create. A list display (list comprehension) looks like a list. Same for set and dictionary displays. But there is no generator literal syntax, so there's nothing for a generator display to look like, so there are no generator displays.

In that original email thread designing the feature, the word "comprehension" became synonymous with "display", and since generators couldn't have displays, they also couldn't have comprehensions.

But as Tim points out in his email, the interesting part of a comprehension is the condition. The heart of the Axiom of Comprehension is the predicate. Perhaps because the condition is optional in a Python comprehension, the focus shifted to the display aspect.

I think we should call them "generator comprehensions" again. We don't use the term "display" for these things. There's no reason to link "comprehension" to "display," and literal syntax.

The four different expressions (list comprehension, dict comprehension, set comprehension, and generator expressions) have an awful lot in common with each other. It would be a great shorthand to be able to discuss their similarities by talking about "comprehensions" and having it cover all four. Their similarities are more than their differences, so let's use the same word for all four.

Proposal: call them "generator comprehensions."

Older:

Even older...