I am at liberty

Tuesday 30 January 2024

As of a few weeks ago, I am between gigs. Riffing on some corporate-speak from a recent press release: “2U and I have mutually determined that 2U is laying me off.”

I feel OK about it: work was becoming increasingly frustrating, and I have some severance pay. 2U is in a tough spot as a company so at least these layoffs seemed like an actual tactic rather than another pointless please-the-investors move by companies flush with profits and cash. 2U struggling also makes being laid off a more appealing option than remaining there after a difficult cut.

edX was a good run for me. We had a noble mission: educate the world. The software was mostly open source (Open edX), which meant our efforts could power education that we as a corporation didn’t want to pursue.

Broadly speaking, my job was to oversee how to do open source well. I loved the mission of education combined with the mission of open source. I loved seeing the community do things together that edX alone could not. I have many good friends at 2U and in the community. I hope they can make everything work out well, and I hope I can do a good job staying in touch with them.

I don’t know what my next gig will be. I like writing software. I like having developers as my customers. I am good at building community both inside and outside of companies. I am good at helping people. I’m interested to hear ideas.

You (probably) don’t need to learn C

Wednesday 24 January 2024

On Mastodon I wrote that I was tired of people saying, “you should learn C so you can understand how a computer really works.” I got a lot of replies which did not change my mind, but helped me understand more how abstractions are inescapable in computers.

People made a number of claims. C was important because syscalls are defined in terms of C semantics (they are not). They said it was good for exploring limited-resource computers like Arduinos, but most people don’t program for those. They said it was important because C is more performant, but Python programs often offload the compute-intensive work to libraries other people have written, and these days that work is often on a GPU. Someone said you need it to debug with strace, then someone said they use strace all the time and don’t know C. Someone even said C was good because it explains why NUL isn’t allowed in filenames, but who tries to do that, and why learn a language just for that trivia?

I’m all for learning C if it will be useful for the job at hand, but you can write lots of great software without knowing C.

A few people repeated the idea that C teaches you how code “really” executes. But C is an abstract model of a computer, and modern CPUs do all kinds of things that C doesn’t show you or explain. Pipelining, cache misses, branch prediction, speculative execution, multiple cores, even virtual memory are all completely invisible to C programs.

C is an abstraction of how a computer works, and chip makers work hard to implement that abstraction, but they do it on top of much more complicated machinery.

C is far removed from modern computer architectures: there have been 50 years of innovation since it was created in the 1970’s. The gap between C’s model and modern hardware is the root cause of famous vulnerabilities like Meltdown and Spectre, as explained in C is Not a Low-level Language.

C can teach you useful things, like how memory is a huge array of bytes, but you can also learn that without writing C programs. People say, C teaches you about memory allocation. Yes it does, but you can learn what that means as a concept without learning a programming language. And besides, what will Python or Ruby developers do with that knowledge other than appreciate that their languages do that work for them and they no longer have to think about it?

Pointers came up a lot in the Mastodon replies. Pointers underpin concepts in higher-level languages, but you can explain those concepts as references instead, and skip pointer arithmetic, aliasing, and null pointers completely.

A question I asked a number of people: what mistakes are JavaScript/Ruby/Python developers making if they don’t know these things (C, syscalls, pointers)?”. I didn’t get strong answers.

We work in an enormous tower of abstractions. I write programs in Python, which provides me abstractions that C (its underlying implementation language) does not. C provides an abstract model of memory and CPU execution which the computer implements on top of other mechanisms (microcode and virtual memory). When I made a wire-wrapped computer, I could pretend the signal travelled through wires instantaneously. For other hardware designers, that abstraction breaks down and they need to consider the speed electricity travels. Sometimes you need to go one level deeper in the abstraction stack to understand what’s going on. Everyone has to find the right layer to work at.

Andy Gocke said it well:

When you no longer have problems at that layer, that’s when you can stop caring about that layer. I don’t think there’s a universal level of knowledge that people need or is sufficient.

“like jam or bootlaces” made another excellent point:

There’s a big difference between “everyone should know this” and “someone should know this” that seems to get glossed over in these kinds of discussions.

C can teach you many useful and interesting things. It will make you a better programmer, just as learning any new-to-you language will because it broadens your perspective. Some kinds of programming need C, though other languages like Rust are ably filling that role now too. C doesn’t teach you how a computer really works. It teaches you a common abstraction of how computers work.

Find a level of abstraction that works for what you need to do. When you have trouble there, look beneath that abstraction. You won’t be seeing how things really work, you’ll be seeing a lower-level abstraction that could be helpful. Sometimes what you need will be an abstraction one level up. Is your Python loop too slow? Perhaps you need a C loop. Or perhaps you need numpy array operations.

You (probably) don’t need to learn C.

Randomly sub-setting test suites

Sunday 14 January 2024

I needed to run random subsets of my test suite to narrow down the cause of some mysterious behavior. I didn’t find an existing tool that worked the way I wanted to, so I cobbled something together.

I wanted to run 10 random tests (out of 1368), and keep choosing randomly until I saw the bad behavior. Once I had a selection of 10, I wanted to be able to whittle it down to try to reduce it further.

I tried a few different approaches, and here’s what I came up with, two tools in the coverage.py repo that combine to do what I want:

  • A pytest plugin (select_plugin.py) that lets me run a command to output the names of the exact tests I want to run,
  • A command-line tool (pick.py) to select random lines of text from a file. For convenience, blank or commented-out lines are ignored.

More details are in the comment at the top of pick.py, but here’s a quick example:

  1. Get all the test names in tests.txt. These are pytest “node” specifications:
    pytest --collect-only | grep :: > tests.txt
  2. Now tests.txt has a line per test node. Some are straightforward:
    tests/test_cmdline.py::CmdLineStdoutTest::test_version
    tests/test_html.py::HtmlDeltaTest::test_file_becomes_100
    tests/test_report_common.py::ReportMapsPathsTest::test_map_paths_during_html_report
    but with parameterization they can be complicated:
    tests/test_files.py::test_invalid_globs[bar/***/foo.py-***]
    tests/test_files.py::FilesTest::test_source_exists[a/b/c/foo.py-a/b/c/bar.py-False]
    tests/test_config.py::ConfigTest::test_toml_parse_errors[[tool.coverage.run]\nconcurrency="foo"-not a list]
  3. Run a random bunch of 10 tests:
    pytest --select-cmd="python pick.py sample 10 < tests.txt"
    We’re using --select-cmd to specify the shell command that will output the names of tests. Our command uses pick.py to select 10 random lines from tests.txt.
  4. Run many random bunches of 10, announcing the seed each time:
    for seed in $(seq 1 100); do
        echo seed=$seed
        pytest --select-cmd="python pick.py sample 10 $seed < tests.txt"
    done
  5. Once you find a seed that produces the small batch you want, save that batch:
    python pick.py sample 10 17 < tests.txt > bad.txt
  6. Now you can run that bad batch repeatedly:
    pytest --select-cmd="cat bad.txt"
  7. To reduce the bad batch, comment out lines in bad.txt with a hash character, and the tests will be excluded. Keep editing until you find the small set of tests you want.

I like that this works and I understand it. I like that it’s based on the bedrock of text files and shell commands. I like that there’s room for different behavior in the future by adding to how pick.py works. For example, it doesn’t do any bisecting now, but it could be adapted to it.

As usual, there might be a better way to do this, but this works for me.

Coverage.py with sys.monitoring

Wednesday 27 December 2023

New in Python 3.12 is sys.monitoring, a lighter-weight way to monitor the execution of Python programs. Coverage.py 7.4.0 now can optionally use sys.monitoring instead of sys.settrace, the facility that has underpinned coverage.py for nearly two decades. This is a big change, both in Python and in coverage.py. It would be great if you could try it out and provide some feedback.

Using sys.monitoring should reduce the overhead of coverage measurement, often lower than 5%, but of course your timings might be different. One of the things I would like to know is what your real-world speed improvements are like.

Because the support is still a bit experimental, you need to define an environment variable to use it: COVERAGE_CORE=sysmon. Eventually, sys.monitoring will be automatically used where possible, but for now you need to explicitly request it.

Some things won’t work with sys.monitoring: plugins and dynamic contexts aren’t yet supported, though eventually they will be. Execution will be faster for line coverage, but not yet for branch coverage. Let me know how it works for you.

This has been in the works since at least March. I hope I haven’t forgotten something silly in getting it out the door.

Real-world match/case

Sunday 10 December 2023

Python 3.10 brought us structural pattern matching, better known as match/case. At first glance, it looks like a switch statement from C or JavaScript, but it’s very different.

You can use match/case to match specific literals, similar to how switch statements work, but their point is to match patterns in the structure of data, not just values. PEP 636: Structural Pattern Matching: Tutorial does a good job explaining the mechanics, but feels like a toy example.

Here’s a real-world use: at work we have a GitHub bot installed as a webhook. When something happens in one of our repos, GitHub sends a payload of JSON data to our bot. The bot has to examine the decoded payload to decide what to do.

These payloads are complex: they are dictionaries with only 6 or 8 keys, but they are deeply nested, eventually containing a few hundred pieces of data. Originally we were picking them apart to see what keys and values they had, but match/case made the job much simpler.

Here’s some of the code for determining what to do when we get a “comment created” event:

# Check the structure of the payload:
match event:
    case {
        "issue": {"closed_at": closed},
        "comment": {"created_at": commented},
        } if closed == commented:
        # This is a "Close with comment" comment. Don't do anything for the
        # comment, because we'll also get a "pull request closed" event at
        # the same time, and it will do whatever we need.
        pass

    case {"sender": {"login": who}} if who == get_bot_username():
        # When the bot comments on a pull request, it causes an event, which
        # gets sent to webhooks, including us.  We don't have to do anything
        # for our own comment events.
        pass

    case {"issue": {"pull_request": _}}:
        # The comment is on a pull request. Process it.
        return process_pull_request_comment(event)

The first case matches if the dict has an “issue” key containing a dict with a “closed_at” key and also a “comment” key containing a dict with a “created_at” key, and if those two leaves in the dict are equal. Writing out that condition without match/case would be more verbose and confusing.

The second case examines the event to see if the bot was the originator of the event. This one wouldn’t have been so hard to write in a different way, but match/case makes it nicer.

This is just what match/case is good at: checking patterns in the structure of data.

It’s also interesting to see the bytecode generated. For that first case, it looks like this:

  2           0 LOAD_GLOBAL              0 (event)

  3           2 MATCH_MAPPING
              4 POP_JUMP_IF_FALSE       67 (to 134)
              6 GET_LEN
              8 LOAD_CONST               1 (2)
             10 COMPARE_OP               5 (>=)
             12 POP_JUMP_IF_FALSE       67 (to 134)

  4          14 NOP

  5          16 NOP

  3          18 LOAD_CONST               8 (('issue', 'comment'))
             20 MATCH_KEYS
             22 POP_JUMP_IF_FALSE       65 (to 130)
             24 DUP_TOP
             26 LOAD_CONST               4 (0)
             28 BINARY_SUBSCR

  4          30 MATCH_MAPPING
             32 POP_JUMP_IF_FALSE       64 (to 128)
             34 GET_LEN
             36 LOAD_CONST               5 (1)
             38 COMPARE_OP               5 (>=)
             40 POP_JUMP_IF_FALSE       64 (to 128)
             42 LOAD_CONST               9 (('closed_at',))
             44 MATCH_KEYS
             46 POP_JUMP_IF_FALSE       62 (to 124)
             48 DUP_TOP
             50 LOAD_CONST               4 (0)
             52 BINARY_SUBSCR
             54 ROT_N                    7
             56 POP_TOP
             58 POP_TOP
             60 POP_TOP
             62 DUP_TOP
             64 LOAD_CONST               5 (1)
             66 BINARY_SUBSCR

  5          68 MATCH_MAPPING
             70 POP_JUMP_IF_FALSE       63 (to 126)
             72 GET_LEN
             74 LOAD_CONST               5 (1)
             76 COMPARE_OP               5 (>=)
             78 POP_JUMP_IF_FALSE       63 (to 126)
             80 LOAD_CONST              10 (('created_at',))
             82 MATCH_KEYS
             84 POP_JUMP_IF_FALSE       61 (to 122)
             86 DUP_TOP
             88 LOAD_CONST               4 (0)
             90 BINARY_SUBSCR
             92 ROT_N                    8
             94 POP_TOP
             96 POP_TOP
             98 POP_TOP
            100 POP_TOP
            102 POP_TOP
            104 POP_TOP
            106 STORE_FAST               0 (closed)
            108 STORE_FAST               1 (commented)

  6         110 LOAD_FAST                0 (closed)
            112 LOAD_FAST                1 (commented)
            114 COMPARE_OP               2 (==)
            116 POP_JUMP_IF_FALSE       70 (to 140)

 10         118 LOAD_CONST               0 (None)
            120 RETURN_VALUE

  3     >>  122 POP_TOP
        >>  124 POP_TOP
        >>  126 POP_TOP
        >>  128 POP_TOP
        >>  130 POP_TOP
            132 POP_TOP
        >>  134 POP_TOP
            136 LOAD_CONST               0 (None)
            138 RETURN_VALUE

  6     >>  140 LOAD_CONST               0 (None)
            142 RETURN_VALUE

That’s a lot, but you can see roughly what it’s doing: check if the value is a mapping (dict) with at least two keys (bytecodes 2–12), then check if it has the two specific keys we’ll be examining (18–22). Look at the value of the first key, check if it’s a dict with at least one key (24–40), etc, and so on.

Hand-writing these sorts of checks might result in shorter bytecode. For example, I already know the event value is a dict, since that is what the GitHub API promise me, so there’s no need to check it explicitly each time. But the Python code would be twistier and harder to get right. I was initially a skeptic about match/case, but this example shows a clear benefit.

Say it again: values not expressions

Wednesday 29 November 2023

Sometimes you can explain a simple thing for the thousandth time, and come away with a deeper understanding yourself. It happened to me the other day with Python mutable argument default values.

This is a classic Python “gotcha”: you can provide a default value for a function argument, but it will only be evaluated once:

>>> def doubled(item, the_list=[]):
...     the_list.append(item)
...     the_list.append(item)
...     return the_list
...
>>> print(doubled(10))
[10, 10]
>>> print(doubled(99))
[10, 10, 99, 99]    # WHAT!?

I’ve seen people be surprised by this and ask about it countless times. And countless times I’ve said, “Yup, the value is only calculated once, and stored on the function.”

But recently I heard someone answer with, “it’s a value, not an expression,” which is a good succinct way to say it. And when a co-worker brought it up again the other day, I realized, it’s right in the name: people ask about “default values” not “default expressions.” Of course it’s calculated only once, it’s a default value, not a default expression. Somehow answering the question for the thousandth time made those words click into place and make a connection I hadn’t realized before.

Maybe this seems obvious to others who have been fielding this question, but to me it was a satisfying alignment of the terminology and the semantics. I’d been using the words for years, but hadn’t seen them as so right before.

This is one of the reasons I’m always interested to help new learners: even well-trodden paths can reveal new insights.

Older: