Computing a GitHub Action matrix with cog

Sunday 7 November 2021

I had a complex three-axis GitHub Action matrix, but needed to skip some combinations. I couldn’t get what I needed with the direct YAML syntax, so I used Cog to generate the matrix with Python.

The matrix made Python wheels with cibuildwheel, and it worked. It had 15 jobs, but they built different numbers of architectures (ubuntu made three, windows made two, macos made only one). This made the overall run take longer, and made it harder to dig through logs to see if everything went OK. Conceptually, the matrix was three-axis, but expressed as two-axis, with a list of architectures for each job:

strategy:
  matrix:
    os:
      - ubuntu-latest
      - macos-latest
      - windows-latest
    cibw_build:
      - cp36
      - cp37
      - cp38
      - cp39
      - cp310
    include:
      - os: ubuntu-latest
        cibw_arch: x86_64 i686 aarch64
      - os: windows-latest
        cibw_arch: x86 AMD64
      - os: macos-latest
        cibw_arch: x86_64

I wanted to make the architectures a third axis, but couldn’t figure out how to use the YAML syntax to limit the choices for each OS. It seemed like the only way to get a ragged three-axis matrix was to list the combinations explicitly. If you know how, I’m still interested to know.

What I wanted was a way to compute the matrix with a bit more power. There are examples out there of using fromJSON to build a matrix, but I didn’t need it to be recomputed every run. I just wanted a way to not have to type out 30 combinations by hand.

I’ve often needed this sort of thing: a static file with just a bit of computed content. This is what Cog was meant for, and it worked great here too. This is what my computed matrix looks like now:

strategy:
  matrix:
    include:
      # To change the matrix, edit the choices, then process this file with cog:
      #
      # $ python -m pip install cogapp
      # $ python -m cogapp -rP .github/workflows/kit.yml
      #
      #
      # [[[cog
      #   #----- vvv Choices for the matrix vvv -----
      #   oss = ["ubuntu", "macos", "windows"]
      #   pys = ["cp36", "cp37", "cp38", "cp39", "cp310"]
      #   archs = {
      #       "ubuntu": ["x86_64", "i686", "aarch64"],
      #       "macos": ["x86_64"],
      #       "windows": ["x86", "AMD64"],
      #   }
      #   #----- ^^^ ---------------------- ^^^ -----
      #
      #   import json
      #   for the_os in oss:
      #       for the_py in pys:
      #           for the_arch in archs[the_os]:
      #               them = {
      #                   "os": the_os,
      #                   "py": the_py,
      #                   "arch": the_arch,
      #               }
      #               print(f"- {json.dumps(them)}")
      # ]]]
      - {"os": "ubuntu", "py": "cp36", "arch": "x86_64"}
      - {"os": "ubuntu", "py": "cp36", "arch": "i686"}
      - {"os": "ubuntu", "py": "cp36", "arch": "aarch64"}
      - {"os": "ubuntu", "py": "cp37", "arch": "x86_64"}
      - {"os": "ubuntu", "py": "cp37", "arch": "i686"}
      - {"os": "ubuntu", "py": "cp37", "arch": "aarch64"}
      - {"os": "ubuntu", "py": "cp38", "arch": "x86_64"}
      - {"os": "ubuntu", "py": "cp38", "arch": "i686"}
      - {"os": "ubuntu", "py": "cp38", "arch": "aarch64"}
      - {"os": "ubuntu", "py": "cp39", "arch": "x86_64"}
      - {"os": "ubuntu", "py": "cp39", "arch": "i686"}
      - {"os": "ubuntu", "py": "cp39", "arch": "aarch64"}
      - {"os": "ubuntu", "py": "cp310", "arch": "x86_64"}
      - {"os": "ubuntu", "py": "cp310", "arch": "i686"}
      - {"os": "ubuntu", "py": "cp310", "arch": "aarch64"}
      - {"os": "macos", "py": "cp36", "arch": "x86_64"}
      - {"os": "macos", "py": "cp37", "arch": "x86_64"}
      - {"os": "macos", "py": "cp38", "arch": "x86_64"}
      - {"os": "macos", "py": "cp39", "arch": "x86_64"}
      - {"os": "macos", "py": "cp310", "arch": "x86_64"}
      - {"os": "windows", "py": "cp36", "arch": "x86"}
      - {"os": "windows", "py": "cp36", "arch": "AMD64"}
      - {"os": "windows", "py": "cp37", "arch": "x86"}
      - {"os": "windows", "py": "cp37", "arch": "AMD64"}
      - {"os": "windows", "py": "cp38", "arch": "x86"}
      - {"os": "windows", "py": "cp38", "arch": "AMD64"}
      - {"os": "windows", "py": "cp39", "arch": "x86"}
      - {"os": "windows", "py": "cp39", "arch": "AMD64"}
      - {"os": "windows", "py": "cp310", "arch": "x86"}
      - {"os": "windows", "py": "cp310", "arch": "AMD64"}
    # [[[end]]]

If you haven’t seen cog before, this is how it works: it finds chunks of Python code between [[[cog and ]]] markers, executes them, and inserts the output into the file up to the [[[end]]] marker. Existing output is replaced.

Here, the 30 lines of combinations are the output. They weren’t in the file originally; they were created when I ran cog and it re-wrote the whole file. If I change the lists of choices, or the Python code, and re-run cog, it will remove those 30 lines and replace them with the new output.

This is perfect for this use: the choices for the matrix are only going to change very infrequently, and manually. When the choices need to change, I can edit the lists in the Python code, and run cog again to update the generated matrix.

Coverage goals

Monday 1 November 2021

There’s a feature request to add a per-file threshold to coverage.py. I didn’t add the feature, I wrote a proof-of-concept: goals.py.

Coverage.py has a --fail-under option that will check the total coverage percentage, and exit with a failing status if it is too low. This lets people set a goal, and then check that they are meeting it in their CI systems.

The feature request is to check each file individually, rather than the project as a whole, to exert tighter control over the goal. That sounds fine, but I could see that it would actually be more complicated than that, because people sometimes have more complicated goals: 100% coverage in tests and 85% in product code, or whatever.

I suggested implementing it as a separate tool that used data from a JSON report. Then, I did just that.

The goals.py tool is flexible: you give it a percentage number, and then a list of glob patterns. It collects up the files that match the patterns, and checks the coverage of that set of files. You can choose to measure the group as a whole, or each file individually. Patterns can be negated to remove files from consideration.

For example:

# Check all Python files collectively, except in the tests/ directory.
$ python goals.py --group 85 '**/*.py' '!tests/*.py'

# We definitely want complete coverage of anything related to html.
$ python goals.py --group 100 '**/*html*.py'

# No Python file should be below 90% covered.
$ python goals.py --file 90 '**/*.py'

Each run of goals.py checks one set of files against one goal, but you can run it multiple times if you want to check multiple goals.

If you want to have more control over your coverage goals, give goals.py a try. It might turn into a full-fledged coverage.py feature, or maybe it’s enough as it is.

Feedback is welcome, either here or on the original feature request.

Django Chat podcast

Wednesday 13 October 2021

I had a fun conversation on the Django Chat podcast with Will Vincent and Carlton Gibson. It was a great discussion.

Things we talked about:

  • Walking
  • Right and wrong ways to do things
  • Geographic meetups during virtual times
  • Open source attention
  • Coverage.py
  • The evolution of the Python standard library
  • Python 3.10’s trace behavior
  • Coverage as a measure of test quality
  • UX of test information
  • Developer gamification
  • Upgrading Django with third-party packages
  • Convincing people to test
  • Using non-public interfaces
  • Cog
  • Side projects as outlets
  • Rewriting my wacky personal site (this site)
  • edX being acquired by 2U
  • Open source from first principles

300 walks

Monday 27 September 2021

I’ve been continuing the walking I described in Pandemic walks, and have now completed 300 such walks, 1648 miles. Walking new streets every day, but from the same point, actually means walking a lot of the same streets every day.

Here are three map thumbnails, showing the new streets visited in the first hundred walks, the next hundred, and the third hundred:

New streets in the first hundred walks

New streets in the second hundred walks

New streets in the third hundred walks

Although the total distance is longer in the last third than in the first, the total of new streets covered is much less, because of how far I had to go to even get to the new streets.

But the walks will continue, still targeting new streets. It’s a great way to get out, see new things, and get some exercise.

Here’s an animation of the 300 walks:

Animation of a map showing every walk I've taken

On to 400!

Update (Sept 29) Today was my 301st walk, and my mapping toolchain started the fourth hundred-map thumbnail, but it’s underwhelming since it only shows the new streets covered by one walk...

New streets in the fourth hundred walks (so far: just one walk)

BTW: I wrote about how I do the mapping back in February: Mapping walks. The new thing here is how I do the 100-walk new-street maps. Low tech: to make the 201–300 map, I draw walks 1–300 with a thin black line, then I draw walks 1–200 with a thicker white line. Only the new streets remain.

Real Django site

Monday 13 September 2021

Big changes behind the scenes here at nedbatchelder.com, but only a small change for you.

My hosting provider was being acquired, and they said they would migrate my site to the new host. Then they wrote last month to say they couldn’t migrate it (no word why), and that I had six weeks to find a new home.

I briefly tried to just move the site as it was, but PHP 5 was in the mix. Rather than learn how to move it to PHP 7, I bit the bullet and converted it to a real Django-served site.

For 13 years this site has been built with Django, but served as static HTML pages. The comments were handled by PHP code. As part of this move, the site is now served directly by Django on the host, with Django-implemented comments.

This should all be invisible to readers of the site, except for one thing: comments are now written as Markdown instead of as neutered HTML. Having a Django foundation means I will be able to make changes more easily in the future.

Behind the scenes, there is still plenty of strange tech: content is in XML, loaded into a SQLite database locally, then rsync’ed to the server.

Some dormant areas of the site aren’t serving properly yet, but the important stuff works. If you see a problem, please let me know.

Older:

Jul 25:

Aptus v3