Ned Batchelder's blog https://nedbatchelder.com/blog Ned Batchelder's personal blog. en-US Ned Batchelder's blog https://nedbatchelder.com/blog https://nedbatchelder.com/pix/rss-banner.gif Secure maintainer workflow https://nedbatchelder.com/blog/202211/secure_maintainer_workflow.html 2022-11-21T06:06:06-05:00 Ned Batchelder I’m trying to establish a more secure workflow for maintaining public packages.

Like most developers, I have terminal sessions with implicit access to credentials. For example, I can make git commits and push to GitHub without having to type a password.

There are two ways this might be a problem. The first is unlikely: a bad guy gets onto my computer and uses the credentials to cause havoc. This is unlikely mostly because a bad guy won’t get my computer, but also, if it does fall into the wrong hands, it will probably be someone looking to resell the laptop, not use my coverage.py credentials maliciously.

The second way is a more serious concern: I could unknowingly run evil or buggy code that uses my credentials in bad ways. People write bug reports for coverage.py, and if I am lucky, they include steps to reproduce the problem. Sometimes the instructions involve small self-contained examples, and I can just run them without fear. But sometimes the steps are clone this repo, and run this large test suite. It’s impossible to review all of that code. I don’t know what the code will do, but if I want to see and diagnose the problem, I have to run it.

I’m trying to reduce the possibilities for bad outcomes, in a few ways:

1Password: where possible, I store credentials in 1Password, and use tooling to get them into environment variables. I have two shell functions (opvars / unopvars) that find values in a vault based on the current directory, and can set and unset them in the environment.

With this, I can have the credentials in the environment for just long enough to use them. This works well for things like PyPI credentials, which are used rarely and could cause significant damage.

But I still also have implicit credentials in my ~/.ssh directory and ~/.netrc file. I’m not sure the best approach to keep them from being available to programs that shouldn’t have them.

Docker: To really isolate unknown code, I use a Docker container. I start with a base image with many versions of Python: base.dockerfile, and then build on it to create a main image that doesn’t even have sudo. In the container, there are no credentials, so I don’t have to worry about malice or accidents. For involved debugging, I might write another Dockerfile FROM these to reduce the re-work that has to happen when starting over.

What else can I be doing to keep safe?

]]>
Ideal open source https://nedbatchelder.com/blog/202210/ideal_open_source.html 2022-10-29T07:37:17-04:00 Ned Batchelder A friend asked my opinion about DHH’s essay from last year: I won’t let you pay me for my open source.

He makes some interesting points about how Bill Gates and Richard Stallman, the poster children for capitalism and open source, normally considered polar opposites, are actually driven by the same economic fears of scarcity. Gates’ fear is that people will use software without paying, Stallman’s fear is that they won’t contribute.

But then he moves into his ideal view of open source, which is that it can be completely removed from these economic fears and constraints, because there is no scarcity. If I write software, and you use it without paying or contributing, I am not harmed. And your actions also don’t harm other users who might pay or contribute. So there’s no scarcity!

He goes on to celebrate writing software as a form of self-actualization. Creating something for the pure joy of creation, with no expectation of anything in return. He touches on autonomy, mastery, and purpose as three pillars of motivation to act.

I totally understand this view. I write side projects. I like that I can choose what to write and how to write it (autonomy). I learn while writing, and enjoy learning (mastery). And I can decide for each project why I’m writing it (purpose).

DHH eventually gets to the core of his philosophical outlook:

There is no universal meaning to life. You’ve been thrown into the world without a preordained purpose, which is both a terrible burden to bear and the ultimate freedom to embrace. You get to decide.

Do I get to decide? In some sense, yes, but actually, I can only choose among the options available to me.

  • I want to work on a side-project full-time!
    Oops, no, that’s not a choice, because I have to pay for food and shelter.
  • I have a day job, but want to have time and energy to maintain a project!
    Reality: I’m tired after a full day of work, and my family needs me, and there are other chores to do.

To get back to purpose: my ideal for open source is that I write something useful, and other people like it and use it. The more people the better.

I’m lucky, because I’ve managed that: coverage.py is widely used, and I am known because of it. But there’s a downside: 221 open issues, many of which are complex, or vague, or require difficult decisions.

This idea that I can decide my purpose feels a bit like a genie granting me a wish, and then I have to deal with the unanticipated plot twists that result.

A project like coverage.py can eventually become a lot of difficult time-consuming not-fun work. It’s true, I am not obligated to continue working on it, or to answer those issues. But that wouldn’t fulfill the purpose I was aiming for. I didn’t start a project so that I could abandon it when lots of people are using it. I didn’t start a project so that I could ignore the problems people are having. I don’t get to decide my purpose. I can only choose among the options that reality offers to me.

Nadia Eghbal Asparouhova studied open source and understood that there is scarcity: a scarcity of maintainer attention. (If you want to read more, she wrote a book: Working in Public: The Making and Maintenance of Open Source Software, and also a Ford Foundation research report: Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure.)

When open source maintainers complain that big companies use their software without providing support, DHH is right: there was no transaction promised. He’s right that there’s a kind of freedom to avoiding those transactions.

I think what drives that negative tension between maintainers and “freeloading” users is that missing possibility: there’s a conceivable reality where the maintainers’ open source ideal situations are achievable. Support commensurate with use would let maintainers focus on their projects. Not only would this make maintainers’ realities better match their desires, it would make for better software, which would help everyone.

I’ve written before about corporations and open source: Corporations and open source: why and how. I understand how companies work: they won’t pay for something if they don’t have to. Why are we as a society OK with this? Suppose someone said, “I never tip waiters, why should I? I don’t get anything extra by doing it, and my kids should have that money.” Where I am from, people would be horrified by this attitude.

But companies’ standard operating style is exactly that: “I can get that software free, nothing is requiring me to pay for it, so why should I?” We all shrug and expect nothing different. But why not expect something different? Companies are just large collections of people. Why is it acceptable for a company to act so differently than each of its component people individually?

As an open source maintainer, it feels like my ideal situation is just on the other side of a locked fence, and I can’t get it. Corporate support could make that a reality, but no one expects it, so it doesn’t happen.

BTW: if you want to help with coverage.py, get in touch!

]]>
Decorator shortcuts https://nedbatchelder.com/blog/202210/decorator_shortcuts.html 2022-10-08T13:42:38-04:00 Ned Batchelder When using many decorators in code, there’s a shortcut you can use if you find yourself repeating them. They can be assigned to a variable just like any other Python expression.

Don’t worry if you don’t understand how decorators work under the hood. A decorator is a line like this in your code, usually modifying how a function behaves:

@something(option1, option2)

def my_function(arg1, arg2):
    ... # etc

For this example, it doesn’t really matter what the “something” decorator does. The important thing to know is that everything after the @ sign is a Python expression that is evaluated to get an object that will be applied to the function.

As with other Python expressions, you can give that object a name, and use it later. This produces the same effect:

modifier = something(option1, option2)


@modifier
def my_function(arg1, arg2):
    ... # etc

In this case we haven’t gained much. But let me show you a real example. In the coverage.py test suite, there are unusual conditions that cause tests to fail, and I want to tell pytest that I expect them to fail in those situations. Pytest has a decorator called “pytest.mark.xfail” that can be used to do this.

Here’s a real example:

@pytest.mark.xfail(

    env.PYVERSION[:2] == (3, 8) and env.PYPY and env.PYPYVERSION >= (7, 3, 10),
    reason="Avoid a PyPy bug: https://foss.heptapod.net/pypy/pypy/-/issues/3749",
)
def test_something():
    ...

(Yes, it’s a bit crazy, but a bug in PyPy 3.8 version 7.3.10 or greater causes some of my tests to fail. Coverage.py tries to closely follow small differences between implementations, so it’s not unusual to have to excuse a test that doesn’t work in very specific circumstances.)

The real problem though was that eleven tests failed in this situation. I didn’t want to copy those four lines into three different test files and explicitly decorate eleven tests. So I defined a shortcut in a helper file:

xfail_pypy_3749 = pytest.mark.xfail(

    env.PYVERSION[:2] == (3, 8) and env.PYPY and env.PYPYVERSION >= (7, 3, 10),
    reason="Avoid a PyPy bug: https://foss.heptapod.net/pypy/pypy/-/issues/3749",
)

Then in the test files, I can do this:

from tests.helpers import xfail_pypy_3749


@xfail_pypy_3749
def test_something():
    ...

@xfail_pypy_3749
def test_something_else():
    ...

Now I have a compact notation to apply to affected tests, and I can add as much detail to the definition because it’s only in one place instead of being copied everywhere.

There could be advanced cases where the decorator function needs to be explicitly called for each function, and a shortcut wouldn’t work right, but to be honest I’m not sure what those would be!

]]>
Truchet backgrounds https://nedbatchelder.com/blog/202209/truchet_backgrounds.html 2022-09-23T22:11:46-04:00 Ned Batchelder These are some Zoom-sized backgrounds I made with my Truchet tiles that I described in my Truchet images blog post. They make a surprising variety of organic shapes. I’m seeing penguins, frogs, and elephants all over them.

Feel free to use one if you want an abstract but engaging background...

A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
A truchet background
]]>
Making a coverage badge https://nedbatchelder.com/blog/202209/making_a_coverage_badge.html 2022-09-19T07:23:16-04:00 Ned Batchelder This is a sketch of how to use GitHub actions to get a total combined coverage number, and create a badge for your README. There are other approaches too, but this uses some commonly used tools to get the job done.

We’ll use tox to run tests, and GitHub actions to run tox. A GitHub gist will be used as a scratch file to store parameters for the badge, which will be rendered by shields.io.

Start with the tox.ini that runs your test suite, and also includes a “coverage” environment that combines, reports, and produces a JSON data file:

tox.ini

[tox]

envlist = py37,py38,py39,py310,coverage

[testenv]
commands =
    python -m coverage run -p -m pytest

[testenv:coverage]
basepython = python3.10
commands =
    python -m coverage combine
    python -m coverage report -m --skip-covered
    python -m coverage json

[gh-actions]
python =
    3.7: py37
    3.8: py38
    3.9: py39
    3.10: py310

We’ll use a GitHub action to run tox, but before we get to that, we need two bits of infrastructure. Go to https://gist.github.com and make an empty secret gist. Copy the id of the gist. Here we’ll call it 123abc456def789.

Next we’ll create a personal access token for updating the gist. Go to your GitHub personal access tokens page and click “Generate new token.” Select the “gist” scope, and click “Generate token.” Copy the value displayed, it will look like “ghp_FSfkCeFblahblahblah”. You can’t get the value again, so be careful with it.

In your repo on GitHub, go to Settings - Secrets - Actions, click “New repository secret.” Use “GIST_TOKEN” as the Name, and paste the ghp_etc token as the Secret, then “Add secret.”

Now we’re ready to create the GitHub action. It will run the test suite on many versions of Python, then run the coverage step to combine all the data files. It uses the JSON report to extract a displayable percentage, then uses a third-party GitHub action to create the JSON data in the Gist so that shields.io can display the badge.

The badge is automatically colored: 50% or lower is red, 90% or higher is green, with a gradient between the two, like this:

The spectrum of badge colors.

As a bonus, there’s an action job summary with the coverage total. Here’s the workflow file:

.github/workflows/tests.yaml

# Run tests


name: "Test Suite"

on:
  push:
  pull_request:

defaults:
  run:
    shell: bash

jobs:
  tests:
    name: "Python ${{ matrix.python-version }} on ${{ matrix.os }}"
    runs-on: "${{ matrix.os }}"

    strategy:
      fail-fast: false
      matrix:
        os:
          - ubuntu-latest
          - macos-latest
          - windows-latest
        python-version:
          - "3.7"
          - "3.8"
          - "3.9"
          - "3.10"

    steps:
      - name: "Check out the repo"
        uses: "actions/checkout@v2"

      - name: "Set up Python"
        uses: "actions/setup-python@v2"
        with:
          python-version: "${{ matrix.python-version }}"

      - name: "Install dependencies"
        run: |
          python -m pip install tox tox-gh-actions

      - name: "Run tox for ${{ matrix.python-version }}"
        run: |
          python -m tox

      - name: "Upload coverage data"
        uses: actions/upload-artifact@v3
        with:
          name: covdata
          path: .coverage.*

  coverage:
    name: Coverage
    needs: tests
    runs-on: ubuntu-latest
    steps:
      - name: "Check out the repo"
        uses: "actions/checkout@v2"

      - name: "Set up Python"
        uses: "actions/setup-python@v2"
        with:
          python-version: "3.10"

      - name: "Install dependencies"
        run: |
          python -m pip install tox tox-gh-actions

      - name: "Download coverage data"
        uses: actions/download-artifact@v3
        with:
          name: covdata

      - name: "Combine"
        run: |
          python -m tox -e coverage
          export TOTAL=$(python -c "import json;print(json.load(open('coverage.json'))['totals']['percent_covered_display'])")
          echo "total=$TOTAL" >> $GITHUB_ENV
          echo "### Total coverage: ${TOTAL}%" >> $GITHUB_STEP_SUMMARY

      - name: "Make badge"
        uses: schneegans/dynamic-badges-action@v1.4.0
        with:
          # GIST_TOKEN is a GitHub personal access token with scope "gist".
          auth: ${{ secrets.GIST_TOKEN }}
          gistID: 123abc456def789   # replace with your real Gist id.
          filename: covbadge.json
          label: Coverage
          message: ${{ env.total }}%
          minColorRange: 50
          maxColorRange: 90
          valColorRange: ${{ env.total }}

Now the badge can be displayed with a URL like this, but replace YOUR_GITHUB_NAME with your GitHub name, and 123abc456def789 with your real Gist id:

https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/YOUR_GITHUB_NAME/123abc456def789/raw/covbadge.json

Consult the docs for your markup language of choice for how to use the image URL to display the badge.

BTW: the files here are simplified versions of the action and tox.ini from scriv, if you are interested.

]]>
Stilted https://nedbatchelder.com/blog/202208/stilted.html 2022-08-27T07:53:01-04:00 Ned Batchelder For fun this summer, I implemented part of the PostScript language, using PyCairo for rendering. I call it Stilted. Implementing a language is an interesting exercise. You always learn some things along the way.

Executable bit: All objects in PostScript have a literal/executable bit that can be changed with the cvx (convert to executable) and cvlit (convert to literal) operators. Literal arrays are delimited by square brackets, executable arrays (procedures) are in curly braces. Like in Python and JavaScript, multiple references share storage. But oddly, in PostScript, you can duplicate an object on the stack, and change its executable bit, and now you have two references to the same storage, but with different attributes.

Here’s an example using GhostScript (a third-party conforming implementation):

GS>      [1 2 3] dup         % make an an array and duplicate it

GS<2>    cvx                 % make the top one executable
GS<2>    pstack              % print the stack
{1 2 3}
[1 2 3]
GS<2>    dup 1 99 put        % change the second element
GS<2>    pstack              % both objects share the storage
{1 99 3}
[1 99 3]
GS<2>

The executable attribute is part of the reference, not part of the object!? This doesn’t seem like a planned and desired outcome: it seems like a side-effect of a common C technique: using low bits of a pointer to store flags.

While writing Stilted, I didn’t realize this behavior until I already had made executability part of the object itself, so Stilted produces a different (wrong) result:

|-0>     [1 2 3] dup        % make an array and duplicate it

|-2>     cvx                % make the top one executable
|-2>     pstack             % oops: both are changed!
{1 2 3}
{1 2 3}
|-2>     dup 1 99 put
|-2>     pstack
{1 99 3}
{1 99 3}
|-2>

Since I don’t think anyone actually depends on having two objects that share storage, but with different executability, I didn’t bother changing it. An advantage of pure-fun side projects: you can do whatever you want!

BTW: the numbers in the prompts are the current depth of the operand stack.

Cutesy string syntax: PostScript strings are made with parentheses, and they nest, so this is one string:

GS> (Hello (there) 1 2 3) pstack

(Hello \(there\) 1 2 3)
GS<1>

Stilted doesn’t nest the parens in strings, because it uses regexes for lexing tokens, and nesting is hard with regexes. This is a syntax error in Stilted:

|-0> (Hello (there) 1 2 3) pstack

Error: syntaxerror in 3
Operand stack (4):
3
2
1
(Hello \(there)
|-4>

Also, who depends on nested parens in strings? Just escape the closing parens in your strings.

Flexible scope: PostScript is a stack-oriented language. There’s an operand stack that operators pop and push to, and also a dictionary stack where names are defined and looked up. The dictionary stack is explicitly manipulated with the begin and end operators. Instead of procedures starting new scopes implicitly, the programmer decides when to begin and end scopes. This means they don’t have to correspond to procedure invocations at all.

We’re so used to scoping being tied to function calls in our programming languages, it was strange to realize that the two concepts can be completely unrelated.

Surprising gaps: Re-acquainting myself with PostScript, I was surprised at what it didn’t have: no way to sort arrays, no string formatting, and so on. PostScript pre-dated languages like Python, JavaScript, and even Perl. Its model is much more like C than the higher-level languages that we’re used to now. Though C has string formatting, and you’d think that would be a useful thing in a printing programming language.

More: If you aren’t familiar with PostScript, I’ve got more description of its unusual control structure approach, and also other blog posts tagged #postscript.

Stilted has been a lot of fun. Extra fun: I used the Obfuscated PostScript winners as test cases!

]]>
Truchet images https://nedbatchelder.com/blog/202208/truchet_images.html 2022-08-17T18:08:27-04:00 Ned Batchelder I got interested in Truchet tiles, and did some hacking around to understand them better, and then display some images using them. The code is not clean or documented, and it’s inefficient in dumb ways, but it made some nice pictures. The code is at nedbat/truchet if you want to experiment.

A simple example of Truchet is Smith tiles. The tiles are designed to fit together seamlessly even when placed randomly:

Random orientations of black/white tiles

Christopher Carlson came up with a way to generalize the tiles so they could be placed on top of each other at different sizes. A square can be covered by four half-sized tiles with inverted colors and extra wings, and the pattern will remain seamless.

Here are his tiles:

The 15 Carlson Truchet tiles

It can be hard to see how they overlap, but this is a start. This is three different sizes of tile overlaid randomly, with the grid displayed to help see the edges:

A Carlson tiling at three different sizes

I love the randomness of these images, how shapes emerge that were not in the tiles themselves. I’ve been using them as Zoom backgrounds and desktop wallpapers. But I wondered if they could be used to create images.

The set of gray values in the Carlson set is somewhat limited, so I created a new set of tiles with more opportunities for variation:

A larger set of new multi-scale Truchet tiles

These produced even more chaos and serendipity when used randomly:

Randomly placed N6 Truchet tiles

To make images, I used a photo as source and fit tiles onto it to match the gray levels. Larger squares would be subdivided when their sub-squares’ intensities differed more than some threshold:

Young Marilyn Monroe, photo
Young Marilyn Monroe, with Truchet tiles
Me, photo
Me, with Truchet tiles

The algorithm to pick a tile will try to choose a good orientation, to match the colors within the square. Notice the tiles used for my shoulders. Though, on the flip side, both these images clearly exhibit “the forehead problem” because there’s little color variation there.

Looking around for other high-contrast images, I tried a well-known blogger’s avatar:

Coding Horror, in Truchet

The subdivision algorithm uses a threshold to decide when a square has enough variation within it to deserve subdivision. What happens if we start that threshold very large, and slide it down to very small, animating the result?

Marilyn, emerging from coarse-grained to fine-grained detail
]]>
Fall fallout https://nedbatchelder.com/blog/202207/fall_fallout.html 2022-07-30T10:57:03-04:00 Ned Batchelder More about my bike fall since I wrote about it two weeks ago.

  • I saw a neurosurgeon. He explained that I have a cavernoma, which is an anomalous collection of blood vessels with thin walls, which can lead to bleeding. The bleeding leaves behind iron deposits, which can cause a seizure. My cavernoma is located in a spot especially prone to seizures.
  • The neurosurgeon thought it would be a simple operation to remove the cavernoma, despite literally being brain surgery. He actually used the phrase “easy-peasy.” Also, his perspective was that this wasn’t a serious incident, since it wasn’t a stroke or death. I guess in his line of work, a seizure is on the small side.
  • Another perspective on severity: I took my bike to the shop to get it checked out. I was telling the bike dude about the crash. He said, “You didn’t need dental work? Then no big deal!”
  • I had an electroencephalogram (EEG). This involved having 27 wires pasted to my head and chest, then lying in a dark room with my eyes closed while they measured my brain activity. Toward the end, they placed a strobe light over my closed eyes, and flashed it at various frequencies. I realized, this is a black box unit test, and my brain is the system under test: provide some inputs, check the outputs, without being able to see the implementation. The initial report seemed to be, “nothing unusual,” but I have to check in with the neurologist.
  • After a few failed attempts, I managed to get the name of the person who called the police for me after my crash. I wrote to him, and he was very friendly, but didn’t have any more details about what happened. When he first saw me I was already on the ground, so he can’t explain the cause of the crash. Still, it felt good to connect with him and find out what he knew.

My energy level is not what it used to be, probably because of the Keppra (anti-seizure medication). Psychologically, I am not used to the idea that my brain can just shut off with no notice. I guess over time, I’ll just ignore that possibility?

]]>
The Fall https://nedbatchelder.com/blog/202207/the_fall.html 2022-07-13T15:44:07-04:00 Ned Batchelder One moment I was riding my bike; the next thing I remember, I was sitting on the ground talking to an EMT from the ambulance parked nearby.

This happened three weeks ago. I went to the emergency room, had a few CT scans and an MRI. The best theory is that I had a seizure. So now I am on anti-seizure medication, and am legally forbidden to drive a car for six months.

I was wearing a helmet, and was on a bike path, not a street. The physical effects were minimal: a sore shoulder and some road rash.

There was no eye-witness, so doctors guess I fell because I blacked out rather than the other way around. During the time I don’t remember, I called my wife and told her where I was, so maybe I was never truly unconscious? No one knows.

I usually have a low heart rate (a resting pulse of 50 bpm is not unusual), so maybe it was cardiac-related? I’m wearing a heart monitor to collect data.

The first week was hard because I felt completely limited in what I could do. All spring I had been feeling strong and capable in physical activities, and now I was finding a short walk difficult.

At first the anti-seizure meds made me tired and a bit fuzzy-headed. But we’ve adjusted them, and/or I’m adjusting to them, and/or my concussion is wearing off, so I feel more like myself. I’ve gotten back on the bike (though not alone), and have been swimming in the ocean, same as every summer.

I have more visits coming up with more doctors to try to understand what happened, and what might happen. I doubt they will be able to completely rule out a seizure, so I may be on meds for quite some time. Their recommendations are quite cautious (“Don’t take a bath without supervision”), so now we are making absurd trade-offs and considerations of risks and possibilities.

It’s unsettling to have lost time without a clear explanation, and especially unsettling to think that it could happen again at any time. I’m not sure what to do with that.

]]>
Math factoid of the day: 60 https://nedbatchelder.com/blog/202206/math_factoid_of_the_day_60.html 2022-06-16T06:53:00-04:00 Ned Batchelder 60 shows up in lots of places. It’s the smallest number divisible by 1 through 6, and perhaps because of that, it’s the basis of our timekeeping and angular measurements.

Of course the angles in an equilateral triangle are 60 degrees. But 60 also appears in solid geometry. There are four Archimedean solids (regular polyhedra made with any mixture of regular polygons) with 60 vertices. You can use Nat Alison’s beautiful polyhedra viewer to explore them:

]]>