30% of people can juggle

Tuesday 3 November 2020

I’ve long wondered what portion of the general public can juggle. I couldn’t find an answer searching the web, so I used the best polling method I have, Twitter:

I realize that my Twitter followers skew toward people like me, so I ran a second poll to try to get data outside of my bubble:

These polls are by no means scientific, and are still very skewed toward the savvy and educated. If you ask a tech-bubble person to ask a friend, the friend is still from a small slice of the population as a whole.

But this is the best data we’ve got. I’ll say that in general, 20–30% of people can juggle.

Since I was making polls, and since 30% was higher than I would have guessed, I made a third poll to see what other people would guess:

There’s a nice symmetry to the idea that about 70% of people are surprised that about 30% of people can juggle!

If you have a better source of data about the general public, let me know.

Ordered dict surprises

Monday 12 October 2020

Since Python 3.6, regular dictionaries retain their insertion order: when you iterate over a dict, you get the items in the same order they were added to the dict. Before 3.6, dicts were unordered: the iteration order was seemingly random.

Here are two surprising things about these ordered dicts.

You can’t get the first item

Since the items in a dict have a specific order, it should be easy to get the first (or Nth) item, right? Wrong. It’s not possible to do this directly. You might think that d[0] would be the first item, but it’s not, it’s the value of the key 0, which could be the last item added to the dict.

The only way to get the Nth item is to iterate over the dict, and wait until you get to the Nth item. There’s no random access by ordered index. This is one place where lists are better than dicts. Getting the Nth element of a list is an O(1) operation. Getting the Nth element of a dict (even if it is ordered) is an O(N) operation.

OrderedDict is a little different

If dicts are ordered now, collections.OrderedDict is useless, right? Well, maybe. It won’t be removed because that would break code using that class, and it has some methods that regular dicts don’t. But there’s also one subtle difference in behavior. Regular dicts don’t take order into account when comparing dicts for equality, but OrderedDicts do:

>>> d1 = {"a": 1, "b": 2}
>>> d2 = {"b": 2, "a": 1}
>>> d1 == d2
>>> list(d1)
['a', 'b']
>>> list(d2)
['b', 'a']

>>> from collections import OrderedDict
>>> od1 = OrderedDict([("a", 1), ("b", 2)])
>>> od2 = OrderedDict([("b", 2), ("a", 1)])
>>> od1 == od2
>>> list(od1)
['a', 'b']
>>> list(od2)
['b', 'a']

BTW, this post is the result of a surprisingly long and contentious discussion in the Python Discord.

Working with many git repos

Monday 12 October 2020

Some of my work on the Open edX team at edX requires working with the three dozen or so repos that form the backbone of the Open edX software. That often means doing the same thing to all of them (tagging, logs, etc).

To make it easier to work with a collection of repos, I have a shell function to run the same command on the git directories found under the current directory. It gets a little more complicated than that: I might have 100 repos in the current directory, but only the ones that have certain master branches should be included in an operation.

My function is called “gittreeif”: it takes a branch name and a command, and walks the current directory tree looking for git repos that have that branch. For each one, it executes the command:

$ gittreeif origin/juniper.master git status

I also define “gittree”, which runs on every repo regardless of its branches.

Here is the definition of gittreeif. Put it in your shell startup file (.bashrc, .zshrc, whatever):

# Run a command for every repo found somewhere beneath the current directory.
#   $ gittree git fetch --all --prune
# To only run commands in repos with a particular branch, use gittreeif:
#   $ gittreeif branch_name git fetch --all --prune
# If the command has subcommands that need to run in each directory, quote the
# entire command:
#   $ gittreeif origin/foo 'git log --format="%s" origin/foo ^$(git merge-base origin/master origin/foo)'
# The directory name is printed before each command.  Use -q to suppress this,
# or -r to show the origin remote url instead of the directory name.
#   $ gittreeif origin/foo -q git status
gittreeif() {
    local test_branch="$1"
    local show_dir=true show_repo=false
    if [[ $1 == -r ]]; then
        # -r means, show the remote url instead of the directory.
        local show_dir=false show_repo=true
    if [[ $1 == -q ]]; then
        # -q means, don't echo the separator line with the directory.
        local show_dir=false show_repo=false
    find . -name .git -type d -prune | while read d; do
        local d=$(dirname "$d")
        git -C "$d" rev-parse --verify -q "$test_branch" >& /dev/null || continue
        if [[ $show_dir == true ]]; then
            echo "---- $d ----"
        if [[ $show_repo == true ]]; then
            echo "----" $(git -C "$d" config --get remote.origin.url) "----"
        if [[ $# == 1 && $1 == *' ']]; then
            (cd "$d" && eval "$1")
            (cd "$d" && "$@")

gittree() {
    # @ is in every repo, so this runs on all repos
    gittreeif @ "$@"

Let’s say I want to summarize the changes between two tags. Here’s a convenient alias to put in your ~/.gitconfig:

    relnotes = log --pretty='%h %ad %an: %s' --date=short --no-merges

The git command to show the changes between “old-commit” and “new-commit” is:

git log new-commit ^old-commit

Putting it all together: to see the changes between juniper.2 and juniper.3 in all the repos that have Juniper branches, using “relnotes” to get the summary style I like:

$ gittreeif \
    open-release/juniper.master \
    git relnotes open-release/juniper.3 ^open-release/juniper.2
---- ./ecommerce ----
 ca9cddb4 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./devstack ----
 10f02ca 2020-08-17 Zachary Trabookis: Remove `xqueue` as `DEFAULT_SERVICES` for
 8ff8dd0 2020-08-17 Zachary Trabookis: Make additional adjustments to the docume
 57455fe 2020-08-10 Zachary Trabookis: Add `xqueue` to default services to provi
 3ca4c9d 2020-07-29 Zachary Trabookis: Make sure to pass in `DOCKER_COMPOSE_FILE
 cef4aa2 2020-07-28 Zachary Trabookis: Updated `README` to include necessary inf
 9415683 2020-07-27 Zachary Trabookis: Update `docker` commands to be `docker-co
 67c7c9b 2020-08-16 morenol: Do not use openedx release for registrar and edx-mk
 56312bc 2020-08-04 Guruprasad Lakshmi Narayanan: Remove duplicate section
 34a46a3 2020-07-24 Guruprasad Lakshmi Narayanan: Remove the non-release service
---- ./xqueue ----
 f004caa 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./edx-e2e-tests ----
---- ./edx-platform ----
 d9e0ca5e70 2020-08-12 Ali-D-Akbar: This commit contains security fixes for the
 c8421f66fc 2020-08-07 uzairr: Fix xss vulnerabilities in templates
 47ab6af637 2020-08-06 Attiya Ishaque: [YONK-1759] Version bump of studio-fronte
 8dd78619c9 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
 b295389e96 2020-07-23 Zachary Trabookis: Set `SESSION_COOKIE_SAMESITE=Lax` for
 91af099933 2020-07-23 uzairr: Fix xss in templates
 0e45ecb743 2020-07-22 Ali-D-Akbar: Sustaining xss fixes This commit contains xs
 3757f0d11e 2020-07-06 Florian Haas: Fix profile image URLs for image storage on
---- ./edx-analytics-pipeline ----
---- ./repo-tools/repo-tools ----
---- ./edx-notes-api ----
 ad53edd 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./cs_comments_service ----
 3079804 2020-08-19 Samuel Walladge: Bump codecov to latest version
---- ./course-discovery ----
 e984f273 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./credentials ----
 7a7aab55 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/edx-analytics-configuration ----
---- ./src/edx-documentation ----
---- ./src/configuration ----
 05bb4edcf 2020-08-24 Feanil Patel: Improve sandboxing. (#5953) (#5960)
 860994c0d 2020-08-21 Feanil Patel: Timmc/codejail improvements (#5956)
---- ./src/enterprise-catalog ----
 f886da6 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/blockstore ----
---- ./src/edx-analytics-data-api ----
 64b4c7f 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/frontend-app-publisher ----
---- ./src/edx-app-android ----
---- ./src/notifier ----
---- ./src/edx-analytics-dashboard ----
 b8dfa559 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/frontend-app-support-tools ----
---- ./src/edx-app-ios ----
---- ./src/edx-demo-course ----
---- ./src/ecommerce-worker ----
---- ./src/frontend-app-learning ----
---- ./src/edx-certificates ----
---- ./src/frontend-app-profile ----
---- ./src/license-manager ----
 85003a6 2020-08-05 Ned Batchelder: Upgrade to Django 2.2.15
---- ./src/testeng-ci ----
---- ./src/frontend-app-gradebook ----
---- ./src/edx-developer-docs ----
---- ./src/frontend-app-account ----

This is how I do it. There are probably other tools to do the same job. Maybe someone will point them out... :)

Değişken Deyince Ne Anlamalı?

Saturday 10 October 2020

Enes Başpınar has translated one of my popular pages into Turkish: Değişken Deyince Ne Anlamalı? is his translation of my Facts and myths about Python names and values.

Google Translate tells me the Turkish title means, “What Should It Understand When You Say Variables?,” which I guess is better translated as, “What Do We Mean When We Say Variables?”

It’s flattering that a piece is liked enough for someone to translate it. The only previous page that was translated was Cog, into Russian (twice!).

If you want to translate something on this site, let me know.


Sunday 20 September 2020

I’ve written a tool for managing changelog files, called scriv. It focuses on a simple workflow, but with lots of flexibility.

I’ve long felt that it’s enormously beneficial for engineers to write about what they do, not only so that other people can understand it, but to help the engineers themselves understand it. Writing about a thing gives you another perspective on it, your own code included.

The philosophy behind scriv, and a quick list of other similar tools, is on the Philosophy page in the docs.

Scriv only does a few things now, but I’m interested to hear about other changelog workflows that could use better tooling.


Sunday 13 September 2020

I threw together a Spotify API program called song-basket. I have a few large themed playlists (for example, Instrumental Funk). This app is to help me add songs to them. I can choose a playlist (the basket), and then as I surf around Spotify, it lets me add the current song to the basket with one click. It also shows me whether the current song is already in the basket or not, which they often are. If the song is already in the basket, I don’t have to think about whether to add it, and I don’t have to deal with the annoying “Add duplicate?” question.

This started as an example in the Tekore docs, and I hacked at it until it did what I wanted. A lot of it is wrong: no templating, incorrect HTML, a stateful web application, horrid styling, and so on. It doesn’t matter, it’s a quick app to do what I need. If I want, I can polish it later.