Taylor Glenn is a great juggler. I don’t mean that she has set world
records, or does more-difficult tricks than anyone else. She is certainly very
accomplished, but what I love about her is that she brings an easy grace to her
juggling, and a friendly encouraging air. I have used some of her instructional
videos to improve my technique.
Her latest video,
Perspective Juggling, is
eye-opening. It’s filmed from above, presenting a fresh view of the juggling.
This makes some of the tricks more understandable by seeing the hands move more
while the balls seem to move less. Other tricks somehow seem more mysterious.
All of them are fluid and mesmerizing.
Take a look:

As a juggler, the only improvement I would make would be to flip it around so
that her hands reach up toward the top, to be more like a first-person view.
Flourish is a visual toy app that
draws harmonographs, sinuous curves simulating a multi-pendulum trace:

Each harmonograph is determined by a few dozen parameter values, which are
chosen randomly. The number of parameters depends on the number of pendulums,
which defaults to 3.
Click a thumbnail to see a larger version. The large-image pages have
thumbnails of “adjacent” images. Each harmonograph is determined by a few dozen
parameter values. For each parameter, four nearby values are substituted,
giving four thumbnails for each parameter. Clicking an adjacent thumbnail
continues your exploration of the parameter space:

The settings dialog lets you adjust the number of pendulums (which determines
the number of parameters) and the kinds of symmetry you are interested in.
I started this because I wanted to understand how the parameters affected the
outcome, but I was also interested to give it a purely visual design. As an
engineer, it was tempting to present the values of the parameters
quantitatively, but I like the simplicity of just clicking curves you like.
I repeated a trick I’ve used in other
mathy visual toys: when
you download a PNG file of an image, the parameter values are stored in a data
island in the file. You can re-upload the image, and Flourish will extract the
parameters and put you back into the parameter-space exploration at that
point.
This is one of those side projects that let me use different sorts of things
than I usually do: numpy, SVG, sass, Docker, and so on. I had more ideas for
things to add (there is color in the code but not the UI). Maybe someday I will
build them.
BTW, I am happy that my first post of 2021 is called “Flourish.” I hope it
is a harbinger of things to come.
The recent blog post Commits are snapshots, not diffs
did a good job explaining away a common git misconception, and helped me finally
understand it. To really wrap my head around it, I checked it empirically.
The misconception starts because git presents commits as diffs, and lets you
manipulate them (rebase, cherry-pick, etc) as if they were diffs. But
internally, git commits are not diffs, they are complete copies of the file at
each revision that changes the file.
At first glance, this seems dumb: why store the whole file over again just
because one line changed? The reason is speed and immutability. If git stored
each commit as a diff against the previous version (as RCS did),
then getting the latest version of a file would require replaying all the diffs
against the very first version of the file, getting slower and slower as the
repo accumulated more commits. This means the most common checkout would get
worse and worse over time.
If git stored the latest version of a file, and diffs going backward in time
(as Subversion does),
then getting older versions would get slower and slower, which isn’t so bad. But
it would require re-writing the previous commit each time a new commit was
made. This would ruin git’s hash-based immutability.
So, surprisingly, git stores the full contents of the file each time the file changes.
I wanted to see this for myself, so I did an experiment.
First, make a new git repo:
$ mkdir gitsize
$ cd gitsize
$ git init
Initialized empty Git repository in /tmp/gitsize/.git/
I used a Python program (makebig.py) to create large files with repeatably
random contents and one changeable word in the middle:
# Make a 1Mb randomish file, repeatably
import random, sys
random.seed(int(sys.argv[1]))
for lineno in range(2048):
if lineno == 1000:
print(sys.argv[2])
print("".join(chr(random.randint(32, 126)) for _ in range(512)))
Let’s make a big file with “one” in the middle, and commit it:
$ python /tmp/makebig.py 1 one > one.txt
$ ls -lh
total 2136
-rw-r--r-- 1 ned wheel 1.0M Dec 19 11:13 one.txt
$ git add one.txt
$ git commit -m "One"
[master (root-commit) 8fceff3] One
1 file changed, 2049 insertions(+)
create mode 100644 one.txt
Git stores everything in the .git directory, with the file contents in the
.git/objects directory:
$ ls -Rlh .git/objects/*
.git/objects/13:
total 1720
-r--r--r-- 1 ned wheel 859K Dec 19 11:14 b581d8695866f880eac2fef47c2594bbebbb3b
.git/objects/7d:
total 8
-r--r--r-- 1 ned wheel 52B Dec 19 11:14 32a67a911e8a04ad1703712481afe93b00c7af
.git/objects/8f:
total 8
-r--r--r-- 1 ned wheel 127B Dec 19 11:14 ceff3e3926764197742b01639a42765e34cd72
.git/objects/info:
.git/objects/pack:
Git stores three kinds of things: blobs, trees, and commits. We now have one
of each. Blobs store the file contents. You can see the
b581d8 blob is 859Kb, which is our 1Mb file with a little
compression applied.
Now we change the file just a little bit by writing it over again with a
different word in the middle:
$ python /tmp/makebig.py 1 one-changed > one.txt
$ git diff
diff --git a/one.txt b/one.txt
index 13b581d..b13026a 100644
--- a/one.txt
+++ b/one.txt
@@ -998,7 +998,7 @@ wLh&#DvF%em\Bb}^Y<gk?5vR8npq{ ~".][T|@.At@~fGYf<0/=cth`e}/}='qBFb&FP?+ENmAA:_g+0
u$d|\v=y$oi@\, (o`=a49|!r\LL^B:y.f)*@5^bR\,Ck=i (.. snipped)
lbY#m++>32X>^gh\/q34})uxZ"e/p;Ybb9\k,UTLPb*?3l7 (.. snipped)
B11\\!x]jM9m't"KD%|,&r(lfh%vfT}~{jOQYBb?|TZ(<<R (.. snipped)
-one
+one-changed
>Mu2P-/=8Z+A&"#@'"8*~fb]kkn;>}Ie.)wGjjHsbO5Nw]" (.. snipped)
Vl {Q)k|{E!vF*@S')U5bK3u1fInN*ZrIe{-qXW}Fr`6*#N (.. snipped)
3lF#jR!"JxXjAvih 4I6E\W:Y.*}b@eZ8xl-"*c/!pe"$Mx (.. snipped)
Commit the change, and we can look again at the .git storage:
$ git commit -am "One, changed"
[master a2410c8] One, changed
1 file changed, 1 insertion(+), 1 deletion(-)
$ ls -Rlh .git/objects/*
.git/objects/0e:
total 8
-r--r--r-- 1 ned wheel 52B Dec 19 11:22 2de9f34b9140c3e99c5d5106a1078d22aa9063
.git/objects/13:
total 1720
-r--r--r-- 1 ned wheel 859K Dec 19 11:14 b581d8695866f880eac2fef47c2594bbebbb3b
.git/objects/7d:
total 8
-r--r--r-- 1 ned wheel 52B Dec 19 11:14 32a67a911e8a04ad1703712481afe93b00c7af
.git/objects/8f:
total 8
-r--r--r-- 1 ned wheel 127B Dec 19 11:14 ceff3e3926764197742b01639a42765e34cd72
.git/objects/a2:
total 8
-r--r--r-- 1 ned wheel 163B Dec 19 11:22 410c8b799b7829e1360649011754739e0a5c50
.git/objects/b1:
total 1720
-r--r--r-- 1 ned wheel 859K Dec 19 11:22 3026a4c10928821aa2b89b3e67d766dfbd533a
.git/objects/info:
.git/objects/pack:
Now, as promised, there are two blobs, each 859Kb. The original file
contents are still in blob b581d8, and there’s a new blob
(3026a4) to hold the updated contents.
Even though we changed just one line in a 2000-line file, git stores the full
contents of both revisions of the file.
Isn’t this terrible!? Won’t my repos balloon to unmanageable sizes? Nope,
because git has another trick up its sleeve. It can store those blobs in “pack
files”, which store repeated sequences of bytes once.
Git will automatically pack blobs when it makes sense to, but we can ask it
to explicitly in order to see them in action:
$ git gc --aggressive
Enumerating objects: 6, done.
Counting objects: 100% (6/6), done.
Delta compression using up to 8 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (6/6), done.
Total 6 (delta 1), reused 0 (delta 0), pack-reused 0
$ ls -Rlh .git/objects/*
.git/objects/info:
total 16
-rw-r--r-- 1 ned wheel 1.2K Dec 19 11:41 commit-graph
-rw-r--r-- 1 ned wheel 54B Dec 19 11:41 packs
.git/objects/pack:
total 1720
-r--r--r-- 1 ned wheel 1.2K Dec 19 11:41 pack-a0d87c64abc0f03070fd14449891fe20ca98926b.idx
-r--r--r-- 1 ned wheel 855K Dec 19 11:41 pack-a0d87c64abc0f03070fd14449891fe20ca98926b.pack
Now instead of individual blob files, we have one pack file. And it’s a
little smaller than either of the blobs!
This may seem like a semantic game: doesn’t this show that commits are deltas?
It’s not the same, for a few reasons:
- Reconstructing a file doesn’t require revisiting its history. Every
revision is available with the same amount of effort.
- The sharing between blobs is at a conceptually different layer than the blob
storage. Git stores a commit as a full snapshot of all of the files’ contents.
The file contents might be stored in a shared-bytes way within the pack
files.
- The object model is full-file contents in blobs, and commits referencing
those blobs. If you removed pack files from the implementation, the conceptual
model and all operations would work the same, just take more disk space.
- The storage savings in a pack file are not limited to a single file. If two
files (or two revisions of two different files) are very similar, their bytes
will be shared.
To demonstrate this last point, we’ll make another file with almost the same
contents as one.txt:
$ python /tmp/makebig.py 1 two > two.txt
$ ls -lh
total 4280
-rw-r--r-- 1 ned wheel 1.0M Dec 19 11:18 one.txt
-rw-r--r-- 1 ned wheel 1.0M Dec 19 11:49 two.txt
$ git add two.txt
$ git commit -m "Two"
[master 079baa5] Two
1 file changed, 2049 insertions(+)
create mode 100644 two.txt
$ git gc --aggressive
Enumerating objects: 9, done.
Counting objects: 100% (9/9), done.
Delta compression using up to 8 threads
Compressing objects: 100% (7/7), done.
Writing objects: 100% (9/9), done.
Total 9 (delta 2), reused 4 (delta 0), pack-reused 0
$ ls -Rlh .git/objects/*
.git/objects/info:
total 16
-rw-r--r-- 1 ned wheel 1.2K Dec 19 11:50 commit-graph
-rw-r--r-- 1 ned wheel 54B Dec 19 11:50 packs
.git/objects/pack:
total 1720
-r--r--r-- 1 ned wheel 1.3K Dec 19 11:50 pack-36b681bfc8ebef963bb8a7dcfe65addab822f5d4.idx
-r--r--r-- 1 ned wheel 855K Dec 19 11:50 pack-36b681bfc8ebef963bb8a7dcfe65addab822f5d4.pack
Now we have two separate source files in our working tree, each 1Mb. But in
the .git storage there is still just one 855Kb pack file. The parts of one.txt
and two.txt that are the same are only stored once.
As another example, let’s change two.txt completely by using a different
random seed, commit it, then change it back again:
$ python /tmp/makebig.py 2 two > two.txt
$ git commit -am "Two, completely changed"
[master 6dac887] Two, completely changed
1 file changed, 2049 insertions(+), 2049 deletions(-)
rewrite two.txt (86%)
$ python /tmp/makebig.py 1 two > two.txt
$ git commit -am "Never mind, I liked it the old way"
[master c06ad2f] Never mind, I liked it the old way
1 file changed, 2049 insertions(+), 2049 deletions(-)
rewrite two.txt (86%)
Looking at the storage, our pack file is twice the size, because we’ve had
two completely different 1Mb-chunks of data. But thinking about two.txt, its
first and third revisions were nearly identical, so they can share bytes in the
pack file:
$ git gc --aggressive
Enumerating objects: 13, done.
Counting objects: 100% (13/13), done.
Delta compression using up to 8 threads
Compressing objects: 100% (11/11), done.
Writing objects: 100% (13/13), done.
Total 13 (delta 3), reused 6 (delta 0), pack-reused 0
$ ls -Rlh .git/objects/*
.git/objects/info:
total 16
-rw-r--r-- 1 ned wheel 1.3K Dec 19 11:58 commit-graph
-rw-r--r-- 1 ned wheel 54B Dec 19 11:58 packs
.git/objects/pack:
total 3432
-r--r--r-- 1 ned wheel 1.4K Dec 19 11:58 pack-49f264f911dc97e529dc56a4f6ad450f8013f720.idx
-r--r--r-- 1 ned wheel 1.7M Dec 19 11:58 pack-49f264f911dc97e529dc56a4f6ad450f8013f720.pack
If git stored diffs, we’d need two different megabyte-sized diffs for the two
complete changes we’ve made to two.txt.
Note that in this experiment I have used “git gc” to force the storage into
its most compact form. You typically wouldn’t do this. Git will automatically
repack files when it makes sense to.
Git doesn’t store diffs, it stores the complete contents of the file at each
revision. But beneath those full-file snapshots is clever redundancy-removing
byte storage that makes the total size even smaller than a diff-based system
could achieve.
If you want to know more,
How Git Works
is a good overview, and
Git Internals - Git Objects
is the authoritative reference.
I needed to revisit how favicons work today. First I wanted to do an
empirical experiment to see what size and format would get used by browsers.
This has always been a confusing landscape. Some pages offer dozens of different
files to use as the icon. I wasn’t going to go crazy with all of that, so I
just wanted to see what would do a simple job.
To run my experiment, I used ImageMagick to create a test favicon.ico, and
also some different-sized png files. So I would know what I was looking at,
each size is actually a different visual image: the 32-pixel icon shows “32”,
and so on.
This is how I made them:
for size in 16 32 48 ; do
magick convert \
-background lightgray \
-fill black \
-size ${size}x${size} \
-gravity center \
-bordercolor black \
-border 1 \
label:${size} \
icon_${size}.bmp
done
for size in 16 32 48 64 96 128 256; do
magick convert \
-background lime \
-fill black \
-size ${size}x${size} \
-gravity center \
-bordercolor black \
-border 1 \
label:${size} \
icon_${size}.png
done
magick convert *.bmp favicon.ico
Playing with these a bit showed me that favicon.ico is not that reliable, and
the simplest thing to do that works well is just to use a 32-pixel PNG file.
I wanted to make an icon of a circled Sleepy Snake image.
I started with GIMP, but got lost in selections, paths, and alpha channels, so I
went back to ImageMagick:
magick convert SleePYsnake.png \
-background white -alpha remove -alpha off \
SleePYwhite.png
magick convert \
-size 3600x3600 xc:Purple -fill LightBlue \
-stroke black -strokewidth 30 \
-draw "circle 1100,1000 1100,1700" -transparent LightBlue \
mask.png
magick convert SleePYwhite.png mask.png -composite temp.png
magick convert temp.png -transparent Purple temp2.png
magick convert temp2.png -crop 1430x1430+385+285 +repage round.png
magick convert round.png -resize 32x32 round_32.png
Probably some of these steps could be combined. The ImageMagick execution
model is still a bit baffling to me. It made these intermediate steps:

I made that montage made with:
magick montage \
SleePYsnake.png SleePYwhite.png mask.png temp.png temp2.png round.png \
-geometry 300x300 -background '#ccc' -mode concatenate -tile 2x \
favicon_stages.png
In the end, I got the result I wanted:

When people ask what they should implement to practice programming, I often
say, Mad Libs. It’s a game, so it might appeal to youthful minds, but it’s
purely text-based, so it won’t be overwhelming to implement. It can start
simple, for beginners, but get complicated if you are more advanced.
Mad Libs is a language
game. One person, the reader, has a story with blanks in it. The other
player(s) provide words to go in the blanks, but they don’t know the story. The
result is usually funny.
For example, the story might be:
There was a tiny
(adjective) (noun) who
was feeling very
(adjective) . After
(verb) ’ing, she felt
(adjective) .
(An actual story would be longer, usually a full paragraph.) The reader will
ask for the words, either in order, or randomized:
Give me an adjective.
“Purple”.
Give ma a noun.
“Barn”.
A verb.
“Jump”.
Another adjective.
“Lumpy”.
Another adjective.
“Perturbed”.
Then the reader presents the finished story:
There was a tiny perturbed barn who was feeling very purple. After
jumping, she felt lumpy.
To do this in software, the program will be the reader, prompting the user
for words, and then output the finished story. There are a few different ways
you could structure it, of different complexities:
- Hard-code the story in the code.
- Represent the story in a data structure (maybe a list?) in the code.
- Read the story from a data file, to make it easier to use different stories.
- Provide a way to edit stories.
- Make a Mad Libs web application.
Each of these has design choices to be made. How will you separate the text
from the blanks? How will you indicate what kind of word goes in each blank?
How complex a story structure will you allow?
There are other bells and whistles you can add along the way, for any of the
stages of complexity:
- Randomize the order the words are requested.
- Have a way for a provided word to be re-used in the story for some
coherence.
- Stories that have random paths through them, so that it is not always the
same story, giving more variety.
- Smarter text manipulation, so that “a” or “an” would be used appropriately
with a provided word.
If you are interested, you can read the details of how I approached it years
ago with my son: Programming madlibs.
I’ve long wondered what portion of the general public can juggle. I couldn’t
find an answer searching the web, so I used the best polling method I have,
Twitter:
I realize that my Twitter followers skew toward people like me, so I ran a
second poll to try to get data outside of my bubble:
These polls are by no means scientific, and are still very skewed toward the
savvy and educated. If you ask a tech-bubble person to ask a friend, the friend
is still from a small slice of the population as a whole.
But this is the best data we’ve got. I’ll say that in general, 20–30% of
people can juggle.
Since I was making polls, and since 30% was higher than I would have guessed,
I made a third poll to see what other people would guess:
There’s a nice symmetry to the idea that about 70% of people are surprised
that about 30% of people can juggle!
If you have a better source of data about the general public, let me know.
Older: