« | » Main « | »

Names and values: making a game board

Tuesday 27 August 2013

In Facts and myths about Python names and values, I described how Python's names and values work. There I posed some other questions, one of which was:

Why do beginners find it hard to make a tic-tac-toe board in Python?

Writing a game often involves a grid-like board. Python doesn't have a native two-dimensional array, so you can use a list of lists. Making a checkerboard full of zeros can be done with a literal list:

board = [
    [0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0],
]

But that's tedious, especially if you need a 100×100 board. Even if you don't, it seems like you could do it more compactly. Python lets you multiply lists by integers to replicate elements, so you can use this code to produce a list of eight zeros:

row = [0] * 8

Let's try using that technique twice, once to make a row, and then again to fill out the board:

from pprint import pprint
# Construct the empty board
board = [ [0]*8 ] * 8
pprint(board)
[[0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0]]

It works! Now let's change one of the cells:

# Put a 1 in the upper-left corner
board[0][0] = 1
pprint(board)
[[1, 0, 0, 0, 0, 0, 0, 0],
 [1, 0, 0, 0, 0, 0, 0, 0],
 [1, 0, 0, 0, 0, 0, 0, 0],
 [1, 0, 0, 0, 0, 0, 0, 0],
 [1, 0, 0, 0, 0, 0, 0, 0],
 [1, 0, 0, 0, 0, 0, 0, 0],
 [1, 0, 0, 0, 0, 0, 0, 0],
 [1, 0, 0, 0, 0, 0, 0, 0]]

Huh? Somehow setting one cell changed the first cell in every row!? What's going on?

Just as with assignment, list multiplication doesn't copy data (Fact: assignment never copies data.) When we replicate a list with multiplication, we get the same value referenced many times. When we made the single row, we made a list with eight references to the same zero:

row is a list with eight references to the same zerorow0row = [0] * 8

When we make the board, the same behavior of replicating a list applies, so the board is actually eight references to the same list:

board is a list of eight references to the rowboard0board = [ [0]*8 ] * 8

When we assign to board[0][0], we are changing one element, but that element is visible through all eight rows, because all eight rows are actually references to the same list:

changing only one elementboard01board[0][0] = 1

In Facts and Myths, I called this the Mutable Presto-Chango.

Perhaps this is clearer if you consider that these two ways to make the board are exactly the same:

# Make it all at once
board = [ [0] * 8 ] * 8

# Make a row, then make a board
row = [0] * 8
board = [ row ] * 8

Assigning the row to its own variable doesn't change the behavior, but when we see it assigned to a name, it's more obvious that you'll get eight references to the same row.

OK, but why didn't setting one element change all 64 zeros? The diagram shows that there's only one zero, referenced by all the elements of the board.

Two names can refer to the same value. In fact, in our case, eight names refer to the same value. The names are board[0][0], board[0][1], ..., board[0][7]. Changing one of those names to refer to 1 won't make all of them refer to it, any more than if the eight names had been "a", "b", ..., "h". (Fact: Names are reassigned independently of other names.)

The compact way to make a board that works correctly is to use a list comprehension to make each row:

board = [ [0] * 8 for i in range(8) ]

With this code, we use [0]*8 to make the row, which is fine because assignment will correctly update a single element in that row. The list comprehension makes a list with eight of those, but instead of each one being a reference to the same row, the row is recomputed each time, so a new list is made for each row.

It's important to keep straight two different syntaxes that look very similar:

# List multiplication, doesn't work right:
board1 = [ [0] * 8 ] * 8

# List comprehension, does work right:
board2 = [ [0] * 8 for i in range(8) ]

In the first case, [0]*8 is calculated once, placed into a one-element list, and then replicated 8 times, giving 8 references to the same row. In the second case, [0]*8 is the expression in the list comprehension, and so is evaluated anew eight times, producing eight distinct lists to serve as our rows.

If this is still confusing, Facts and myths about Python names and values has more details, or try the code on pythontutor.com.

I need help with coverage.py

Monday 12 August 2013

Coverage.py is kind of stuck. I haven't done much to it in the last six months or so. Well, I've fixed six bugs, but I haven't released them because I want to fix more.

In particular, I want to fix the branch coverage bugs. When I wrote the branch coverage feature, I based it on bytecode analysis of the code. Coverage.py finds the possible branches by looking at the bytecode to decide which of them can jump to which . Some of those bugs are because I don't fully understand CPython bytecode.

To help me learn all the ins and outs of the CPython bytecode, I started byterun, a pure-Python implementation of a Python bytecode runner, or VM. Actually, I didn't start it, I found a thing written by Paul Swartz, and renovated and extended it. It works pretty well, but isn't complete. In particular, some of the more complex bytecodes, dealing with the block stack, don't work properly on Python 3. I would love some help finishing byterun.

Then I need to figure out if analyzing bytecode is a reasonable way to find branches in a program. I think when I started down this path, I thought that everything would be compiled to simple bytecodes like "if cond jump to x," but that's not what happens. Bytecodes like WITH_CLEANUP are complex beasts with sub-optimal documentation, and they behave slightly differently in Python 2 and Python 3.

Of course I would rather not rip out the bytecode approach and implement an AST approach, but I would like to know whether that's the way to get branch coverage that truly works well.

I'm looking for collaborators to help get through this issue. I need to get past it so I can get back to implementing features people want, like gevent support, or faster reporting, or getting rid of pickles!

« | » Main « | »