Since posting v3.5.2 beta 1 last Sunday, no one has said anything about it, so it must be perfect. The exact same code is now Coverage.py v3.5.2. This release of the foremost code coverage tool for Python includes a number of small fixes. The full details, including links to the tickets that were closed, are in the coverage.py change history.

May all your lines and branches be covered!

tagged: , » react

I just posted Coverage.py v3.5.2 beta 1. This release of the foremost code coverage tool for Python includes a number of small fixes:

  • The HTML report has been slightly tweaked.
  • You can now provide custom CSS for the HTML report if you'd like to tweak it further.
  • Source files with encodings declared at the top are properly handled in the HTML report in Python 2. They had always been handled properly in Python 3.
  • Better error handling when a supposed Python file can't be parsed.
  • Better handling of exit status for the coverage command.
  • Better installation in PyPy.

The full details, including links to the tickets that were closed, are in the coverage.py beta change history.

Please give this a try, and let me know of any problems. Given the nature of the changes, I should be upgrading it to "released" within the week.

tagged: , » react

Once upon a time, Jamie Zawinski said,

Some people, when confronted with a problem, think, "I know, I'll use regular expressions." Now they have two problems.

BTW: Jeffrey Friedl dug into the history and found that someone said it about awk before jwz said it about regular expressions!

I seem to have developed a fascination for new variants of this joke, especially where the concept being referenced is important to the structure of the joke. For example, last June I said,

Some people, when faced with a problem, think, "I know, I'll use binary." Now they have 10 problems.

The other day I contributed,

Some people, when confronted with a problem, think, "I know, I'll use threads," and then two they hav erpoblesms.

It seems that Eiríkr Åsheim earlier had a similar one,

Some people, when confronted with a problem, think "I know, I'll use multithreading". Nothhw tpe yawrve o oblems.

Making fun of Java is easy. Chris Lonnen said,

Some people see a problem and think "I know, I'll use Java!" Now they have a ProblemFactory.

Floating point can be surprising. Tom Scott quipped,

Some programmers, when confronted with a problem, think "I know, I'll use floating point arithmetic." Now they have 1.999999999997 problems.

Finally, this is not a technical joke, but is too true to leave out. Tom Dale said (and then deleted?),

Some people, wanting an escape from their full-time job, think "I know, I'll contribute to open source." Now they have two full-time jobs.

Brendan Berg has a list of others if you want more...

tagged: , » 15 reactions

Last night I did a half-hour presentation about Python Iteration, starting with the basics, and touching on generators and why they are wonderful. This was part of a night of Foundational Topics at Boston Python.

I'm not sure I got the level of the material right, I think there were people there who wanted to learn more, but this went too fast, or over their heads. It's hard because there's no way to make it right for everyone.

tagged: » 1 reaction

Since writing Pragmatic Unicode, or, How do I stop the pain?, I've collected a handful of Unicode-related stuff:

  • Unicode 6.1 came out last year, Andrew West's summary of the latest additions is a view from the trenches. The commenters on his blog are asking about the status of their favorite exotic script, for example.
  • A PDF showing what is proposed to be added in Unicode 6.2. It's very clear that semantic distinctions are not important for new characters! Also, we finally get U+1F5D1, TRASH CAN!
  • Michael Kaplan has a series of blog posts, Every character has a story, exploring some of the back-stories of characters in Unicode. His musings on the three monkeys are especially erudite.
  • Matt Mayer has some interesting stories about Love Hotels and Unicode. I especially like the reasoning behind the "regional indicator symbols" A-Z, to avoid having to put flags in Unicode.
  • On the lighter side, the Fake Unicode Consortium presents more creative names for Unicode characters. Currently, you can't see the characters because of a Google+ redesign, but maybe soon... ☹
  • Finally, in 1889, when telegraph messages were paid for by the word, "Unicode" was the name for a dictionary of commonly-sent phrases mapped to obscure words so that instead of sending, "Jones dines with us this evening and remains the night - Smith," you could send, "Jones Coctivus Smith." It's of course no use to us now, but interesting to see how communications technology was accommodated. Also a bit shocking to see how maternity has changed: flip to page 11 to the section labelled "Births" to see the kinds of messages people needed to commonly send.

Maybe this is crazy, but I'm looking for advice.

Conceptually, coverage.py is pretty simple. First, using the sys.settrace facility in Python, record every line that is executed. Then, after the program is done, report on those lines, and especially on lines that could have been executed but were not.

Of course, the reality is more difficult. During execution, to record the line, we have to find the file name, which we get from the stack frame. Later, we look for that file by name to create the report. Sometimes, the file isn't a Python file!

One reason this can happen is if the file was actually created by a tool, and the tool provides the original source file as the reported name. For example, Jinja compiles .html files to Python code, and when the code is running, it claims to be "mytemplate.html". When coverage.py tries to report on the file, it can't parse it as Python, and things go wrong.

Originally, this error would be reported to the user. There's a -i switch that shuts off all errors like this, but it seemed dumb for coverage.py to get confused by something like this. So I changed it to not trace files named "*.html".

Of course, the world is more varied than that, so I got a report of someone with Jinja2 files named "*.jinja2" which now trip the error. So I need a more general solution.

I figure there are a couple of possibilities:

  1. Don't measure files at all if they have an extension that isn't ".py". This will let us measure extension-less files, and .py files, and will ignore all the rest, on the theory that any other extension implies that we won't be able to parse it later anyway.
  2. Measure all files, but during reporting, if a file can't be parsed, ignore the error if it has an extenstion that isn't "*.py".
  3. (Shudder) Make a configuration option about what extensions to measure, or which to ignore.
  4. Some people want "ignore errors" to be the default, but if a file is missing for some reason, it's important to know, because it will throw off the reporting, and that shouldn't happen silently.

Do people ever name their Python source files something other than "*.py"? Are there weird ecosystems like this that I'll only hear about if I make one of these changes?

This is a question that crops up often:

I have two nested loops, and inside, how can I break out of both loops at once?

Python doesn't offer a way to break out of two (or more) loops at once, so the naive approach looks like this:

done = False
for x in range(10):
    for y in range(20):
        if some_condition(x, y):
            done = True
            break
        do_something(x, y)
    if done:
        break

This works, but seems unfortunate. A lot of noise here concerns the breaking out of the loop, rather than the work itself.

The sophisticated approach is to get rid of, or at least hide away, the double loop. Looked at another way, this code is really iterating over one sequence of things, a sequence of pairs. Using Python generators, we can neatly encapsulate the pair-ness, and get back to one loop:

def pairs_range(limit1, limit2):
    """Produce all pairs in (0..`limit1`-1, 0..`limit2`-1)"""
    for i1 in range(limit1):
        for i2 in range(limit2):
            yield i1, i2

for x, y in pairs_range(10, 20):
    if some_condition(x, y):
        break
    do_something(x, y)

Now our code is nicely focused on the work at hand, and the mechanics of the double loop needed to produce a sequence of pairs is encapsulated in pairs_range.

Naturally, pairs_range could become more complex, more interesting ranges, not just pairs but triples, etc. Adapt to your own needs.

As with any language, you can approach Python as if it were C/Java/Javascript with different syntax, and many people do at first, relying on concepts they already know. Once you scratch the surface, Python provides rich features that take you off that track. Iteration is one of the first places you can find your Python wings.

tagged: » 18 reactions

Pi day (two days ago) passed without notice here, but then Eric Johnson posted a comment on last year's pi day post:

Ancient Egyptians may have thought Pi was 256/81: Approximations of π.

256/81 is about 3.16049382716049382716, which is approx 0.6% above the value of Pi. 22/7 is approx 0.04% less than Pi, so the ancient Egyptians weren't particularly accurate, but the numerator and denominator they choose are interesting for another reason.

256/81 can be expressed as 2^8 / 3^4, which can be expressed as 2^2^3/3^2^2, which of course is a palindrome.

Posted on A.E. Pi day, 2012 (A.E. = Ancient Egyptian)

I had never heard any of this before, and was delighted.

Poking around on the Wikipedia page about approximating pi, I found this interesting tidbit: there are points in the Mandelbrot set whose iteration escape counts provide arbitrarily accurate estimates to pi! Will the wonders never cease?

Happy belated Pi Day!

tagged: » 2 reactions

Last week was PyCon 2012, I had a blast as always. I gave a talk entitled, Pragmatic Unicode, or, How Do I Stop the Pain?

I chose the topic because I thought it would appeal to many Python developers, and because I knew all about it. Turns out I didn't! But it was great learning more details as I went. And then I filled in a few more tidbits by chatting with Martin v. Löwis at PyCon.

Part of the fun of this talk was finding the Unicode characters to decorate it with, and then building the credits slide at the end on the plane. It's all built with Cog to avoid cut-and-paste nightmares. Look at the HTML source of the actual presentation if you're interested in the Cog twistiness.

Of course, Unicode is a much bigger topic than this, but 25 minutes is what it is. Enjoy, the video, slides, and full text are there.

This blog started ten years ago today, with a post about My first job ever. It's strange to think about those ten years. At the time, it seemed late to be starting a blog, but now having a blog going back ten years makes it seem like one of the ancients.

I wrote far more frequently then than I do now, partly because of the novelty of it, partly because of time pressures, and partly because Twitter gets the shorter tossed-off ideas now. But I still value having a place to express myself when the universe moves me to.

If you haven't been a long-time reader, the most unusual post here was about dinner at the White House, though by far the most popular post was the animated CSS Homer. Of course I find much else in the archives that I would like to point out to you, but won't.

When I started this ten years ago, I didn't know what would come of it. As a side project, there were no requirements on it, and I could take it wherever I felt like taking it. It's still that way: I don't know what topics will find their way here in the next year or ten, and I'm interested to find out.

tagged: » 3 reactions

I'm really proud of how Boston Python has grown in the last year. We have over 1750 members, making us (I think) the largest local Python user group in the world.

I wanted a way for all of the Boston people to identify themselves at PyCon. My first thought was naturally t-shirts, but they're complex to make, and then people have to buy them, and you'd only wear it on one PyCon day (I hope!), and some people don't like wearing t-shirts in the first place. So that didn't happen.

Then I thought about simple stickers for the badge holder, but I didn't know this year's badge design, so I didn't know how big I could make a sticker so that it wouldn't obscure the name.

So how about a sticker that hangs off the side of the badge? Kind of like the "Speaker" flags on the bottom. So I designed these:

PyCon badge stickers

They hang off the left side of the badge, and can be printed at home for procrastinators like me. Hopefully, the folding instructions make sense to people. Maybe it will catch on, and other tribes will make flags for their members to wear.

I'll have a bunch of them with me at PyCon, if you need one, just ask!

tagged: » 4 reactions

PyCon is nearly upon us, which means I am preparing slides for my talk, which means I am using Cog again. Cog runs bits of Python code in a text file, and interpolates the output of the code into the file. It's useful for writing small programs to augment text, in this case, to produce code output in my HTML file.

This year I needed some Python 3 examples, so the biggest change in Cog v2.3 is that it is runnable on Python 3. I also took the opportunity to drop support for Pythons older than 2.6.

Having both Python 2 and Python 3 examples in one file means I need to run it through Cog twice, once under each version of Python. But when running in Python 2, I need the Python 3 examples to stay as they are. So I added a new attribute to the cog module, cog.previous. This is simply a string containing the output from the last run of Cog. A code chunk can now decide to "do nothing" simply by outputting cog.previous.

A few other mini-features are in: a dash as a file name means process standard in, and Cog can be run as "python -m cogapp," which helped my 2 vs. 3 switcheroo.

Enjoy Cog 2.3.

tagged: » react

Older:

Tue 14:

Linotype

Even older...