Re-ruling .rst

Friday 12 May 2017

Sometimes, you need a small job done, and you can write a small Python program, and it does just what you need, and it pleases you.

I have some Markdown files to convert to ReStructured Text. Pandoc does a really good job. But it chooses a different order for heading punctuation than our house style, and I didn't see a way to control it.

But it was easy to write a small thing to do the small thing:

import re
import sys

# The order we want our heading rules.
GOOD_RULES = '#*=-.~'

# A rule is any line of all the same non-word character, 3 or more.
RULE_RX = r"^([^\w\d])\1\1+$"

def rerule_file(f):
    rules = {}
    for line in f:
        line = line.rstrip()
        rule_m = re.search(RULE_RX, line)
        if rule_m:
            if line[0] not in rules:
                rules[line[0]] = GOOD_RULES[len(rules)]
            line = rules[line[0]] * len(line)
        print(line)

rerule_file(sys.stdin)

If you aren't conversant in .rst: there's no fixed order to which punctuation means which level heading. The first rule encountered is heading 1, the next style found is heading 2, and so on.

There might be other ways to do this, but this makes me happy.

Shell = Maybe

Monday 24 April 2017

A common help Python question: how do I get Python to run this complicated command line program? Often, the answer involves details of how shells work. I tried my hand at explaining it what a shell does, why you want to avoid them, how to avoid them from Python, and why you might want to use one: Shell = Maybe.

Text-mode menu bar indicators

Monday 17 April 2017

I recently upgraded my Mac operating system, and decided to try out a new feature: automatically hiding the menu bar. This gives me back another sliver of vertical space. But it has a drawback: I no longer have the time, battery life, and speaker volume indicators available at a glance.

I went looking for a thing that I figured must exist: a Mac app that would display that information in a dock icon. I already have a dock clock. I found a dock battery indicator, though it tried so hard to be cute and pictorial, I couldn't tell what it was telling me.

Asking around, I got a recommendation for GeekTool. It lets you draw a panel on your desktop, and then draw in the panel with the output of a script. Now the ball was back in my court: I could build my own thing.

I'd long ago moved the dock to the left side of the screen (again, to use all the vertical space for my own stuff.) This left a small rectangle of desktop visible at the upper left and lower left, even with maximized windows. I drew a panel in the upper left of the desktop, and set it to run this script every five seconds:

#!/usr/bin/env python3.6

import datetime
import re
import subprocess

def block_eighths(eighths):
    """Return the Unicode string for a block of so many eighths."""
    assert 0 <= eighths <= 8
    if eighths == 0:
        return "\u2003"
    else:
        return chr(0x2590 - eighths)

def gauge(percent):
    """Return a two-char string drawing a 16-part gauge."""
    slices = round(percent / (100 / 16))
    b1 = block_eighths(min(slices, 8))
    b2 = block_eighths(max(slices - 8, 0))
    return b1 + b2

now = datetime.datetime.now()
print(f"{now:%-I:%M\n%-m/%-d}")

batt = subprocess.check_output(["pmset", "-g", "batt"]).decode('utf8').splitlines()
m = re.search(r"\d+%", batt[1])
if m:
    level = m.group(0)
    batt_percent = int(level[:-1])
else:
    level = "??%"
if "discharging" in batt[1]:
    arrow = "\u25bc"        # BLACK DOWN-POINTING TRIANGLE
elif "charging" in batt[1]:
    arrow = "\u25b3"        # WHITE UP-POINTING TRIANGLE
else:
    arrow = ""

print(level + arrow)
print(gauge(batt_percent) + "\u2578")   # BOX DRAWINGS HEAVY LEFT

vol = subprocess.check_output(["osascript", "-e", "get volume settings"]).decode('utf8')
m = re.search(r"^output volume:(\d+), .* muted:(\w+)", vol)
if m:
    level, muted = m.groups()
    if muted == 'true':
        level = "\u20e5"        # COMBINING REVERSE SOLIDUS OVERLAY
    print(f"\u24cb{level}")     # CIRCLED LATIN CAPITAL LETTER V

# For debugging: output the raw data, but pushed out of view.
print(f"{'':30}{batt}")
print(f"{'':30}{vol}")

This displays the time, date, battery level (both numerically and as a crudely drawn battery gauge), whether the battery is charging or discharging, and the volume level:

All that information, crammed into a tiny corner

BTW, if you are looking for Python esoterica, there are a few little-known things going on in this line:

print(f"{now:%-I:%M\n%-m/%-d}")

Finding Unicode characters to represent things was a bit of a challenge. I couldn't find exactly what I need for the right tip of the battery gauge, but it works well enough.

Geektool also does web pages, though in a quick experiment I couldn't make it do something useful, so I stuck with text mode. There also seem to be old forks of Geektool that offer text colors, which could be cool, but it started to feel a bit off-the-path.

This works great for what it does.

Clean-text bookmarklet

Saturday 8 April 2017

I love the text-based web. I love that people can speak their minds, express opinions, encourage each other, and create a lively world of words. This also means they are free to design their text in, shall we say, expressive ways. Those ways are not always ideal for actually reading the words.

Today I really liked Tiberius Hefflin's Part of That World, about the need to recognize non-code contributions in open source projects. You should read it, it is good and true.

But when I first got to the page, I saw this:

Screenshot of gray text on black background, slightly letterspaced

To start with the positive, this text has an elegance to it. It gives a peaceful quiet impression. It pairs perfectly with the mermaid illustration on the page. But I find it hard to read. This typeface is too weak to be light-on-dark, and letterspacing is almost always a bad idea for body text. It isn't even white-on-black, it's 70% white on black, so the letters seem to be hiding in the dark.

I don't mean to pick on this page. It's a well-designed page. There's clearly a mood being created here, and it's been established well. There are many pages online that veer much farther from the usual than this.

My solution for pages like this is a bookmarklet to strip away idiosyncracies in text layout. It changes text to almost-black on white, it removes letterspacing and shadows, and changes full-justified text to left-justified. When I use the bookmarklet on Part of That World, it looks like this:

Screenshot of cleaned-up text

You might prefer the original. That's fine, to each their own. You might feel like the personality has been bleached from this text. To some extent, that's true. But I saw the original, and can choose between them. This helped me to read the words, and not get snagged on the design of the page.

This is the bookmarklet: Clean text.

This is the JavaScript code in the bookmarklet, formatted and tweaked so you can read it:

javascript:(function () {
    var newSS = document.createElement('link'),
        styles = (
            '* { ' +
                'background: #fff; color: #111; ' +
                'letter-spacing: 0; text-shadow: none; hyphens: none; ' +
            '}' +
            ':link, :link * { color: #0000EE; } ' +
            ':visited, :visited * { color: #551A8B; }'
        ).replace(/;/g,' !important;');
    newSS.rel = 'stylesheet';
    newSS.href = 'data:text/css,' + escape(styles);
    document.getElementsByTagName('head')[0].appendChild(newSS);
    var els = document.getElementsByTagName('*');
    for (var i = 0, el; el = els[i]; i++) {
        if (getComputedStyle(el).textAlign === 'justify') {
            el.style.textAlign = 'left';
        }
    }
})();

There are other solutions to eccentrically designed pages. You could read blogs in a single aggregating RSS reader. But then everything is completely homogenized, and you don't even get a chance to experience the design as the author intended. Writers could (and are) flocking to sites like Medium that again homogenize the design.

By the way, full disclosure: I don't like the design of my own site, the page you are (probably) currently reading. I have been working on a re-design on and off for months. Maybe eventually it will be finished. The text will be serif, and larger, with a responsive layout and fewer distractions. Some day.

IronPython is weird

Wednesday 15 March 2017

Have you fully understood how Python 2 and Python 3 deal with bytes and Unicode? Have you watched Pragmatic Unicode (also known as the Unicode Sandwich, or unipain) forwards and backwards? You're a Unicode expert! Nothing surprises you any more.

Until you try IronPython...

Turns out IronPython 2.7.7 has str as unicode!

C:\Users\Ned>"\Program Files\IronPython 2.7\ipy.exe"
IronPython 2.7.7 (2.7.7.0) on .NET 4.0.30319.42000 (32-bit)
Type "help", "copyright", "credits" or "license" for more information.
>>> "abc"
'abc'
>>> type("abc")
<type 'str'>
>>> u"abc"
'abc'
>>> type(u"abc")
<type 'str'>
>>> str is unicode
True
>>> str is bytes
False

String literals work kind of like they do in Python 2: \u escapes are recognized in u"" strings, but not "" strings, but they both produce the same type:

>>> "abc\u1234"
'abc\\u1234'
>>> u"abc\u1234"
u'abc\u1234'

Notice that the repr of this str/unicode type will use a u-prefix if any character is non-ASCII, but it the string is all ASCII, then the prefix is omitted.

OK, so how do we get a true byte string? I guess we could encode a unicode string? WRONG. Encoding a unicode string produces another unicode string with the encoded byte values as code points!:

>>> u"abc\u1234".encode("utf8")
u'abc\xe1\x88\xb4'
>>> type(_)
<type 'str'>

Surely we could at least read the bytes from a file with mode "rb"? WRONG.

>>> type(open("foo.py", "rb").read())
<type 'str'>
>>> type(open("foo.py", "rb").read()) is unicode
True

On top of all this, I couldn't find docs that explain that this happens. The IronPython docs just say, "Since IronPython is a implementation of Python 2.7, any Python documentation is useful when using IronPython," and then links to the python.org documentation.

A decade-old article on InfoQ, The IronPython, Unicode, and Fragmentation Debate, discusses this decision, and points out correctly that it's due to needing to mesh well with the underlying .NET semantics. It seems very odd not to have documented it some place. Getting coverage.py working even minimally on IronPython was an afternoon's work of discovering each of these oddnesses empirically.

Also, that article quotes Guido van Rossum (from a comment on Calvin Spealman's blog):

You realize that Jython has exactly the same str==unicode issue, right? I've endorsed this approach for both versions from the start. So I don't know what you are so bent out of shape about.

I guess things have changed with Jython in the intervening ten years, because it doesn't behave that way now:

$ jython
Jython 2.7.1b3 (default:df42d5d6be04, Feb 3 2016, 03:22:46)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_31
Type "help", "copyright", "credits" or "license" for more information.
>>> 'abc'
'abc'
>>> type(_)
<type 'str'>
>>> str is unicode
False
>>> type("abc")
<type 'str'>
>>> type(u"abc")
<type 'unicode'>
>>> u"abc".encode("ascii")
'abc'
>>> u"abc"
u'abc'

If you want to support IronPython, be prepared to rethink how you deal with bytes and Unicode. I haven't run the whole coverage.py test suite on IronPython, so I don't know if other oddities are lurking there.

Rubik's algorithms

Sunday 26 February 2017

Recently, a nephew asked about how to solve a Rubik's Cube. I couldn't sit down with him to show him what I knew, so I looked around the web for explanations. I was surprised by two things: first, that all the pages offering solutions seemed to offer the same one, even down to the colors discussed: "Start by making a white cross, ..., finally, finish the yellow side."

Second, that the techniques (or "algorithms") were often given without explanation. They're presented as something to memorize.

My own solving technique uses a few algorithms constructed in a certain way that I describe in Two-Part Rubik's Algorithms. I wrote them up as a resource I hope my nephew will be able to use.

A Rubik's Cube with two edges flipped

BTW, that page makes use of Conrad Rider's impressive TwistySim library.

Https

Saturday 25 February 2017

Someone posted a link to my latest blog post on /r/Python, but somehow got an https link for it. That's odd: my site doesn't even properly serve content over https. People were confused by the broken link.

I should say, my site didn't even serve content over https, because now it does. I'd been meaning to enable https, and force its use, for a long time. This broken link pushed it to the top of the list.

Let's Encrypt is the certificate authority of choice these days, because they are free and automatable. And people say they make it easy, but I have to say, I would not have classified this as easy. I'm sure it's easier than it used to be, but it's still a confusing maze of choices, with decision points you are expected to navigate.

Actually getting everything installed requires sudo, or without sudo, using third-party tools, with instructions from obscure blog posts. There's clearly still room for improvement.

Once you have the certificate in place, you need to redirect your http site to https. Then you have to fix the http references in your site. Protocol-relative (or schema-less) URLs are handy here.

It's all done now, the entire site should always be https. I'm glad I finally got the kick in the pants to do it. If you find something wrong, let me know.

A tale of two exceptions, continued

Thursday 23 February 2017

In my last blog post, A tale of two exceptions, I laid out the long drawn-out process of trying to get a certain exception to make tests skip in my test runner. I ended on a solution I liked at the time.

But it still meant having test-specific code in the product code, even if it was only a single line to set a base class for an exception. It didn't feel right to say "SkipTest" in the product code, even once.

In that blog post, I said,

One of the reasons I write this stuff down is because I'm hoping to get feedback that will improve my solution, or advance my understanding. ... a reader might object and say, "you should blah blah blah."

Sure enough, Ionel said,

A better way is to handle this in coverage's test suite. Possible solution: wrap all your tests in a decorator that reraises with a SkipException.

I liked this idea. The need was definitely a testing need, so it should be handled in the tests. First I tried doing something with pytest to get it to do the conversion of exceptions for me. But I couldn't find a way to make it work.

So: how to decorate all my tests? The decorator itself is fairly simple. Just call the method with all the arguments, and return its value, but if it raises StopEverything, then raise SkipTest instead:

def convert_skip_exceptions(method):
    """A decorator for test methods to convert StopEverything to SkipTest."""
    def wrapper(*args, **kwargs):
        """Run the test method, and convert exceptions."""
        try:
            result = method(*args, **kwargs)
        except StopEverything:
            raise unittest.SkipTest("StopEverything!")
        return result
    return wrapper

But decorating all the test methods would mean adding a @convert_skip_exceptions line to hundreds of test methods, which I clearly was not going to do. I could use a class decorator, which meant I would only have to add a decorator line to dozens of classes. That also felt like too much to do and remember to do in the future when I write new test classes.

It's not often I say this, but: it was time for a metaclass. Metaclasses are one of the darkest magics Python has, and they can be mysterious. At heart, they are simple, but in a place you don't normally think to look. Just as a class is used to make objects, a metaclass is used to make classes. Since there's something I want to do everytime I make a new class (decorate its methods), a metaclass gives me the tools to do it.

class SkipConvertingMetaclass(type):
    """Decorate all test methods to convert StopEverything to SkipTest."""
    def __new__(mcs, name, bases, attrs):
        for attr_name, attr_value in attrs.items():
            right_name = attr_name.startswith('test_')
            right_type = isinstance(attr_value, types.FunctionType)
            if right_name and right_type:
                attrs[attr_name] = convert_skip_exceptions(attr_value)

        return super(SkipConvertingMetaclass, mcs).__new__(mcs, name, bases, attrs)

There are details here that you can skip as incantations if you like. Classes are all instances of "type", so if we want to make a new thing that makes classes, it derives from type to get those same behaviors. The method that gets called when a new class is made is __new__. It gets passed the metaclass itself (just as classmethods get cls and instance methods get self), the name of the class, the tuple of base classes, and a dict of all the names and values defining the class (the methods, attributes, and so on).

The important part of this metaclass is what happens in the __new__ method. We look at all the attributes being defined on the class. If the name starts with "test_", and it's a function, then it's a test method, and we decorate the value with our decorator. Remember that @-syntax is just a shorthand for passing the function through the decorator, which we do here the old-fashioned way.

Then we use super to let the usual class-defining mechanisms in "type" do their thing. Now all of our test methods are decorated, with no explicit @-lines in the code. There's only one thing left to do: make sure all of our test classes use the metaclass:

CoverageTestMethodsMixin = SkipConvertingMetaclass('CoverageTestMethodsMixin', (), {})

class CoverageTest(
    ... some other mixins ...
    CoverageTestMethodsMixin,
    unittest.TestCase,
):
    """The base class for all coverage.py test classes."""

Metaclasses make classes, just the way classes make instances: you call them. Here we call our with the arguments it needs (class name, base classes, and attributes) to make a class called CoverageTestMethodsMixin.

Then we use CoverageTestMethodsMixin as one of the base classes of CoverageTest, which is the class used to derive all of the actual test classes.

Pro tip: if you are using unittest-style test classes, make a single class to be the base of all of your test classes, you will be glad.

After all of this class machinations, what have we got? Our test classes all derive from a base class which uses a metaclass to decorate all the test methods. As a result, any test which raises StopEverything will instead raise SkipTest to the test runner, and the test will be skipped. There's now no mention of SkipTest in the product code at all. Better.

A tale of two exceptions

Sunday 22 January 2017

It was the best of times, it was the worst of times...

This week saw the release of three different versions of Coverage.py. This is not what I intended. Clearly something was getting tangled up. It had to do with some tricky exception handling. The story is kind of long and intricate, but has a number of chewy nuggets that fascinate me. Your mileage may vary.

Writing it all out, many of these missteps seem obvious and stupid. If you take nothing else from this, know that everyone makes mistakes, and we are all still trying to figure out the best way to solve some problems.

It started because I wanted to get the test suite running well on Jython. Jython is hard to support in Coverage.py: it can do "coverage run", but because it doesn't have the same internals as CPython, it can't do "coverage report" or any of the other reporting code. Internally, there's one place in the common reporting code where we detect this, and raise an exception. Before all the changes I'm about to describe, that code looked like this:

for attr in ['co_lnotab', 'co_firstlineno']:
    if not hasattr(self.code, attr):
        raise CoverageException(
            "This implementation of Python doesn't support code analysis.\n"
            "Run coverage.py under CPython for this command."
        )

The CoverageException class is derived from Exception. Inside of Coverage.py, all exceptions raised are derived from CoverageException. This is a good practice for any library. For the coverage command-line tool, it means we can catch CoverageException at the top of main() so that we can print the message without an ugly traceback from the internals of Coverage.py.

The problem with running the test suite under Jython is that this "can't support code analysis" exception was being raised from hundreds of tests. I wanted to get to zero failures or errors, either by making the tests pass (where the operations were supported on Jython) or skipping the tests (where the operations were unsupported).

There are lots of tests in the Coverage.py test suite that are skipped for all sorts of reasons. But I didn't want to add decorators or conditionals to hundreds of tests for the Jython case. First, it would be a lot of noise in the tests. Second, it's not always immediately clear from a test that it is going to touch the analysis code. Lastly and most importantly, if someday in the future I figured out how to do analysis on Jython, or if it grew the features to make the current code work, I didn't want to have to then remove all that test-skipping noise.

So I wanted to somehow automatically skip tests when this particular exception was raised. The unittest module already has a way to do this: tests are skipped by raising a unittest.SkipTest exception. If the exception raised for "can't support code analysis" derived from SkipTest, then the tests would be skipped automatically. Genius idea!

So in 4.3.2, the code changed to this (spread across a few files):

from coverage.backunittest import unittest

class StopEverything(unittest.SkipTest):
    """An exception that means everything should stop.

    This derives from SkipTest so that tests that spring this trap will be
    skipped automatically, without a lot of boilerplate all over the place.

    """
    pass

class IncapablePython(CoverageException, StopEverything):
    """An operation is attempted that this version of Python cannot do."""
    pass

...

# Alternative Python implementations don't always provide all the
# attributes on code objects that we need to do the analysis.
for attr in ['co_lnotab', 'co_firstlineno']:
    if not hasattr(self.code, attr):
        raise IncapablePython(
            "This implementation of Python doesn't support code analysis.\n"
            "Run coverage.py under another Python for this command."
        )

It felt a little off to derive a product exception (StopEverything) from a testing exception (SkipTest), but that seemed acceptable. One place in the code, I had to deal specifically with StopEverything. In an inner loop of reporting, we catch exceptions that might happen on individual files being reported. But if this exception happens once, it will happen for all the files, so we wanted to end the report, not show this failure for every file. In pseudo-code, the loop looked like this:

for f in files_to_report:
    try:
        generate_report_for_file(f)
    except StopEverything:
        # Don't report this on single files, it's a systemic problem.
        raise
    except Exception as ex:
        record_exception_for_file(f, ex)

This all seemed to work well: the tests skipped properly, without a ton of noise all over the place. There were no test failures in any supported environment. Ship it!

Uh-oh: very quickly, reports came in that coverage didn't work on Python 2.6 any more. In retrospect, it was obvious: the whole point of the "from coverage.backunittest" line in the code above was because Python 2.6 doesn't have unittest.SkipTest. For the Coverage.py tests on 2.6, I install unittest2 to get a backport of things 2.6 is missing, and that gave me SkipTest, but without my test requiements, it doesn't exist.

So my tests passed on 2.6 because I installed a package that provided what was missing, but in the real world, unittest.SkipTest is truly missing.

This is a conundrum that I don't have a good answer to:

How can you test your code to be sure it works properly when the testing requirements aren't installed?

To fix the problem, I changed the definition of StopEverything. Coverage.py 4.3.3 went out the door with this:

class StopEverything(unittest.SkipTest if env.TESTING else object):
    """An exception that means everything should stop."""
    pass

The env.TESTING setting was a pre-existing variable: it's true if we are running the coverage.py test suite. This also made me uncomfortable: as soon as you start conditionalizing on whether you are running tests or not, you have a very slippery slope. In this case it seemed OK, but it wasn't: it hid the fact that deriving an exception from object is a dumb thing to do.

So 4.3.3 failed also, and not just on Python 2.6. As soon as an exception was raised inside that reporting loop that I showed above, Python noticed that I was trying to catch a class that doesn't derive from Exception. Of course, my test suite didn't catch this, because when I was running my tests, my exception derived from SkipTest.

Changing "object" to "Exception" would fix the problem, but I didn't like the test of env.TESTING anyway. So for 4.3.4, the code is:

class StopEverything(getattr(unittest, 'SkipTest', Exception)):
    """An exception that means everything should stop."""
    pass

This is better, first because it uses Exception rather than object. But also, it's duck-typing the base class rather than depending on env.TESTING.

But as I kept working on getting rid of test failures on Jython, I got to this test failure (pseudo-code):

def test_sort_report_by_invalid_option(self):
    msg = "Invalid sorting option: 'Xyzzy'"
    with self.assertRaisesRegex(CoverageException, msg):
        coverage.report(sort='Xyzzy')

This is a reporting operation, so Jython will fail with a StopEverything exception saying, "This implementation of Python doesn't support code analysis." StopEverything is a CoverageException, so the assertRaisesRegex will catch it, but it will fail because the messages don't match.

StopEverything is both a CoverageException and a SkipTest, but the SkipTest is the more important aspect. To fix the problem, I did this, but felt silly:

def test_sort_report_by_invalid_option(self):
    msg = "Invalid sorting option: 'Xyzzy'"
    with self.assertRaisesRegex(CoverageException, msg):
        try:
            coverage.report(sort='Xyzzy')
        except SkipTest:
            raise SkipTest()

I knew this couldn't be the right solution. Talking it over with some co-workers (OK, I was griping and whining), we came up with the better solution. I realized that CoverageException is used in the code base to mean, "an ordinary problem from inside Coverage.py." StopEverything is not an ordinary problem. It reminded me of typical mature exception hierarchies, where the main base class, like Exception, isn't actually the root of the hierarchy. There are always a few special-case classes that derive from a real root higher up.

For example, in Python, the classes Exception, SystemExit, and KeyboardInterrupt all derive from BaseException. This is so "except Exception" won't interfere with SystemExit and KeyboardInterrupt, two exceptions meant to forcefully end the program.

I needed the same thing here, for the same reason. I want to have a way to catch "all" exceptions without interfering with the exceptions that mean "end now!" I adjusted my exception hierarchy, and now the code looks like this:

class BaseCoverageException(Exception):
    """The base of all Coverage exceptions."""
    pass

class CoverageException(BaseCoverageException):
    """A run-of-the-mill exception specific to coverage.py."""
    pass

class StopEverything(
        BaseCoverageException,
        getattr(unittest, 'SkipTest', Exception)
    ):
    """An exception that means everything should stop."""
    pass

Now I could remove the weird SkipTest dance in that test. The catch clause in my main() function changes from CoverageException to BaseCoverageException, and things work great. The end...?

One of the reasons I write this stuff down is because I'm hoping to get feedback that will improve my solution, or advance my understanding. As I lay out this story, I can imagine points of divergence: places in this narrative where a reader might object and say, "you should blah blah blah." For example:

  • "You shouldn't bother supporting 2.6." Perhaps not, but that doesn't change the issues explored here, just makes them less likely.
  • "You shouldn't bother supporting Jython." Ditto.
  • "You should just have dependencies for the things you need, like unittest2." Coverage.py has a long-standing tradition of having no dependencies. This is driven by a desire to be available to people porting to new platforms, without having to wait for the dependencies to be ported.
  • "You should have more realistic integration testing." I agree. I'm looking for ideas about how to test the scenario of having no test dependencies installed.

That's my whole tale. Ideas are welcome.

Update: the story continues, but fair warning: metaclasses ahead!

Evil ninja module initialization

Tuesday 10 January 2017

A question about import styles on the Python-Dev mailing list asked about imports like this:

import os as _os

Understanding why people do this is an interesting lesson in how modules work. A module is nothing more than a collection of names. When you define a name in a .py file, it becomes an attribute of the module, and is then importable from the module.

An underlying simplicity in Python is that many statements are really just assignment statements in disguise. All of these define the name X:

X = 17
def X(): print("look!")
import X

When you create a module, you can make the name "X" importable from that module by assigning to it, or defining it as a function. You can also make it importable by importing it yourself.

Suppose your module looks like this:

# yourmodule.py
import os

def doit():
    os.something_or_other()

This module has two names defined in it: "doit", and "os". Someone else can now do this:

# someone.py
from yourmodule import os

# or worse, this imports os and doit:
from yourmodule import *

This bothers some people. "os" is not part of the actual interface of yourmodule. That first import I showed prevents this leaking of your imports into your interface. Importing star doesn't pull in names starting with underscores. (Another solution is to define __all__ in your module.)

Most people though, don't worry about this kind of name leaking. Import-star is discouraged anyway, and people know not to import os from other modules. The solution of renaming os to _os just makes your code ugly for little benefit.

The part of the discussion thread that really caught my eye was Daniel Holth's winking suggestion of the "evil ninja mode pattern" of module initialization:

def ninja():
    global exported
    import os
    def exported():
        os.do_something()

ninja()
del ninja

What's going on here!? Remember that def is an assignment statement like any other. When used inside a function, it defines a local name, as assignment always does. But an assignment in a function can define a global name if the name is declared as global. It's a little unusual to see a global statement without an explicit assignment at the top-level, but it works just fine. The def statement defines a global "exported" function, because the global statement told it to. "os" is now a local in our function, because again, the import statement is just another form of assignment.

So we define ninja(), and then execute it immediately. This defines the global "exported", and doesn't define a global "os". The only problem is the name "ninja" has been defined, which we can clean up with a del statement.

Please don't ever write code this way. It's a kind of over-defensiveness that isn't needed in typical Python code. But understanding what it does, and why it does it, is a good way to flex your understanding of Python workings.

For more about how names (and values) work in Python, people seem to like my PyCon talk, Python Names and Values.

No PyCon for me this year

Thursday 5 January 2017

2017 will be different for me in one specific way: I won't be attending PyCon. I've been to ten in a row:

Ten consecutive PyCon badges

This year, Open edX con is in Madrid two days later after PyCon, actually overlapping with the sprints. I'm not a good enough traveler to do both. Crossing nine timezones is not something to be taken lightly.

I'll miss the usual love-fest at PyCon, but after ten in a row, it should be OK to miss one. I can say that now, but probably in May I will feel like I am missing the party. Maybe I really will watch talks on video for a change.

I usually would be working on a presentation to give. I like making presentations, but it is a lot of work. This spring I'll have that time back.

In any case, this will be a new way to experience the Python community. See you all in 2018 in Cleveland!

Older:

Even older...