How many errors?

Monday 7 September 2015 — This is nearly ten years old. Be careful.

A co-worker mentioned to me the other day that our test runner would run methods with the word “test” in the name, even if they didn’t start with “test_”. This surprised him and me, so I went looking into exactly how the function names were matched.

What I found was surprising: a single line of code with between three and five errors, depending on how you wanted to count them.

This is the line of code:

self.testMatch = re.compile(r'(?:^|[\\b_\\.%s-])[Tt]est' % os.sep)

Regexes are complicated, and it is easy to make mistakes. I’m not writing this to point fingers, or to label people as stupid. The point is that code is inherently complicated, and scrutiny and review are very very important for keeping things working.

The regex here is using an r”” string, as all regexes should. But notice there are two instances of double backslashes. The r”” string means that the regex will actually have two double backslashes. Each of them therefore means, “match a backslash.” So we have a character class (in the square brackets, also called character ranges) with backslash in it twice. That’s needless, one is enough.

But looking closer, what’s that “b” doing in there? It will actually match a “b” character. Which means that “abtest” will match this pattern, but “bctest” will not. Surely that’s a mistake.

Going back in the history of the repo, we see that the line used to be:

self.testMatch = re.compile(r'(?:^|[\b_\.%s-])[Tt]est' % os.sep)

That is, the backslashes used to be single rather than double. The doubling happened during a documentation pass: the docs needed the backslashes doubled, and I guess a misguided attempt at symmetry also doubled the backslashes in the code.

But with this older line, we can see that the intent of the backslashes was to get “\b” and “\.” into the character set. The “\.” wasn’t necessary, a dot isn’t special in a character set, so just “.” would have been fine.

What’s the “\b” for? The Python re docs say,

Matches the empty string, but only at the beginning or end of a word.

So the intent here was to force the word “test” to be at the boundary of a word. In which case, why include dot or dash in the regex? They would already define the boundary of a word.

But reading further in the re docs:

Inside a character range, \b represents the backspace character, for compatibility with Python’s string literals.

So the “\b” didn’t match word boundaries at all, it matched a backspace character. I’m guessing it never encountered a backspace character in the directory, class, and function names it was being used on.

OK, so that probably explains how the dot and dash got there: the \b wasn’t doing its job, and rather than get to the bottom of it, a developer threw in the explicit characters needed.

Let’s look at os.sep. That’s the platform-specific pathname separator: slash on Unix-like systems, backslash on Windows. String formatting is being used to insert it into the character class, so that the pathname separator will also be OK before the word “test”. (Remember: if the \b had worked, we wouldn’t need this at all.)

Look more closely at the result of that string formatting though: on Windows, the %s will be replaced with a backslash. The character class will then end with “.\-]”. In a character class, backslashes still serve their escaping function, so this is the same as “.-]”. That is, the backslash won’t be used as a literal backslash at all. On Windows, this os.sep interpolation was pointless. (Keep in mind: this problem was solved when the incorrect backslash doubling happened.)

What’s our final tally? I don’t know, depends how you count. The line still has bugs (“abtest” vs “bctest”), and a long and checkered past.

Think carefully about what you write. Review it well. Get others to help. If something doesn’t seem to be working, figure out why. Be careful out there.

Comments

Homer 12:54 AM on 8 Sep 2015

Alas, like every open-source project in the world, this bug was reported many years ago, and simply never fixed: https://github.com/nose-devs/nose/issues/11

Bug number eleven!

Thomas Guest 7:05 AM on 8 Sep 2015

Ironic that the test harness isn't itself unit tested, or if it is the tests are poor.

Thomas Krauss 8:30 AM on 8 Sep 2015

@Thomas Guest
I'm not sure I have read right what you said but I would point out that unit testing does not detect bugs. At best (i.e. when done right) it ensures that the program is doing what we want it to do.

It does not ensure that the program is only doing what we want it to do. We may counter that "the tests ought to be better than that".

But I think it is unfair and in fact unreasonable.

The realm of things we do not want our programs to do is infinite. And I've yet to see any take or even hindsight to guide ourselves in this realm. Even the question of which areas of this realm our program can get into seems difficult.

@All
I have grown super suspicious of the use of string matching of any sort (simple search, regular expressions, even parser of a given grammar) when we want to run code.

The example in this post is great but I have an even simpler one. In my previous job, a program was selecting too what pieces of code to run. On the machine, these pieces were in a directory "object/". They get chosen because the case matching against the directory name was "o*".

One day, a big problem occur. Big as in 'solve this right away'. So people went into task force mode. In less than half an hour, a temporary solution was in place.

It didn't work.

Because they had kept a unchanged copy of the initial directory and they had named it "old_object". Since that matches "o*", the buggy pieces of code still got chosen!

This is how a feature turns into a bug. My opinion is that features and bugs are very alike: both are happening. It is not that the fact we want them to happen or not is irrelevant. It is just more important that they happen because from ground zero, they should not.

When a program is blank, it does nothing. Upon growing, it acquires abilities. I think the question we should spend more time on is "Which abilities my program acquire with this code?".

A feature that is not documented is a bug in disguise. And a feature that is never used is also a bug. The more time passes, the more probable it is that nobody will remember it nor that anybody will even care!

We should test if every feature is documented and used. It is a big question. Big as in 'solve this right away'. But it won't happen anytime soon I think.

Thomas Guest 9:12 AM on 8 Sep 2015

Hi Thomas Krauss.
I was suggesting the test matcher should have unit tests and that these should have captured the desired behaviour.

Thomas Krauss 9:34 PM on 8 Sep 2015

So I read you wrong (ah! I kind of knew it.)
Thank you very much for taking the time to answer me.

How many errors?

Comments

Add a comment: