Pylint

Saturday 14 June 2008This is over 16 years old. Be careful.

Being a developer, I’m a sucker for rules to follow to improve my code, and for tools that can help me to follow them. Being a Python developer, I don’t have a static type checking compiler to help me. Pylint aims to fill some of those gaps.

It examines your Python source code and reports on all sorts of things it doesn’t like. Like most tools of this sort (its name is a reference to the classic lint tool for C), it can be annoyingly picky. Since its job is to flag things that might be a problem, it errs on the alarmist noisy side.

Pylint tries to apply light type checking to methods and variables, so it will complain about constructs simply because they interfere with that goal:

W0142: 26:MyClass.my_method: Used * or ** magic

Excuse me, but ** is not magic, it’s a powerful language feature. Reading pylint’s warnings, you get the feeling that it won’t be completely happy until you are coding within the intersection of Python and Java.

But pylint’s best feature is its configurability. In the settings file, you can disable individual messages:

# Disable the message(s) with the given id(s).
disable-msg=C0103,R0903,W0142,C0324,R0201,C0111,W0232,W0702,W0703,W0201

and also configure all sorts of other settings. This is important because pylint also natters on about style issues: valid names, line length, number of statements per method, and so on. Pylint also lets you disable a particular message in specific files, classes, or methods, which is extremely useful for overriding warnings about tricky cases, or simple misdiagnoses.

As with every other lint-like tool I’ve ever used, the first order of business is deciding what you really want pylint to tell you about. Initially, its reporting will be about things that just don’t matter, and you’ll disable a ton of messages. Then you’ll get to the things that you’ll agree are minor issues and you’ll want to clean up, like unused imports.

The next rung of messages are helpful because they get you to think about the way you’ve written your code. For example, this code:

def my_method(self, arg1, extras=[]):
    // blah blah...

will get you this warning:

W0102:247:MyClass.my_method: Dangerous default value [] as argument

Pylint warns about this because you could append to extras in the body of the method, and that would modify the single list object that is used as the default value for every invocation of the method, something you almost certainly didn’t intend. Changing the code to this avoids the possibility and the warning:

def my_method(self, arg1, extras=None):
    extras = extras or []
    // blah blah...

Whether you want to adopt this idiom uniformly, or stick with the more common extras=[] is something you’ll have to decide. Pylint did you the favor of bringing it to your attention so that you could think through the issue and decide. In this case, you may be able to simply leave extras as None and use it as is in the body of the function, but you get the point.

Occasionally, you’ll get unambiguous value from the pylint output. I ran pylint on a large actively developed code base, and it reported on an instance of an undefined variable. I looked, and sure enough, that code shouldn’t work. Digging further, I looked at who called that code, and once I was done pulling on all of the threads I discovered, I had a couple hundred lines of code that were not used any more, and I could delete them.

I don’t know whether I’ll stick with pylint. It’s a tricky balance to get it set properly so that it warns about things I genuinely believe to be issues.

The other minor downside to pylint is that you have to install three separate packages to get it to work. Logilab would do well to provide a single installer for pylint and its dependencies.

BTW: There are other tools for static checking Python code, but I haven’t used them recently: PyFlakes and PyChecker.

Comments

[gravatar]
At Resolver Systems we have PyLint in our pre-build check. (So Resolver One will refuse to build if the PyLint step fails.)

We only use it for a specific set of warnings that we find very useful:

Unused imports
Shadowing builtins
Unused variables or arguments
Redefined functions or methods
Undefined names

We have builtin a few specific exceptions (occasionally redefining functions is useful for example), but the number of unused imports and variables we can pull out of our code alone makes PyLint valuable.

PyFlakes is faster though, so some of our developers have built PyFlakes integration into their IDEs. It doesn't do everything that PyLint does, but it does warn about some of the more common problems like unused and undefined names.
[gravatar]
There's a likely problem in your "fix" for a my_method which appends to an argument list if given (else to an empty list): since you've fixed it to start with "extras = extras or []", if the caller explicitly passes an empty list, that argument will NOT be used for the appends -- rather, a different empty list will.
So, if the caller passes a non-empty list, when the method is finished the caller will see stuff appended to the list it passed; but when it passes an empty list, it will never see anything appended to it. It's *extremely* unlikely that such a weirdly discriminating behavior is actually desired as the method's specification!
Most likely, the desired fix is a very different first statement for the method: "if extras is None: extras = []". This way, if the caller explicitly passes in any list as the extras argument, the method will use it and the caller will see the method's effects in terms of appends to that list (whether the passed-in list is empty or not at the time it's passed); the method will use a new "temporary" list only when the caller doesn't pass any.
[gravatar]
Even code for static languages can be helped with code analysis tools. In Java we used to use PMD and Checkstyle. For C# code we use FxCop and StyleCop. There's some overlap in what they report but they work at different levels. FxCop looks at compiled code, StyleCop looks at source.

In order for these tools to be effective, it's worth spending the time upfront to decide which rules you're going to follow -- and then make sure that they're actually followed. On a large Java project I worked on a few years ago the decision to use PMD came very late in the cycle and the chosen ruleset reported thousands of violations in the code. It was too late and too disruptive to fix any but the most egregious problems.
[gravatar]
@Alex: an excellent point! None of the cases I saw like this in my code were modifying the list for the caller to see the result. I was looking for cases where internal bookkeeping would modify the list, which would also be bad, and where my fix would have been OK.

Your change is definitely the safest.
[gravatar]
> But pylint's best feature is its configurability. In the settings file, you can [...] and also configure all sorts of other settings.

One trouble I had with pylint was figuring out how to drive it.

Where does pylint look for the settings file? What is its name?

Where can one find good documentation on the "all sorts of settings" that one can configure?
[gravatar]
Ahem. It seems all I needed was motivation to look in /usr/share/doc/pylint/ to find the existing, very comprehensive documentation, which answers all my questions above.

In particular, the man page explains where pylint looks for its configuration files, and the 'features.txt' document is a complete coverage of features, settings, and messages.

Thanks for this article, Ned! :-)
[gravatar]
Jürgen Hermann 4:06 AM on 26 Jun 2008
In case anyone finds this useful, I use this Paver task to drive pylint:

{use "view source" to get proper indents}
@task
@needs(["build_init"])
@cmdopts([
('output=', 'o', 'Create report file (*.html, *.log, or *.txt) [stdout]'),
('rcfile=', 'r', 'Configuration file [None]'),
('msg-only', 'm', 'Only generate messages (no reports)'),
])
def lint():
"""Report pylint results."""
import sys
from pylint import lint as linter

# report according to file extension
reporters = {
".html": linter.HTMLReporter,
".log": linter.ParseableTextReporter,
".txt": linter.TextReporter,
}

rcfile = options.get("rcfile", None)
if rcfile:
argv = ["--rcfile", rcfile]
else:
rcfile = options.build_dir / "emptyfile"
argv = [
"--rcfile", rcfile,
"-iy",
"--comment=y",
"--max-line-length", "132",
"--disable-msg=W0142,I0011",
"--deprecated-modules=regsub,TERMIOS,Bastion,rexec",
] + ["--%s=[a-z_][A-Za-z0-9_]{2,50}" % o for o in (
"function-rgx",
"method-rgx",
"attr-rgx",
"argument-rgx",
"variable-rgx",
)]

rcfile_name = options.get("output", options.build_dir / "pylint") + ".rc"
sys.stderr.write("Generating rcfile %r... " % str(rcfile_name))
rcfile_handle = open(rcfile_name, "w")
try:
sys.stdout = rcfile_handle
try:
linter.Run(argv + ["--generate-rcfile"])
except SystemExit:
sys.stderr.write("done.\n")
finally:
sys.stdout = sys.__stdout__
rcfile_handle.close()

if options.get("lint", {}).get("msg_only", False):
argv += ["-rn"]

argv += [
"--import-graph", options.lint_dir / "imports.dot",
]
argv += options.toplevel_packages

sys.stderr.write("Running %s::pylint %s\n" % (sys.argv[0], " ".join(argv)))
outfile = options.get("output", None)
if outfile:
reporterClass = reporters.get(path(outfile).ext, linter.TextReporter)
sys.stderr.write("Writing output to %r\n" % (str(outfile),))
linter.Run(argv, reporter=reporterClass(open(outfile, "w")))
else:
linter.Run(argv)
[gravatar]
Hey.
So I can enable/disable checks locally, by including source code comments like:

# pylint: disable-msg=C0103

But I can't seem to redefine the regex for valid method names locally. A comment like:

# pylint: method-rgx=[a-z_][a-z0-9_]{2,30}$

Causes pylint to barf with:

'E0011: : Unrecognized file option 'method-rgx'

So I can only define valid method regex once, globally, in the pylintrc file?

Most of my code uses one convention. But when I subclass unittest.TestCase, I want to adhere to TestCase's usual standards, which differ slightly from the rest of my project (in allowing upper case chars in method names.)

Anyone have any clues? Thanks!
[gravatar]
Well Jonathan - a nasty solution to this is to do two runs on your project. Once on your production code and once on your tests. I wonder how easy it is to share configurations...
[gravatar]
Good idea Michael, thanks. I'll try that out tomorrow.
[gravatar]
Hey. For the record, I tried that out. As it turns out, my config differs only slightly from the pylint defaults, so my pylintrc file(s) are small, and therefore sharing configurations between more than one run is not massively critical right now.

Having said that, it isn't quite as straightforward as alluded above : My custom 'TestCase' class isn't, of course, actually a test. So the actual setup I ended up with is one config for product code, a second for tests, and a third for testutil infrastructure.
[gravatar]
You said Pylint also lets you disable a particular message in specific files...

may i know how? as i am new to python itself and pylint also
[gravatar]
You can disable Pylint in a file by adding a comment line in the file itself:

# pylint: disable-msg=E1101
[gravatar]
For:
W0102:247:MyClass.my_method: Dangerous default value [] as argument
Couldn't you switch to a tuple instead?
def my_method(self, arg1, extras: tuple = ()):
    // blah blah...

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.