Pytest’s parametrize can be made even more powerful with your own helper functions to build test cases.
Pytest’s parametrize is a great feature for writing tests without repeating yourself needlessly. (If you haven’t seen it before, read Starting with pytest’s parametrize first). When the data gets complex, it can help to use functions to build the data parameters.
I’ve been working on a project involving multi-line data, and the parameterized test data was getting awkward to create and maintain. I created helper functions to make it nicer. The actual project is a bit gnarly, so I’ll use a simpler example to demonstrate.
Here’s a function that takes a multi-line string and returns two numbers, the lengths of the shortest and longest non-blank lines:
def non_blanks(text: str) -> tuple[int, int]:
"""Stats of non-blank lines: shortest and longest lengths."""
lengths = [len(ln) for ln in text.splitlines() if ln]
return min(lengths), max(lengths)
We can test it with a simple parameterized test with two test cases:
import pytest
from non_blanks import non_blanks
@pytest.mark.parametrize(
"text, short, long",
[
("abcde\na\nabc\n", 1, 5),
("""\
A long line
The next line is blank:
Short.
Much much longer line, more than anyone thought.
""", 6, 48),
]
)
def test_non_blanks(text, short, long):
assert non_blanks(text) == (short, long)
I really dislike how the multi-line string breaks the indentation flow, so I wrap strings like that in textwrap.dedent:
@pytest.mark.parametrize(
"text, short, long",
[
("abcde\na\nabc\n", 1, 5),
(textwrap.dedent("""\
A long line
The next line is blank:
Short.
Much much longer line, more than anyone thought.
"""),
6, 48),
]
)
(For brevity, this and following examples only show the parametrize decorator, the test function itself stays the same.)
This looks nicer, but I have to remember to use dedent, which adds a little bit of visual clutter. I also need to remember that first backslash so that the string won’t start with a newline.
As the test data gets more elaborate, I might not want to have it all inline in the decorator. I’d like to have some of the large data in its own file:
@pytest.mark.parametrize(
"text, short, long",
[
("abcde\na\nabc\n", 1, 5),
(textwrap.dedent("""\
A long line
The next line is blank:
Short.
Much much longer line, more than anyone thought.
"""),
6, 48),
(Path("gettysburg.txt").read_text(), 18, 80),
]
)
Now things are getting complicated. Here’s where a function can help us. Each test case needs a string and three numbers. The string is sometimes provided explicitly, sometimes read from a file.
We can use a function to create the correct data for each case from its most convenient form. We’ll take a string and use it as either a file name or literal data. We’ll deal with the initial newline, and dedent the multi-line strings:
def nb_case(text, short, long):
"""Create data for test_non_blanks."""
if "\n" in text:
# Multi-line string: it's actual data.
if text[0] == "\n": # Remove a first newline
text = text[1:]
text = textwrap.dedent(text)
else:
# One-line string: it's a file name.
text = Path(text).read_text()
return (text, short, long)
Now the test data is more direct:
@pytest.mark.parametrize(
"text, short, long",
[
nb_case("abcde\na\nabc\n", 1, 5),
nb_case("""
A long line
The next line is blank:
Short.
Much much longer line, more than anyone thought.
""",
6, 48),
nb_case("gettysburg.txt", 18, 80),
]
)
One nice thing about parameterized tests is that pytest creates a distinct id for each one. The helps with reporting failures and with selecting tests to run. But the id is made from the test data. Here, our last test case has an id using the entire Gettysburg Address, over 1500 characters. It was very short for a speech, but it’s very long for an id!
This is what the pytest output looks like with our current ids:
test_non_blank.py::test_non_blanks[abcde\na\nabc\n-1-5] PASSED
test_non_blank.py::test_non_blanks[A long line\nThe next line is blank:\n\nShort.\nMuch much longer line, more than anyone thought.\n-6-48] PASSED
test_non_blank.py::test_non_blanks[Four score and seven years ago our fathers brought forth on this continent, a\nnew nation, conceived in Liberty, and dedicated to the proposition that all men\nare created equal.\n\nNow we are engaged in a great civil war, testing whether that nation, or any\nnation so conceived and so dedicated, can long endure. We are met on a great\nbattle-field of that war. We have come to dedicate a portion of that field, as a\nfinal resting place for those who here gave their lives that that nation might\nlive. It is altogether fitting and proper that we should do this.\n\nBut, in a larger sense, we can not dedicate \u2013 we can not consecrate we can not\nhallow \u2013 this ground. The brave men, living and dead, who struggled here, have\nconsecrated it far above our poor power to add or detract. The world will little\nnote, nor long remember what we say here, but it can never forget what they did\nhere. It is for us the living, rather, to be dedicated here to the unfinished\nwork which they who fought here have thus far so nobly advanced. It is rather\nfor us to be here dedicated to the great task remaining before us that from\nthese honored dead we take increased devotion to that cause for which they gave\nthe last full measure of devotion \u2013 that we here highly resolve that these dead\nshall not have died in vain that this nation, under God, shall have a new birth\nof freedom \u2013 and that government of the people, by the people, for the people,\nshall not perish from the earth.\n-18-80] PASSED
Even that first shortest test has an awkward and hard to use test name.
For more control over the test data, instead of creating tuples to use as test cases, you can use pytest.param to create the internal parameters object that pytest needs. Each of these can have an explicit id assigned. Pytest will still assign an id if you don’t provide one.
Here’s an updated nb_case() function using pytest.param:
def nb_case(text, short, long, id=None):
if "\n" in text:
# Multi-line string: it's actual data.
if text[0] == "\n": # Remove a first newline
text = text[1:]
text = textwrap.dedent(text)
else:
# One-line string: it's a file name.
id = id or text
text = Path(text).read_text()
return pytest.param(text, short, long, id=id)
Now we can provide ids for test cases. The ones reading from a file will use the file name as the id:
@pytest.mark.parametrize(
"text, short, long",
[
nb_case("abcde\na\nabc\n", 1, 5, id="little"),
nb_case("""
A long line
The next line is blank:
Short.
Much much longer line, more than anyone thought.
""",
6, 48, id="four"),
nb_case("gettysburg.txt", 18, 80),
]
)
Now our tests have useful ids:
test_non_blank.py::test_non_blanks[little] PASSED
test_non_blank.py::test_non_blanks[four] PASSED
test_non_blank.py::test_non_blanks[gettysburg.txt] PASSED
The exact details of my case() function aren’t important here. Your
tests will need different helpers, and you might make different decisions about
what to do for these tests. But a function like this lets you write your complex
test cases in the way you like best to make your tests as concise, expressive
and readable as you want.
Comments
Add a comment: