TDD with pytest

TDD with `pytest`

What is TDD

Test Driven Design is a Design methodology. I have found it quite easy to think that it's all about testing, but it isn't - it's about design.

When using TDD, we write a test before implementing code, but that test is a test for some behaviour of the system, which is an expression of the design. The difference between behaviour (design), and implementation, is that there are many implementations which satisfy a behaviour. The test ensures that the system behaves the way it is intended, regardless of implementation.

The TDD process is often given in terms of 'Red', 'Greed', and 'Refactor'. In the 'Red' phase, we write a test for some behaviour. This test will fail, because the behaviour is not yet implemented, hence the 'Red' phase. We then implement the behaviour, with the minimal code required to make the test pass, which takes us to the 'Green' phase. At this point, we refactor the implementation, the 'Refactor' phase. During the 'Refactor' phase, we might find ourselves making our code more generic, introducing parent classes, abstract classes, and interfaces.

Why is TDD worth practicing?

In my opinion, the following are the main benefits of TDD:

Requirements Clarity: you need to have clear requirements in order to understand what to test.
Interface Focussed: writing the test for the behaviour before the implementation of the behaviour pushes you to think about the public interface of the code which implements the behaviour.
Safe Refactoring: provided you have tested for the behaviour (the public interface) you want, you can freely change the implementation, safe in the knowledge that if you accidentally change the behaviour, tests will fail and notify you.

A quick example

Here is a quick example of a simple TDD kata, the FizzBuzz kata:

#!/usr/bin/env python

import pytest

def multipleOf(multiple, factor):
    return (multiple % factor) == 0

def fizzbuzz(multiple):
    if (multipleOf(multiple, 3) and (multipleOf(multiple, 5))):
	return 'FizzBuzz'
    if multipleOf(multiple, 3):
	return 'Buzz'
    if multipleOf(multiple, 5):
	return 'Fizz'

def test__assertTrue():
    assert True #  lol

def test__returnFizzOnMultipleOfThree():
    assert fizzbuzz(6) == 'Buzz'

def test_returnBuzzOnMultipleOfFive():
    assert fizzbuzz(10) == 'Fizz'

def test__returnFizzBuzzOnMultipleOfThreeOrFive():
    assert fizzbuzz(15) == 'FizzBuzz'

Using `pytest`

pytest is a simple but effective Python unit testing framework.

`assert`

pytest uses the assert statement to validate conditions in unit tests.

According to the Python docs:

Assert statements are a convenient way to insert debugging assertions into a program … The simple form, assert expression, is equivalent to: if not expression: raise AssertionError … The extended form, assert expression1, expression2, is equivalent to: if not expression1: raise AssertionError(expression2)

Essentially, this statement allows us to raise an error if a condition is not fulfilled. Sounds perfect for unit tests, no?

Test structure

Test method names must be prefixed or suffixed with test_, though I quite like to use a double underscore, as I think it improves readability:

import pytest

def test__my_function():
    assert True #  lol

Class names of classes containing test methods should be prefixed with Test, and have no __init__ method:

import pytest

class TestClass:

    #  No __init__!

    def test__a_method(self):
	assert True #  rofl

While an IDE test runner can usually be configured to run tests with any naming convention, pytest, at least on the command-line, will only pick up your test files automatically if they are also prefixed with test_. If your files are not so prefixed, you can still manually point pytest at them with pytest my_test_file.py.

pytest allows you to put your tests anywhere, but it's always a good idea to group them logically into classes and modules, in a structure with some relation to your source code.

Setup and Teardown

Setup and teardown methods allow you to create certain conditions for your tests, and remove them afterwards. pytest has setup and teardown functionality for modules, classes, methods, and functions, which are simply the name of the code construct prefixed with the method type:

Module -> setup_module(module):
Class -> setup_class(cls):
etc…

The setup_x method is called once before, and the teardown_x after each matching type of construct, and they are called in, roughly, LIFO order: a setup_module(module): function is called first, and before any other setup_ call, and teardown_module(module): is called after any other teardown_ calls. Whether the setup and teardown for function or class is called next depends on what order the tests are encountered in the file. If a Test class is present, setup_class(cls): is called, then setup_method(self, method):.

When using Test classes, the setup_class(cls): and teardown_class(cls): methods are.. class methods (funny that), so they must be annotated with @classmethod, and be passed cls, not self.

Let's see what this looks like in practice:

#!/usr/bin/env python
import pytest

def setup_module(module):
    print('\nSetting up module')

def teardown_module(module):
    print('\nTearing down module')

class TestClass:

    @classmethod
    def setup_class(cls):
	print('\nSetting up class')

    @classmethod
    def teardown_class(cls):
	print('\nTearing down class')

    def setup_method(self, method):
	print('\nSetting up method')

    def teardown_method(self, method):
	print('\nTearing down method')

    def test__my_method(self):
	print('\n**Executing my_method')
	assert True

def setup_function(function):
    print('\nSetting up function')

def teardown_function(function):
    print('\nTearing down function')

def test__my_function():
    print('\n**Executing my_function')

Text Fixtures

Test fixtures allow you to further customise the environment in which your tests run. Text fixtures are usually functions which either provide some functionality or data to your test, to emulate the behaviour of a dependency, for example.

pytest provides a decorator pytest.fixture which is used to decorate fixture functions. pytest is very flexible, and allows specification of which fixtures are to be run in conjunction with which tests. pytest provides an autouse parameter, which tells pytest to execute that fixture for every test.

Lets start building an example:

import pytest
from pytest import fixture

@fixture()
def fixture_function():
    print('\nI help test methods do their jobs!')


def test__something():
    print('\nTesting something...')
    assert True

def test__something_else():
    print('\nTesting something ELSE')
    assert True

Running that with pytest -v -s (s tells pytest to pass through the results print() calls to stdout) will only print the strings in the test__ methods, because even though we have specified the fixture with @fixture() we have not yet integrated the fixture_function() with the test functions.

There are two ways to integrate the fixture with our test__ functions:

Supply the fixture as a parameter to the test__ function
Use the @mark.usefixtures('fixture_name') decorator

import pytest
from pytest import fixture, mark

@fixture()
def fixture_function():
    print('\nI help test methods do their jobs!')


def test__something(fixture_function):
    print('\nTesting something...')
    assert True

@mark.usefixtures('fixture_function')
def test__something_else():
    print('\nTesting something ELSE')
    assert True

Now running that with the same pytest arguments as before should print the fixture_function message before each of the test__ function messages.

Now let's add another fixture, which is called for every test__ function, using the autouse parameter to @fixture():

import pytest
from pytest import fixture, mark

@fixture()
def fixture_function():
    print('\nI help SOME methods do their jobs!')

@fixture(autouse=True)
def used_automatically():
    print('\nI help ALL test methods do their jobs!')


def test__something():
    print('\nTesting something...')
    assert True

@mark.usefixtures('fixture_function')
def test__something_else():
    print('\nTesting something ELSE')
    assert True

Running this, you should see that the message from fixture_function() is only displayed for test__something_else(), while the message from used_automatically() displays for both tests.

pytest fixtures also provide functionality for them to clean up after they leave the scope. This can be done in one of two ways:

The yield keyword
the request.addfinalalizer method

The yield keyword, use it in your test fixture in place of a return statement. Whatever comes after the yield keyword will be executed after the test has completed.

The request.addfinalizer method is more complex, but more flexible. It allows the test method to specify any number of functions to be called after the test.

Let's build those into the example:

import pytest
from pytest import fixture, mark

@fixture()
def fixture_function():
    print('\nI help SOME methods do their jobs!')
    yield
    print('\nI clean up using YIELD')


def finalize():
    print('\nI clean up using REQUEST.ADDFINALIZER')


@fixture(autouse=True)
def used_automatically(request):
    print('\nI help ALL test methods do their jobs!')
    request.addfinalizer(finalize)


def test__something():
    print('\nTesting something...')
    assert True

@mark.usefixtures('fixture_function')
def test__something_else():
    print('\nTesting something ELSE')
    assert True

Here, our fixture_function uses the yield keyword to emit it's second print() output after the test__something_else test function.

We have also defined another function finalize, which is used by the request.addfinalizer in both test__something and test__something_else automatically, because we are still using the @fixture(autouse=True) decorator on the used_automatically. So you should see the output of finalize's print() after both tests have completed.

Fixture Scope

Fixture scope is (surprise surprise), the scope of the fixture - the allowed domain of its operation. There are four fixture scopes:
1. Function - fixture scope is per test function, it is run once per test function
2. Class - fixture scope is per class
3. Module - run once per module
4. Session - run when pytest starts
Note that the scopes are not the same as the four types of setup/teardown method.

Now let's work these into our example:
```
 import pytest
 from pytest import fixture, mark

def finalize():
    print('\nI clean up using REQUEST.ADDFINALIZER')


@fixture(scope='function', autouse=True)
def fixture_function():
    print('\nI help FUNCTIONS do their jobs!')
    yield
    print('\nI clean up using YIELD')


@fixture(scope='class', autouse=True)
def fixture_class():
    print('\nI help CLASSES do their jobs!')
    yield
    print('\nI clean up using YIELD')


@fixture(scope='module', autouse=True)
def fixture_module(request):
    print('\nI help MODULES do their jobs!')
    request.addfinalizer(finalize)


@fixture(scope='session', autouse=True)
def fixture_session(request):
    print('\nI help SESSIONS do their jobs!')
    request.addfinalizer(finalize)


class TestClass:

    def test__something(self):
	print('\nTesting something...')
	assert True

    def test__something_else(self):
	print('\nTesting something ELSE')
	assert True
```
Several things have changed here:
1. All test methods are inside a Test class - this is because there is no difference between function scope and method scope.
2. All the fixtures use the autouse=True parameter - this is to make the example a bit simpler, so the code can be more easily correlated with pytest output.
3. The session and module scoped fixtures use request.addfinalizer, and function and class scoped fixtures use yield. This is to show the difference.
If you study the pytest output, you should see that the session fixture is called first, then the module fixture, then the class fixture. The function fixture is then called once per test function, and its cleanup is run after each test function. Then, the class cleanup is called, then the module cleanup, then the session cleanup. Somewhat like an onion.
Data Providers

Test fixtures can operate as data providers for test methods. This means that the fixture can supply arguments to the method, to vary the execution path in the test method.

pytest.fixture can be supplied with a params parameter, each of whose items can be used in the test method the fixture is applied to:
```
import pytest
from pytest import fixture

@fixture(params=[1, 2, 3])
def my_fixture(request):
    return request.param


def test__a_test(my_fixture):
    print('\nTesting something with param {}'.format(my_fixture))
```
You should see that for each parameter defined in the @fixture decorator, the test method output is modified by that parameter. You can do more complex things too:
```
 import pytest
 from pytest import fixture

 @fixture(params=[str.lower, str.upper, str.capitalize])
 def my_fixture(request):
     return request.param


def test__a_test(my_fixture):
    print(my_fixture('\nTesting something'))
```
In this case, each function in the params list is applied to the test output.

There is another, more complex and flexible way to supply arguments to test methods: the mark.parametrize() decorator. Here is a quick example from the docs to show how it works:
```
import pytest
from pytest import mark


@mark.parametrize("test_input,expected", [("3+5", 8), ("2+4", 6), ("6*9", 42)])
def test_eval(test_input, expected):
    assert eval(test_input) == expected
```
I prefer to use the @fixture(params=[]) decorator to create specific conditions for a test method by manipulating the params before making them available to the test method, and the mark.parametrize decorator to specify the possible cases a generic function might need to handle.

Useful tidbits

Comparing float values can be difficult, as they are stored internally as binary fractions. This means that 0.1 + 0.2 doesn't have the answer you might expect:

print((0.1+0.2) == 0.3) #  False!
print(0.1+0.2) #  0.30000000000000004

You can test floating point arithmethic using pytest@'s approx function:

import pytest
from pytest import approx

def test__float():
    assert (0.1 + 0.2) == 0.3 # Fail

def test__float():
    assert (0.1 + 0.2) == approx(0.3) # pass

Sometimes you want to make sure a particular exception is thrown. pytest provides a context manager for that case: raises(). Let's work it into the previous example:

import pytest
from pytest import approx, raises

def test__float():
    with raises(AssertionError):
	assert (0.1 + 0.2) == 0.3 # pass!

def test__float():
    assert (0.1 + 0.2) == approx(0.3) # pass

If you're running pytest from the command-line, there are some useful arguments to know:

pytest
       -v # Verbose mode
       -q # Quiet mode (suppress output)
       -s # Pass test output to stdout, instead of capturing
       --ignore PATH # Ignore specified PATH when discovering tests
       --maxfail NUMBER # Stop after NUMBER failed tests

A proper worked example

In this example, I'll work through modelling a supermarket checkout, with a Checkout, Items, prices, price calculation, and discounts.

First, the source code, though of course, it was written according to Red->Green->Refactor:

   #!/usr/bin/env python

from typing import List, Dict

class Discount:

    def __init__(self, number: int, factor: float) -> None:
	self.number = number
	self.factor = factor


DiscountTable = Dict[str, Discount]


class Item:

    def __init__(self, price: float, name: str) -> None:
	self.price = price
	self.name = name


Basket = List[Item]


class Checkout:

    def __init__(self, discount_table: DiscountTable, basket: Basket) -> None:
	self.discount_table = discount_table
	self.basket = basket

    def _apply_discounts(self) -> None:
	discounted_basket = []
	discountable_product_names = self.discount_table.keys()
	for product_name in discountable_product_names:
	    product_discount = self.discount_table[product_name]
	    items = [item for item in self.basket if item.name == product_name]
	    if len(items) >= product_discount.number:
		for item in items:
		    discounted_basket.append(
			Item(product_discount.factor * item.price, product_name)
		    )
	    else:
		discounted_basket.extend(items)
	self.basket = discounted_basket

    def get_total(self, eligible_for_discount: bool) -> float:
	if eligible_for_discount:
	    self._apply_discounts()

	return sum([item.price for item in self.basket])

And the tests:

#!/usr/bin/env python
from typing import Dict

from pytest import mark

from .Checkout import Checkout, Item, Discount, DiscountTable

# Imagine I am a properly implemented product-discount subsystem..
discount_table = {
    'bread': Discount(3, 0.5),
    'milk': Discount(2, 0.25),
    'cheese': Discount(5, 0.6)
}


def test__create_discount():
    discount = Discount(2, 0.5)

    assert isinstance(discount, Discount)
    assert discount.number == 2
    assert discount.factor == 0.5


def test__create_item():
    item = Item(5.0, 'Ham')

    assert isinstance(item, Item)
    assert item.price == 5.0
    assert item.name == 'Ham'


class TestCheckout:

    @mark.parametrize(
	'basket, num_items_in_basket',
	[
	    [[Item(2.0, 'bread'), Item(2.0, 'bread')], 2],
	    [[Item(2.0, 'bread'), Item(2.0, 'bread'), Item(2.0, 'bread')], 3]
	]
    )
    def test__createCheckout(self, basket, num_items_in_basket):
	checkout = Checkout(discount_table, basket)

	assert isinstance(checkout, Checkout)
	assert isinstance(checkout.discount_table, DiscountTable)
	assert len(checkout.basket) == num_items_in_basket

    @mark.parametrize(
	'basket, total, eligible_for_discount',
	[
	    [[Item(2.0, 'bread'), Item(2.0, 'bread')], 4, True],
	    [[Item(2.0, 'bread'), Item(2.0, 'bread'), Item(2.0, 'bread')], 3, True],
	    [[Item(2.0, 'bread'), Item(2.0, 'bread'), Item(2.0, 'bread')], 6, False],
	    [[Item(2.0, 'milk'), Item(5.0, 'bread'), Item(3.0, 'cheese')], 10, False],
	    [[Item(2.0, 'milk'), Item(2.0, 'milk'), Item(2.0, 'milk')], 6, False],
	    [[Item(2.0, 'milk'), Item(2.0, 'milk'), Item(2.0, 'milk')], 1.5, True]
	]
    )
    def test__get_total(self, basket, total, eligible_for_discount):
	checkout = Checkout(discount_table, basket)
	checkout_total = checkout.get_total(eligible_for_discount)

	assert total == checkout_total

I could probably have made more use of the @fixture decorator, to dry up some of those checkout = Checkout...

Mock Objects

Mock objects come by various names: dummy, stub, mock, etc. These are sometimes collectively called test doubles, as their purpose is to mimic the dependencies of the code under test, so as to avoid having to initialise the entire stack, and enable us to control the environment of the test, so as to systematically verify its behaviour.

Pythons standard unittest library provides the extremely useful MagicMock class. MagicMock (and its parent, Mock) allows various keyword arguments to be specified to its constructor, which control aspects of its behaviour:

spec=SomeClass: specifies that the mock implements SomeClass
side_effect=some_func: allows some_func to be called when the mock is called
return_value='some value': allows some value to be returned when the mock is called

pytest also provides the monkeypatch fixture, which allows us to dynamically change the functionality of code.

The following shows how to use MagicMock and monkeypatch:

#!/usr/bin/env python
from typing import Any
from unittest.mock import MagicMock


class ApiLogger:

    def __init__(self, logger: Any) -> None:
	self.logger = logger

    def log(self, message: str) -> str:
	return self.logger.log(message)



def test__create_logger():
    logger = ApiLogger(MagicMock())

    assert isinstance(logger, ApiLogger)

def test__log(monkeypatch):

    mock_logger = MagicMock()
    mock_logger.log = MagicMock(return_value='MOCKED')

    api_logger = ApiLogger(mock_logger)
    assert api_logger.log('log message') == 'MOCKED'

    def get_foo(self, some_arg) -> str:
	return 'foo'

    api_logger = ApiLogger(mock_logger)
    monkeypatch.setattr(ApiLogger, 'log', get_foo)
    assert api_logger.log('anything') == 'foo'

Best Practices

Always make implement the simplest test cases possible. This forces you to implement simple code to satisfy the test case, and ensures that complexity increases incrementally, with each step having been thought through in advance. If your test cases are very complex, that is a sign that the design is not sufficiently decoupled, as complex text cases are indicative of complex interfaces, and it is often the case that complex interfaces mean insufficiently decoupled code. Usually subjecting the code to some interface segregation helps with this.

TDD with pytest