Unit Testing for Data Science in Python
- Unit Testing for Data Science in Python
Unit Testing Basics
Write a Simple unit test using pytest
The test_
prefix naming convention indicates the presence of unit tests. Units tests may also be reffered to as test modules.
Unit tests are python functions whose names start with test_
.
### test_row_to_list.py ###
import pytest
import row_to_list
def test_for_clean_row():
assert row_to_list("2,081\t314,942\n") == \
["2,081", "314,942"]
def test_for_missing_area():
assert row_to_list("\t293,410\n") is None
def test_for_missing_tab():
assert row_to_list("1,463238,765\n") is None
Test requires assert
statement, to evaluate success/failure of test.
Test can be run from the command line:
pytest test_row_to_list.py
Understanding the test result report
Section 1 - General information about OS, Python version, pytest package versions, working directory and plugins.
================== test session start ==================
platform linux -- Python 3.6.7, pytest-4.0.1, py-1.8.0, pluggy-0.9.0
rootdir: /tmp/tmpvdblq9g7, inifile:
plugins: mock-1.10.0
Section 2 - Test results
collecting...
collected 3 items
test_ro_to_list.py .F. [100%]
- 3 tests found
- test results indicated by characters
.
indicating a pass andF
indicating that the second test failed.
Section 3 - Information on failed tests
================== FAILURES ==================
--------------test_for_missing_area-----------
def test_for_missing_area():
> assert row_to_list("\t293,410\n") is None
E AssertionError: assert ['', '293,410'] is None
E + where ['','293,410'] = row_to_list('\t293,410\n')
test_row_to_list.py:7 AssertionError
Section 4 - Test Summary
======== 1 failed, 2 passed in 0.03 seconds ========
- Result summary from all units tests that ran: 1 failed, 2 passed.
- Total time to run 0.03 seconds
More benefits and test types
- Unit tests serve as documentation of how code should run.
- Greater trust in a package, since users can run tests and verify that the package works.
- Incorporating tests in a CI/CD pipeline ensures bad code is never pushed to a production system.
Unit test
- A unit is a small, independent piece of code such as a Python function or class.
Integration test
- Check if multiple units work correctly togther.
Intermediate Unit Testing
Assert statements
assert boolean_expression, message
Message is only printed if boolean expression doesn’t pass and should give information about why the assertion error was raised.
### test_row_to_list.py ###
import pytest
...
def test_for_missing_area():
actual = row_to_list("\t293,410\n")
expected = None
message = ("row_to_list)'\t293,410\n') "
"returned {0} instead "
"of {1}".format(actual, expected)
assert actual is expected, message
Assertions with Floats:
Because of the way Python evaluates floats we may get unexpected results when comparing float values, e.g. 0.1 + 0.1 + 0.1 == 0.3
=> False
.
Instead when comparing float values we should wrap the expected return value using pytest.approx()
.
assert 0.1 + 0.1 + 0.1 == pytest.approx(0.3)
This method also works for NumPy arrays:
assert np.array([0.1 + 0.1, 0.1 + 0.1 + 0.1]) == pytest.approx(np.array([0.2, 0.3]))
Multiple assertions in one unit test:
A unit test may contain mutliple assert statements to test the output of a function.
### test_convert_to_int.py ###
import pytest
...
def test_on_string_with_one_comma():
return_value = convert_to_int("2,081")
assert isinstance(return_value, int)
assert return_value == 2081
Testing for exceptions instead of return values
Sometimes, rather than testing the return value of a function, we want to check that a function correctly raises an Exception.
with pytest.raises(ValueError):
# <--- Does nothing on entering the context
print("This is part of the context")
# <-- If context raise ValueError, silence it.
# <-- If the context did not raise ValueError, raise an exception
def test_valueerror_on_one_dimensional_arg():
example_argument = np.array([2081, 314942, 1059, 186606, 1148, 206186])
with pytest.raises(ValueError):
split_into_training_and_testing_sets(example_argument)
We can also check that the error message on exception is correct by storing the exception_info from the context.
def test_valueerror_on_one_dimensional_arg():
example_argument = np.array([2081, 314942, 1059, 186606, 1148, 206186])
with pytest.raises(ValueError) as exception_info:
split_into_training_and_testing_sets(example_argument)
assert exception_info.match(
"Argument data array must be 2 dimensional. "
"Got 1 dimentional array instead"
)
How many tests should one write for a function?
Test argument types:
- Bad Arguments
- Arguments for which a function raises and exception instead of returning a value.
- Special arguments
- Boundary values
- Special logic
- Normal arguments
- Recommened to test at least 2/3 normal arguments
Note: not all functions will have special or bad arguments.
Test Driven Development (TDD)
It is good practice to always write unit tests prior to implementation of any code.
- Unit tests cannot be deprioritised.
- Time for writing unit tests must be factored into implementation time.
- Requirements are clearer and implementation is easier.
Test Organisation and Execution
Organising tests
Project Structure:
src/ # All application code lives here
|-- data/ # Package for data preprocessing
|-- __init__.py
|-- preprocessing_helpers.py # Contains row_to_list(), convert_to_int()
|-- features # Package for feature generation from preprocessed data
|-- __init__.py
|-- as_numpy.py # Contains get_data_as_numpy_array()
|-- models # Package for training/testing linear regression model
|-- __init__.py
|-- train.py # Contains split_into_training_and_testing_sets()
tests/ # Test suite: all tests live here
|-- data/
|-- __init__.py
|-- test_preprocessing_helpers.py # Corresponds to module src/data/preprocessing_helpers.py
|-- features
|-- __init__.py
|-- test_as_numpy.py # Contains TestGetDataAsNumpyArray
|-- models
|-- __init__.py
|-- test_train.py # Contains TestSplitIntoTrainingAndTestSets
For each module my_module.py
there should exist an equivalent test_my_module.py
. This mirroring of the internal structure of our project ensures that if we know the location of a module in our project, we can infer the location of its test set.
Test module structure:
import pytest
from data.preprocessing_helpers import row_to_list, convert_to_int
class TestRowToList(object): # Use CamelCase
def test_on_no_tab_no_missing_value(self):
...
def test_on_two_tabs_no_missing_value(self):
...
class TestConvertToInt(object):
def test_with_no_comma(self):
...
def test_with_one_comma(self):
...
Mastering Test Execution
-
Running all tests
cd tests pytest
-
Stop after first failure
pytest -x
-
Running a subset of tests
pytest data/test_preprocessing_helpers.py
pytest assigns a unique Node ID to each test class and unit test it encounters:
-
Node ID of a test class:
<path to test module>::<test class name>
pytest data/test_preprocessing_helpers.py::TestRowToList
-
Node ID of a unit test:
<path to test module>::<test class name>::<unit test name>
pytest data/test_preprocessing_helpers.py::TestRowToList::test_on_one_tab_with_missing_value
-
-
Running tests using keyword expressions
pytest -k "pattern"
e.g.
pytest -k "TestSplitIntoTrainingAndTestingSets"
pytest -k "TestSplit"
pytest -k "TestSplit and not test_on_one_row"
Expected Failures and Conditional Skipping
If we want to mark a test that is expected to fail, e.g. the function is not yet implemented, we may do so with the xfail
decorator. This will result in the tets being marked as xfail in the test results, but not cause a failure of the test suite.
import pytest
class TestTrainModel(object):
@pytest.mark.xfail(reason="Using TDD, train_model() is not implemented")
def test_on_linear_data(self):
...
If we wish to skip a test baseed on a given condition, e.g. a test which only works on a particular Python version, we may use the skipif
decorator. If the boolean expression passed to the decorator is True
then the test will be skipped. We must also pass the reason
argument to describe the reason for this test being skipped.
import pytest
import sys
class TestConvertToInt(object):
@pytest.mark.skipif(sys.version_info > (2,7, reason="requires Python 2.7")
def test_with_no_comma(self):
"""Only runs on Python 2.7 or lower"""
test_argument = "756"
expected = 756
actual = convert_to_int(test_argument)
message = unicode("Expected: 2081, Actual: {0}".format(actual))
assert actual == expected, message
Showing reason for skipping/xfail:
pytest -r[set of characters]
pytest -rs # Show reason for skipping
pytest -rx # Show reason for xfail
pytest -rsx # Show both skipping and xfail reasons
Skipping/xfailing may also be applied to entire test classes.
@pytest.mark.xfail(reason="Using TDD, train_model() is not implemented")
class TestTrainModel(object):
...
Continuous integration and code coverage
Developers $\rightarrow$ GitHub $\rightarrow$ CI Server
Build status badge:
-
Build passing $\implies$ Stable Project
-
Build Failing $\implies$ Unstable Project
Setting up a CI Server with Travis:
1) Create a configuration file (.travis.yml
)
language: python
python:
- "3.6"
install:
- pip install -e .
script:
- pytest tests
2) Push the file to GitHub
git add .travis.yml
git push origin master
3) Install the Travis CI app
- The Travis CI app can be installed through the GitHub Marketplace.
- Travis CI is free for public respositories.
- Every commit pushed to our GitHub repo will now trigger a build in Travis.
4) Show the build status badge
- Click on the badge show in Travis
- Select “Markdown”.
- Paste markdown code in GitHub
README.md
Adding Code Coverage Badge to GitHub:
$\text{code coverage} = \frac{\text{num lines of application code that ran during testing}}{\text{total num lines of application code}}$
High percentages (75% and above) indicate a well tested code base.
1) Modify .travis.yml
language: python
python:
- "3.6"
install:
- pip install -e .
- pip install pytest-cov codecov # Install packages for code coverage report
script:
- pytest --cov=src tests # Point to the source directory
after_success:
- codecov # Uploads report to codecov.io
2) Install Codecov app from Github Marketplace
3) Paste markdown link for badge in README.md