We start with the pre-commit. This is our first line of defence against simple to correct errors in our code. pre-commit hooks let you customize Git’s internal behavior and trigger customizable actions at key points in the development life cycle. We can use this to run checks on our code before we commit it. This can include things like: - Code formatting - Linting - Typing checks - Import organization
Black enforces code formatting compliant with PEP 8 enforces code formatting compliant with PEP 8 such as line lengths, indentations, blank lines, etc. It is customizable, so if you don't like certain line length restrictions, you can always change them. You can also tell black
not to look in certain files, or ignore certain features.
Flake8 checks for style and syntax errors. Usually used in conjunction with black
, and you can also tell it to ignore things.
Mypy checks for typing errors, will help find potential problems with passing incorrect types, when type hints have been added in accordance with PEP 484.
Isort sorts your imports appropriately.
In order to ensure code consistency, we will run these checks every time we make a commit. This can be annoying, but it is for our own good!
Setting up pre-commit
First, we need to make sure we add the required files to our project:
poetry add --group dev pre-commit black isort flake8 mypy
We must now make some changes to our pyproject.toml
line-length = 88
exclude = '''
| .git
| .hg
| .mypy_cache
| .tox
| venv
| _build
| buck-out
| build
| dist
Take it from me, we DO NOT want our tools to try and alter the files in our virtual environment or distributions folders!
Similarly, for isort
profile = "black"
line_length = 88
multi_line_output = 3
include_trailing_comma = true
virtual_env = "venv"
Unfortunately, flake8
can't be configured inside the pyproject.toml
file, so we have to create a separate file in our root directory called .flake8
. In it, we add:
max-line-length = 88
The pre-commit hook
Now we create a file in the root directory called .pre-commit-config.yaml
, and add the following:
- repo:
rev: 24.3.0
- id: black
- repo:
rev: 7.0.0
- id: flake8
- repo:
rev: v1.9.0
- id: mypy
- repo:
rev: 5.13.2
- id: isort
Trying it out
In the command line, we can run
poetry run pre-commit run --all-files
and we should get the following (This may take a minute or two to run for the first time):
If there are any issues, work through them to fix the issues. Similarly when you use the VSCode UI to make a commit, any issues will be picked up.
Now in the
file, add the following import statement
import matplotlib.pyplot as plt
You should get the following output when you run the pre-commit
- hook id: flake8
- exit code: 1
cancer_prediction/ F401 'matplotlib.pyplot as plt' imported but unused
Now head into the
file and in the train_and_save_model() function, for the
filenameargument, change the type to
int`, so it reads:
def train_and_save_model(train_data, filename: int = "cancer_model.pkl"):
This is nonsensical, since you are trying to pass a string as argument that should be an int
! And sure enough, when you run the pre-commit, you get this complaint from mymp
- hook id: mypy
- exit code: 1
cancer_prediction/ error: Incompatible default for argument "filename" (default has type "str", argument has type "int") [assignment]
cancer_prediction/ error: No overload variant of "join" matches argument types "str", "int" [call-overload]
cancer_prediction/ note: Possible overload variants:
cancer_prediction/ note: def join(str, /, *paths: str) -> str
cancer_prediction/ note: def join(str | PathLike[str], /, *paths: str | PathLike[str]) -> str
cancer_prediction/ note: def join(bytes | PathLike[bytes], /, *paths: bytes | PathLike[bytes]) -> bytes
cancer_prediction/ error: No overload variant of "dirname" matches argument type "int" [call-overload]
cancer_prediction/ note: Possible overload variants:
cancer_prediction/ note: def [AnyStr in (str, bytes)] dirname(p: PathLike[AnyStr]) -> AnyStr
cancer_prediction/ note: def [AnyOrLiteralStr in (str, bytes, str)] dirname(p: AnyOrLiteralStr) -> AnyOrLiteralStr
cancer_prediction/ error: Argument 1 to "save" of "CancerModel" has incompatible type "int"; expected "str" [arg-type]
Found 4 errors in 1 file (checked 7 source files)
Further reading
Information on GitHub Actions, Black, Flake8, Mypy, Isort, and Git Hooks