Python Style Guide

This guide documents Python coding conventions that go beyond what ruff and clint can enforce. The practices below require human judgment to implement correctly and improve code readability, maintainability, and testability across the MLflow codebase.

Avoid Redundant Docstrings

Omit docstrings that merely repeat the function name or provide no additional value. Function names should be self-documenting.

# Bad
def calculate_sum(a: int, b: int) -> int:
    """Calculate sum"""
    return a + b


# Good
def calculate_sum(a: int, b: int) -> int:
    return a + b

Prefer `typing.Literal` for Fixed-String Parameters

When a parameter only accepts a fixed set of string values, use typing.Literal instead of a plain str type hint. This improves type-checking, enables IDE autocompletion, and documents allowed values at the type level.

# Bad
def f(app: str) -> None:
    """
    Args:
        app: Application type. Either "fastapi" or "flask".
    """
    ...


# Good
from typing import Literal


def f(app: Literal["fastapi", "flask"]) -> None:
    """
    Args:
        app: Application type. Either "fastapi" or "flask".
    """
    ...

Minimize Try-Catch Block Scope

Wrap only the specific operations that can raise exceptions. Keep safe operations outside the try block to improve debugging and avoid masking unexpected errors.

# Bad
try:
    never_fails()
    can_fail()
except ...:
    handle_error()

# Good
never_fails()
try:
    can_fail()
except ...:
    handle_error()

Use Dataclasses Instead of Complex Tuples

Replace tuples with 3+ elements with named dataclasses. This improves code clarity, prevents positional argument errors, and enables type checking on individual fields.

# Bad
def get_user() -> tuple[str, int, str]:
    return "Alice", 30, "Engineer"


# Good
from dataclasses import dataclass


@dataclass
class User:
    name: str
    age: int
    occupation: str


def get_user() -> User:
    return User(name="Alice", age=30, occupation="Engineer")

Use `pathlib` Methods Instead of `os` Module Functions

When you have a pathlib.Path object, use its built-in methods instead of os module functions. This is more readable, type-safe, and follows object-oriented principles.

from pathlib import Path

path = Path("some/file.txt")

# Bad
import os

os.path.exists(path)
os.remove(path)

# Good
path.exists()
path.unlink()

Pass `pathlib.Path` Objects Directly to `subprocess`

Avoid converting pathlib.Path objects to strings when passing them to subprocess functions. Modern Python (3.8+) accepts Path objects directly, making the code cleaner and more type-safe.

import subprocess
from pathlib import Path

path = Path("some/script.py")

# Bad
subprocess.check_call(["foo", "bar", str(path)])

# Good
subprocess.check_call(["foo", "bar", path])

Use next() to Find First Match Instead of Loop-and-Break

Use the next() builtin function with a generator expression to find the first item that matches a condition. This is more concise and functional than manually looping with break statements.

# Bad
result = None
for item in items:
    if item.name == "target":
        result = item
        break

# Good
result = next((item for item in items if item.name == "target"), None)

Use Pattern Matching for String Splitting

When splitting strings into a fixed number of parts, use pattern matching instead of direct unpacking or verbose length checks. Pattern matching provides concise, safe extraction that clearly handles both expected and unexpected cases.

# Bad: unsafe
a, b = some_str.split(".")

# Bad: safe but verbose
if some_str.count(".") == 1:
    a, b = some_str.split(".")
else:
    raise ValueError(f"Invalid format: {some_str!r}")

# Bad: safe but verbose
splits = some_str.split(".")
if len(splits) == 2:
    a, b = splits
else:
    raise ValueError(f"Invalid format: {some_str!r}")

# Good
match some_str.split("."):
    case [a, b]:
        ...
    case _:
        raise ValueError(f"Invalid format: {some_str!r}")

Always Verify Mock Calls with Assertions

Every mocked function must have an assertion (assert_called, assert_called_once, etc.) to verify it was invoked correctly. Without assertions, tests may pass even when the mocked code isn't executed.

from unittest import mock


# Bad
def test_foo():
    with mock.patch("foo.bar"):
        calls_bar()


# Good
def test_bar():
    with mock.patch("foo.bar") as mock_bar:
        calls_bar()
        mock_bar.assert_called_once()

Set Mock Behaviors in Patch Declaration

Define return_value and side_effect directly in the patch() call rather than assigning them afterward. This keeps mock configuration explicit and reduces setup code.

from unittest import mock


# Bad
def test_foo():
    with mock.patch("foo.bar") as mock_bar:
        mock_bar.return_value = 42
        calls_bar()

    with mock.patch("foo.bar") as mock_bar:
        mock_bar.side_effect = Exception("Error")
        calls_bar()


# Good
def test_foo():
    with mock.patch("foo.bar", return_value=42) as mock_bar:
        calls_bar()

    with mock.patch("foo.bar", side_effect=Exception("Error")) as mock_bar:
        calls_bar()

Parametrize Tests with Multiple Input Cases

Use @pytest.mark.parametrize to test multiple inputs instead of repeating assertions. This creates separate test cases for each input, making failures easier to diagnose and tests more maintainable.

# Bad
def test_foo():
    assert foo("a") == 0
    assert foo("b") == 1
    assert foo("c") == 2


# Good
@pytest.mark.parametrize(
    ("input", "expected"),
    [
        ("a", 0),
        ("b", 1),
        ("c", 2),
    ],
)
def test_foo(input: str, expected: int):
    assert foo(input) == expected

Avoid Custom Messages in Test Asserts

Pytest's assertion introspection provides detailed failure information automatically. Avoid adding custom messages to assert statements in tests unless absolutely necessary.

# Bad
def test_list_items():
    items = list_items()
    assert len(items) == 3, f"Expected 3 items, got {len(items)}"


# Good
def test_list_items():
    items = list_items()
    assert len(items) == 3

Preserve function metadata and type information in decorators

When writing decorators, always use @functools.wraps to preserve function metadata (like __name__ and __doc__), and use typing.ParamSpec and typing.TypeVar to preserve the function's type information for accurate type checking and autocompletion in IDEs.

# Bad
from typing import Any, Callable


def decorator(f: Callable[..., Any]) -> Callable[..., Any]:
    def wrapper(*args: Any, **kwargs: Any) -> Any:
        ...  # Pre-execution logic (e.g., logging, validation, setup)
        res = f(*args, **kwargs)
        ...  # Post-execution logic (e.g., cleanup, result transformation)
        return res

    return wrapper


# Good
import functools
from typing import Callable, ParamSpec, TypeVar

_P = ParamSpec("P")
_R = TypeVar("R")


def decorator(f: Callable[_P, _R]) -> Callable[_P, _R]:
    @functools.wraps(f)
    def wrapper(*args: _P.args, **kwargs: _P.kwargs) -> _R:
        ...  # Pre-execution logic (e.g., logging, validation, setup)
        res = f(*args, **kwargs)
        ...  # Post-execution logic (e.g., cleanup, result transformation)
        return res

    return wrapper

Python Rule