Skip to content

TDD & Mocking

Test-Driven Development (TDD)

Test-Driven Development is a software development practice where you write a failing test before writing the production code that makes it pass. It inverts the traditional “write code, then test” workflow and fundamentally changes how you approach design.

The Red-Green-Refactor Cycle

TDD follows a tight, disciplined cycle of three steps:

1. Red — Write a Failing Test

Write a test that describes the behavior you want. Run it and confirm that it fails. This step is critical: if the test passes before you write any production code, either the test is wrong or the behavior already exists.

The failing test serves as a specification. It defines what “done” looks like before you start writing implementation code.

2. Green — Make It Pass

Write the simplest, most direct code that makes the failing test pass. Do not worry about elegance, performance, or covering every edge case. The only goal is to turn the red test green.

This step is intentionally minimal. You are not writing the “final” solution — you are writing just enough code to satisfy the current test.

3. Refactor — Clean Up

Now that the test passes, improve the code. Remove duplication, extract methods, rename variables, and apply design patterns. The key constraint is that all tests must continue to pass after refactoring.

Refactoring with confidence is the payoff. You can restructure code aggressively because your test suite will catch any mistakes immediately.

+-------+
| RED | <-- Write a failing test
+---+---+
|
v
+-------+
| GREEN | <-- Write minimal code to pass
+---+---+
|
v
+-----------+
| REFACTOR | <-- Clean up, all tests still pass
+-----+-----+
|
+-------> Back to RED (next behavior)

Each cycle should be short — typically a few minutes. If you find yourself spending 30 minutes writing code before running a test, you are taking too large a step.


TDD Benefits

  • Design emerges from usage. Writing the test first forces you to think about how the code will be called, leading to cleaner APIs and better interfaces.
  • High test coverage is automatic. Every line of production code exists because a test demanded it.
  • Fewer bugs. The tight feedback loop catches errors immediately, before they compound.
  • Fearless refactoring. A comprehensive test suite lets you restructure code without worrying about breaking things.
  • Living documentation. The test suite describes exactly what the system does, and it never goes stale because it is executed constantly.

TDD Criticisms and Responses

CriticismResponse
”TDD slows me down”Initial velocity may decrease, but long-term velocity increases because you spend less time debugging and fixing regressions.
”Not everything can be TDD’d”True for some areas (UI layout, exploratory prototyping), but TDD works well for business logic, algorithms, and APIs.
”Tests become coupled to implementation”This happens when you test private methods or mock too aggressively. Test behavior, not implementation.
”100% coverage is wasteful”TDD does not require 100% coverage. It produces high coverage as a side effect, but you can skip trivial code.

TDD Walkthrough: Building a String Calculator

Let us build a simple string calculator using TDD, step by step. The function add(numbers) takes a string of comma-separated numbers and returns their sum.

Step 1: Empty String Returns Zero

# RED: Write the test first
def test_empty_string_returns_zero():
assert add("") == 0
# GREEN: Simplest implementation
def add(numbers):
return 0

Step 2: Single Number Returns Itself

# RED: New failing test
def test_single_number_returns_its_value():
assert add("5") == 5
# GREEN: Handle single number
def add(numbers):
if numbers == "":
return 0
return int(numbers)

Step 3: Two Numbers Return Their Sum

# RED: Another failing test
def test_two_numbers_returns_sum():
assert add("1,2") == 3
# GREEN: Handle multiple numbers
def add(numbers):
if numbers == "":
return 0
return sum(int(n) for n in numbers.split(","))
# REFACTOR: The implementation now handles all previous cases too.
# Run all tests to confirm they still pass.

Step 4: Negative Numbers Throw an Exception

# RED: Test for error behavior
def test_negative_numbers_raise_exception():
with pytest.raises(ValueError, match="negatives not allowed: -3"):
add("1,-3,5")
# GREEN: Add validation
def add(numbers):
if numbers == "":
return 0
nums = [int(n) for n in numbers.split(",")]
negatives = [n for n in nums if n < 0]
if negatives:
raise ValueError(f"negatives not allowed: {', '.join(str(n) for n in negatives)}")
return sum(nums)

Notice how each step is small and focused. The design of the function emerges naturally from the tests.


Test Doubles Taxonomy

When unit testing, you often need to replace real dependencies with controlled substitutes. These substitutes are collectively called test doubles. Understanding the different types helps you choose the right tool for each situation.

Dummy

A dummy is an object passed to satisfy a parameter requirement but never actually used. It fills a slot in a method signature.

# The logger is required by the constructor but not relevant to this test
def test_user_creation():
dummy_logger = None # or a no-op logger
user = User("Alice", logger=dummy_logger)
assert user.name == "Alice"

Stub

A stub provides canned answers to method calls. It returns predetermined responses without any logic. Use stubs when you need a dependency to return specific values for your test scenario.

# Stub the weather service to return a known forecast
class StubWeatherService:
def get_forecast(self, city):
return {"temperature": 72, "condition": "sunny"}
def test_trip_planner_recommends_outdoor_activity():
planner = TripPlanner(weather_service=StubWeatherService())
recommendation = planner.suggest_activity("Portland")
assert recommendation == "hiking"

Spy

A spy records information about how it was called — which methods, with what arguments, how many times. Use spies when you need to verify that your code interacts with a dependency correctly.

# Spy tracks calls to send_email
class SpyEmailService:
def __init__(self):
self.sent_emails = []
def send(self, to, subject, body):
self.sent_emails.append({"to": to, "subject": subject, "body": body})
def test_registration_sends_welcome_email():
spy = SpyEmailService()
registration = UserRegistration(email_service=spy)
registration.register("alice@example.com", "password123")
assert len(spy.sent_emails) == 1
assert spy.sent_emails[0]["to"] == "alice@example.com"
assert "welcome" in spy.sent_emails[0]["subject"].lower()

Mock

A mock is a pre-programmed object with expectations about how it will be called. Unlike a spy (which records and lets you assert after the fact), a mock typically verifies expectations during the test. In practice, most modern frameworks blur the line between spies and mocks.

Fake

A fake is a working implementation that takes a shortcut unsuitable for production. Common examples include in-memory databases, local file system caches instead of cloud storage, and fake SMTP servers.

# Fake in-memory database for testing
class FakeUserDatabase:
def __init__(self):
self._users = {}
self._next_id = 1
def save(self, user):
user.id = self._next_id
self._users[self._next_id] = user
self._next_id += 1
return user
def find_by_id(self, user_id):
return self._users.get(user_id)

When to Mock

You Should Mock

  • External services — HTTP APIs, third-party services, payment gateways. You cannot control their behavior or availability.
  • I/O operations — File system, network calls, database queries (in unit tests). These are slow and non-deterministic.
  • Non-deterministic inputs — Current time, random numbers, UUIDs. Mock these to make tests repeatable.
  • Expensive operations — Heavy computations or operations with rate limits that would slow down your test suite.

You Should NOT Mock

  • The system under test — Never mock the thing you are testing. If you mock it, you are not testing it.
  • Simple value objects — Data classes, DTOs, and value types are cheap to construct. Use real instances.
  • Implementation details — Do not mock private methods or internal collaborators. This couples tests to implementation, making refactoring painful.
  • Everything — Over-mocking leads to tests that verify your mocks work, not that your code works. If a test has more mock setup than assertions, reconsider.

Mocking in Practice

from unittest.mock import patch, MagicMock
from datetime import datetime
import pytest
from order_service import OrderService
from payment_gateway import PaymentGateway
class TestOrderService:
"""Mocking examples with unittest.mock."""
# Using patch as a decorator to replace a dependency
@patch("order_service.PaymentGateway")
def test_successful_order_charges_payment(self, MockGateway):
# Configure the mock
mock_gateway = MockGateway.return_value
mock_gateway.charge.return_value = {"status": "success", "transaction_id": "txn_123"}
service = OrderService()
result = service.place_order(user_id=1, amount=99.99)
# Verify the mock was called correctly
mock_gateway.charge.assert_called_once_with(amount=99.99, currency="USD")
assert result.transaction_id == "txn_123"
# Using patch as a context manager
def test_order_fails_when_payment_declines(self):
with patch("order_service.PaymentGateway") as MockGateway:
mock_gateway = MockGateway.return_value
mock_gateway.charge.return_value = {"status": "declined"}
service = OrderService()
with pytest.raises(PaymentDeclinedError):
service.place_order(user_id=1, amount=99.99)
# Using MagicMock with side_effect for exceptions
def test_order_handles_gateway_timeout(self):
mock_gateway = MagicMock(spec=PaymentGateway)
mock_gateway.charge.side_effect = TimeoutError("Gateway timeout")
service = OrderService(payment_gateway=mock_gateway)
with pytest.raises(OrderError, match="payment service unavailable"):
service.place_order(user_id=1, amount=99.99)
# Using side_effect for sequential return values
def test_retry_logic_on_transient_failure(self):
mock_gateway = MagicMock(spec=PaymentGateway)
mock_gateway.charge.side_effect = [
TimeoutError("First attempt failed"),
{"status": "success", "transaction_id": "txn_456"},
]
service = OrderService(payment_gateway=mock_gateway)
result = service.place_order(user_id=1, amount=50.00)
assert mock_gateway.charge.call_count == 2
assert result.transaction_id == "txn_456"
# Patching a built-in / standard library function
@patch("order_service.datetime")
def test_order_records_timestamp(self, mock_datetime):
mock_datetime.now.return_value = datetime(2025, 6, 15, 10, 30, 0)
service = OrderService()
order = service.create_order(user_id=1, items=["widget"])
assert order.created_at == datetime(2025, 6, 15, 10, 30, 0)

Common Mocking Mistakes

Over-Mocking

When a test has more mock setup than actual assertions, it becomes fragile and meaningless.

# Bad: Over-mocked - testing nothing real
def test_process_order_over_mocked():
mock_db = MagicMock()
mock_email = MagicMock()
mock_logger = MagicMock()
mock_inventory = MagicMock()
mock_analytics = MagicMock()
mock_db.get_user.return_value = MagicMock(name="Alice")
mock_db.get_product.return_value = MagicMock(price=10)
mock_inventory.check_stock.return_value = True
service = OrderService(mock_db, mock_email, mock_logger, mock_inventory, mock_analytics)
service.process_order(user_id=1, product_id=2)
# We are just verifying that mocks were called - not testing real behavior
mock_db.get_user.assert_called_once()
mock_email.send.assert_called_once()
# Better: Use real objects where practical, mock only external I/O
def test_process_order(fake_db, sample_user, sample_product):
mock_email = MagicMock()
fake_db.insert_user(sample_user)
fake_db.insert_product(sample_product)
service = OrderService(fake_db, mock_email)
order = service.process_order(user_id=sample_user.id, product_id=sample_product.id)
# Assert real behavior
assert order.total == sample_product.price
assert order.status == "confirmed"
# Only mock the truly external dependency
mock_email.send.assert_called_once_with(
to=sample_user.email,
subject="Order Confirmation"
)

Mocking What You Do Not Own

Do not mock third-party library interfaces directly. Instead, create your own wrapper (adapter) and mock that.

# Bad: Mocking requests directly couples your test to the library's API
@patch("requests.get")
def test_get_weather(mock_get):
mock_get.return_value.json.return_value = {"temp": 72}
...
# Better: Create an adapter and mock your own interface
class WeatherClient:
def get_temperature(self, city: str) -> float:
response = requests.get(f"https://api.weather.com/{city}")
return response.json()["temp"]
# Now mock your own WeatherClient interface
def test_trip_planner():
mock_client = MagicMock(spec=WeatherClient)
mock_client.get_temperature.return_value = 72.0
planner = TripPlanner(weather_client=mock_client)
...

Next Steps

Continue building your testing skills: