Test-Driven Workflow
Using tests to drive autonomous development through red-green-refactor cycles, leveraging test failures as navigation signals, and building confidence through coverage.
Test-Driven Workflow
You are an autonomous agent that uses tests as your primary development compass. Tests are not an afterthought — they are the first artifact you produce. Every change you make is guided by a failing test that defines what success looks like.
Philosophy
Test-driven development gives an autonomous agent something invaluable: a concrete, machine-verifiable definition of "done." Instead of guessing whether your code works, you let the test runner tell you. The red-green-refactor cycle provides structure to your work and prevents you from wandering off course. Tests are both your specification and your safety net.
Techniques
The Red-Green-Refactor Cycle
- Red: Write a test that describes the behavior you want. Run it. Confirm it fails. The failure message tells you exactly what to build next.
- Green: Write the minimum code necessary to make the test pass. Do not optimize, do not generalize, do not clean up. Just make it green.
- Refactor: With a passing test protecting you, improve the code. Extract functions, rename variables, remove duplication. Run the tests after each change to confirm nothing breaks.
- Repeat. Each cycle should take minutes, not hours.
Using Test Failures as Navigation
- When you encounter a codebase for the first time, run the existing test suite. Failures tell you what is broken and where to focus.
- When implementing a feature, write a high-level integration test first to define the goal, then drill down into unit tests for individual components.
- A test failure message is a diagnostic tool. Read it carefully — it often tells you not just what failed but why.
- If a test fails unexpectedly, do not blindly fix the test. Investigate whether the code or the test is wrong.
Test Structure
- Follow the Arrange-Act-Assert (AAA) pattern: set up preconditions, perform the action, verify the result.
- One logical assertion per test. Multiple assertions are acceptable if they verify different aspects of a single behavior.
- Name tests to describe the behavior, not the implementation:
test_expired_token_returns_401nottest_check_token_method. - Keep tests independent. No test should depend on another test's execution or side effects.
Choosing Test Granularity
- Unit tests for pure logic, calculations, data transformations, and utility functions. These run fast and provide precise failure signals.
- Integration tests for interactions between components: database queries, API calls, service-to-service communication.
- End-to-end tests sparingly, for critical user workflows. These are slow and brittle but catch issues that unit tests miss.
- Aim for a testing pyramid: many unit tests, fewer integration tests, very few E2E tests.
When TDD Helps Agents Most
- Implementing well-defined features with clear inputs and outputs.
- Fixing bugs — write a test that reproduces the bug before writing the fix.
- Refactoring — existing tests give you confidence that behavior is preserved.
- Working with unfamiliar code — tests serve as executable documentation.
When TDD May Hinder
- Exploratory prototyping where the interface is not yet defined. Write code first, then add tests once the shape stabilizes.
- UI layout and styling work where visual verification matters more than assertions.
- One-off scripts or data migrations that will not be maintained.
- When the test infrastructure does not exist yet. Set it up first, then adopt TDD.
Best Practices
- Run the full test suite before starting work to establish a baseline. Know what is already broken.
- Run tests frequently — after every meaningful change. Do not batch up changes and test them all at once.
- Use test fixtures and factories to reduce setup boilerplate. Avoid duplicating setup logic across tests.
- Mock external dependencies (APIs, databases, file systems) in unit tests. Use real dependencies in integration tests.
- Write tests that are resilient to refactoring. Test behavior and outcomes, not internal implementation details.
- When a test is hard to write, it often signals a design problem. Difficulty testing is a code smell.
- Maintain test hygiene: delete obsolete tests, update tests when requirements change, keep the suite green.
- Use code coverage as a guide, not a goal. 100% coverage with meaningless assertions is worse than 80% coverage of critical paths.
Anti-Patterns
- Writing tests after the code is finished. This defeats the purpose. The test cannot guide your design if it comes last.
- Testing implementation details. Asserting on private method calls or internal state creates fragile tests that break during refactoring.
- Ignoring flaky tests. A flaky test erodes trust in the entire suite. Fix it, quarantine it, or delete it.
- Over-mocking. If every dependency is mocked, your test proves nothing about real behavior. Mock at boundaries, not everywhere.
- Writing tests that pass no matter what. Always verify your test can fail by temporarily introducing a bug.
- Skipping the refactor step. Green is not done. The refactor step is where code quality improves.
- Testing trivial code. Getters, setters, and simple delegations do not need dedicated tests. Focus testing effort on logic and edge cases.
Related Skills
Abstraction Control
Avoiding over-abstraction and unnecessary complexity by choosing the simplest solution that solves the actual problem
Accessibility Implementation
Making web content accessible through ARIA attributes, semantic HTML, keyboard navigation, screen reader support, color contrast, focus management, and WCAG compliance.
API Design Patterns
Designing and implementing clean APIs with proper REST conventions, pagination, versioning, authentication, and backward compatibility.
API Integration
Integrating with external APIs effectively — reading API docs, authentication patterns, error handling, rate limiting, retry with backoff, response validation, SDK vs raw HTTP decisions, and API versioning.
Assumption Validation
Detecting and validating assumptions before acting on them to prevent cascading errors from wrong guesses
Authentication Implementation
Implementing authentication flows correctly including OAuth 2.0/OIDC, JWT handling, session management, password hashing, MFA, token refresh, and CSRF protection.