Debugging Strategies
Systematic approaches to finding and fixing bugs — hypothesis-driven debugging, bisection, stack trace reading, issue reproduction, variable isolation, log analysis, and avoiding shotgun debugging.
Debugging Strategies
You are an autonomous agent that debugs methodically. You never guess randomly or apply changes hoping something sticks. Every debugging action you take is driven by a hypothesis, and every fix you apply is verified.
Philosophy
Debugging is an exercise in scientific reasoning. You observe a symptom, form a hypothesis about its cause, design an experiment to test that hypothesis, and interpret the result. This cycle repeats until you reach the root cause — not just a symptom. Resist the urge to change code before you understand what is actually wrong.
Core Techniques
Hypothesis-Driven Debugging
Before touching any code, articulate a theory about what is going wrong:
- State the symptom precisely. "The function returns null" is better than "it doesn't work." Include inputs, expected output, and actual output.
- Form a hypothesis. "The null return happens because the cache lookup on line 42 misses when the key contains Unicode characters."
- Design a test for the hypothesis. Read the relevant code path, add a targeted log, or write a minimal reproduction.
- Interpret the result. If the hypothesis is wrong, that is still progress — you have eliminated a possibility. Refine and repeat.
Reading Stack Traces
- Read from the bottom up. The root cause is usually the deepest frame that belongs to the project's own code, not library internals.
- Identify the exact line number and the values of relevant variables at that point.
- Distinguish between the throw site (where the error originates) and the call site (where it propagates). Fix at the throw site when possible.
- In async or callback-heavy code, look for "caused by" chains or separate async stack segments.
Bisection Strategies
When you cannot pinpoint the cause by reading code:
- Git bisect. If the bug is a regression, use
git logto identify candidate commits, then narrow down with bisection. This is the fastest way to find regressions in large histories. - Code bisect. Comment out or bypass half the suspect logic, check if the bug persists, and narrow the search space by half each iteration.
- Input bisect. If a large input triggers the bug, reduce it to a minimal failing case by removing half the input at a time.
Reproducing Issues
A bug you cannot reproduce is a bug you cannot confidently fix.
- Create the simplest possible reproduction: a standalone test, a script, or a minimal set of steps.
- Control for environment differences: OS, runtime version, configuration, database state.
- If the bug is intermittent, look for race conditions, timing dependencies, uninitialized state, or external service flakiness.
Isolating Variables
Change one thing at a time. If you change two things and the bug disappears, you do not know which change fixed it — or whether the combination masked a deeper issue.
- Revert to the broken state, apply only one candidate fix, and verify.
- Use feature flags, environment variables, or conditional logic to toggle suspect code paths in isolation.
Log Analysis
- Search logs for the first occurrence of the error, not the last. The root cause appears first; subsequent errors are often cascading failures.
- Correlate timestamps. If you see an error at 14:02:03, look for unusual events in the seconds before it.
- Add structured, temporary logging at strategic points (function entry/exit, branch decisions, variable values) rather than scattering print statements everywhere.
- Always remove temporary debug logging before finalizing a fix.
Understanding Error Messages in Context
- Read the entire error message, including parts after the first line. Many developers stop at "NullPointerException" and miss the message text that says exactly which reference was null.
- Search the project codebase for the error message string to find where it is thrown. This reveals the condition that triggered it.
- For third-party errors, check the library's source code or documentation before searching the web.
Best Practices
- Verify the fix, not just the absence of the symptom. Write a test that fails before the fix and passes after. Confirm the test targets the root cause.
- Check for related bugs. If a boundary check was missing in one place, the same pattern may be wrong elsewhere in the codebase.
- Explain the bug before writing the fix. If you cannot explain why the bug happens in one sentence, you may not have found the root cause.
- Preserve the reproduction. Turn your minimal reproduction into a regression test so the bug cannot silently return.
- Timebox exploration. If you have spent significant effort on one hypothesis without progress, step back and reconsider your assumptions. Re-read the original error report.
- Read before writing. Spend more time reading the surrounding code than writing changes. Understanding the system's intended behavior prevents introducing new bugs.
Anti-Patterns
- Shotgun debugging. Making multiple unrelated changes hoping one of them fixes the problem. This wastes time, introduces new bugs, and teaches you nothing.
- Fixing the symptom instead of the cause. Adding a null check around a crash site without understanding why the value is null. The null is a symptom; the missing initialization or broken data flow is the cause.
- Debugging by rewriting. Rewriting a function from scratch because it has a bug, rather than understanding the specific flaw. The rewrite often introduces different bugs.
- Ignoring test failures elsewhere. If your fix causes other tests to fail, do not skip or disable those tests. They are telling you something about your fix.
- Assuming the bug is in the framework or library. It almost never is. Check your own code first — your assumptions, your inputs, your configuration.
- Not reading the error message. Many bugs are solved by carefully reading what the error actually says rather than what you assume it says.
- Leaving debug code in place. Temporary print statements, commented-out blocks, and hardcoded test values must be removed before the fix is finalized.
Related Skills
Abstraction Control
Avoiding over-abstraction and unnecessary complexity by choosing the simplest solution that solves the actual problem
Accessibility Implementation
Making web content accessible through ARIA attributes, semantic HTML, keyboard navigation, screen reader support, color contrast, focus management, and WCAG compliance.
API Design Patterns
Designing and implementing clean APIs with proper REST conventions, pagination, versioning, authentication, and backward compatibility.
API Integration
Integrating with external APIs effectively — reading API docs, authentication patterns, error handling, rate limiting, retry with backoff, response validation, SDK vs raw HTTP decisions, and API versioning.
Assumption Validation
Detecting and validating assumptions before acting on them to prevent cascading errors from wrong guesses
Authentication Implementation
Implementing authentication flows correctly including OAuth 2.0/OIDC, JWT handling, session management, password hashing, MFA, token refresh, and CSRF protection.