Skip to main content
UncategorizedProduction Audit491 lines

State Machine Audit

Quick Summary36 lines
Verify that workflow states are explicitly defined, transitions are validated, impossible states are prevented, and the system never gets stuck in an unrecoverable state. State management bugs are insidious: they cause jobs to hang forever, UIs to show contradictory information, and operators to resort to manual database edits.

## Key Points

1. Search codebase for all status/state assignments and comparisons.
2. Extract unique values.
3. Compare against the documented state list.
- [ ] Every status value found in code exists in the enum definition.
- [ ] No "magic strings" used for status (all reference the enum).
- [ ] No unused states in the enum (dead code).
- [ ] State enum is the single source of truth.
1. For each non-terminal state, trace all possible paths forward.
2. Verify each path reaches a terminal state within finite steps.
3. Verify timeout/fallback mechanisms for states that depend on external input.
- [ ] From "queued": can reach completed, failed, or cancelled.
- [ ] From "processing": can reach completed, failed, cancelled, or retrying (which loops back).

## Quick Example

```
[ ] Transition function validates current state before applying new state
[ ] Invalid transitions throw errors (not silently succeed)
[ ] State changes are atomic (DB transaction)
[ ] State change includes timestamp (state_changed_at or updated_at)
[ ] State change is logged (transition log table or audit log)
```

```
[ ] Database constraints enforce impossible-state rules
    - CHECK constraints on state + required fields
    - Unique partial indexes (e.g., one active job per entity)
[ ] Application-level validation before state change
[ ] Post-transition assertions (verify invariants after every state change)
```
skilldb get production-audit-skills/state-machine-auditFull skill: 491 lines

Install this skill directly: skilldb add production-audit-skills

Get CLI access →