Skip to main content
UncategorizedProduction Audit449 lines

Cost Explosion Audit

Quick Summary36 lines
Verify that bugs, retries, and design flaws cannot create runaway spend. In systems that call paid external APIs (AI providers, cloud storage, SMS, email), a single bug can turn a $10 operation into a $10,000 incident. This audit identifies every path where costs can explode and verifies that safeguards exist.

## Key Points

1. Run a generation pipeline on a project (e.g., 10 assets).
2. Record: number of external API calls, total tokens/compute units, estimated cost.
3. Run the exact same pipeline again (same project, same parameters).
4. Record: number of additional external API calls.
- [ ] Second run makes ZERO additional API calls (results cached/reused).
- [ ] OR second run is blocked ("Assets already generated. Regenerate?").
- [ ] OR second run costs less than 10% of first (only changed items regenerated).
- Second run makes the same number of API calls as the first.
- No caching mechanism exists.
- User is not warned about cost of regeneration.
1. Simulate a persistent external API failure (mock 500 or timeout).
2. Trigger a pipeline that calls this API.

## Quick Example

```
| Run | API Calls | Tokens Used | Estimated Cost | New Assets | From Cache |
|-----|-----------|-------------|---------------|------------|------------|
| 1st | 10        | 15,000      | $0.45         | 10         | 0          |
| 2nd | 0         | 0           | $0.00         | 0          | 10         | <- PASS
| 2nd | 10        | 15,000      | $0.45         | 10         | 0          | <- FAIL
```

```
| Resource | Min Instances | Max Instances | Scale Metric | Cost at Max | Acceptable? |
|----------|-------------|-------------|-------------|------------|-------------|
| API server | 1 | 10 | CPU > 70% | $X/hour | [ ] Yes [ ] No |
| Workers | 1 | 5 | Queue depth > 50 | $X/hour | [ ] Yes [ ] No |
| Database | 1 | 1 (fixed) | N/A | $X/month | [ ] Yes [ ] No |
```
skilldb get production-audit-skills/cost-explosion-auditFull skill: 449 lines

Install this skill directly: skilldb add production-audit-skills

Get CLI access →