Database Query Optimization
Optimize database queries through indexing, query planning, N+1 elimination, connection pooling, and schema design.
You are an expert in database query optimization for optimizing application performance. ## Key Points - **Indexing:** B-tree indexes reduce lookup time from O(n) to O(log n). A missing index on a filtered or joined column is the most common cause of slow queries. - **N+1 problem:** Executing one query to fetch a list, then one query per item to fetch related data. 100 items = 101 queries. - **Connection pooling:** Reusing database connections instead of opening a new one per request. Connection creation overhead is typically 20-50ms. - **Query selectivity:** The ratio of rows returned to total rows. High selectivity (few rows matched) benefits most from indexing. - **Slow query log:** Enable PostgreSQL's `log_min_duration_statement = 100` to log any query taking over 100ms. - **pg_stat_statements:** Tracks cumulative query statistics — total execution time, call count, average duration. Identifies the queries consuming the most total server time. - **Connection pool metrics:** Monitor active connections, queue depth, and wait time. If the queue grows, increase pool size or optimize long-running queries. - **Application-level tracing:** Use APM tools (Datadog, New Relic) to trace database time as a percentage of total request time. - Index every column used in `WHERE`, `JOIN ON`, and `ORDER BY` clauses on tables with more than 10,000 rows. - Use `SELECT` with only the columns you need instead of `SELECT *` to reduce I/O and memory usage. - Batch related queries into a single round-trip using `ANY()`, `IN()`, or CTEs to eliminate N+1 patterns. - Adding indexes without considering write overhead — each index slows down `INSERT`, `UPDATE`, and `DELETE` operations. Aim for no more than 5-7 indexes per table. ## Quick Example ```sql SELECT * FROM events WHERE user_id = 42 AND created_at > '2025-06-01'; -- Seq Scan: 3200ms ``` ```sql CREATE INDEX idx_events_user_created ON events (user_id, created_at); -- Index Scan: 2ms ```
skilldb get performance-optimization-skills/Database Query OptimizationFull skill: 138 linesDatabase Query Optimization — Performance Optimization
You are an expert in database query optimization for optimizing application performance.
Core Philosophy
Overview
Database queries are frequently the slowest component in a request lifecycle. A single unoptimized query can take seconds and consume disproportionate server resources. Query optimization involves analyzing execution plans, designing proper indexes, restructuring queries, and managing connections to achieve consistent sub-100ms response times.
Core Concepts
- Execution plan: The database engine's strategy for resolving a query. Use
EXPLAIN ANALYZE(PostgreSQL) orEXPLAIN(MySQL) to see whether the engine performs sequential scans, index scans, or hash joins. - Indexing: B-tree indexes reduce lookup time from O(n) to O(log n). A missing index on a filtered or joined column is the most common cause of slow queries.
- N+1 problem: Executing one query to fetch a list, then one query per item to fetch related data. 100 items = 101 queries.
- Connection pooling: Reusing database connections instead of opening a new one per request. Connection creation overhead is typically 20-50ms.
- Query selectivity: The ratio of rows returned to total rows. High selectivity (few rows matched) benefits most from indexing.
Implementation Patterns
Identify Slow Queries with EXPLAIN ANALYZE
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2025-01-01'
GROUP BY u.id;
Look for: Seq Scan on large tables, high actual time, and large rows estimates that differ significantly from actual counts.
Add Targeted Indexes
Before (sequential scan on 10M rows):
SELECT * FROM events WHERE user_id = 42 AND created_at > '2025-06-01';
-- Seq Scan: 3200ms
After (composite index):
CREATE INDEX idx_events_user_created ON events (user_id, created_at);
-- Index Scan: 2ms
Eliminate N+1 Queries
Before:
users = db.query("SELECT * FROM users LIMIT 100")
for user in users:
orders = db.query(f"SELECT * FROM orders WHERE user_id = {user.id}")
# 101 queries total
After:
users = db.query("SELECT * FROM users LIMIT 100")
user_ids = [u.id for u in users]
orders = db.query("SELECT * FROM orders WHERE user_id = ANY(%s)", [user_ids])
# 2 queries total
orders_by_user = groupby(orders, key=lambda o: o.user_id)
Connection Pooling
// Node.js with pg-pool
const { Pool } = require('pg');
const pool = new Pool({
max: 20, // max concurrent connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
async function getUser(id) {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
return result.rows[0];
} finally {
client.release(); // return connection to pool
}
}
Pagination with Cursor-Based Approach
Before (offset pagination degrades with large offsets):
SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 100000;
-- Scans and discards 100,000 rows
After (cursor-based):
SELECT * FROM products WHERE id > 100000 ORDER BY id LIMIT 20;
-- Seeks directly to the cursor position via index
Measurement & Monitoring
- Slow query log: Enable PostgreSQL's
log_min_duration_statement = 100to log any query taking over 100ms. - pg_stat_statements: Tracks cumulative query statistics — total execution time, call count, average duration. Identifies the queries consuming the most total server time.
- Connection pool metrics: Monitor active connections, queue depth, and wait time. If the queue grows, increase pool size or optimize long-running queries.
- Application-level tracing: Use APM tools (Datadog, New Relic) to trace database time as a percentage of total request time.
Best Practices
- Index every column used in
WHERE,JOIN ON, andORDER BYclauses on tables with more than 10,000 rows. - Use
SELECTwith only the columns you need instead ofSELECT *to reduce I/O and memory usage. - Batch related queries into a single round-trip using
ANY(),IN(), or CTEs to eliminate N+1 patterns.
Common Pitfalls
- Adding indexes without considering write overhead — each index slows down
INSERT,UPDATE, andDELETEoperations. Aim for no more than 5-7 indexes per table. - Using
OFFSETfor pagination on large datasets, which forces the database to scan and discard rows linearly, degrading performance as page numbers increase.
Anti-Patterns
Over-engineering for hypothetical scale. Building for millions of users when you have hundreds adds complexity without value. Solve today's problems first.
Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide well wastes time and introduces unnecessary risk.
Premature abstraction. Creating elaborate frameworks and utilities before you have enough concrete cases to know what the abstraction should look like produces the wrong abstraction.
Neglecting error handling at boundaries. Internal code can trust its inputs, but system boundaries (user input, APIs, file I/O) require defensive validation.
Skipping documentation for obvious code. What is obvious to you today will not be obvious to your colleague next month or to you next year.
Install this skill directly: skilldb add performance-optimization-skills
Related Skills
Bundle Optimization
Reduce JavaScript bundle size through tree shaking, dependency analysis, and build configuration tuning.
Caching Strategies
Design effective browser and CDN caching strategies using Cache-Control headers, ETags, service workers, and edge caching.
Code Splitting
Reduce initial load time with route-based and component-level code splitting using dynamic imports and framework-specific patterns.
Core Web Vitals
Optimize Largest Contentful Paint, First Input Delay, and Cumulative Layout Shift for better user experience and search rankings.
Image Optimization
Optimize image delivery with modern formats, responsive sizing, lazy loading, and CDN-based transformations.
Memory Management
Prevent memory leaks and optimize memory usage through proper cleanup patterns, profiling, and garbage collection awareness.