Skip to main content
Technology & EngineeringSql199 lines

Window Functions

SQL window functions for analytics, ranking, and running calculations over partitioned result sets

Quick Summary34 lines
You are an expert in SQL window functions for performing advanced analytics and calculations across related rows in relational databases.

## Key Points

- **PARTITION BY** — divides rows into groups (like GROUP BY but without collapsing)
- **ORDER BY** — defines the logical order within each partition
- **Frame clause** — narrows the window to a sliding subset of the partition
1. **Use named windows** when the same OVER clause repeats — improves readability and reduces errors.
2. **Be explicit about frames** — the default frame changes depending on whether ORDER BY is present. Always specify `ROWS BETWEEN ...` when the distinction matters.
3. **Prefer `DENSE_RANK` over `ROW_NUMBER`** when ties should share the same rank.
4. **Push filters outside** — window functions cannot appear in WHERE; wrap in a subquery or CTE and filter the outer query.
6. **Limit partition size** — very large partitions force the engine to buffer many rows; add appropriate WHERE filters to shrink the working set.
- **Non-deterministic `ROW_NUMBER`** — if the ORDER BY is not unique, the assignment is arbitrary. Add a tiebreaker column.
- **Filtering on window results in WHERE** — this is a syntax error. Use a CTE or subquery.
- **Performance on large tables** — each distinct OVER clause may require a separate sort. Consolidate windows or ensure supporting indexes exist.
- **NULL handling in `LAG`/`LEAD`** — the third argument supplies a default when the offset falls outside the partition; without it you get NULL, which can silently break arithmetic.

## Quick Example

```sql
function_name(...) OVER (
  [PARTITION BY expr, ...]
  [ORDER BY expr [ASC|DESC], ...]
  [frame_clause]
)
```

```sql
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW   -- running total (default with ORDER BY)
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING  -- entire partition
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING            -- 3-row sliding window
```
skilldb get sql-skills/Window FunctionsFull skill: 199 lines
Paste into your CLAUDE.md or agent config

Window Functions — SQL

You are an expert in SQL window functions for performing advanced analytics and calculations across related rows in relational databases.

Overview

Window functions compute values across a set of rows related to the current row without collapsing them into a single output row (unlike GROUP BY). They enable ranking, running totals, moving averages, lead/lag comparisons, and percentile calculations — all within a single query pass.

Core Concepts

Window Function Anatomy

Every window function call has the form:

function_name(...) OVER (
  [PARTITION BY expr, ...]
  [ORDER BY expr [ASC|DESC], ...]
  [frame_clause]
)
  • PARTITION BY — divides rows into groups (like GROUP BY but without collapsing)
  • ORDER BY — defines the logical order within each partition
  • Frame clause — narrows the window to a sliding subset of the partition

Function Categories

CategoryFunctionsPurpose
RankingROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()Assign ordinal positions
OffsetLAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), NTH_VALUE()Access other rows
AggregateSUM(), AVG(), COUNT(), MIN(), MAX()Running/sliding aggregates
DistributionPERCENT_RANK(), CUME_DIST(), PERCENTILE_CONT(), PERCENTILE_DISC()Statistical distribution

Frame Specifications

-- Rows-based frame (physical offset)
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW

-- Range-based frame (logical offset)
RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW

-- Groups-based frame (peer groups, PostgreSQL 11+)
GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING

Common shorthands:

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW   -- running total (default with ORDER BY)
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING  -- entire partition
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING            -- 3-row sliding window

Implementation Patterns

Ranking Within Groups

-- Top 3 highest-paid employees per department
SELECT *
FROM (
  SELECT
    department_id,
    employee_name,
    salary,
    DENSE_RANK() OVER (
      PARTITION BY department_id
      ORDER BY salary DESC
    ) AS salary_rank
  FROM employees
) ranked
WHERE salary_rank <= 3;

Running Totals and Moving Averages

-- Running revenue total and 7-day moving average
SELECT
  order_date,
  daily_revenue,
  SUM(daily_revenue) OVER (
    ORDER BY order_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS running_total,
  AVG(daily_revenue) OVER (
    ORDER BY order_date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  ) AS moving_avg_7d
FROM daily_sales;

Period-over-Period Comparison

-- Month-over-month revenue change
SELECT
  month,
  revenue,
  LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
  ROUND(
    100.0 * (revenue - LAG(revenue, 1) OVER (ORDER BY month))
    / NULLIF(LAG(revenue, 1) OVER (ORDER BY month), 0),
    2
  ) AS pct_change
FROM monthly_revenue;

Gap and Island Detection

-- Find consecutive date ranges (islands)
WITH numbered AS (
  SELECT
    event_date,
    event_date - (ROW_NUMBER() OVER (ORDER BY event_date))::int AS grp
  FROM events
)
SELECT
  MIN(event_date) AS island_start,
  MAX(event_date) AS island_end,
  COUNT(*)        AS consecutive_days
FROM numbered
GROUP BY grp
ORDER BY island_start;

Percentile and Distribution

-- Salary percentiles per department
SELECT
  department_id,
  employee_name,
  salary,
  PERCENT_RANK() OVER (PARTITION BY department_id ORDER BY salary) AS pct_rank,
  NTILE(4) OVER (PARTITION BY department_id ORDER BY salary) AS quartile
FROM employees;

Named Windows (WINDOW Clause)

-- Reuse a window definition to keep the query DRY
SELECT
  order_date,
  amount,
  SUM(amount)   OVER w AS running_total,
  AVG(amount)   OVER w AS running_avg,
  COUNT(amount) OVER w AS running_count
FROM orders
WINDOW w AS (ORDER BY order_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);

Core Philosophy

Window functions represent a fundamental shift in how SQL expresses analytical calculations. Before window functions, any computation that needed to compare a row to its neighbors — rankings, running totals, period-over-period changes — required self-joins, correlated subqueries, or application-level post-processing. Window functions collapse these patterns into a single, declarative expression that the database engine can optimize as a single pass over the data. They do not replace GROUP BY; they complement it by computing aggregates without collapsing rows.

The mental model for window functions is "a frame sliding over an ordered partition." PARTITION BY divides the result set into independent groups (like GROUP BY, but without collapsing). ORDER BY defines the logical sequence within each partition. The frame clause narrows the window to a subset of the partition relative to the current row — the three rows before and after, everything from the start to the current row, or all rows within a 7-day range. Understanding the frame is the key to getting correct results; the default frame changes depending on whether ORDER BY is present, and this subtlety causes the majority of window function bugs.

Window functions are most powerful when combined with CTEs. A common pattern is to compute a window expression in a CTE (since window functions cannot appear in WHERE clauses), then filter the results in the outer query. This two-step approach — compute in a CTE, filter outside — is the idiomatic way to express "top N per group," "first occurrence per partition," and "rows where the running total exceeds a threshold."

Anti-Patterns

  • Using ROW_NUMBER with a non-unique ORDER BY — when the ORDER BY columns contain duplicates, ROW_NUMBER assigns positions arbitrarily among tied rows. Different executions of the same query can produce different numbering. Always include a unique tiebreaker column.

  • Relying on the default frame without understanding it — with ORDER BY present, the default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which groups peers together. Without ORDER BY, the default frame is the entire partition. Misunderstanding this produces subtly wrong running totals and aggregates.

  • Using LAST_VALUE without adjusting the frame — with the default frame, LAST_VALUE returns the current row (because the frame ends at the current row), not the last row in the partition. Use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to get the true partition last value.

  • Multiple distinct OVER clauses without index support — each unique OVER clause may require a separate sort operation. Three window functions with different PARTITION BY and ORDER BY clauses force three full sorts of the data. Consolidate windows where possible and ensure supporting indexes exist.

  • Filtering on window function results in the WHERE clause — window functions are evaluated after WHERE, so WHERE ROW_NUMBER() OVER (...) <= 3 is a syntax error. Wrap the window computation in a CTE or subquery and filter in the outer query.

Best Practices

  1. Use named windows when the same OVER clause repeats — improves readability and reduces errors.
  2. Be explicit about frames — the default frame changes depending on whether ORDER BY is present. Always specify ROWS BETWEEN ... when the distinction matters.
  3. Prefer DENSE_RANK over ROW_NUMBER when ties should share the same rank.
  4. Push filters outside — window functions cannot appear in WHERE; wrap in a subquery or CTE and filter the outer query.
  5. Watch for LAST_VALUE surprises — with the default frame (ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW), LAST_VALUE returns the current row, not the partition's last row. Use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
  6. Limit partition size — very large partitions force the engine to buffer many rows; add appropriate WHERE filters to shrink the working set.

Common Pitfalls

  • Confusing ROWS vs RANGERANGE groups peers (rows with the same ORDER BY value) together; ROWS counts physical rows. A running sum with RANGE may jump by multiple rows at once when duplicates exist.
  • Non-deterministic ROW_NUMBER — if the ORDER BY is not unique, the assignment is arbitrary. Add a tiebreaker column.
  • Filtering on window results in WHERE — this is a syntax error. Use a CTE or subquery.
  • Performance on large tables — each distinct OVER clause may require a separate sort. Consolidate windows or ensure supporting indexes exist.
  • NULL handling in LAG/LEAD — the third argument supplies a default when the offset falls outside the partition; without it you get NULL, which can silently break arithmetic.

Install this skill directly: skilldb add sql-skills

Get CLI access →