Programmatic SEO
Programmatic SEO strategies: generating thousands of search-optimized pages from structured data, template design, internal linking, and indexing at scale.
Programmatic SEO is the practice of generating large volumes of search-targeted pages from structured data, and the line between success and penalty is whether each page provides genuine value to the human who lands on it. Google's helpful content system specifically targets "scaled content" that exists solely for search engines, so every programmatic page must pass the test: "Would a human find this page genuinely useful, or is it just keyword-stuffed filler?" ## Key Points 1. **Data Source**: A structured dataset with enough unique attributes per entity to produce genuinely useful pages (databases, APIs, public datasets, scraped-and-cleaned data). 2. **Template**: A page layout that transforms raw data into readable, valuable content — not just a data dump. 3. **Keyword Pattern**: A repeatable search query pattern with proven volume (validated via keyword research tools like Ahrefs, Semrush, or Google Keyword Planner).
skilldb get seo-content-skills/Programmatic SEOFull skill: 286 linesProgrammatic SEO — SEO & Content
Core Philosophy
Programmatic SEO is the practice of generating large volumes of search-targeted pages from structured data, and the line between success and penalty is whether each page provides genuine value to the human who lands on it. Google's helpful content system specifically targets "scaled content" that exists solely for search engines, so every programmatic page must pass the test: "Would a human find this page genuinely useful, or is it just keyword-stuffed filler?"
The pSEO triangle -- data source, template, and keyword pattern -- must be validated before building anything. A structured dataset with unique attributes per entity, a template that transforms data into readable and useful content, and a keyword pattern with proven search volume are all prerequisites. Missing any one of these three elements dooms the project.
Indexing velocity and crawl budget management are the operational challenges that distinguish pSEO from traditional SEO. Submitting 50,000 new URLs at once overwhelms Google's crawl budget. Rolling out pages incrementally, maintaining well-segmented sitemaps, and using canonical URLs to prevent duplicate indexing are the operational disciplines that determine whether programmatic pages actually appear in search results.
You are an expert in programmatic SEO, specializing in generating large volumes of search-optimized pages from structured data sources, designing scalable templates, and managing indexing at scale.
Overview
Programmatic SEO (pSEO) is the practice of generating hundreds or thousands of unique, search-targeted pages from structured data rather than writing each page manually. Examples include Zapier's app integration pages, Nomad List's city comparison pages, and Wise's currency conversion pages. Each page targets a specific long-tail keyword pattern (e.g., "best [tool] for [use-case]" or "[city] cost of living") and is rendered from a template populated with real data. Done well, pSEO captures massive long-tail search traffic. Done poorly, it produces thin content that gets penalized.
Core Concepts
The pSEO Triangle
Every successful programmatic SEO project requires three elements:
- Data Source: A structured dataset with enough unique attributes per entity to produce genuinely useful pages (databases, APIs, public datasets, scraped-and-cleaned data).
- Template: A page layout that transforms raw data into readable, valuable content — not just a data dump.
- Keyword Pattern: A repeatable search query pattern with proven volume (validated via keyword research tools like Ahrefs, Semrush, or Google Keyword Planner).
Long-Tail Keyword Patterns
pSEO targets keyword modifiers systematically:
| Pattern | Example | Volume Per Page |
|---|---|---|
[product] vs [product] | "Notion vs Coda" | 100–1K |
[product] alternatives | "Airtable alternatives" | 1K–10K |
[tool] for [use-case] | "CRM for real estate" | 100–500 |
[city] [topic] | "Berlin cost of living" | 500–5K |
[language] to [language] translation | "English to Japanese translation" | 1K–50K |
[format] to [format] converter | "PDF to DOCX converter" | 5K–50K |
Thin Content vs. Valuable Content
Google's helpful content system penalizes pages that exist solely for search engines. Every programmatic page must pass this test: "Would a human find this page genuinely useful, or is it just keyword-stuffed filler?" Pages need unique data, analysis, or utility — not just reformatted information available elsewhere.
Implementation Patterns
Static Generation from a Database (Next.js)
// app/tools/[slug]/page.tsx
import { db } from "@/lib/database";
import { notFound } from "next/navigation";
import type { Metadata } from "next";
interface Props {
params: Promise<{ slug: string }>;
}
export async function generateStaticParams() {
const tools = await db.tools.findMany({ select: { slug: true } });
return tools.map((tool) => ({ slug: tool.slug }));
}
export async function generateMetadata({ params }: Props): Promise<Metadata> {
const { slug } = await params;
const tool = await db.tools.findUnique({ where: { slug } });
if (!tool) return {};
return {
title: `${tool.name} — Features, Pricing & Alternatives`,
description: `Compare ${tool.name}: pricing from $${tool.startingPrice}/mo, ${tool.featureCount} features, ${tool.integrationCount} integrations. See alternatives and user reviews.`,
alternates: { canonical: `/tools/${slug}` },
};
}
export default async function ToolPage({ params }: Props) {
const { slug } = await params;
const tool = await db.tools.findUnique({
where: { slug },
include: { features: true, alternatives: true, reviews: true },
});
if (!tool) notFound();
return (
<article>
<h1>{tool.name}</h1>
<p>{tool.description}</p>
<section>
<h2>Key Features</h2>
<ul>
{tool.features.map((f) => (
<li key={f.id}>
<strong>{f.name}</strong>: {f.description}
</li>
))}
</ul>
</section>
<section>
<h2>Pricing</h2>
<PricingTable plans={tool.pricingPlans} />
</section>
<section>
<h2>Alternatives to {tool.name}</h2>
<AlternativesGrid alternatives={tool.alternatives} />
</section>
</article>
);
}
Comparison Pages (A vs B Pattern)
// app/compare/[slugA]-vs-[slugB]/page.tsx
export async function generateStaticParams() {
const tools = await db.tools.findMany({ select: { slug: true } });
const pairs: { slugA: string; slugB: string }[] = [];
// Generate top comparison pairs based on category overlap
for (let i = 0; i < tools.length; i++) {
for (let j = i + 1; j < tools.length; j++) {
pairs.push({
slugA: tools[i].slug,
slugB: tools[j].slug,
});
}
}
// Filter to only pairs with search volume (from pre-researched keyword data)
const validPairs = await db.comparisonKeywords.findMany({
where: { volume: { gte: 50 } },
select: { slugA: true, slugB: true },
});
return validPairs;
}
Internal Linking at Scale
Internal links are critical for pSEO — they distribute authority and help crawlers discover pages. Build linking programmatically from entity relationships:
// components/related-links.tsx
interface RelatedEntity {
slug: string;
name: string;
category: string;
}
export function RelatedLinks({
current,
related,
}: {
current: string;
related: RelatedEntity[];
}) {
// Group by category for organized, contextual linking
const grouped = related.reduce(
(acc, entity) => {
const key = entity.category;
if (!acc[key]) acc[key] = [];
acc[key].push(entity);
return acc;
},
{} as Record<string, RelatedEntity[]>
);
return (
<nav aria-label="Related pages">
{Object.entries(grouped).map(([category, entities]) => (
<div key={category}>
<h3>Related {category}</h3>
<ul>
{entities
.filter((e) => e.slug !== current)
.slice(0, 8) // Cap links per section
.map((entity) => (
<li key={entity.slug}>
<a href={`/tools/${entity.slug}`}>{entity.name}</a>
</li>
))}
</ul>
</div>
))}
</nav>
);
}
Sitemap Generation for Large Page Sets
// app/sitemap/[id]/route.ts
// Split sitemaps for sites with >50,000 URLs (Google's limit per sitemap)
import { NextRequest, NextResponse } from "next/server";
const URLS_PER_SITEMAP = 45000;
export async function GET(
_req: NextRequest,
{ params }: { params: Promise<{ id: string }> }
) {
const { id } = await params;
const page = parseInt(id, 10);
const offset = page * URLS_PER_SITEMAP;
const tools = await db.tools.findMany({
skip: offset,
take: URLS_PER_SITEMAP,
select: { slug: true, updatedAt: true },
orderBy: { slug: "asc" },
});
const xml = `<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
${tools
.map(
(t) => ` <url>
<loc>https://example.com/tools/${t.slug}</loc>
<lastmod>${t.updatedAt.toISOString()}</lastmod>
<changefreq>weekly</changefreq>
<priority>0.6</priority>
</url>`
)
.join("\n")}
</urlset>`;
return new NextResponse(xml, {
headers: { "Content-Type": "application/xml" },
});
}
Indexing API for Faster Discovery
// scripts/submit-to-google.ts
// Use Google's Indexing API to request faster crawling of new pages
import { google } from "googleapis";
async function submitUrls(urls: string[]) {
const auth = new google.auth.GoogleAuth({
keyFile: "./service-account.json",
scopes: ["https://www.googleapis.com/auth/indexing"],
});
const indexing = google.indexing({ version: "v3", auth });
for (const url of urls) {
await indexing.urlNotifications.publish({
requestBody: {
url,
type: "URL_UPDATED",
},
});
// Respect rate limits: 200 requests per day
await new Promise((r) => setTimeout(r, 500));
}
}
Best Practices
- Validate keyword patterns before building. Use Ahrefs, Semrush, or Google Keyword Planner to confirm that the long-tail keyword pattern has consistent search volume across entities. Building 10,000 pages for keywords nobody searches is wasted effort.
- Add unique value to every page beyond raw data. Include computed insights (percentile rankings, trend analysis), user-generated content (reviews, ratings), or cross-entity comparisons that make each page genuinely useful.
- Implement pagination and crawl budget management. Use
rel="canonical", proper sitemap segmentation, androbots.txtrules to ensure search engines spend their crawl budget on high-value pages rather than duplicates or low-value parameter variations.
Anti-Patterns
- Near-identical pages at scale: Generating thousands of pages that differ only in the entity name while the surrounding content is identical. Google detects this as "scaled content abuse" and demotes all affected pages.
- Building without keyword validation: Creating 10,000 pages for keyword patterns that nobody actually searches. Validate search volume per pattern using Ahrefs, Semrush, or Google Keyword Planner before generating any pages.
- Overwhelming crawl budget: Submitting all URLs at once instead of rolling out incrementally. Index 1,000-5,000 pages per week, monitor Google Search Console's indexing reports, and prioritize pages with the highest keyword volume.
- No unique value per page: Pages that contain only reformatted data available elsewhere without added analysis, computed insights, user reviews, or cross-entity comparisons. Raw data dumps get ignored by both users and search engines.
- Missing internal linking: Generating programmatic pages without building the internal link graph from entity relationships. Internal links distribute authority, help crawlers discover pages, and are critical for pSEO at scale.
Common Pitfalls
- Generating pages with near-identical content that differ only in the entity name. Google's helpful content system detects and demotes these as "scaled content abuse." Every page must have substantively different, useful information.
- Ignoring indexing velocity. Submitting 50,000 new URLs at once overwhelms Google's crawl budget. Roll out pages incrementally (1,000–5,000 per week), monitor Google Search Console's "Pages" report for indexing status, and prioritize pages with the highest keyword volume.
Install this skill directly: skilldb add seo-content-skills
Related Skills
Contentlayer
"Contentlayer and Velite for type-safe content management: transforming Markdown/MDX into typed data, schema validation, computed fields, Next.js integration, hot reload, and migration between content tools."
Core Web Vitals
Core Web Vitals optimization: LCP, INP, and CLS measurement, diagnosis, and improvement strategies for better search rankings and user experience.
Fumadocs
"fumadocs documentation framework: Next.js App Router native, MDX content collections, full-text search, OpenAPI integration, TypeScript-first, customizable UI components, and content source adapters."
Mdx
"MDX authoring with Next.js: Markdown + JSX, custom components, frontmatter extraction, @next/mdx, mdx-bundler, contentlayer integration, rehype/remark plugins, and syntax highlighting with Shiki or Prism."
Next SEO
"Next.js SEO and metadata management: meta tags, Open Graph, Twitter cards, JSON-LD structured data, canonical URLs, robots directives, and sitemap generation using the Metadata API and next-seo."
Nextra
"Nextra documentation framework: MDX-powered Next.js docs and blog sites, sidebar navigation, full-text search, i18n, themes (docs and blog), frontmatter configuration, and custom components."