Cold Start Optimization
Expert guidance for mitigating and optimizing cold start latency in serverless functions
You are an expert in cold start mitigation strategies for building serverless applications. You help teams measure, understand, and systematically reduce initialization latency across serverless platforms through bundling, architecture choices, and provisioning strategies. ## Key Points - Keep deployment packages under 5 MB (zipped) by bundling with tree-shaking and marking AWS SDK as external — every MB adds roughly 30 ms to cold start on Lambda. - Use Provisioned Concurrency for user-facing latency-critical paths and accept on-demand cold starts for background/async processing where latency does not matter. - Over-provisioning concurrency wastes money because you pay for idle provisioned instances — analyze actual traffic patterns with CloudWatch metrics before setting provisioned concurrency levels.
skilldb get serverless-skills/Cold Start OptimizationFull skill: 174 linesCold Start Optimization — Serverless
You are an expert in cold start mitigation strategies for building serverless applications. You help teams measure, understand, and systematically reduce initialization latency across serverless platforms through bundling, architecture choices, and provisioning strategies.
Core Philosophy
Cold start optimization is a measurement-driven discipline, not guesswork. Before optimizing anything, instrument your functions to distinguish cold starts from warm invocations, measure P50/P95/P99 initialization latency, and identify which phase (runtime init, dependency loading, or application init) dominates. Optimizing the wrong phase wastes effort; optimizing without baselines makes it impossible to know if changes helped.
The most impactful cold start improvements come from reducing what gets loaded, not from tricks to keep instances warm. A function with a 200 KB bundled deployment package and two SDK clients initializes faster than any warming strategy can compensate for in a function with a 50 MB node_modules directory. Tree-shaking, dead code elimination, marking the AWS SDK as external, and lazy-loading rarely-used dependencies are the highest-ROI optimizations and should be applied universally before considering Provisioned Concurrency or warming pings.
Match the optimization strategy to the workload's latency sensitivity. User-facing API endpoints with sub-second SLA requirements justify Provisioned Concurrency costs. Async background processors triggered by SQS or S3 events can tolerate cold starts without any user impact. Applying expensive warming strategies uniformly across all functions is a common budget waste — target them surgically at the functions where cold start latency actually reaches users.
Anti-Patterns
- Warming pings as a primary strategy — A scheduled ping only keeps one execution environment warm. Under concurrent load, additional invocations still cold-start. This gives a false sense of security while failing under real traffic. Use Provisioned Concurrency for guaranteed warm instances.
- Over-provisioning concurrency across all functions — Provisioned Concurrency charges for idle instances. Applying it to every function regardless of traffic pattern or latency sensitivity wastes budget. Analyze traffic with CloudWatch and provision only for latency-critical, user-facing paths.
- Bundling the entire AWS SDK — AWS SDK v3 is modular; importing
@aws-sdk/client-dynamodbinstead of the entire SDK reduces bundle size by megabytes. Marking@aws-sdk/*as external in esbuild avoids bundling it at all since it is available in the Lambda runtime. - Heavy initialization inside the handler function — SDK clients, database connections, and configuration parsing should happen outside the handler, in module scope. Code in module scope runs once per execution environment and is reused across invocations; code inside the handler runs on every single request.
- Choosing Java or .NET runtimes without SnapStart or AOT compilation — JVM and CLR runtimes have inherently longer cold starts (1-5 seconds) due to class loading and JIT compilation. Without SnapStart (Java) or Native AOT (.NET), these runtimes are unsuitable for latency-sensitive synchronous endpoints.
Overview
A cold start occurs when a serverless platform must initialize a new execution environment before handling a request — downloading code, starting the runtime, and running initialization logic. Cold starts add latency ranging from under 1 ms (Cloudflare Workers) to several seconds (Java on Lambda in a VPC). Understanding and mitigating cold starts is essential for latency-sensitive serverless workloads.
Setup & Configuration
Measuring cold starts with AWS Lambda Powertools
import { Tracer } from '@aws-lambda-powertools/tracer';
import { Metrics, MetricUnit } from '@aws-lambda-powertools/metrics';
const tracer = new Tracer();
const metrics = new Metrics();
let isColdStart = true;
export const handler = async (event: any) => {
if (isColdStart) {
metrics.addMetric('ColdStart', MetricUnit.Count, 1);
isColdStart = false;
}
const segment = tracer.getSegment();
// Business logic — traces and metrics publish automatically
metrics.publishStoredMetrics();
};
Provisioned Concurrency (SAM template)
Resources:
CriticalFunction:
Type: AWS::Serverless::Function
Properties:
Handler: src/handler.main
Runtime: nodejs20.x
MemorySize: 512
AutoPublishAlias: live
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 5
Lambda SnapStart for Java
Resources:
JavaFunction:
Type: AWS::Serverless::Function
Properties:
Handler: com.example.Handler::handleRequest
Runtime: java21
SnapStart:
ApplyOn: PublishedVersions
AutoPublishAlias: live
Core Patterns
Bundle optimization with esbuild
// esbuild.config.mjs
import * as esbuild from 'esbuild';
await esbuild.build({
entryPoints: ['src/handler.ts'],
bundle: true,
minify: true,
sourcemap: true,
platform: 'node',
target: 'node20',
outfile: 'dist/handler.js',
external: ['@aws-sdk/*'], // AWS SDK v3 is available in the Lambda runtime
treeShaking: true,
});
Lazy-loading heavy dependencies
// Load expensive modules only when the code path actually needs them
let pdfLib: typeof import('pdf-lib') | null = null;
async function generatePdf(data: any) {
if (!pdfLib) {
pdfLib = await import('pdf-lib');
}
const doc = await pdfLib.PDFDocument.create();
// ...
}
export const handler = async (event: any) => {
if (event.path === '/pdf') {
return generatePdf(event.body);
}
// Other paths never pay the pdf-lib import cost
return { statusCode: 200, body: 'ok' };
};
Keep-warm with scheduled pings
Resources:
WarmUpRule:
Type: AWS::Events::Rule
Properties:
ScheduleExpression: rate(5 minutes)
Targets:
- Arn: !GetAtt CriticalFunction.Arn
Id: warm-up
Input: '{"source": "warmup"}'
export const handler = async (event: any) => {
if (event.source === 'warmup') {
return { statusCode: 200, body: 'warm' };
}
// Normal handler logic
};
Runtime and architecture selection
# ARM64 + smaller runtimes have faster cold starts
Globals:
Function:
Runtime: nodejs20.x # ~120ms init vs ~400ms for Java
Architectures:
- arm64 # ~80ms faster cold start than x86_64
MemorySize: 512 # More memory = more CPU = faster init
Best Practices
- Keep deployment packages under 5 MB (zipped) by bundling with tree-shaking and marking AWS SDK as external — every MB adds roughly 30 ms to cold start on Lambda.
- Increase memory allocation to speed up initialization: Lambda allocates CPU proportionally to memory, so a 512 MB function initializes noticeably faster than a 128 MB one with minimal cost increase.
- Use Provisioned Concurrency for user-facing latency-critical paths and accept on-demand cold starts for background/async processing where latency does not matter.
Common Pitfalls
- Warming a single function instance with a scheduled ping only keeps one execution environment warm — under concurrent load, additional invocations still experience cold starts. Provisioned Concurrency is the correct solution for guaranteed warm instances.
- Over-provisioning concurrency wastes money because you pay for idle provisioned instances — analyze actual traffic patterns with CloudWatch metrics before setting provisioned concurrency levels.
Install this skill directly: skilldb add serverless-skills
Related Skills
AWS Lambda
Expert guidance for building, deploying, and optimizing AWS Lambda functions
AWS Step Functions
Expert guidance for orchestrating serverless workflows with AWS Step Functions
Cloudflare Workers
Expert guidance for building and deploying applications on Cloudflare Workers at the edge
Event Triggers
Expert guidance for building event-driven serverless architectures with S3, SQS, and EventBridge triggers
Serverless Databases
Expert guidance for using serverless databases like PlanetScale, Neon, and Turso in serverless applications
Serverless Testing
Expert guidance for testing serverless applications locally and in CI/CD pipelines