Dynamodb
AWS DynamoDB NoSQL database for high-performance key-value and document workloads
You are an expert in Amazon DynamoDB for designing and operating NoSQL databases with single-digit millisecond latency at any scale.
## Key Points
- **Ignoring pagination in query results** -- Query and Scan return at most 1 MB per call. Failing to check `LastEvaluatedKey` and loop silently drops data beyond the first page.
- **Storing large blobs directly in items** -- The 400 KB item size limit and per-item RCU cost make DynamoDB a poor fit for large payloads. Store blobs in S3 and reference them by key.
- **Design for access patterns first.** Model your table around queries, not entities. Use single-table design when entities share access patterns.
- **Use on-demand billing** for unpredictable workloads, provisioned with auto-scaling for steady-state.
- **Keep items small** (under 400 KB limit). Store large blobs in S3 and reference them by key.
- **Use sparse indexes**: GSI items only appear if the GSI key attributes exist, so omit them to exclude items from the index.
- **Use `ProjectionExpression`** to retrieve only needed attributes, reducing read costs and latency.
- **Enable Point-in-Time Recovery (PITR)** for production tables.
- **Use TTL** for automatically expiring temporary data (sessions, caches) at no extra cost.
- **Hot partitions**: A single partition key receiving disproportionate traffic throttles that partition. Distribute writes across partition keys.
- **Scan is expensive**: `scan` reads every item in the table. Always prefer `query` with a key condition. If you need scan, use parallel scan with `Segment`/`TotalSegments`.
- **GSI throttling propagates**: If a GSI is throttled (provisioned mode), writes to the base table are also throttled. Ensure GSI capacity matches write patterns.
## Quick Example
```python
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
table = dynamodb.Table("Orders")
```
```javascript
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, PutCommand, QueryCommand } from "@aws-sdk/lib-dynamodb";
const client = new DynamoDBClient({ region: "us-east-1" });
const ddb = DynamoDBDocumentClient.from(client);
```skilldb get aws-services-skills/DynamodbFull skill: 203 linesAWS DynamoDB — Cloud Services
You are an expert in Amazon DynamoDB for designing and operating NoSQL databases with single-digit millisecond latency at any scale.
Core Philosophy
DynamoDB demands you think about access patterns before you write a single line of code. Unlike relational databases where you normalize data and add queries later, DynamoDB requires you to model your table around how your application reads and writes data. Start with a list of access patterns, then design your partition key, sort key, and GSIs to serve those patterns efficiently. The schema follows the queries, not the other way around.
Single-table design is the default approach for applications where entities share access patterns. Storing users, orders, and products in one table with composite keys (e.g., PK=USER#alice, SK=ORDER#2024-01-15) enables fetching related data in a single query with no joins. This reduces the number of tables to manage, minimizes round trips, and keeps costs low. Multi-table design is appropriate when entities have completely independent access patterns or vastly different throughput requirements.
Every write should assume failure. Use conditional expressions for optimistic concurrency control, idempotency keys for retryable operations, and transactions when multiple items must be updated atomically. DynamoDB's default eventually consistent reads are sufficient for most use cases, but use strongly consistent reads when you need read-after-write guarantees -- and never on GSIs, which do not support them.
Anti-Patterns
- Designing tables around entities instead of access patterns -- Normalizing data into separate tables like a relational database leads to expensive, slow scan operations and cross-table lookups that DynamoDB is not built for.
- Using Scan as a query mechanism -- Scan reads every item in the table and is proportionally expensive. Always use Query with a key condition. If you need Scan, it is a sign your data model needs rethinking.
- Choosing low-cardinality partition keys -- Keys like
statusorcountryconcentrate traffic on a few partitions, causing throttling. Partition keys should have high cardinality and even distribution. - Ignoring pagination in query results -- Query and Scan return at most 1 MB per call. Failing to check
LastEvaluatedKeyand loop silently drops data beyond the first page. - Storing large blobs directly in items -- The 400 KB item size limit and per-item RCU cost make DynamoDB a poor fit for large payloads. Store blobs in S3 and reference them by key.
Overview
DynamoDB is a fully managed NoSQL key-value and document database. Tables have a primary key (partition key, or partition key + sort key). Data access patterns must be designed upfront; secondary indexes (GSI/LSI) provide alternative query paths. DynamoDB supports on-demand and provisioned capacity modes, DynamoDB Streams for change data capture, and transactions.
Setup & Configuration
Create a Table (AWS CLI)
aws dynamodb create-table \
--table-name Orders \
--attribute-definitions \
AttributeName=PK,AttributeType=S \
AttributeName=SK,AttributeType=S \
--key-schema \
AttributeName=PK,KeyType=HASH \
AttributeName=SK,KeyType=RANGE \
--billing-mode PAY_PER_REQUEST
SDK Setup (Python boto3)
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
table = dynamodb.Table("Orders")
SDK Setup (Node.js v3)
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, PutCommand, QueryCommand } from "@aws-sdk/lib-dynamodb";
const client = new DynamoDBClient({ region: "us-east-1" });
const ddb = DynamoDBDocumentClient.from(client);
Core Patterns
Single-Table Design
# Store multiple entity types in one table using PK/SK patterns
# User entity
table.put_item(Item={
"PK": "USER#alice",
"SK": "PROFILE",
"name": "Alice",
"email": "alice@example.com",
})
# Order entity belonging to user
table.put_item(Item={
"PK": "USER#alice",
"SK": "ORDER#2024-01-15#ord-001",
"total": 59.99,
"status": "shipped",
})
# Query all orders for a user
response = table.query(
KeyConditionExpression=Key("PK").eq("USER#alice") & Key("SK").begins_with("ORDER#"),
)
orders = response["Items"]
Batch Operations
with table.batch_writer() as batch:
for item in items:
batch.put_item(Item=item)
# batch_writer handles chunking into 25-item batches and retries
Transactions
dynamodb_client = boto3.client("dynamodb")
dynamodb_client.transact_write_items(
TransactItems=[
{
"Update": {
"TableName": "Orders",
"Key": {"PK": {"S": "USER#alice"}, "SK": {"S": "ORDER#ord-001"}},
"UpdateExpression": "SET #s = :new_status",
"ConditionExpression": "#s = :expected",
"ExpressionAttributeNames": {"#s": "status"},
"ExpressionAttributeValues": {
":new_status": {"S": "shipped"},
":expected": {"S": "processing"},
},
}
},
{
"Put": {
"TableName": "Orders",
"Item": {
"PK": {"S": "SHIPMENT#ship-099"},
"SK": {"S": "ORDER#ord-001"},
"carrier": {"S": "UPS"},
},
}
},
]
)
Global Secondary Index
aws dynamodb update-table \
--table-name Orders \
--attribute-definitions AttributeName=GSI1PK,AttributeType=S AttributeName=GSI1SK,AttributeType=S \
--global-secondary-index-updates '[{
"Create": {
"IndexName": "GSI1",
"KeySchema": [
{"AttributeName": "GSI1PK", "KeyType": "HASH"},
{"AttributeName": "GSI1SK", "KeyType": "RANGE"}
],
"Projection": {"ProjectionType": "ALL"}
}
}]'
# Query the GSI
response = table.query(
IndexName="GSI1",
KeyConditionExpression=Key("GSI1PK").eq("STATUS#shipped") & Key("GSI1SK").begins_with("2024-01"),
)
DynamoDB Streams + Lambda
aws lambda create-event-source-mapping \
--function-name process-order-changes \
--event-source-arn arn:aws:dynamodb:us-east-1:123456789012:table/Orders/stream/2024-01-01T00:00:00.000 \
--starting-position LATEST \
--batch-size 100
Conditional Writes (Optimistic Locking)
table.update_item(
Key={"PK": "PRODUCT#sku-100", "SK": "INVENTORY"},
UpdateExpression="SET quantity = quantity - :dec",
ConditionExpression="quantity >= :dec",
ExpressionAttributeValues={":dec": 1},
)
Best Practices
- Design for access patterns first. Model your table around queries, not entities. Use single-table design when entities share access patterns.
- Use on-demand billing for unpredictable workloads, provisioned with auto-scaling for steady-state.
- Keep items small (under 400 KB limit). Store large blobs in S3 and reference them by key.
- Use sparse indexes: GSI items only appear if the GSI key attributes exist, so omit them to exclude items from the index.
- Use
ProjectionExpressionto retrieve only needed attributes, reducing read costs and latency. - Enable Point-in-Time Recovery (PITR) for production tables.
- Use TTL for automatically expiring temporary data (sessions, caches) at no extra cost.
Common Pitfalls
- Hot partitions: A single partition key receiving disproportionate traffic throttles that partition. Distribute writes across partition keys.
- Scan is expensive:
scanreads every item in the table. Always preferquerywith a key condition. If you need scan, use parallel scan withSegment/TotalSegments. - GSI throttling propagates: If a GSI is throttled (provisioned mode), writes to the base table are also throttled. Ensure GSI capacity matches write patterns.
- Forgetting pagination:
queryandscanreturn max 1 MB per call. Always check forLastEvaluatedKeyand loop. - Reserved words in expressions: Attributes like
name,status,dataare reserved. Always useExpressionAttributeNames(e.g.,#sforstatus). - Transaction limits: Transactions support max 100 items and 4 MB total. Items in a transaction must be in the same region.
- Misunderstanding eventually consistent reads: By default, reads are eventually consistent. Use
ConsistentRead=Truefor strong consistency (2x cost, not available on GSIs).
Install this skill directly: skilldb add aws-services-skills
Related Skills
API Gateway
AWS API Gateway for building, deploying, and managing RESTful and WebSocket APIs
Cloudformation
AWS CloudFormation infrastructure-as-code for provisioning and managing AWS resources declaratively
Cognito
AWS Cognito user authentication and authorization for web and mobile applications
Ecs Fargate
AWS ECS and Fargate for running containerized applications without managing servers
Rds Aurora
AWS RDS and Aurora managed relational databases for production SQL workloads
S3
AWS S3 object storage service for scalable, durable file and data storage