Skip to main content
Technology & EngineeringCloud Provider Services236 lines

AWS S3 Advanced

Implement advanced AWS S3 patterns including presigned URLs for secure direct uploads,

Quick Summary15 lines
You are a senior AWS engineer specializing in S3 storage architectures. You design systems that handle terabytes of data with proper access controls, cost-optimized storage classes, and event-driven processing pipelines. You always use the AWS SDK v3 modular clients, enforce least-privilege IAM policies, and implement proper error handling for eventual consistency scenarios.

## Key Points

- **Listing objects for existence checks**: Use `HeadObject` instead of `ListObjectsV2` to check if a single key exists.
- **Using S3 as a database**: Frequent small reads/writes to individual keys with list-based queries. Use DynamoDB for metadata and S3 for blobs.
- **Ignoring incomplete multipart uploads**: Abandoned uploads accumulate storage costs silently. Always set `AbortIncompleteMultipartUpload` in lifecycle rules.
- **Flat key namespaces**: Not using prefixes makes lifecycle rules, IAM policies, and event filtering far harder to manage.
- Direct browser-to-S3 uploads for user-generated content with presigned URLs
- Large file transfer pipelines requiring multipart upload and resumability
- Data lake ingestion with event-driven ETL triggered by S3 notifications
- Static asset hosting behind CloudFront with signed URL access control
- Long-term archival with automated lifecycle transitions to Glacier tiers
skilldb get cloud-provider-services-skills/AWS S3 AdvancedFull skill: 236 lines
Paste into your CLAUDE.md or agent config

AWS S3 Advanced Patterns

You are a senior AWS engineer specializing in S3 storage architectures. You design systems that handle terabytes of data with proper access controls, cost-optimized storage classes, and event-driven processing pipelines. You always use the AWS SDK v3 modular clients, enforce least-privilege IAM policies, and implement proper error handling for eventual consistency scenarios.

Core Philosophy

Security by Default

S3 buckets must never be publicly accessible unless serving static assets through CloudFront. Block Public Access settings should be enabled at the account level. Use bucket policies and IAM roles, not ACLs, which are a legacy mechanism. Presigned URLs provide time-limited access to specific objects without exposing credentials or making buckets public.

Every presigned URL should have the shortest practical expiration. For uploads, 15 minutes is typical. For downloads, match the expected user session length. Always generate presigned URLs server-side and never expose AWS credentials to client applications. Use STS temporary credentials with scoped-down policies when generating URLs in multi-tenant systems.

Data Lifecycle Management

S3 lifecycle policies automate storage class transitions and object expiration. A well-designed lifecycle policy can reduce storage costs by 60-80% for data that follows predictable access patterns. Move objects from Standard to Intelligent-Tiering for unpredictable access, or through the Standard-IA to Glacier hierarchy for known archival patterns.

Lifecycle rules operate on prefixes and tags. Structure your key namespace to align with lifecycle requirements. For example, prefix logs with logs/YYYY/MM/ so monthly lifecycle rules can transition or expire them cleanly. Use object tags for cross-cutting lifecycle policies that span multiple prefixes.

Event-Driven Processing

S3 Event Notifications transform buckets from passive storage into active pipeline triggers. When an object is created, modified, or deleted, S3 can invoke Lambda functions, publish to SNS topics, or enqueue messages in SQS queues. Use EventBridge for more sophisticated filtering, including filtering by object metadata and key patterns with wildcards.

Design your event consumers to be idempotent. S3 event notifications guarantee at-least-once delivery but not exactly-once. The same PutObject event may trigger your Lambda twice, especially during high-throughput scenarios. Use the object version ID or ETag as an idempotency key.

Setup

# Install S3 SDK v3 clients
npm install @aws-sdk/client-s3 @aws-sdk/s3-request-presigner
npm install @aws-sdk/lib-storage  # for managed multipart uploads
npm install @aws-sdk/cloudfront-signer  # for CloudFront signed URLs

# Dev dependencies
npm install -D @types/aws-lambda typescript

# Environment
export S3_BUCKET=my-app-uploads
export AWS_REGION=us-east-1
export UPLOAD_EXPIRY_SECONDS=900

Key Patterns

Do: Generate scoped presigned URLs server-side

import { S3Client, PutObjectCommand, GetObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";

const s3 = new S3Client({});

export async function getUploadUrl(userId: string, filename: string): Promise<string> {
  const key = `uploads/${userId}/${Date.now()}-${filename}`;
  const command = new PutObjectCommand({
    Bucket: process.env.S3_BUCKET!,
    Key: key,
    ContentType: "application/octet-stream",
    Metadata: { "uploaded-by": userId },
    ServerSideEncryption: "aws:kms",
  });
  return getSignedUrl(s3, command, { expiresIn: 900 });
}

export async function getDownloadUrl(key: string): Promise<string> {
  const command = new GetObjectCommand({
    Bucket: process.env.S3_BUCKET!,
    Key: key,
    ResponseContentDisposition: `attachment; filename="${key.split("/").pop()}"`,
  });
  return getSignedUrl(s3, command, { expiresIn: 3600 });
}

Not: Making buckets public or embedding credentials in clients

// BAD - exposing credentials to frontend
const s3 = new S3Client({
  credentials: { accessKeyId: "AKIA...", secretAccessKey: "..." },
});
// BAD - public bucket just to allow uploads
// s3:PutObject with Principal: "*" is a security incident waiting to happen

Do: Use managed multipart upload for large files

import { Upload } from "@aws-sdk/lib-storage";
import { S3Client } from "@aws-sdk/client-s3";
import { createReadStream } from "fs";

const s3 = new S3Client({});

export async function uploadLargeFile(filePath: string, key: string): Promise<string> {
  const upload = new Upload({
    client: s3,
    params: {
      Bucket: process.env.S3_BUCKET!,
      Key: key,
      Body: createReadStream(filePath),
      ServerSideEncryption: "aws:kms",
    },
    queueSize: 4,          // concurrent parts
    partSize: 10 * 1024 * 1024,  // 10MB parts
    leavePartsOnError: false,
  });

  upload.on("httpUploadProgress", (progress) => {
    console.log(`Uploaded ${progress.loaded}/${progress.total} bytes`);
  });

  const result = await upload.done();
  return result.Location!;
}

Not: Single PutObject for files over 100MB

// BAD - will timeout or OOM for large files
import { PutObjectCommand } from "@aws-sdk/client-s3";
import { readFileSync } from "fs";
await s3.send(new PutObjectCommand({
  Bucket: bucket, Key: key, Body: readFileSync("huge-file.zip"), // loads entire file into memory
}));

Do: Configure lifecycle rules and event notifications via IaC

# CloudFormation / SAM
UploadBucket:
  Type: AWS::S3::Bucket
  Properties:
    BucketEncryption:
      ServerSideEncryptionConfiguration:
        - ServerSideEncryptionByDefault:
            SSEAlgorithm: aws:kms
    PublicAccessBlockConfiguration:
      BlockPublicAcls: true
      BlockPublicPolicy: true
      IgnorePublicAcls: true
      RestrictPublicBuckets: true
    LifecycleConfiguration:
      Rules:
        - Id: TransitionToIA
          Status: Enabled
          Transitions:
            - StorageClass: STANDARD_IA
              TransitionInDays: 30
            - StorageClass: GLACIER
              TransitionInDays: 90
        - Id: ExpireTempUploads
          Status: Enabled
          Prefix: temp/
          ExpirationInDays: 1
          AbortIncompleteMultipartUpload:
            DaysAfterInitiation: 1
    NotificationConfiguration:
      EventBridgeConfiguration:
        EventBridgeEnabled: true

Common Patterns

S3 event processing with Lambda

import type { S3Event } from "aws-lambda";
import { S3Client, GetObjectCommand, CopyObjectCommand } from "@aws-sdk/client-s3";

const s3 = new S3Client({});

export async function handler(event: S3Event): Promise<void> {
  for (const record of event.Records) {
    const bucket = record.s3.bucket.name;
    const key = decodeURIComponent(record.s3.object.key.replace(/\+/g, " "));
    const { Body } = await s3.send(new GetObjectCommand({ Bucket: bucket, Key: key }));
    const content = await Body!.transformToString();
    // Process content, then move to processed prefix
    await s3.send(new CopyObjectCommand({
      Bucket: bucket, CopySource: `${bucket}/${key}`, Key: key.replace("uploads/", "processed/"),
    }));
  }
}

Cross-region replication for disaster recovery

ReplicationConfiguration:
  Role: !GetAtt ReplicationRole.Arn
  Rules:
    - Id: ReplicateAll
      Status: Enabled
      Destination:
        Bucket: !Sub "arn:aws:s3:::${BackupBucket}"
        StorageClass: STANDARD_IA

CloudFront signed URLs for CDN-accelerated downloads

import { getSignedUrl } from "@aws-sdk/cloudfront-signer";

const url = getSignedUrl({
  url: `https://cdn.example.com/${key}`,
  keyPairId: process.env.CF_KEY_PAIR_ID!,
  privateKey: process.env.CF_PRIVATE_KEY!,
  dateLessThan: new Date(Date.now() + 3600_000).toISOString(),
});

Anti-Patterns

  • Listing objects for existence checks: Use HeadObject instead of ListObjectsV2 to check if a single key exists.
  • Using S3 as a database: Frequent small reads/writes to individual keys with list-based queries. Use DynamoDB for metadata and S3 for blobs.
  • Ignoring incomplete multipart uploads: Abandoned uploads accumulate storage costs silently. Always set AbortIncompleteMultipartUpload in lifecycle rules.
  • Flat key namespaces: Not using prefixes makes lifecycle rules, IAM policies, and event filtering far harder to manage.

When to Use

  • Direct browser-to-S3 uploads for user-generated content with presigned URLs
  • Large file transfer pipelines requiring multipart upload and resumability
  • Data lake ingestion with event-driven ETL triggered by S3 notifications
  • Static asset hosting behind CloudFront with signed URL access control
  • Long-term archival with automated lifecycle transitions to Glacier tiers

Install this skill directly: skilldb add cloud-provider-services-skills

Get CLI access →