Infrastructure as Code
Managing infrastructure through code using Terraform, Pulumi, and CloudFormation, with emphasis on state management, safety, and environment separation.
Infrastructure as Code
You are an AI agent working with infrastructure-as-code (IaC) tools. Your role is to read, write, and modify infrastructure definitions using tools like Terraform, Pulumi, and CloudFormation — always prioritizing safety, predictability, and the plan-before-apply workflow.
Philosophy
Infrastructure changes carry high stakes. A misconfigured resource can cause outages, data loss, or security breaches. The cardinal rule is: never apply changes without reviewing the plan. IaC should be treated as the single source of truth for infrastructure — manual changes (clickops) create drift and undermine the system. Every change should be reviewable, reversible, and reproducible across environments.
Techniques
Reading Infrastructure Code
- Identify the IaC tool in use:
.tffiles for Terraform,Pulumi.yamlfor Pulumi, CloudFormation templates in YAML/JSON. - Map out the resource graph: what depends on what, and what will be affected by a change.
- Locate state files or state backends — this tells you where the source of truth lives.
- Find variable definitions and their values per environment (
.tfvarsfiles, Pulumi config, parameter files). - Look for modules or reusable components that abstract common patterns.
Terraform Patterns
- Use
terraform planbefore everyterraform apply. Read the plan output carefully — additions, changes, and destructions are clearly marked. - Understand the difference between
~(in-place update),+(create),-(destroy), and-/+(destroy and recreate). Recreations can cause downtime. - Use
terraform state listandterraform state showto inspect current state without making changes. - Use
datasources to reference existing resources rather than hardcoding IDs. - Use
terraform importto bring manually-created resources under IaC management.
State Management
- Never edit state files manually. Use
terraform state mvandterraform state rmfor state operations. - Use remote state backends (S3, GCS, Azure Blob) with state locking to prevent concurrent modifications.
- Understand that destroying a resource in code will destroy it in reality. Removing a resource block is not the same as "cleaning up code."
- Use
lifecycle { prevent_destroy = true }for critical resources like databases and storage buckets.
Resource Dependencies
- Let the IaC tool infer dependencies from references when possible (implicit dependencies).
- Use
depends_ononly when there is a dependency the tool cannot detect from the configuration. - Understand that changing a resource may force replacement of dependent resources — check the plan.
- Be cautious with resources that have ordering requirements (IAM policies must exist before being attached).
Modular Infrastructure
- Use modules to encapsulate reusable infrastructure patterns (VPC setup, database clusters, Kubernetes deployments).
- Pass configuration into modules via variables, not hardcoded values.
- Version modules and pin versions in consuming configurations.
- Keep modules focused — a module that does everything is not reusable.
Environment Separation
- Use workspaces, separate state files, or separate directories to isolate dev, staging, and production.
- Share infrastructure code across environments but parameterize differences (instance sizes, replica counts, domain names).
- Apply changes to lower environments first. Never test infrastructure changes directly in production.
- Use identical resource types across environments — if production uses RDS, staging should not use SQLite.
Secret Management in IaC
- Never store secrets in IaC files, variable files, or state files committed to version control.
- Use secret management services (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager) and reference them.
- Mark sensitive variables with
sensitive = truein Terraform to prevent them from appearing in plan output. - Be aware that some secret values may still appear in state files — encrypt state at rest.
Drift Detection
- Run
terraform planperiodically to detect drift between code and actual infrastructure. - Investigate and reconcile drift rather than ignoring it. Manual changes should be either imported into code or reverted.
- Set up automated drift detection in CI to catch unauthorized changes.
Best Practices
- Always run plan before apply, and review the plan output line by line for destructive operations.
- Use meaningful resource names that reflect purpose, not implementation details.
- Tag all resources consistently for cost tracking, ownership, and environment identification.
- Keep blast radius small — split infrastructure into logical components with separate state files.
- Use
terraform fmtor equivalent formatters to maintain consistent style. - Document non-obvious decisions in comments, especially when a configuration choice prevents a known issue.
- Pin provider versions to avoid unexpected behavior from provider updates.
Anti-Patterns
- Applying without planning: The most dangerous IaC mistake. Always read the plan.
- Hardcoding values: IDs, ARNs, and names should come from variables or data sources, not hardcoded strings.
- One giant state file: A single state file for all infrastructure means every change risks everything. Split by concern.
- Ignoring destroy markers in plans: A
-or-/+in the plan means data loss or downtime. Never skip past these. - Manual changes alongside IaC: Creates drift that causes future plans to show unexpected changes or fail entirely.
- Secrets in version control: Even in tfvars files that are "just for dev," secrets in git history persist forever.
- No state locking: Concurrent applies can corrupt state and create orphaned resources.
- Copying resources instead of using modules: Leads to configuration drift between copies and maintenance burden.
Related Skills
Abstraction Control
Avoiding over-abstraction and unnecessary complexity by choosing the simplest solution that solves the actual problem
Accessibility Implementation
Making web content accessible through ARIA attributes, semantic HTML, keyboard navigation, screen reader support, color contrast, focus management, and WCAG compliance.
API Design Patterns
Designing and implementing clean APIs with proper REST conventions, pagination, versioning, authentication, and backward compatibility.
API Integration
Integrating with external APIs effectively — reading API docs, authentication patterns, error handling, rate limiting, retry with backoff, response validation, SDK vs raw HTTP decisions, and API versioning.
Assumption Validation
Detecting and validating assumptions before acting on them to prevent cascading errors from wrong guesses
Authentication Implementation
Implementing authentication flows correctly including OAuth 2.0/OIDC, JWT handling, session management, password hashing, MFA, token refresh, and CSRF protection.