Infrastructure as Code
Provision and manage cloud infrastructure through code rather than manual
Infrastructure as Code
Core Philosophy
Infrastructure as Code treats infrastructure provisioning and configuration as a software engineering discipline. Every server, network, database, and permission is defined in version-controlled code that can be reviewed, tested, and reproduced deterministically. The principle is simple: if it is not in code, it does not exist. Manual changes are the enemy of reliability, auditability, and disaster recovery.
Key Techniques
- Declarative Definitions: Describe the desired end state of infrastructure and let the tool determine the steps to achieve it. Terraform HCL and CloudFormation YAML are declarative; scripts are imperative.
- State Management: Maintain a state file that maps declared resources to real infrastructure. Remote state backends (S3, GCS) with locking prevent concurrent modifications and enable team collaboration.
- Modular Composition: Package reusable infrastructure components as modules with well-defined inputs and outputs, enabling consistent patterns across teams.
- Plan Before Apply: Always generate and review an execution plan before making changes. The plan shows exactly what will be created, modified, or destroyed.
- Drift Detection: Periodically compare actual infrastructure state against the declared configuration to detect and remediate manual changes.
- Environment Parity: Use the same modules with different variable files to create identical dev, staging, and production environments.
Best Practices
- Store all IaC in version control alongside application code or in a dedicated infrastructure repository with the same review processes.
- Use remote state with locking. Local state files cause conflicts and data loss in team environments.
- Never hardcode secrets in IaC files. Reference secret managers or inject at apply time via environment variables.
- Tag every resource with owner, environment, project, and cost center for visibility and cost attribution.
- Write automated tests for infrastructure modules using tools like Terratest or kitchen-terraform.
- Pin provider and module versions to prevent unexpected behavior from upstream updates.
- Use workspaces or directory structures to isolate environments, never share state between production and non-production.
Common Patterns
- Hub-and-Spoke Networking: A central networking module provisions VPCs, subnets, and peering; application modules consume network outputs as inputs.
- GitOps for Infrastructure: Merge to main triggers automated plan and apply, with pull request plans serving as the review mechanism.
- Layered Stacks: Separate long-lived foundational infrastructure (networking, IAM) from frequently changing application infrastructure to reduce blast radius.
- Self-Service Modules: Provide pre-approved, parameterized modules that application teams can use without deep infrastructure knowledge.
Anti-Patterns
- ClickOps — making infrastructure changes through cloud console UIs. These changes are untracked, unrepeatable, and invisible to the team.
- Monolithic state files containing hundreds of resources. A single failed apply can block all infrastructure changes.
- Copying and pasting infrastructure code instead of creating reusable modules.
- Ignoring state file security. State contains sensitive data including resource IDs, IP addresses, and sometimes passwords.
- Applying changes without reviewing the plan first. A single misconfiguration can destroy production databases.
- Not implementing proper IAM for the IaC pipeline itself. The CI/CD service account that applies infrastructure changes is a high-value target.
Related Skills
CI/CD Pipelines
Design and maintain continuous integration and continuous delivery pipelines
Cloud Architecture
Design scalable, resilient, and cost-effective systems on cloud platforms like
Configuration Management
Manage system configurations consistently across environments using automation
Container Orchestration
Manage containerized applications at scale using orchestration platforms like
Cloud Cost Optimization
Reduce and optimize cloud infrastructure spending without sacrificing performance
Incident Management
Coordinate effective incident response from detection through resolution and