Skip to main content
Technology & EngineeringTerraform354 lines

Terraform Testing

Testing Terraform configurations with native tests, Terratest, plan validation, and policy-as-code

Quick Summary32 lines
You are an expert in testing Terraform configurations and infrastructure as code.

## Key Points

1. **Static analysis** — `terraform fmt`, `terraform validate`, TFLint
2. **Plan validation** — Inspect the plan output for expected changes
3. **Native tests** — `terraform test` with `.tftest.hcl` files (Terraform 1.6+)
4. **Integration tests** — Terratest or similar tools that apply, verify, and destroy real infrastructure
5. **Policy-as-code** — OPA/Conftest, Sentinel, or Checkov for compliance rules
- **Start with static analysis.** `terraform validate`, `terraform fmt -check`, and TFLint catch most errors with zero cost.
- **Use `terraform test` for module contracts.** Write plan-mode tests that verify outputs and resource configurations without creating infrastructure.
- **Run Terratest with `t.Parallel()`** and use random naming to allow concurrent test runs.
- **Always `defer terraform.Destroy`** in Terratest to clean up resources, even if assertions fail.
- **Enforce policy-as-code in CI.** Use Conftest, Sentinel, or Checkov as a mandatory gate before apply.
- **Test modules, not root configurations.** Modules have clear interfaces that are easier to test. Root configurations are better validated through plan reviews and policy checks.
- **Not running `terraform init` before `terraform test`.** The test framework requires initialized providers.

## Quick Example

```bash
# Run TFLint
tflint --init
tflint --recursive
```

```bash
# Validate syntax and internal consistency (no provider APIs called)
terraform init -backend=false
terraform validate
```
skilldb get terraform-skills/Terraform TestingFull skill: 354 lines
Paste into your CLAUDE.md or agent config

Testing — Terraform

You are an expert in testing Terraform configurations and infrastructure as code.

Overview

Testing Terraform code spans multiple layers: static analysis and linting, plan-time validation, integration testing with real infrastructure, and policy-as-code enforcement. Terraform 1.6 introduced a native testing framework (terraform test) that runs HCL-based test files, reducing the need for external tools in many scenarios. For more complex assertions or multi-step workflows, Terratest (Go-based) remains a popular choice.

Core Concepts

Testing Pyramid for Terraform

  1. Static analysisterraform fmt, terraform validate, TFLint
  2. Plan validation — Inspect the plan output for expected changes
  3. Native teststerraform test with .tftest.hcl files (Terraform 1.6+)
  4. Integration tests — Terratest or similar tools that apply, verify, and destroy real infrastructure
  5. Policy-as-code — OPA/Conftest, Sentinel, or Checkov for compliance rules

Implementation Patterns

Static Analysis with TFLint

# .tflint.hcl
config {
  call_module_type = "local"
}

plugin "aws" {
  enabled = true
  version = "0.30.0"
  source  = "github.com/terraform-linters/tflint-ruleset-aws"
}

plugin "terraform" {
  enabled = true
  preset  = "recommended"
}

rule "terraform_naming_convention" {
  enabled = true
  format  = "snake_case"
}
# Run TFLint
tflint --init
tflint --recursive

Terraform Validate

# Validate syntax and internal consistency (no provider APIs called)
terraform init -backend=false
terraform validate

Native Tests (Terraform 1.6+)

Test files use .tftest.hcl extension and live alongside or in a tests/ directory.

# tests/basic.tftest.hcl
provider "aws" {
  region = "us-east-1"
}

variables {
  project     = "test"
  environment = "dev"
  vpc_cidr    = "10.0.0.0/16"
}

run "creates_vpc" {
  command = plan

  assert {
    condition     = aws_vpc.main.cidr_block == "10.0.0.0/16"
    error_message = "VPC CIDR block did not match expected value"
  }

  assert {
    condition     = aws_vpc.main.enable_dns_hostnames == true
    error_message = "DNS hostnames should be enabled"
  }
}

run "creates_correct_number_of_subnets" {
  command = plan

  variables {
    availability_zones = ["us-east-1a", "us-east-1b"]
  }

  assert {
    condition     = length(aws_subnet.public) == 2
    error_message = "Expected 2 public subnets"
  }
}
# tests/integration.tftest.hcl — actually creates infrastructure
variables {
  project     = "tftest"
  environment = "test"
}

run "deploy_and_verify" {
  command = apply

  assert {
    condition     = output.vpc_id != ""
    error_message = "VPC ID output should not be empty"
  }

  assert {
    condition     = length(output.public_subnet_ids) > 0
    error_message = "Should have at least one public subnet"
  }
}
# tests/module.tftest.hcl — testing a module
run "test_vpc_module" {
  command = plan

  module {
    source = "./modules/vpc"
  }

  variables {
    cidr_block         = "10.0.0.0/16"
    availability_zones = ["us-east-1a", "us-east-1b"]
    project            = "test"
    environment        = "ci"
  }

  assert {
    condition     = output.vpc_id != null
    error_message = "Module should output a VPC ID"
  }
}
# Run all tests
terraform test

# Run tests with verbose output
terraform test -verbose

# Run a specific test file
terraform test -filter=tests/basic.tftest.hcl

Terratest (Go)

package test

import (
	"testing"

	"github.com/gruntwork-io/terratest/modules/aws"
	"github.com/gruntwork-io/terratest/modules/terraform"
	"github.com/stretchr/testify/assert"
)

func TestVpcModule(t *testing.T) {
	t.Parallel()

	terraformOptions := &terraform.Options{
		TerraformDir: "../modules/vpc",
		Vars: map[string]interface{}{
			"cidr_block":         "10.0.0.0/16",
			"availability_zones": []string{"us-east-1a", "us-east-1b"},
			"project":            "terratest",
			"environment":        "test",
		},
		NoColor: true,
	}

	defer terraform.Destroy(t, terraformOptions)
	terraform.InitAndApply(t, terraformOptions)

	vpcID := terraform.Output(t, terraformOptions, "vpc_id")
	assert.NotEmpty(t, vpcID)

	publicSubnetIDs := terraform.OutputList(t, terraformOptions, "public_subnet_ids")
	assert.Equal(t, 2, len(publicSubnetIDs))

	// Verify the VPC actually exists in AWS
	vpc := aws.GetVpcById(t, vpcID, "us-east-1")
	assert.Equal(t, "10.0.0.0/16", *vpc.CidrBlock)
}

func TestVpcModulePlanOnly(t *testing.T) {
	t.Parallel()

	terraformOptions := &terraform.Options{
		TerraformDir: "../modules/vpc",
		Vars: map[string]interface{}{
			"cidr_block":         "10.0.0.0/16",
			"availability_zones": []string{"us-east-1a"},
			"project":            "plantest",
			"environment":        "test",
		},
		PlanFilePath: "/tmp/tfplan",
	}

	// Only runs plan, no actual infrastructure created
	planStruct := terraform.InitAndPlanAndShowWithStruct(t, terraformOptions)
	assert.Equal(t, 0, len(planStruct.ResourceChangesMap["delete"]))
}

Plan Validation with JSON Output

# Generate plan as JSON for automated analysis
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
# validate_plan.py — check plan output for compliance
import json
import sys

with open("plan.json") as f:
    plan = json.load(f)

errors = []

for change in plan.get("resource_changes", []):
    if change["type"] == "aws_s3_bucket":
        after = change.get("change", {}).get("after", {})
        # Check that all S3 buckets have versioning
        if not after.get("versioning"):
            errors.append(
                f"S3 bucket {change['address']} missing versioning"
            )

    if change["type"] == "aws_instance":
        after = change.get("change", {}).get("after", {})
        tags = after.get("tags", {})
        if "Environment" not in tags:
            errors.append(
                f"Instance {change['address']} missing Environment tag"
            )

if errors:
    for e in errors:
        print(f"FAIL: {e}", file=sys.stderr)
    sys.exit(1)

print("All plan validations passed.")

Policy-as-Code with Conftest (OPA)

# policy/terraform.rego
package terraform

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_security_group_rule"
    resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
    resource.change.after.from_port == 22
    msg := sprintf("Security group rule %s allows SSH from 0.0.0.0/0", [resource.address])
}

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"
    not resource.change.after.metadata_options
    msg := sprintf("Instance %s must configure IMDSv2 metadata options", [resource.address])
}

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_db_instance"
    resource.change.after.publicly_accessible == true
    msg := sprintf("RDS instance %s must not be publicly accessible", [resource.address])
}
terraform show -json tfplan > plan.json
conftest test plan.json --policy policy/

Checkov for Security Scanning

# Scan Terraform files directly
checkov -d . --framework terraform

# Scan a plan file
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
checkov -f plan.json --framework terraform_plan

Best Practices

  • Start with static analysis. terraform validate, terraform fmt -check, and TFLint catch most errors with zero cost.
  • Use terraform test for module contracts. Write plan-mode tests that verify outputs and resource configurations without creating infrastructure.
  • Reserve integration tests for critical paths. Real infrastructure tests are slow and expensive. Use them for modules that are hard to validate via plan alone (e.g., IAM policies, networking connectivity).
  • Run Terratest with t.Parallel() and use random naming to allow concurrent test runs.
  • Always defer terraform.Destroy in Terratest to clean up resources, even if assertions fail.
  • Enforce policy-as-code in CI. Use Conftest, Sentinel, or Checkov as a mandatory gate before apply.
  • Test modules, not root configurations. Modules have clear interfaces that are easier to test. Root configurations are better validated through plan reviews and policy checks.

Core Philosophy

Testing Terraform is about building confidence that your infrastructure changes will do what you expect and nothing else. Unlike application code, where a failed test means a broken feature, a failed infrastructure test can mean a production outage, data loss, or a security breach. The stakes justify investing in a layered testing strategy that catches different classes of errors at different speeds and costs.

The testing pyramid applies to infrastructure just as it does to application code. Static analysis and plan-time validation are your foundation: fast, cheap, and able to catch the majority of errors. Native terraform test and plan assertions form the middle tier, verifying module contracts without provisioning real resources. Integration tests with real infrastructure are the apex: slow and expensive, but irreplaceable for validating behaviors that cannot be observed from a plan (IAM policy effectiveness, network connectivity, DNS resolution).

Policy-as-code is testing from the organization's perspective rather than the developer's. While unit and integration tests ask "does this configuration do what I intended," policy checks ask "does this configuration comply with what the organization allows." Both are necessary, and neither substitutes for the other. A configuration that passes all developer tests but opens SSH to the internet should be caught by policy before it reaches production.

Anti-Patterns

  • Testing only in production. Skipping all forms of testing and relying on plan reviews alone means that the first time you discover a misconfiguration is when it causes an incident. Even basic terraform validate and terraform fmt -check in CI catch errors that plan reviews miss.

  • Writing integration tests for everything. Running real infrastructure for every test is slow, expensive, and creates flaky tests that fail due to API rate limits or transient cloud issues. Reserve integration tests for behaviors that genuinely cannot be validated through plan inspection.

  • Asserting on implementation details. Tests that check the exact number of resources, specific internal attribute values, or resource names break every time you refactor. Assert on outputs and observable behavior instead: "the VPC ID is not empty" rather than "there are exactly 7 resources in the plan."

  • Skipping cleanup in integration tests. Forgetting defer terraform.Destroy in Terratest or not using ephemeral test infrastructure means failed test runs leave orphaned resources accumulating cost. Set up automated cleanup sweeps as a safety net.

  • Policy rules tied to resource names or indices. Conftest or Sentinel rules that reference specific resource addresses (like aws_instance.web[0]) break when the configuration is refactored. Write policies based on resource types, attributes, and tags.

Common Pitfalls

  • Not running terraform init before terraform test. The test framework requires initialized providers.
  • Forgetting cleanup in integration tests. If defer Destroy is missing and the test fails, resources linger and accumulate cost.
  • Testing implementation details instead of behavior. Assert on outputs and observable properties, not on internal resource counts or specific attribute values that may change with refactoring.
  • Slow feedback loops. Integration tests that take 15+ minutes discourage frequent testing. Use plan-mode tests for fast feedback and limit integration tests to nightly or pre-release runs.
  • Conftest/OPA rules that are too specific. Policies tied to exact resource names or indices break when the configuration is refactored. Write rules based on resource types and attributes.
  • Not testing variable validation. Write tests that pass invalid values and confirm the validation block rejects them.
  • Ignoring test infrastructure costs. Set up automated cleanup (e.g., aws-nuke scheduled runs) to catch resources that leaked from failed test runs.

Install this skill directly: skilldb add terraform-skills

Get CLI access →