Skip to main content
Technology & EngineeringNetworking Infrastructure177 lines

DNS Management

DNS configuration, record management, and resolution strategies for reliable domain infrastructure

Quick Summary20 lines
You are an expert in DNS configuration and management for building reliable networked systems.

## Key Points

- **High TTL (3600–86400s):** Reduces DNS query load, improves performance, but slows propagation of changes.
- **Low TTL (60–300s):** Enables fast failover and quick changes, but increases query volume.
- **Pre-migration pattern:** Lower TTL 24–48 hours before a migration, make the change, then raise TTL again.
- **Use infrastructure-as-code** (Terraform, Pulumi) to manage DNS records — avoid manual console changes that create drift.
- **Implement DNSSEC** on authoritative zones to prevent cache poisoning and spoofing attacks.
- **Use multiple NS providers** or at least geographically distributed nameservers for resilience against provider outages.
- **CNAME at zone apex:** Most DNS providers do not allow CNAME records on the root domain (`example.com`). Use ALIAS/ANAME records or provider-specific flattening (e.g., Cloudflare proxied CNAME).

## Quick Example

```
Client → Recursive Resolver → Root NS → TLD NS → Authoritative NS → IP Address
          (ISP / 8.8.8.8)      (.)       (.com)     (example.com)
```
skilldb get networking-infrastructure-skills/DNS ManagementFull skill: 177 lines
Paste into your CLAUDE.md or agent config

DNS Management — Networking & Infrastructure

You are an expert in DNS configuration and management for building reliable networked systems.

Core Philosophy

Overview

DNS (Domain Name System) translates human-readable domain names into IP addresses. Proper DNS management is foundational to every networked application — affecting availability, performance, and security. This skill covers record types, zone management, resolution strategies, and DNS-based traffic routing.

Core Concepts

DNS Record Types

RecordPurposeExample
AMaps domain to IPv4 addressexample.com → 93.184.216.34
AAAAMaps domain to IPv6 addressexample.com → 2606:2800:220:1:...
CNAMEAlias to another domainwww.example.com → example.com
MXMail exchange serverexample.com → mail.example.com (pri 10)
TXTArbitrary text (SPF, DKIM, verification)v=spf1 include:_spf.google.com ~all
NSNameserver delegationexample.com → ns1.provider.com
SRVService location with port/priority_sip._tcp.example.com → sipserver.example.com:5060
CAACertificate authority authorizationexample.com → letsencrypt.org
PTRReverse DNS lookup34.216.184.93.in-addr.arpa → example.com

DNS Resolution Flow

Client → Recursive Resolver → Root NS → TLD NS → Authoritative NS → IP Address
          (ISP / 8.8.8.8)      (.)       (.com)     (example.com)

TTL (Time to Live)

TTL controls how long resolvers cache a record. Balancing TTL is critical:

  • High TTL (3600–86400s): Reduces DNS query load, improves performance, but slows propagation of changes.
  • Low TTL (60–300s): Enables fast failover and quick changes, but increases query volume.
  • Pre-migration pattern: Lower TTL 24–48 hours before a migration, make the change, then raise TTL again.

Implementation Patterns

Zone File Configuration (BIND-style)

$TTL 3600
@   IN  SOA   ns1.example.com. admin.example.com. (
            2024010101  ; Serial
            7200        ; Refresh
            3600        ; Retry
            1209600     ; Expire
            86400 )     ; Minimum TTL

@       IN  NS    ns1.example.com.
@       IN  NS    ns2.example.com.
@       IN  A     93.184.216.34
@       IN  AAAA  2606:2800:220:1::248
www     IN  CNAME @
mail    IN  A     93.184.216.40
@       IN  MX    10 mail.example.com.
@       IN  TXT   "v=spf1 include:_spf.google.com ~all"

Terraform DNS Management

resource "cloudflare_record" "www" {
  zone_id = var.zone_id
  name    = "www"
  type    = "CNAME"
  value   = "example.com"
  ttl     = 3600
  proxied = true
}

resource "aws_route53_record" "api" {
  zone_id = aws_route53_zone.primary.zone_id
  name    = "api.example.com"
  type    = "A"

  alias {
    name                   = aws_lb.api.dns_name
    zone_id                = aws_lb.api.zone_id
    evaluate_target_health = true
  }
}

DNS-Based Traffic Routing (Route 53 Weighted)

resource "aws_route53_record" "blue" {
  zone_id        = aws_route53_zone.primary.zone_id
  name           = "app.example.com"
  type           = "A"
  set_identifier = "blue"

  weighted_routing_policy {
    weight = 70
  }

  alias {
    name    = aws_lb.blue.dns_name
    zone_id = aws_lb.blue.zone_id
  }
}

resource "aws_route53_record" "green" {
  zone_id        = aws_route53_zone.primary.zone_id
  name           = "app.example.com"
  type           = "A"
  set_identifier = "green"

  weighted_routing_policy {
    weight = 30
  }

  alias {
    name    = aws_lb.green.dns_name
    zone_id = aws_lb.green.zone_id
  }
}

Diagnostic Commands

# Query specific record types
dig example.com A +short
dig example.com MX +short
dig example.com TXT +short

# Trace full resolution path
dig example.com +trace

# Query a specific nameserver
dig @8.8.8.8 example.com A

# Reverse lookup
dig -x 93.184.216.34

# Check DNSSEC
dig example.com +dnssec

# View all records for a domain
dig example.com ANY +noall +answer

Best Practices

  • Use infrastructure-as-code (Terraform, Pulumi) to manage DNS records — avoid manual console changes that create drift.
  • Implement DNSSEC on authoritative zones to prevent cache poisoning and spoofing attacks.
  • Use multiple NS providers or at least geographically distributed nameservers for resilience against provider outages.

Common Pitfalls

  • CNAME at zone apex: Most DNS providers do not allow CNAME records on the root domain (example.com). Use ALIAS/ANAME records or provider-specific flattening (e.g., Cloudflare proxied CNAME).
  • Forgetting the trailing dot: In zone files, mail.example.com without a trailing dot is interpreted as mail.example.com.example.com. — always use mail.example.com. for fully qualified names.

Anti-Patterns

Over-engineering for hypothetical requirements. Building for scenarios that may never materialize adds complexity without value. Solve the problem in front of you first.

Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide wastes time and introduces risk.

Premature abstraction. Creating elaborate frameworks before having enough concrete cases to know what the abstraction should look like produces the wrong abstraction.

Neglecting error handling at system boundaries. Internal code can trust its inputs, but boundaries with external systems require defensive validation.

Skipping documentation. What is obvious to you today will not be obvious to your colleague next month or to you next year.

Install this skill directly: skilldb add networking-infrastructure-skills

Get CLI access →