Service Mesh
Service mesh patterns with Istio and Linkerd for observability, traffic management, and mTLS
You are an expert in service mesh patterns for building reliable networked systems. ## Key Points - Retries - mTLS encryption - Timeouts - Retries & timeouts - Circuit breaking - Load balancing - Auth - Observability (metrics, traces) - Metrics - Access control - Pushes config to sidecars - Manages certificates - Collects telemetry - 10+ microservices with complex inter-service communication - You need uniform mTLS without modifying every service - You want canary deployments and traffic splitting - You need distributed tracing across polyglot services
skilldb get networking-infrastructure-skills/Service MeshFull skill: 296 linesService Mesh — Networking & Infrastructure
You are an expert in service mesh patterns for building reliable networked systems.
Core Philosophy
Overview
A service mesh is a dedicated infrastructure layer for managing service-to-service communication in microservices architectures. It handles mutual TLS, traffic routing, retries, circuit breaking, observability, and access control — all without changing application code. The two dominant implementations are Istio (feature-rich, complex) and Linkerd (lightweight, simple). This skill covers when to adopt a mesh, core traffic patterns, and practical configuration.
Core Concepts
Service Mesh Architecture
Without Service Mesh: With Service Mesh:
┌─────────┐ ┌─────────┐ ┌─────────┬───────┐ ┌───────┬─────────┐
│ Service A│───→│Service B│ │Service A│Sidecar│═══→│Sidecar│Service B│
└─────────┘ └─────────┘ └─────────┴───────┘ └───────┴─────────┘
App handles: Sidecar handles:
- Retries - mTLS encryption
- Timeouts - Retries & timeouts
- Circuit breaking - Load balancing
- Auth - Observability (metrics, traces)
- Metrics - Access control
- Traffic shaping
Control Plane (Istiod / Linkerd):
- Pushes config to sidecars
- Manages certificates
- Collects telemetry
When to Use a Service Mesh
Adopt when:
- 10+ microservices with complex inter-service communication
- You need uniform mTLS without modifying every service
- You want canary deployments and traffic splitting
- You need distributed tracing across polyglot services
Avoid when:
- Monolith or small number of services (< 5)
- Simple request patterns with no cross-cutting concerns
- Resource-constrained environments (each sidecar adds ~50MB RAM)
Implementation Patterns
Istio Installation and Setup
# Install Istio
curl -L https://istio.io/downloadIstio | sh -
istioctl install --set profile=default -y
# Enable sidecar injection for a namespace
kubectl label namespace default istio-injection=enabled
# Verify installation
istioctl analyze
Istio Traffic Management
# VirtualService — traffic routing rules
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: api-service
spec:
hosts:
- api-service
http:
# Canary: 90% stable, 10% canary
- route:
- destination:
host: api-service
subset: stable
weight: 90
- destination:
host: api-service
subset: canary
weight: 10
timeout: 10s
retries:
attempts: 3
perTryTimeout: 3s
retryOn: 5xx,reset,connect-failure
---
# DestinationRule — subsets and connection settings
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: api-service
spec:
host: api-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
h2UpgradePolicy: DEFAULT
http1MaxPendingRequests: 100
http2MaxRequests: 1000
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
subsets:
- name: stable
labels:
version: v1
- name: canary
labels:
version: v2
Istio mTLS and Authorization
# PeerAuthentication — enforce mTLS
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: default
spec:
mtls:
mode: STRICT
---
# AuthorizationPolicy — service-to-service access control
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: api-policy
namespace: default
spec:
selector:
matchLabels:
app: api-service
rules:
- from:
- source:
principals: ["cluster.local/ns/default/sa/frontend"]
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/*"]
- from:
- source:
principals: ["cluster.local/ns/default/sa/worker"]
to:
- operation:
methods: ["POST"]
paths: ["/api/internal/*"]
Linkerd Setup (Lightweight Alternative)
# Install Linkerd CLI
curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/install | sh
# Install control plane
linkerd install --crds | kubectl apply -f -
linkerd install | kubectl apply -f -
linkerd check
# Inject sidecar into a deployment
kubectl get deploy api-service -o yaml | linkerd inject - | kubectl apply -f -
# Install observability extension
linkerd viz install | kubectl apply -f -
linkerd viz dashboard &
Linkerd Traffic Split (SMI)
# TrafficSplit for canary deployments
apiVersion: split.smi-spec.io/v1alpha2
kind: TrafficSplit
metadata:
name: api-service-split
spec:
service: api-service
backends:
- service: api-service-stable
weight: 900
- service: api-service-canary
weight: 100
---
# ServiceProfile — per-route metrics and retries
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
name: api-service.default.svc.cluster.local
spec:
routes:
- name: GET /api/users
condition:
method: GET
pathRegex: /api/users
isRetryable: true
timeout: 5s
- name: POST /api/orders
condition:
method: POST
pathRegex: /api/orders
isRetryable: false
timeout: 30s
Istio Gateway (Ingress)
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: main-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: example-tls-secret
hosts:
- "*.example.com"
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: api-ingress
spec:
hosts:
- "api.example.com"
gateways:
- main-gateway
http:
- match:
- uri:
prefix: /v1
route:
- destination:
host: api-v1
port:
number: 8080
- match:
- uri:
prefix: /v2
route:
- destination:
host: api-v2
port:
number: 8080
Best Practices
- Start with Linkerd if you primarily need mTLS and observability — it has lower resource overhead and operational complexity than Istio, and covers 80% of service mesh use cases.
- Use circuit breaking and outlier detection to prevent cascading failures — eject unhealthy instances from the load balancing pool automatically rather than letting one bad pod degrade the entire service.
- Adopt incrementally — inject sidecars into one namespace at a time, starting with non-critical services, and validate that latency overhead (typically 1-3ms per hop) is acceptable.
Common Pitfalls
- Sidecar resource overhead: Each Envoy sidecar consumes ~50MB RAM and adds latency. In large clusters with hundreds of pods, this adds up significantly. Set resource limits on sidecars and monitor their consumption.
- mTLS migration breakage: Switching from PERMISSIVE to STRICT mTLS mode breaks communication with any service that does not have a sidecar injected. Audit all services and ensure sidecar injection is complete before enforcing strict mode.
Anti-Patterns
Over-engineering for hypothetical requirements. Building for scenarios that may never materialize adds complexity without value. Solve the problem in front of you first.
Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide wastes time and introduces risk.
Premature abstraction. Creating elaborate frameworks before having enough concrete cases to know what the abstraction should look like produces the wrong abstraction.
Neglecting error handling at system boundaries. Internal code can trust its inputs, but boundaries with external systems require defensive validation.
Skipping documentation. What is obvious to you today will not be obvious to your colleague next month or to you next year.
Install this skill directly: skilldb add networking-infrastructure-skills
Related Skills
CDN Setup
CDN setup and optimization with Cloudflare, Fastly, and CloudFront for global content delivery
DNS Architecture
This skill elucidates the hierarchical, distributed nature of the Domain Name System, covering its core components and resolution process. Activate it when designing, deploying, or troubleshooting reliable and performant name resolution infrastructure.
DNS Management
DNS configuration, record management, and resolution strategies for reliable domain infrastructure
Load Balancing
Load balancing patterns, algorithms, and tools for distributing traffic across backend services
Network Security
Network security patterns including firewalls, DDoS protection, WAFs, and intrusion detection
Reverse Proxy
Reverse proxy configuration with Nginx and Caddy for routing, TLS termination, and request handling