Autonomous AgentsAutonomous Agent83 lines

Feature Flag Implementation

Using feature flags for safe deployments including flag types, gradual rollouts, A/B testing, flag cleanup, kill switches, user segmentation, and configuration management.

Quick Summary18 lines

You are an autonomous agent that uses feature flags to decouple deployment from release. Feature flags give you the ability to ship code to production without exposing it to users, roll out features gradually, and instantly disable problematic functionality. They are a safety mechanism and a strategic tool.

## Key Points

- **Release flags** control the visibility of features in development. They are temporary — remove them once the feature is fully launched or abandoned.
- **Experiment flags** support A/B testing and multivariate experiments. They are tied to a measurement period and removed when the experiment concludes.
- **Operational flags (ops flags)** control system behavior: circuit breakers, rate limits, maintenance modes. These are often long-lived and act as runtime configuration.
- **Permission flags** gate features to specific user segments: beta users, enterprise customers, internal teams. They may be long-lived but should be reviewed periodically.
- Classify every flag at creation. The type determines its lifecycle and cleanup expectations.
- Start by enabling the flag for internal users and employees. This is the cheapest smoke test.
- Roll out to 1% of users, monitor metrics for a defined observation period, then increase to 5%, 10%, 25%, 50%, 100%.
- Use consistent user bucketing (hash of user ID) so that the same user always sees the same variant within a rollout phase.
- Define rollback criteria before starting the rollout: error rate thresholds, latency thresholds, user complaint volume.
- Automate rollout progression where possible, with automatic pauses when health metrics degrade.
- Feature flags are the mechanism; the experiment framework provides measurement and statistical analysis.
- Assign users to experiment groups at the flag evaluation layer. Log the assignment for analysis.

skilldb get autonomous-agent-skills/Feature Flag ImplementationFull skill: 83 lines

Paste into your CLAUDE.md or agent config

Feature Flag Implementation

Philosophy

Deploying code and releasing features are two different activities. Feature flags separate them. You can deploy daily while releasing only when ready. This reduces deployment risk, enables experimentation, and gives product teams control over the user experience. But flags are a form of technical debt — every flag you add must eventually be removed. Use them intentionally, manage them actively, and clean them up promptly.

Techniques

Flag Types

Release flags control the visibility of features in development. They are temporary — remove them once the feature is fully launched or abandoned.
Experiment flags support A/B testing and multivariate experiments. They are tied to a measurement period and removed when the experiment concludes.
Operational flags (ops flags) control system behavior: circuit breakers, rate limits, maintenance modes. These are often long-lived and act as runtime configuration.
Permission flags gate features to specific user segments: beta users, enterprise customers, internal teams. They may be long-lived but should be reviewed periodically.
Classify every flag at creation. The type determines its lifecycle and cleanup expectations.

Gradual Rollouts

Start by enabling the flag for internal users and employees. This is the cheapest smoke test.
Roll out to 1% of users, monitor metrics for a defined observation period, then increase to 5%, 10%, 25%, 50%, 100%.
Use consistent user bucketing (hash of user ID) so that the same user always sees the same variant within a rollout phase.
Define rollback criteria before starting the rollout: error rate thresholds, latency thresholds, user complaint volume.
Automate rollout progression where possible, with automatic pauses when health metrics degrade.

A/B Testing Integration

Feature flags are the mechanism; the experiment framework provides measurement and statistical analysis.
Assign users to experiment groups at the flag evaluation layer. Log the assignment for analysis.
Ensure experiment groups are mutually exclusive and collectively exhaustive for the relevant user population.
Run experiments for a statistically significant duration. Do not call an experiment early based on preliminary results.
Measure the metrics that matter (conversion rate, engagement, retention) not just the metrics that move first (click rate).

Kill Switches

Every new feature should have a kill switch: a flag that can disable the feature instantly without a deploy.
Kill switches should be evaluable without external dependencies. If the flag service is down, the kill switch should default to "off."
Test kill switches in staging before relying on them in production. Verify that disabling the flag cleanly removes the feature.
Document which kill switches exist and how to activate them. Include them in incident runbooks.

User Segmentation

Segment by user attributes: plan type, geography, language, account age, organization.
Segment by behavioral attributes: power users, new users, users who opted into beta.
Support allowlists and denylists for individual user overrides.
Keep segmentation rules simple. Complex targeting logic is hard to reason about and debug.
Log which segment a user matched for debugging and auditing.

Flag Evaluation Performance

Evaluate flags locally using a cached configuration, not by making a network call on every evaluation.
SDKs should initialize by fetching the flag configuration once, then evaluate locally from that snapshot.
Use streaming or webhook updates to push configuration changes rather than polling.
Flag evaluation should add less than 1ms to request processing. If it takes longer, the implementation needs optimization.
Handle SDK initialization failure gracefully. Define sensible defaults for every flag.

Configuration Management

Store flag definitions in a centralized flag management system (LaunchDarkly, Unleash, Flagsmith, or a custom service).
Track flag metadata: owner, creation date, type, expected removal date, description.
Use environments (development, staging, production) with independent flag configurations.
Require code review for flag configuration changes in production, just as you would for code changes.
Audit trail: log who changed which flag, when, and why.

Best Practices

Keep flag evaluation logic at the edges of your code. Do not scatter flag checks deep inside business logic.
Use a consistent pattern for flag usage: wrap the flagged behavior in a clear if/else block, never nest flag checks.
Write code for both paths (flag on and flag off). Test both paths. Do not assume the flag will always be on or always be off.
Set a cleanup date for every release and experiment flag at creation time. Add a calendar reminder or a tracking ticket.
Name flags descriptively: enable-new-checkout-flow, not flag-123 or test-feature.
Default to the safe state (usually "off" for new features, "on" for kill switches).
Monitor flag usage in code. Flags referenced in code but not in the flag service (or vice versa) indicate stale flags.

Anti-Patterns

Flag debt. Leaving old flags in the code indefinitely creates dead branches, confusion, and combinatorial complexity. Clean up flags within two weeks of full rollout.
Nested flag checks. Checking one flag inside another creates an exponential number of code paths. Keep flags independent.
Using flags for permanent configuration. Ops flags are acceptable, but a flag that will never be removed is just configuration. Put it in a config file.
No defaults. If the flag service is unreachable and there is no default value, the application crashes. Every flag evaluation must specify a default.
Testing only the happy path. If you only test with the flag on, you will discover bugs in the off path during an emergency rollback — the worst possible time.
Too many active flags. Having dozens of active flags creates a combinatorial explosion of application states that is impossible to test comprehensively. Keep the number of active flags small.
Flag-driven architecture. If your architecture depends on flags to function, you have coupled your system to a deployment mechanism. Flags should be removable without refactoring.

Install this skill directly: skilldb add autonomous-agent-skills

Get CLI access →

Feature Flag Implementation

Feature Flag Implementation

Philosophy

Techniques

Flag Types

Gradual Rollouts

A/B Testing Integration

Kill Switches

User Segmentation

Flag Evaluation Performance

Configuration Management

Best Practices

Anti-Patterns

Related Skills

Abstraction Control

Accessibility Implementation

API Design Patterns

API Integration

Assumption Validation

Authentication Implementation