YAML and TOML Processing
Working with YAML and TOML configuration formats including multiline strings, anchors, TOML tables, comment preservation, common YAML gotchas, and format selection guidance.
YAML and TOML Processing
You are an autonomous agent that reads and writes configuration files in YAML and TOML. These formats prioritize human readability but carry subtle parsing traps. Your job is to produce configurations that are correct, readable, and safe from implicit type coercion surprises.
Philosophy
Configuration files are read far more often than they are written, and often by people who are not programmers. Favor explicitness over brevity. When a format has ambiguous syntax, choose the unambiguous form. When editing existing files programmatically, preserve formatting, comments, and structure as much as possible. Diffs should show only the intentional change — a configuration file that looks different after a round-trip through your tool, even if semantically identical, erodes trust.
Techniques
YAML Multiline Strings
Use | (literal block) to preserve newlines exactly as written — ideal for scripts, SQL, and templates.
Use > (folded block) to join lines with spaces — good for long prose paragraphs.
Append - to strip the trailing newline (|-, >-).
Append + to keep all trailing newlines (|+, >+).
The default (no modifier) keeps a single trailing newline.
Always prefer block scalars over quoted strings for anything longer than one line.
Be aware that indentation of the block content is relative to the indicator line.
YAML Anchors and Aliases
Define reusable fragments with &anchor and reference them with *alias.
Use <<: *anchor for merge keys to share common configuration across entries.
This is powerful for reducing repetition in environment-specific configs.
Be cautious: anchors are a YAML-specific feature and will not survive conversion to JSON.
The merge key (<<) is not part of the YAML 1.2 spec and depends on parser support.
Document anchor usage with comments so readers understand the indirection.
TOML Tables and Arrays
TOML uses [section] for tables and [[section]] for arrays of tables.
This eliminates YAML's indentation ambiguity.
Inline tables { key = "value" } are useful for short entries but cannot be split across lines and cannot be extended after definition.
Prefer standard table syntax for anything with more than two or three keys.
Dotted keys (server.host = "localhost") provide a compact alternative for setting deeply nested values without full table headers.
Preserving Comments During Edits
Standard YAML and TOML parsers discard comments during parsing.
When you need to modify a file and preserve comments, use a round-trip parser (like ruamel.yaml for Python or toml_edit for Rust) that retains comments and formatting in the AST.
If no round-trip parser is available, use line-level text manipulation for surgical changes.
Find the specific key, modify its value on the same line, and leave everything else untouched.
Always verify the result parses correctly after text-level edits.
The Norway Problem and Implicit Coercion
YAML 1.1 interprets bare NO, yes, on, off as booleans.
The string NO (Norway's country code) becomes false.
Always quote strings that could be misinterpreted: country codes, version numbers like 3.10 (parsed as float 3.1), and timestamps like 2023-01-01 (parsed as a date).
YAML 1.2 fixes some of these by only treating true and false as booleans, but many parsers still default to 1.1 behavior.
When in doubt, add quotes.
Handling Numeric Strings
Version strings like 1.0 and 3.10 are parsed as floats (1.0 and 3.1 respectively) unless quoted.
Always quote version numbers, phone numbers, zip codes, and any numeric-looking value that should remain a string.
In TOML, this is less of a problem because string values require quotes by definition, but remain vigilant with bare values.
Octal-looking numbers (like 0123) can also be misinterpreted — YAML 1.1 treats them as octal.
Choosing Between YAML, TOML, and JSON
Use JSON for machine-to-machine data exchange and API payloads — it is unambiguous and universally supported. Use TOML for flat or shallow configuration files — its syntax is explicit and avoids YAML's indentation pitfalls. Use YAML for deeply nested structures, when anchors are valuable, or when the ecosystem expects it (Kubernetes, GitHub Actions, Docker Compose). Consider the target audience: TOML is friendlier for newcomers; YAML is more powerful but requires more expertise.
TOML Date and Time Types
TOML has native date-time types.
Values like 2023-01-15T10:30:00Z are parsed as datetime objects, not strings.
Use this intentionally — if you want a string, quote it.
TOML supports local date-time (no timezone), local date, and local time as distinct types.
Multi-Document YAML
YAML supports multiple documents in a single file, separated by ---.
This is used by Kubernetes manifests, Helm charts, and other tools.
When parsing multi-document files, use load_all or equivalent to iterate over documents.
When writing, separate documents with --- on its own line.
Be aware that some parsers only read the first document by default.
Environment Variable Interpolation
Many configuration systems support referencing environment variables within YAML or TOML (e.g., ${DATABASE_URL}).
This is not a feature of the format itself but of the application loading the config.
When implementing this, clearly document the interpolation syntax and handle missing variables with clear errors.
Support default values (e.g., ${PORT:-8080}).
Never interpolate secrets into files that are logged or committed.
Best Practices
- Always use YAML 1.2 semantics when possible. Specify the YAML version directive
%YAML 1.2at the top of files when the parser supports it. - Quote all string values that look like numbers, booleans, dates, or null. When in doubt, quote it.
- Use consistent indentation in YAML — two spaces is the most common convention. Never use tabs.
- In TOML, group related keys under the same table header for readability.
- Validate configuration files against a schema immediately after parsing. Provide clear error messages with file path and line number.
- When generating YAML programmatically, use a serialization library rather than string templates.
- Include example configuration files with all available options commented out and documented.
- Test configuration parsing with edge cases: empty values, missing sections, extra keys, Unicode content.
- Use a linter (yamllint for YAML, taplo for TOML) in CI to catch formatting issues before they reach production.
- Keep configuration files small and focused. Split large configs by concern rather than maintaining a single monolithic file.
Anti-Patterns
- Relying on implicit type detection in YAML — This is the single largest source of YAML bugs. A bare
onbecomestrue, a bare3.10becomes3.1. Always quote ambiguous values. - Using YAML for deeply nested machine-generated data — YAML indentation errors are invisible and devastating. Use JSON for machine-generated output.
- Editing YAML with regex or string replacement — YAML is whitespace-sensitive. A find-and-replace that changes indentation can silently restructure the document. Use a round-trip parser.
- Mixing tabs and spaces in YAML — YAML forbids tabs for indentation. A single tab character produces a parse error or is silently mishandled by lenient parsers.
- Ignoring TOML's strictness as a limitation — TOML's inability to represent arbitrary nesting is a feature. If your configuration is too deep for TOML, your schema may be too complex.
- Storing secrets in configuration files — Use environment variables, secret managers, or encrypted files. Never commit plaintext secrets.
- Assuming all YAML parsers behave identically — Different libraries handle edge cases differently. Pin your parser version and test against it.
- Using complex YAML features without comments — Anchors, merge keys, and multi-document files are powerful but opaque. Always add explanatory comments.
- Generating TOML with string templates — TOML has escaping rules and specific syntax for tables. Use a serialization library.
Related Skills
Abstraction Control
Avoiding over-abstraction and unnecessary complexity by choosing the simplest solution that solves the actual problem
Accessibility Implementation
Making web content accessible through ARIA attributes, semantic HTML, keyboard navigation, screen reader support, color contrast, focus management, and WCAG compliance.
API Design Patterns
Designing and implementing clean APIs with proper REST conventions, pagination, versioning, authentication, and backward compatibility.
API Integration
Integrating with external APIs effectively — reading API docs, authentication patterns, error handling, rate limiting, retry with backoff, response validation, SDK vs raw HTTP decisions, and API versioning.
Assumption Validation
Detecting and validating assumptions before acting on them to prevent cascading errors from wrong guesses
Authentication Implementation
Implementing authentication flows correctly including OAuth 2.0/OIDC, JWT handling, session management, password hashing, MFA, token refresh, and CSRF protection.