JSON Manipulation
Advanced JSON processing and manipulation including deep merging, path-based access, schema validation, diffing, streaming large files, and preserving key ordering.
JSON Manipulation
You are an autonomous agent that frequently reads, transforms, and writes JSON data. JSON is deceptively simple — its flat syntax hides complexities around merging, ordering, precision, and scale. Treat JSON manipulation as structured data surgery: precise, predictable, and lossless.
Philosophy
JSON is the lingua franca of data interchange, but it was designed for simplicity, not for every data modeling need. Understand its limitations (no comments, no dates, no undefined, IEEE 754 floats only) and work within them. When transforming JSON, preserve what you do not explicitly intend to change. When producing JSON, be consistent in formatting and ordering so that diffs remain meaningful over time.
Techniques
Deep Merging
Shallow Object.assign or spread operators only handle top-level keys.
For nested structures, use recursive deep merge with a clear policy.
Does an array in the source replace or concatenate with the target array?
Do null values delete keys or set them to null?
Document your merge semantics before implementing.
Common strategies include replace-arrays (simpler, predictable), concat-arrays (additive), and merge-by-index (positional matching).
Path-Based Access
Use JSONPath or lodash-style dot notation (a.b[0].c) to access deeply nested values without manual traversal.
This makes code more readable and less fragile to structural changes.
When setting values at a path, create intermediate objects and arrays automatically.
Handle edge cases: what happens when the path crosses a non-object value?
Libraries like lodash.get, jmespath, and jsonpath-plus provide battle-tested implementations.
Schema Validation
Validate incoming JSON against a JSON Schema before processing.
This catches structural problems early with clear error messages rather than cryptic runtime failures deep in business logic.
Use additionalProperties: false when you want strict shape checking.
Validate after deserialization, not against raw strings.
Use schema composition ($ref, allOf, oneOf) to build reusable schemas for shared types.
Collect all validation errors rather than failing on the first one — this gives the caller a complete picture of what needs fixing.
Diffing JSON Objects
To compare two JSON objects, produce a structured diff that shows added, removed, and changed keys with their paths. Use RFC 6902 (JSON Patch) format for machine-readable diffs. For human-readable diffs, show the path, old value, and new value on each line. When diffing arrays, decide whether to compare by index or by a key field. Index-based diffing is simpler but produces noisy diffs when items are inserted or removed from the middle.
Handling Circular References
Standard JSON.stringify throws on circular references.
Detect cycles during serialization using a WeakSet of visited objects.
Replace circular references with a sentinel value like "[Circular]" or omit them entirely.
When building data structures, prefer DAGs over graphs when JSON serialization is needed.
If you must serialize graphs, use a reference-based scheme where each object has an ID and relationships are stored as ID references.
Streaming Large JSON Files
For JSON files that exceed available memory, use streaming parsers (SAX-style) that emit events for each token.
Libraries like jsonstream, ijson (Python), or System.Text.Json (C#) support this pattern.
Structure your data as JSON arrays of objects so each element can be processed independently.
For output, write opening brackets, stream individual objects with comma separation, and close the brackets.
NDJSON (one JSON object per line) is often a better choice for large datasets because each line can be parsed independently without a streaming parser.
JSON Patch and Merge Patch
RFC 6902 (JSON Patch) defines a sequence of operations (add, remove, replace, move, copy, test) for precise mutations. RFC 7396 (JSON Merge Patch) is simpler — it is a partial document where null means "delete this key." Choose Patch for complex, auditable changes; choose Merge Patch for simple partial updates. JSON Patch supports a "test" operation that can verify preconditions before applying changes, enabling optimistic concurrency control.
Preserving Key Ordering
JSON spec says objects are unordered, but humans and diffs care about ordering. When round-tripping JSON (read, modify, write), use parsers that preserve insertion order. Sort keys consistently when generating JSON for version control — alphabetical is the most common convention. When order matters semantically, use an array of key-value pairs instead of an object.
Handling Special Values
JSON has no representation for Infinity, NaN, undefined, Date objects, BigInt, or binary data. When serializing these values, define an explicit convention: dates as ISO 8601 strings, BigInts as strings with a type hint, binary as base64. Document these conventions in your API contract. When deserializing, use reviver functions to reconstruct typed values from their string representations.
Best Practices
- Always specify indentation when writing JSON meant for human reading or version control. Two spaces is conventional.
- Use
JSON.parsewith a reviver function to convert date strings, BigInt representations, or other typed values during deserialization. - When generating JSON programmatically, build the data structure in memory first, then serialize once. Do not construct JSON by string concatenation.
- Handle
undefinedvalues explicitly — they are silently dropped byJSON.stringify, which can cause subtle data loss. - For numeric precision, be aware that JSON numbers are IEEE 754 doubles. Integers beyond 2^53 lose precision. Use string representation for large IDs or monetary values.
- Validate at system boundaries (API inputs, file reads) but trust data within your own pipeline to avoid redundant checks.
- When writing JSON configuration files, include a
$schemareference so editors can provide autocompletion and validation. - Prefer JSONC or JSON5 for configuration files that humans will edit, but always produce strict JSON for API output.
- When logging JSON, use single-line format for log aggregation tools. Pretty-print only for human debugging.
- Use content-type
application/jsonconsistently in HTTP responses and verify it on receipt.
Anti-Patterns
- Building JSON with string concatenation — One unescaped quote or missing comma produces invalid JSON. Always serialize from data structures.
- Catching JSON.parse errors silently — A malformed input usually means an upstream bug. Log the error with context (source, first 200 chars of input) for debugging.
- Mutating shared JSON objects in place — Multiple consumers may hold references. Clone before mutation using
structuredCloneor a deep clone utility. - Using JSON for binary data — Base64 encoding inflates size by 33%. Use binary formats or multipart encoding for files and images.
- Storing large datasets as a single JSON file — This forces full deserialization for any access. Use NDJSON, a database, or partitioned files instead.
- Ignoring encoding — JSON must be UTF-8 per RFC 8259. Do not assume Latin-1 or other encodings.
- Relying on key order for logic — Using object key position as a data channel is fragile and confusing. Use arrays when order matters.
- Deeply nesting without schema documentation — A 10-level-deep JSON structure without a schema is unnavigable. Document the shape with JSON Schema or TypeScript types.
- Using eval() to parse JSON — This is a severe security vulnerability. Always use JSON.parse or a dedicated parser library.
Related Skills
Abstraction Control
Avoiding over-abstraction and unnecessary complexity by choosing the simplest solution that solves the actual problem
Accessibility Implementation
Making web content accessible through ARIA attributes, semantic HTML, keyboard navigation, screen reader support, color contrast, focus management, and WCAG compliance.
API Design Patterns
Designing and implementing clean APIs with proper REST conventions, pagination, versioning, authentication, and backward compatibility.
API Integration
Integrating with external APIs effectively — reading API docs, authentication patterns, error handling, rate limiting, retry with backoff, response validation, SDK vs raw HTTP decisions, and API versioning.
Assumption Validation
Detecting and validating assumptions before acting on them to prevent cascading errors from wrong guesses
Authentication Implementation
Implementing authentication flows correctly including OAuth 2.0/OIDC, JWT handling, session management, password hashing, MFA, token refresh, and CSRF protection.