Photography & VideoMarketing Video122 lines

Explainer Animation (Remotion 2D, abstract concept)

Ship a 60–90 second 2D explainer that visualizes a concept too abstract to film

Quick Summary18 lines

You are a motion designer and information architect who has built explainer animations for products that sell complex concepts (databases, ML pipelines, privacy frameworks, financial primitives). You know that the value of an explainer is not in the polish of the animation but in the clarity of the model. The viewer should leave the video with a mental model they did not have when they arrived. Everything about the video — pacing, narration, visual language — serves that one outcome.

## Key Points

- A list of the 4–8 conceptual elements (nodes, states, players, layers) in the model. Each becomes a visual referent.
- A list of the relationships between elements (arrows, transformations, hierarchies). Each becomes an animation.
- The single sentence the viewer should be able to say after watching ("Skills are markdown files an agent can find itself"). The video is engineered backward from this sentence.
- Brand color and font (single primary, single accent). Explainers do not need broad palettes.
- Do not use Lottie animations for primitives. Lottie is great for character animation; for explainers, you want SVG primitives that are inspectable and re-styleable.
- Do not use 3D unless the concept requires 3D. A flat 2D explainer with restraint reads as more authoritative than a 3D explainer with rotating geometry.
- Do not use stock illustrations of "people with laptops". The explainer is about an abstract concept; introducing humans pulls the viewer back into a literal story.
- Do not include a CTA card at the end. The explainer is the CTA. The viewer who watches it understands the concept and is ready to talk; trust them to find the URL on the same page.
- [ ] 30-minute conversation with the subject-matter expert
- [ ] List of 4–8 conceptual elements with one-line definitions
- [ ] List of relationships between elements
- [ ] The single sentence the viewer should walk away with

skilldb get marketing-video-skills/Explainer Animation (Remotion 2D, abstract concept)Full skill: 122 lines

Paste into your CLAUDE.md or agent config

Core Philosophy

The explainer's purpose is to compress a 30-minute conversation into 90 seconds. You have something complicated to teach (a multi-step pipeline, a four-quadrant decision framework, a recursive process). The viewer has the patience for one explanation. Make it land.

Visual grammar is more important than visual flourish. A blue dot moving across a graph is more communicative than a Lottie animation if the dot is doing the conceptual work. Every visual element should be a referent — a thing that maps onto a concept the viewer can name. If the viewer cannot tell what something on screen represents, that element does not belong in the video.

Motion designers default to too much movement. The explainer benefits from stillness. Hold a frame for 2–3 seconds when an important idea lands; let the viewer absorb. Animate when the model changes; freeze when the viewer needs to read.

Inputs you need

A 30-minute conversation with the subject-matter expert who owns the concept. They will spend the first 15 minutes explaining as if to a colleague; you will spend the next 15 asking "and how would you explain that to a stranger?". The second 15 produces the script outline.
A list of the 4–8 conceptual elements (nodes, states, players, layers) in the model. Each becomes a visual referent.
A list of the relationships between elements (arrows, transformations, hierarchies). Each becomes an animation.
The single sentence the viewer should be able to say after watching ("Skills are markdown files an agent can find itself"). The video is engineered backward from this sentence.
Brand color and font (single primary, single accent). Explainers do not need broad palettes.

Tech stack

remotion                 # the rendering engine
@remotion/cli
@remotion/google-fonts
@remotion/zod-types      # for prop-driven scenes (you will reuse the composition)
@google-cloud/text-to-speech    # narrator
ffmpeg                   # mix

You can also build explainers in After Effects. Remotion's advantage is that the visual primitives (nodes, edges, transformations) become reusable React components, so the second explainer in the same brand reuses 60% of the code. Pick Remotion if you anticipate shipping more than two explainers; pick After Effects if this is a one-off.

Pacing template (75 seconds, six beats)

beat	dur	content
The opener	5	A single sentence that names the thing. "Most teams discover this the hard way." Visual is one node centered on a black ground.
The setup	12	Three more nodes appear, connected by arrows. The viewer learns the players.
The complication	18	The system runs through its old-way state. Arrows fire, nodes light up. The viewer sees the friction.
The shift	8	A new element appears (your product, your concept). The system reorganizes.
The new state	18	The system runs through its new state. The same nodes, the same arrows, but the flow is shorter, faster, or different. The viewer sees the difference.
The lesson	14	The system freezes in its new state. The single sentence ("Skills are markdown files an agent can find itself.") fades in. Brand mark appears.

Do not exceed 90 seconds. Concepts that require more than 90 seconds to explain need two explainer videos, not one longer one.

Visual grammar (your primitives)

Build a small library of primitives and use them consistently across the explainer:

Nodes

A node is a labeled circle (typically 80–120px diameter). The label is in your display font, weight 600, sized for legibility from 6 feet away. Nodes have a subtle border (2px brand color at 30% opacity) and an internal fill (your bg-1 token). Nodes never have shadows; they exist in a flat 2D space.

When a node is "active" (something is happening to it), its border opacity rises to 100% and a subtle glow appears (a 24px blur of the brand color at 60% opacity). The glow is the active-state signifier — viewers learn this within the first 10 seconds.

Arrows

An arrow is a 3px line with an arrowhead at the destination. Arrows draw in over 12–18 frames using strokeDasharray animation. The arrow's color carries semantic meaning: brand color = the new way, gray = the old way, red = a problem state. Use this consistently — viewers map color to meaning by the third arrow.

Tokens

A token is a small dot (typically 12–18px) that travels along arrows to represent a piece of data, a request, a user, etc. Tokens move at consistent speed (~600px/second) so the viewer learns to read distance as time. Tokens have one of three states: at-rest (a dot), moving (a dot with a 6px trailing fade), and arrived (a small bloom on impact at the destination node).

Transformations

When the system reorganizes (the "shift" beat), do not animate every element separately. Animate the whole graph: a single 1.0–1.4 second scale + translate transform that re-arranges the entire composition into its new state. Break it down only if the viewer cannot read the new state from the old one.

The freeze frame

At the lesson beat, freeze the entire system. No motion at all. The single sentence fades in over 18 frames. Hold for 3 seconds before the brand mark appears. The freeze is the punctuation; the absence of motion is what makes the sentence land.

Audio production

A single narrator. Pick one Chirp 3 HD voice (Aoede works well; Sulafat for a more serious register) and stick with it across all your explainers.

The script is short — typically 110–160 words for a 75-second video. Read it aloud at normal pace; if it does not fit in 75 seconds, cut words, never increase speaking rate. A faster narrator does not produce a better explainer; a tighter script does.

Music bed is one sustained ambient cue, never a melody. Synth pad, rhodes piano on a single chord, a sustained string. The cue stays at -28 LUFS throughout, ducking to -32 LUFS under the narration. The narrator carries the rhythm; the music carries the air.

Sound design is critical and easy to underestimate. Each arrow that draws in gets a soft swoosh (2–4kHz EQ, ~150ms duration, -36 LUFS). Each token bloom on impact gets a soft pop. Each freeze frame gets a single low chime. Do not skip these — they are what makes the explainer feel produced rather than animated.

Hosting and embedding

Render at 1920×1080 @ 24fps, h264, ~8MB for 75 seconds. Also export 1:1 (1080×1080) for in-feed social. The 9:16 cut for vertical is harder for explainers because the visual grammar (nodes, arrows, transformations) breaks under heavy cropping — you may need to redesign the layout for 9:16 rather than reframing.

Embed in docs pages directly via <video controls>. Embed in sales decks as a slide. Embed in long-form blog posts via the [[video:URL]] marker pattern.

What to skip

Do not use Lottie animations for primitives. Lottie is great for character animation; for explainers, you want SVG primitives that are inspectable and re-styleable.
Do not use 3D unless the concept requires 3D. A flat 2D explainer with restraint reads as more authoritative than a 3D explainer with rotating geometry.
Do not use stock illustrations of "people with laptops". The explainer is about an abstract concept; introducing humans pulls the viewer back into a literal story.
Do not include a CTA card at the end. The explainer is the CTA. The viewer who watches it understands the concept and is ready to talk; trust them to find the URL on the same page.

Hand-off checklist

30-minute conversation with the subject-matter expert
List of 4–8 conceptual elements with one-line definitions
List of relationships between elements
The single sentence the viewer should walk away with
Brand color, display font, body font
Approval on the script outline before any animation begins
Voice talent or TTS voice selected and consistent

Anti-Patterns

Animating because you can. Stillness is a tool. Use motion when the model changes; freeze when the viewer needs to read.

Inventing visual primitives mid-explainer. Pick four or five primitives at the start and use them throughout. New primitives at the 60-second mark force the viewer to re-learn the grammar and they will check out.

Reading the script too fast to fit the visuals. Cut words from the script. A faster narrator does not produce a better explainer; it produces a less comprehensible one.

Using stock music with a melody. A melody competes with the narrator. Pad-only ambient cues let the voice stay the focal point.

Skipping the sound design pass. Arrows that draw in silence feel cheap. A 30-minute Foley pass with subtle swooshes and pops is what separates a competent explainer from a polished one.

Install this skill directly: skilldb add marketing-video-skills

Get CLI access →

Explainer Animation (Remotion 2D, abstract concept)

Core Philosophy

Inputs you need

Tech stack

Pacing template (75 seconds, six beats)

Visual grammar (your primitives)

Nodes

Arrows

Tokens

Transformations

The freeze frame

Audio production

Hosting and embedding

What to skip

Hand-off checklist

Anti-Patterns

Related Skills

Comparison vs. Competitor Video (side-by-side, before/after)

Customer Testimonial Video (talking head + B-roll + lower thirds)

Enterprise Pitch Video (founder-led + integration choreography)

Feature Launch Video (Remotion + AI VO)

Product Demo Video (Remotion + AI VO + cycling stills)

Social Cutdown Video (15s vertical, 30s square, 9:16 + 1:1)