visual-storytelling-auditor
Detects the #1 AI screenplay failure: writing for the page instead of the screen.
Identifies when a screenplay fails to use the visual medium — the most fundamental AI screenwriting failure. Film is pictures that move. If you can close your eyes and lose nothing, it's not a screenplay. ## Key Points - Scenes are mostly people sitting and talking - The script would work equally well as a radio play - Action lines are sparse, vague, or describe internal states - No visual storytelling — all meaning conveyed through dialogue - Coverage notes say "uncinematic" or "feels like a stage play" - Every scene is set in generic interiors (office, apartment, restaurant, car) - Thinks, wonders, considers, remembers, realizes, understands, knows, feels - Any verb describing internal cognition or emotion - Abstract nouns: weight, tension, gravity, atmosphere, sense, feeling - 70%+ silent: excellent visual storytelling - 50-70%: adequate but dialogue-dependent - Below 50%: the screenplay is a radio play with pictures ## Quick Example ``` Sarah thinks about her mother's death. He remembers the first time they met. She feels a wave of regret wash over her. Mark considers his options carefully. The weight of the decision hangs over her. ``` ``` Sarah touches the empty chair at the head of the table. He pulls a faded photo from his wallet. Doesn't look at it — just holds it. She picks up the phone. Puts it down. Picks it up again. Mark stares at the two doors. Steps toward one. Stops. Sarah stands at the edge of the cliff. Wind whips her hair. She doesn't step back. ```
skilldb get screenplay-audit-skills/visual-storytelling-auditorFull skill: 248 linesVisual Storytelling Auditor
Identifies when a screenplay fails to use the visual medium — the most fundamental AI screenwriting failure. Film is pictures that move. If you can close your eyes and lose nothing, it's not a screenplay.
When to Use This Skill
- Scenes are mostly people sitting and talking
- The script would work equally well as a radio play
- Action lines are sparse, vague, or describe internal states
- No visual storytelling — all meaning conveyed through dialogue
- Coverage notes say "uncinematic" or "feels like a stage play"
- Every scene is set in generic interiors (office, apartment, restaurant, car)
The Visual Storytelling Failures
Failure 1 — Talking Heads
Scenes where two characters sit in a fixed location and exchange dialogue with minimal physical action. This is television from the 1950s, not cinema.
AI's default scene:
INT. COFFEE SHOP - DAY
Sarah and Mark sit across from each other.
SARAH
I've been thinking about what
you said.
MARK
And?
SARAH
You were right. I need to make
a change.
(they continue talking for 2 pages)
Cinematic version:
The conversation happens while they DO something — walking through a farmer's market, driving, cleaning up after a party, building something, waiting in a hospital hallway. The environment creates visual interest, the activity creates rhythm (pauses to examine produce, react to traffic, scrub a dish), and the location adds meaning.
Diagnostic: For each dialogue scene, note the location and what characters are physically doing. If the answer is "sitting" or "standing" with no significant physical action for more than a page, it's a talking head scene.
Talking head percentage: Count talking-head scenes / total scenes with dialogue. Above 40% = the script is visually inert.
Failure 2 — Invisible Action Lines
AI writes action lines that describe things the camera can't show.
Unfilmable action lines:
Sarah thinks about her mother's death.
He remembers the first time they met.
She feels a wave of regret wash over her.
Mark considers his options carefully.
The weight of the decision hangs over her.
None of these can be filmed. A camera can only capture what is VISIBLE and AUDIBLE.
Filmable replacements:
Sarah touches the empty chair at the head of the table.
He pulls a faded photo from his wallet. Doesn't look at it — just holds it.
She picks up the phone. Puts it down. Picks it up again.
Mark stares at the two doors. Steps toward one. Stops.
Sarah stands at the edge of the cliff. Wind whips her hair. She doesn't step back.
Detection method: Search action lines for:
- Thinks, wonders, considers, remembers, realizes, understands, knows, feels
- Any verb describing internal cognition or emotion
- Abstract nouns: weight, tension, gravity, atmosphere, sense, feeling
Each is a flag. Replace with visible behavior.
Failure 3 — Dialogue Carrying All Information
AI uses dialogue to convey information that should be communicated visually.
Dialogue-dependent:
SARAH
This house is falling apart.
Look at the cracks in the walls.
Visual storytelling:
Sarah runs her hand along a crack in the plaster.
It crumbles at her touch. A CHUNK falls to the floor.
She looks up. Water stains spread across the ceiling
like a bruise.
The audience sees it. They don't need to be told.
The silent movie test: Could you tell this story as a silent film? What percentage of the plot, emotion, and character would survive without any dialogue?
- 70%+ silent: excellent visual storytelling
- 50-70%: adequate but dialogue-dependent
- Below 50%: the screenplay is a radio play with pictures
Failure 4 — Static Staging
All scenes take place in contained, static locations. No movement, no changing environments, no visual progression.
AI's location defaults:
- INT. LIVING ROOM
- INT. OFFICE
- INT. RESTAURANT
- INT. CAR (parked)
- INT. BEDROOM
Cinematic staging:
- Move the conversation to a location that creates visual tension (a construction site, a boat, a crowded subway, a hospital)
- Have characters walk through spaces instead of sitting in them
- Change the environment during the scene (lights go out, rain starts, crowd arrives)
- Use vertical space (characters on different levels — rooftop, stairwell, balcony)
- Use the background (something happening behind the characters that comments on or contrasts with the dialogue)
Failure 5 — No Visual Motifs
AI doesn't plant and pay off visual elements across the script. Great screenplays have recurring images that accumulate meaning.
Visual motif examples:
- A clock that appears in three scenes, each time showing less time
- A red coat visible in a sea of grey in multiple scenes
- Water imagery that shifts from calm to turbulent as the story escalates
- A door that's open in Act 1, closed in Act 2, broken in Act 3
- An empty chair that represents an absent character
Diagnostic: Can you identify 3+ recurring visual elements in the script? If not, the script lacks visual language.
Failure 6 — No Environmental Storytelling
The environment doesn't reflect, contrast with, or comment on the dramatic action.
AI approach: Locations are neutral containers. The coffee shop is just a coffee shop.
Cinematic approach: Locations are active storytelling partners:
- A breakup in a restaurant where other couples are celebrating anniversaries
- A job interview in an office where the interviewer's walls display achievement awards the protagonist will never have
- A reunion at a playground where children play — emphasizing the passage of time
- A confession during a storm, where the thunder covers the most important words
Failure 7 — Props and Objects Unused
AI doesn't use physical objects as storytelling devices. Great screenplays are full of objects that carry meaning.
Object storytelling:
- A wedding ring removed, placed on a nightstand, noticed by another character
- A gun in a drawer that's opened twice before it's used
- A letter that's written, crumpled, uncrumpled, rewritten, and finally burned
- A toy that connects a parent's storyline to a child's
- A suitcase packed and unpacked three times before the character actually leaves
Visual Storytelling Score
For each scene, rate:
| Dimension | Score 1-5 |
|---|---|
| Physical action (characters doing, not just talking) | |
| Location specificity (could only happen here) | |
| Environmental storytelling (setting adds meaning) | |
| Object/prop use (things carry weight) | |
| Filmability (action lines describe visible/audible things) | |
| Visual information (camera tells us things dialogue doesn't) |
Manuscript-Level Metrics
VISUAL STORYTELLING REPORT:
Talking head scenes: 24 of 42 (57%) — CRITICAL
Unfilmable action lines: 67 instances
Scenes with significant physical action: 12 of 42 (29%)
Unique locations: 8 (low variety)
Visual motifs identified: 0
Props/objects used dramatically: 2
OVERALL VISUAL SCORE: 28/100
DIAGNOSIS: This screenplay is a radio play. Major visual
storytelling overhaul needed.
Rewrite Toolkit
Convert Dialogue to Visual
For every flagged scene, provide a visual alternative:
CURRENT (dialogue-dependent):
Sarah tells Mark she's leaving.
VISUAL REWRITE OPTIONS:
a) Mark comes home to find the closet half-empty. Her side of the
medicine cabinet is bare. The silence tells him.
b) Sarah packs while Mark watches. She reaches for the photo on
the nightstand. Leaves it. That's the moment he knows.
c) The moving truck is already there when Mark turns the corner.
He stops walking. Stands in the middle of the street.
The Five Visual Questions
For every scene, ask:
- What does the audience SEE that they don't HEAR?
- What is in the BACKGROUND that comments on the FOREGROUND?
- What OBJECT in this scene carries emotional weight?
- What would this scene look like with NO DIALOGUE?
- What does the LOCATION tell us that the characters don't?
Anti-Patterns
- Demanding constant visual pyrotechnics. Some scenes should be two people talking. The question is whether the SCRIPT AS A WHOLE has visual storytelling, not whether every scene is an action sequence.
- Over-directing in action lines. "CAMERA PUSHES IN on her trembling hand" is directing on the page. Write what the audience sees, not how the camera should capture it.
- Eliminating all dialogue. The goal isn't silence — it's balance. Great screenplays use dialogue AND visuals, each doing what it does best.
- Confusing visual with expensive. A character staring at an uneaten birthday cake is visual storytelling. It costs nothing. Visual doesn't mean VFX.
- Ignoring genre. A courtroom drama will be more dialogue-heavy than an action film. But even courtroom scenes can use visual storytelling (reactions, objects, spatial dynamics).
Install this skill directly: skilldb add screenplay-audit-skills
Related Skills
Act Structure Mapper
Deep structural analysis for screenplays in any format. Maps the script's actual act breaks,
ai-dialogue-detector
Detects AI-generated dialogue patterns specific to screenplays: on-the-nose dialogue
Animation Script Checker
Format-specific checker for animated screenplays across kids, adult, film, stop-motion,
character-flattening-screenplay
Detects AI character flattening in screenplays — where characters lose complexity
Cross-Episode Continuity Checker
Specialized for multi-episode works including limited series, ongoing series, and web series.
Dialogue Subtext Analyzer
Analyzes screenplay dialogue for subtext depth. Scores each exchange on a 1-5 subtext scale,