PromptFork

Veo talking-character scene — lip-sync optimization and the reaction shot technique

Veo's native speech generation is its killer feature — a clip template optimized for clean lip-sync (shorter sentences, consonant-heavy words), plus the reaction-shot technique for variety and the audio mixing levels that make dialogue sound professional.

Open in Studio
Prompt
Veo can generate synced speech — its standout feature. Here's how to get the cleanest results:

BASE TEMPLATE:
'[Shot size — see shot guide] of [character description] [action while speaking], in [setting], [lighting], [mood/style].
The character says: "[YOUR LINE — see lip-sync optimization]"
Audio: [ambient sound at low level], natural lip-sync, [tone of voice], [dialogue -6dB above ambient].'

Example: 'Medium close-up of a friendly barista wiping the counter and looking up with a warm smile, in a cozy morning cafe with exposed brick, warm window light with soft shadows, documentary style.
The character says: "Morning! The usual today?"
Audio: quiet espresso machine hum, gentle background chatter at low level, natural lip-sync, warm and friendly tone, dialogue clear above ambient.'

LIP-SYNC OPTIMIZATION (what actually syncs well vs what doesn't):
• SHORTER SENTENCES SYNC BETTER: Keep each line to 5-12 words. Longer sentences accumulate sync drift. If you need more dialogue, split into separate clips.
• CONSONANT-HEAVY WORDS ARE CLEARER: Words with strong lip movements (P, B, M, F, V, W, TH) sync more visibly than vowel-heavy words. 'Perfect morning for a fresh brew' syncs better than 'Oh, I see you are here again.'
• AVOID: Questions with rising intonation at the end (sync often drifts), words with silent letters, rapid-fire dialogue, whispering (lip movement too subtle to sync).
• BEST: Declarative sentences, greetings, short questions, exclamations.

THE REACTION SHOT TECHNIQUE (for longer dialogue scenes):
Don't try to put a full conversation in one clip. Instead:
• Clip 1: Character A speaks their line (medium close-up)
• Clip 2: Character B LISTENING and REACTING (close-up — nodding, smiling, furrowing brow) with Character A's voice continuing as voiceover
• Clip 3: Character B responds
This is how real filmmakers handle dialogue — the reaction shot is often more interesting than the speaking shot. It also hides any lip-sync imperfections because the listener's mouth isn't supposed to be moving.

SHOT SIZE GUIDE FOR DIALOGUE:
• Close-up (face fills frame): Best for emotional lines, confessions, intensity. Lip-sync is most scrutinized here — use your strongest lines.
• Medium close-up (head + shoulders): THE SWEET SPOT. Close enough to read lips and emotion, forgiving enough that minor sync issues aren't distracting.
• Medium shot (waist up): Good when body language matters (gesturing, working while talking). Lip-sync less critical at this distance.
• Wide shot: Avoid for dialogue — lips too small to sync meaningfully.

AUDIO MIXING GUIDANCE:
• Dialogue should sit approximately -6dB above ambient sound. In practice: describe ambient sounds as 'quiet,' 'low level,' 'subtle background' and the voice as 'clear,' 'prominent,' 'warm tone.'
• Always include at least one ambient sound — pure silence feels uncanny. Even 'room tone, quiet air conditioning hum' adds realism.
• If adding music in post: keep it -12dB below dialogue. Music should support, never compete with speech.

Tips: keep dialogue to ONE short line per clip for the cleanest sync; put the spoken words in quotes so Veo treats them as literal speech; describe the voice TONE (warm, authoritative, playful, conspiratorial) not just the words — delivery matters; for a conversation between two characters, generate each character's lines as separate clips and cut them together in editing.
Source
promptfork seed
License
CC-BY-4.0
Published
6/22/2026

More prompts you might like

Veo prompt with audio + camera direction

A structured Veo template that uses its strengths — camera moves and synced audio — laid out field by field.

New

Veo product ad — voiceover tone mapping, the 'button moment,' and music-VO relationship

A commercial template engineered for conversion — with the voiceover tone that matches your product category, the precise timing for the product-name 'button moment,' and the music-to-voiceover relationship that professionals use.

New

Veo image-to-video with sound

Upload a still to Veo, add believable motion plus matching ambient audio — the sound is what sells the realism.

New

Sora cinematic drone / aerial establishing shot

A Sora prompt for a smooth cinematic aerial shot with camera move, lighting, and mood.

New

Runway Gen-2 VFX element for compositing

Prompt an isolated VFX element (smoke, sparks, energy) on black for easy compositing.

New

YouTube first 30 seconds engineered for maximum retention (with re-hooks by content type)

Script the first 30 seconds using curiosity gap theory and pattern interrupts — with the critical first-3-second visual hook, B-roll direction, and re-hook lines tailored to tutorials vs commentary vs storytelling.

New