Core Principles

Treat prompt writing as storytelling. Use coherent and natural language instead of stacking isolated keywords.
Only describe dynamic events. Do not describe static visual details already visible in the input image (e.g., clothing, jewelry).
Follow clarity, non-contradiction, and minimal negation. Use specific and concrete descriptions and guide actions step by step.

Best Practice Template

Recommended Structure

Camera movement
+
Emotion of the speaking character
+
Speaking state
(talking / crying / singing / …)
+
Specific actions
+
(Optional) Background events / Other characters' actions

Complete Example

"The camera slowly moves from the side to a medium-front shot.
A young woman is sitting by the window, calm, gently picking up the coffee cup and smiling as she talks to the camera.
A boy beside her quietly watches her and occasionally turns to the camera and smiles.
The lights in the background are gently flickering."

Note: Including "the character talks/sings" improves lip-sync performance.

Writing Examples

Continuous Actions & Movement Control

Use "first… then…" to choreograph multi-step actions.
Always include "talking" or "singing" if the character must speak.

Explosion Turn Back Action
"The man walks forward first, then stops and puts his hands on his hips while speaking.
Then he turns around to look at the explosion behind him, showing his back.
The clothes on his back have been blown to pieces."
Explosion Turn Back Action
Walking Singing Follow Shot
"The camera follows. The man turns around, looks at the camera and walks forward, singing.
He first touches his collar with both hands, then spreads his hands and raises his head."
Walking Singing Follow Shot

Speak First Then Gesture / Gesture First Then Speak

Remove Cigarette Before Speaking
"The man quickly took the cigarette out of his mouth, then started speaking to the camera."
Remove Cigarette Before Speaking
Speaking Turn And Run
"The camera zoomed in. The woman spoke to the camera, and after finishing, turned around and ran backward."
Speaking Turn And Run

Multi-Character Action Control

Describe interaction between characters.

Subject Detection

Required to specify who speaks or sings.
One subject → pass its mask_url
Multiple subjects → pass list of mask_urls
Boy Catches Up And Talks
"Following the camera, the boy quickened his pace to catch up with the girl and then started talking to her.
The girl stopped, turned around impatiently, and stared at the boy."
Boy Catches Up And Talks
Torch Man Speaks Then Looks At Camera
"Following the camera, the man holding a torch on the left looked at the others while speaking.
Then he turned back to look into the camera."
Torch Man Speaks Then Looks At Camera

Camera Movement Control

Precisely control movement type, speed, and end position.
Use "rapidly / quickly" for stronger responsiveness.

Torch Man Speaks Then Looks At Camera
"Pan the camera to the right in a circular motion and focus on the man's front face.
The man quickly takes off his glasses and speaks while facing forward."
Torch Man Speaks Then Looks At Camera
Circular Pan Rapid Zoom In
"Pan the camera to the right in a circular motion.
The character speaks while facing forward.
Then quickly zoom in and focus on the character's front face.
The character puts down the binoculars and speaks."
Circular Pan Rapid Zoom In

Emotional Level & Micro-Expression Control

Calm To Panic Transition
"First, the man looked calm and spoke to the camera.
Then he suddenly became nervous and panicked, frowning tightly."
Calm To Panic Transition
Contempt To Pride Sequence
"First, the woman rolled her eyes disdainfully and spoke with contempt.
Then she supported her chin with her index finger.
Finally, she played with the box on the table with a proud expression."
Contempt To Pride Sequence

Special Effects / Adding Unreal Elements

Distorted Air Ripple Absorption
"Distorted air ripples appear.
The father and son are sucked into the depths like dust and disappear instantly,
leaving only the remote control and phone on the empty sofa."
Distorted Air Ripple Absorption
Evil Sunglasses Chick
"The chick puts on sunglasses, holds two guns, and speaks evilly."
Evil Sunglasses Chick

Camera Focus Control (Depth of Field)

Focus On Fallen King
"The reporter reports to the camera.
The king behind collapses.
The camera focuses on the collapsed king.
The reporter quickly turns around and speaks while looking behind."
Focus On Fallen King
Focus On Soldier Behind
"The woman turns around and speaks while looking at the soldier behind.
The camera focuses on the soldier.
Then she turns back to speak to the camera."
Focus On Soldier Behind

Environmental Interaction Description

Gentle Breeze Interaction
"A gentle breeze blows.
The character's clothes and hair flutter slightly.
The leaves in the background sway."
Gentle Breeze Interaction
Selfie Orangutan Color Changing Lava
"The orangutan maintains a selfie pose and talks to the camera.
The volcanic lava behind gradually turns blue."
Selfie Orangutan Color Changing Lava

Precautions & Recommendations

Recommended Practices

Upload high-resolution input images.
Complex scenes may require multiple generations.
If prompts do not work well, try omitting them.

Avoiding Common Issues

Structural stability is best within ~15 seconds.
Very small face region may cause mouth-not-moving issues.
Characters leaving and re-entering the frame may lose consistency.
Extreme poses or angles may downgrade results.