Skip to main content

Reference image

The reference image is the foundation of how your video agent looks and talks. Our model picks up on every detail of the image and uses it to infer micro-expressions, poses, and even background motion. Here’s some things about images that we think drives quality video agents.
Iterate on your images in the LemonSlice web app with our built-in AI image editor.
Larger faces have best facial details. Make the head as large as possible. If you want to use hand gestures in your video agent, use a half-body portrait while maintaining a relatively large face.
Large Face
Don’t zoom out too much. Our model renders every pixel in real time, so focus those pixels on the character, not the background.
Zoom Level
Use a neutral expression and look directly into the camera. Your reference image controls the expression and pose of your video agent throughout the call. For example, a reference image of a person looking up may cause the agent to always be looking up.
Expressionand Pose
Drive simple animations from a single image. Our model intelligently animates simple scenes using the reference image, the agent prompt, and the idle prompt. Keep your animations scoped to relatively simple, looping animations like waves in water and atmospheric effects. The model can also animate people in the background, but they won’t always be coherent.
Background
Example idle prompts for the images:
  • Left: Waves are moving in the background.
  • Center: Person driving and talking. Her hair is blowing in the wind. The background is moving as the car is moving.
  • Right: Person talking holding a bowl. Steam is coming out of the bowl
For non-human or cartoon characters, use an image with the mouth slightly open. This will show the model how you want the mouth to animate.
Mouth

Choosing a voice

Voice intonation impacts facial expression. For example, choosing a voice with an angry intonation will yield an angry expression. Likewise, a monotonous voice will produce a neutral facial expression. If your video agent’s mouth looks off, choose another voice. Some voices work better than others.