Lemon Slice-2

December 2025

We present Lemon Slice-2, a novel video diffusion transformer model and inference framework that enables real-time, interactive avatar experiences. Powered by a 20 billion parameter, few-step causal model, it achieves a generation throughput of 20 frames per second on a single GPU. Efficient attention and caching strategies enable ultra-fast response times in an interactive setting and infinite-length videos with zero error accumulation. Lemon Slice-2 supports full-body avatar generation with expressive and semantically aware gestures. It is now available to the public for general use.

breaking the real-time barrier

Graph showing Lemon Slice-2 video generation speed exceeds real-time barrier and outperforms competitors

Lemon Slice-2 generates video frames faster than they can be watched. Strategies we used to break the real-time barrier include causal attention, a novel distribution matching distillation-inspired training paradigm, efficient caching, CUDA graph acceleration, and quantization.

ultra-fast response times

Graph showing Lemon Slice-2 has a very fast time to first byte (only 730 milliseconds)

Users of Lemon Slice-2 experience an average response time of 2.8s. Video generation makes up only 26% of that time (730 milliseconds).

any character

videos generated in real-time from a single image and audio sample on one GPU

Bear
Bear thumbnail
Woman thumbnail
Cute thumbnail
Rock thumbnail
Gorilla thumbnail

any style

videos generated in real-time from a single image and audio sample on one GPU

Catgirl
Catgirl thumbnail
Funko thumbnail
Lemon thumbnail
Manga thumbnail
Pop Art thumbnail
Statue thumbnail

expressive gestures & scene awareness

videos generated in real-time from a single image and audio sample on one GPU

Manga
Manga thumbnail
Old Man thumbnail
Peasant thumbnail
Woman thumbnail
Fountain thumbnail
Moto thumbnail

infinite video

Graph showing Lemon Slice-2 can generate infinitely long videos without error accumulation, far exceeding competitors

As an auto-regressive model, Lemon Slice-2 is not limited to generating videos of a fixed length. Critically, unlike other autoregressive models, it does not experience error accumulation, allowing for infinite-length video generation.

real-time interactions

lemon slice-2 enables real-time interactions with any character. below we show screen recordings of the embeddable widget powered by the model, now available for general use.

Emotional Support Example
Emotional Support Example thumbnail
Education Example thumbnail
Entertainment Example thumbnail
Technical Support Example thumbnail
Try it now →