I made a playable ping pong game where every frame is ai generated. This is my interactive diffusion model I made from scratch.
A custom diffusion model creates a real-time game where every single frame is AI-generated from scratch.
An independent developer has built a fully playable ping pong game with a unique twist: every single frame is generated in real-time by a custom AI diffusion model. The model, built from scratch, takes user input (controlling a paddle with up/down arrows) and generates the corresponding visual output frame-by-frame. This approach, similar to projects like Decart Oasis's AI Minecraft and Google's upcoming Genie 3, treats gameplay as a continuous generative process rather than rendering pre-defined assets.
The model was trained on a synthetic dataset of roughly 100,000 image-action pairs. Initial results after just one hour of training on a T4 GPU were rudimentary, but refining the training logic yielded significantly improved, coherent gameplay after three hours. This rapid iteration showcases the potential for lightweight, accessible AI models to create dynamic, interactive experiences without massive computational resources. The project highlights a shift from AI as a content creation tool to a core game engine component, capable of generating coherent visual sequences based on real-time interaction.
This work demonstrates the feasibility of end-to-end generative AI for interactive media. While current results are simple, the technical foundation points toward a future where complex game worlds and narratives could be generated dynamically in response to player actions, moving beyond scripted events and pre-rendered environments.
- Every frame of the interactive ping pong game is generated in real-time by a custom diffusion model.
- The model was trained for 3 hours on a T4 GPU using a synthetic dataset of ~100k image-action pairs.
- The project demonstrates a conceptual parallel to generative interactive AI like Google's upcoming Genie 3 model.
Why It Matters
It prototypes a future where game worlds are generated dynamically by AI in real-time, not just pre-rendered.