I built a Telegram bot that controls ComfyUI video generation from my phone – approve or regenerate each shot with one tap
Approve or regenerate AI video shots with one tap from your couch, eliminating PC babysitting.
A developer has created a novel workflow solution that liberates AI video creators from being tethered to their workstations. By building a custom Python pipeline, LooPene44 connected the popular node-based interface ComfyUI to the Telegram messaging platform via its Bot API. This setup allows users to define a video scene in a JSON file—specifying parameters like StartFrame, prompts, CFG scale, and steps—and then initiate the generation process remotely. Once a shot is complete (taking approximately 130 seconds on an RTX 5070 Ti GPU), the system automatically sends both a preview frame and the full RIFE-interpolated 32fps MP4 video directly to a Telegram chat. The user then receives two simple buttons: one to approve the shot and proceed, and another to regenerate it with a new random seed.
The technical stack is deliberately lightweight, using Python with the `requests` library for API calls and `ffmpeg` for frame extraction, avoiding complex webhooks in favor of simple `getUpdates` polling. The AI generation itself leverages the Wan 2.2 I2V 14B model in a GGUF quantized format (Q6_K), run through a dual KSampler setup for managing high and low noise levels. The core innovation is the shift from passive waiting to active, asynchronous review and decision-making. This transforms video creation into a mobile-friendly, iterative approval process where creators can be productive from anywhere, significantly reducing downtime and manual intervention. The project demonstrates a growing trend of developers building bespoke automation and interface layers on top of powerful but complex AI toolkits to streamline creative workflows.
- Remote control via Telegram bot allows approving or regenerating video shots with one tap from a mobile device
- Each shot generation takes ~130s on an RTX 5070 Ti and delivers a 32fps RIFE-interpolated MP4 for review
- Workflow uses a JSON scene definition and a Python pipeline with ComfyUI's API and the Wan 2.2 I2V 14B model
Why It Matters
It demonstrates how custom automation can unlock mobile, asynchronous workflows for complex AI video generation, saving time and increasing flexibility.