Meow-Omni 1 decodes cat intent with 71% accuracy using quad-modal AI
This open-source MLLM fuses video, audio, and physiological data to understand what your cat really means.
Understanding what animals actually feel is notoriously difficult due to "semantic aliasing" — the same external signal (e.g., a cat's purr) can mean radically different things depending on context. Existing multimodal AI systems rely on superficial pattern matching and can't process high-frequency biological time-series data, making them blind to physiological context. To solve this, a team of researchers from multiple institutions introduces Meow-Omni 1, the first open-source, quad-modal large language model specifically built for computational ethology. The model fuses video, audio, and physiological time-series streams with textual reasoning through specialized scientific encoders integrated into a unified backbone. It uses physiologically grounded cross-modal alignment to infer latent internal states rather than just matching superficial behaviors.
Meow-Omni 1 was evaluated on MeowBench, a novel, expert-verified quad-modal benchmark designed to test intent recognition in cats. The model achieved state-of-the-art accuracy of 71.16%, substantially outperforming leading vision-language and omni-modal baselines. The researchers have released the complete open-source pipeline, including model weights, training framework, and the Meow-10K dataset. This work establishes a scalable paradigm for inter-species intent understanding and moves foundation models toward real-world applications like veterinary diagnostics (e.g., distinguishing pain from contentment) and wildlife conservation (e.g., monitoring stress in endangered species).
- First open-source quad-modal LLM fusing video, audio, physiological data with text for animal behavior analysis.
- Achieves 71.16% accuracy on MeowBench, a new expert-verified benchmark for feline intent recognition.
- Open-sourced with model weights, training framework, and Meow-10K dataset to accelerate interspecies communication research.
Why It Matters
Brings AI-powered veterinary diagnostics and wildlife conservation closer by decoding animal intent from multimodal signals.