Meta’s First AI Model From Its Superintelligence Lab Doesn’t Exactly Spark Joy
Meta's new model claims second-place in multimodal benchmarks, but lags in coding and agentic tasks.
Meta has re-entered the frontier AI race with Muse Spark, its first model from the newly formed Superintelligence Lab led by Alexandr Wang. The company claims the natively multimodal model represents a significant leap over prior efforts like LLaMa, featuring support for tool-use, visual chain-of-thought, and multi-agent orchestration. According to Meta's benchmarks, it now ranks competitively, sitting just behind top models from Google and OpenAI in multimodal functionality and holding its own in reasoning tests against leaders like Anthropic's Claude.
However, the launch comes with caveats. Meta did not release an accompanying research paper and has a history of benchmark skepticism, so performance claims require scrutiny. Muse Spark notably continues to struggle in coding and agentic functionality, areas where it fails to challenge Anthropic's dominance. The model is being positioned for immediate integration into Meta's core apps—Facebook, Instagram, Messenger, and WhatsApp—with a dual focus on monetization through personalized, affiliate-style shopping recommendations and processing consumer health queries, though data privacy concerns may hinder adoption in the latter.
- Claims second-place in multimodal benchmarks behind only Gemini 3.1 Pro and GPT-5.4, but lacks a published paper for verification.
- Built for integration across Meta's apps, emphasizing personalized shopping recommendations and health data processing as key differentiators.
- Lags behind competitors in coding and agentic task completion, failing to challenge Anthropic's lead in these areas.
Why It Matters
Meta is back in the AI race with a model built for direct product integration, targeting commerce and health use cases.