Open Source

Microsoft Presents "TRELLIS.2": An Open-Source, 4b-Parameter, Image-To-3D Model Producing Up To 1536³ PBR Textured Assets, Built On Native 3D VAES With 16× Spatial Compression, Delivering Efficient, Scalable, High-Fidelity Asset Generation.

Open-source 3D model produces high-fidelity PBR assets with 16x spatial compression.

Deep Dive

Microsoft has unveiled TRELLIS.2, a state-of-the-art open-source model for image-to-3D generation. With 4 billion parameters, it can produce 3D assets up to 1536³ voxel resolution, complete with physically based rendering (PBR) textures. The model introduces a 'field-free' sparse voxel structure called O-Voxel, which bypasses traditional volumetric field limitations to handle complex topologies and sharp geometric features. Built on native 3D variational autoencoders (VAEs) with 16× spatial compression, TRELLIS.2 delivers efficient, scalable generation without sacrificing detail.

This release includes a research paper, code on GitHub, and a live demo on Hugging Face. The model is designed for professionals in gaming, film, and simulation, where rapid creation of high-quality 3D assets is critical. By making the model open-source, Microsoft aims to democratize access to advanced 3D generation, enabling developers to integrate it into custom pipelines. However, the 4B parameter size requires substantial GPU resources for local deployment, though the demo offers a cloud-based alternative.

Key Points
  • TRELLIS.2 uses a field-free O-Voxel structure for efficient 3D reconstruction with complex topologies.
  • Supports up to 1536³ resolution with full PBR materials for realistic asset texturing.
  • Built on native 3D VAEs with 16× spatial compression, reducing memory footprint while maintaining fidelity.

Why It Matters

Open-source, high-fidelity 3D generation from images accelerates asset creation for gaming, VR, and simulation workflows.