[D] OOD and Spandrels, or What you should know about EBM.
EBMs avoid 'spandrel' artifacts that cause MLPs to hallucinate linear patterns in unknown regions.
New research from Stanford University provides a critical comparison between Energy-Based Models (EBMs) and traditional Multi-Layer Perceptrons (MLPs), demonstrating they are not equivalent reformulations. The study trained both model types on identical 2D datasets—including a split circle, twist function, and noisy 'kissing pyramids'—with equivalent parameter counts. When queried densely around the training data, MLPs produced striking linear artifacts called 'spandrels' in out-of-distribution (OOD) regions, incorrectly extrapolating piecewise linear patterns. In contrast, EBMs, which learn by assigning low energy to correct variable configurations, showed no such artifacts, cleanly indicating low probability in unsampled areas.
This difference is most pronounced with discontinuous distributions. In a key experiment, training data deliberately omitted samples from a function's 'kink' or discontinuity. The ReLU-MLP hallucinated a linear connection across the gap, imposing continuity where none existed in the true data-generating process. The EBM correctly refrained from making predictions in the unsampled valley. The findings, visualized in detailed density plots, confirm that EBMs make fundamentally different predictions than MLPs, especially near distribution boundaries and discontinuities, challenging the assumption that they converge to the same solutions.
The implications are significant for safety-critical applications. MLPs' tendency to generate confident but incorrect linear extrapolations in OOD regions—a form of 'hallucination'—poses risks in fields like autonomous systems or medical diagnosis where models encounter novel scenarios. EBMs offer a more conservative and potentially safer alternative by not imposing unjustified structural assumptions like continuity on the data. This work, building on foundational papers from Yann LeCun and others, suggests EBMs warrant renewed attention for building robust AI systems that know what they don't know.
- MLPs create linear 'spandrel' artifacts in OOD regions, incorrectly assuming data continuity.
- EBMs showed no spandrels in tests on three 2D functions, better handling distribution discontinuities.
- When training data omitted a function's 'kink', MLPs hallucinated a linear connection while EBMs correctly remained uncertain.
Why It Matters
EBMs' resistance to spandrel artifacts makes them safer for real-world AI where models face novel, out-of-distribution scenarios.