Research & Papers

B-spline decoupling slashes Transformer model size while preserving accuracy

New R-CMTF-BSD algorithm compresses Vision and Swin Transformers with B-splines…

Deep Dive

A new paper on arXiv (2605.18794) by Joppe De Jonghe, Van Tien Pham, and Mariya Ishteva tackles the long‑standing challenge of compressing large Transformer models. Their method, Robust Basis Spline Decoupling (R-CMTF-BSD), rethinks how multivariate functions are represented inside neural networks. Instead of using polynomial or piecewise‑linear internal functions (which can be unstable or limited in expressiveness), the team leverages B‑splines – smooth, locally supported basis functions – to model the univariate nonlinearities in a decoupled layer.

The algorithm combines a constrained coupled matrix‑tensor factorization with Tikhonov regularization and normalization, making it both robust and numerically stable. Tested on Vision Transformer and Swin Transformer architectures, R-CMTF-BSD achieved significant parameter reduction while maintaining competitive accuracy against baseline compression methods. This opens the door to deploying smaller, faster Transformer models on edge devices without sacrificing performance.

Key Points
  • R-CMTF-BSD uses B-splines instead of polynomials for more stable and expressive decoupling in Transformer compression.
  • The method reduces parameter count on Vision & Swin Transformers while preserving accuracy.
  • Built on constrained coupled matrix-tensor factorization with Tikhonov regularization for robust optimization.

Why It Matters

Efficient Transformer compression enables smaller, faster models for deployment on resource‑constrained hardware without major accuracy loss.