Research & Papers

[R] Fast WTConv: Accelerated Implementation for "Wavelet Convolutions for Large Receptive Fields"

This simple drop-in replacement for depthwise convolutions just got massively faster...

Deep Dive

WTConv, a popular wavelet-based convolution layer with over 500 citations since July 2024, now has an optimized implementation that dramatically accelerates performance. The new code supports CUDA (NVIDIA GPUs), Metal (Apple GPUs/MPS), and Triton backends, providing significant speed improvements while maintaining WTConv's benefits of larger receptive fields and measurable accuracy gains across diverse computer vision tasks. Users can simply replace existing depthwise convolutions with this faster drop-in layer.

Why It Matters

Researchers and engineers can now deploy more efficient convolutional networks without sacrificing accuracy, accelerating both training and inference.