Research & Papers

SpikeHash: Spiking Neural Networks for Efficient Cross-Modal Hashing Retrieval

Brain-inspired spiking network achieves competitive retrieval with significantly less energy and parameters.

Deep Dive

Cross-modal hashing retrieval aims to encode heterogeneous data (e.g., images and text) into compact binary codes for efficient Hamming-space search. Existing methods typically learn semantics in continuous feature spaces and apply a sign operation to generate binary codes, weakly coupling training with discrete retrieval. SpikeHash addresses this by formulating hashing as spike-state evolution, directional spike interaction, and competitive spike readout. The framework first converts image and text features into multi-timestep spike sequences. These sequences jointly drive a shared hash state, with each modality influencing the other's firing dynamics via directional spike modulation. Crucially, instead of a continuous hash head, SpikeHash uses a positive-negative spiking readout where each hash bit is produced by temporal competition between paired spike channels, enabling discrete outputs directly.

Experimental results on three benchmark datasets (specific names not cited but implied) show SpikeHash achieves competitive retrieval accuracy while significantly reducing parameter size, operation count, and estimated energy consumption during the hash learning stage. This suggests a promising spiking alternative to conventional continuous hash mapping, particularly for energy-constrained or edge-deployed multimodal retrieval systems. The project page provides further details. The work was submitted to arXiv in May 2026 by authors from an unspecified institution.

Key Points
  • SpikeHash converts image and text features into multi-timestep spike sequences for temporal hash state evolution.
  • Directional spike modulation enables each modality to influence the other's firing dynamics, improving cross-modal interaction.
  • The positive-negative spiking hash readout uses temporal competition between paired spike channels, reducing parameters and energy versus continuous hash heads.

Why It Matters

Offers a compact, energy-efficient alternative to conventional continuous hash mapping for multimodal retrieval.