Research & Papers

IntSeqBERT: Learning Arithmetic Structure in OEIS via Modulo-Spectrum Embeddings

A new AI model uses modular arithmetic to predict the next number in complex sequences with 95.85% accuracy.

Deep Dive

Researcher Kazuhisa Nakasho has introduced IntSeqBERT, a novel dual-stream Transformer encoder designed specifically for predicting integer sequences from the Online Encyclopedia of Integer Sequences (OEIS). The model addresses a fundamental limitation of standard tokenized AI models, which struggle with the vast numerical range and periodic arithmetic patterns found in sequences like factorials and exponentials. IntSeqBERT's innovation lies in its dual encoding system: each number is represented both by a continuous log-scale magnitude embedding and by sine/cosine embeddings for 100 different modulo residues (from 2 to 101). These two streams are fused using FiLM (Feature-wise Linear Modulation) and trained jointly on 274,705 sequences with three prediction heads.

At 91.5 million parameters, the 'Large' version of IntSeqBERT achieves a 95.85% accuracy in predicting the magnitude of the next number in a sequence and a 50.38% Mean Modulo Accuracy. Crucially, it outperforms a standard tokenized Transformer baseline by 8.9 and 4.5 percentage points in these metrics, respectively. An ablation study confirmed the critical role of the modulo stream, accounting for a 15.2-point gain in modulo accuracy and an additional 6.2-point boost in magnitude prediction. The model's predictions are converted into concrete integer outputs via a probabilistic Chinese Remainder Theorem-based solver, resulting in a dramatic 7.4-fold improvement in exact next-term prediction (19.09% vs. 2.59% Top-1 accuracy).

Analysis of the model's 'modulo spectrum' revealed a strong negative correlation between information gain and Euler's totient ratio, providing empirical evidence that composite moduli are more efficient at capturing the underlying arithmetic structure of OEIS sequences. This efficiency stems from the Chinese Remainder Theorem's ability to aggregate information from multiple modular residues to reconstruct the original integer. The research demonstrates that explicitly modeling modular arithmetic properties, rather than treating numbers as opaque tokens, is a powerful approach for AI systems dealing with mathematical reasoning and pattern recognition in numerical domains.

Key Points
  • IntSeqBERT uses dual magnitude and modulo embeddings for 100 residues (2-101) to encode numbers, fused via FiLM.
  • The 91.5M-parameter model achieves 95.85% magnitude accuracy and a 7.4x improvement in next-term prediction over tokenized baselines.
  • Modulo stream ablation shows it contributes +15.2 pt to Mean Modulo Accuracy and +6.2 pt to magnitude accuracy.

Why It Matters

This demonstrates a new AI architecture for mathematical reasoning that could improve tools for research, cryptography, and automated theorem proving.