Graph Tokenization for Bridging Graphs and Transformers
Researchers' novel method converts complex graphs into sequences, enabling BERT to outperform specialized models on 14 benchmarks.
A team of researchers has developed a novel framework that successfully bridges the worlds of graph-structured data and Transformer models, a longstanding challenge in machine learning. The core innovation is a graph tokenization method that converts complex, non-sequential graph data into a format that standard sequence models like BERT can understand. This is achieved through a two-step process: first, a reversible graph serialization technique flattens the graph into a sequence while preserving its structural information, guided by global statistics of graph substructures. Second, this sequence is fed into Byte Pair Encoding (BPE), the same tokenizer used by large language models, which learns to merge frequently occurring substructures into meaningful, discrete tokens.
Empirical results are compelling, demonstrating that this approach enables off-the-shelf Transformer architectures to be applied directly to graph benchmarks without any model modifications. The proposed method achieved state-of-the-art performance on 14 benchmark datasets, frequently surpassing both specialized Graph Neural Networks (GNNs) and other dedicated graph transformer models. This breakthrough effectively unlocks the vast ecosystem of pre-trained sequence models and their associated tooling for graph-based tasks, from social network analysis to molecular property prediction. The work, accepted as a poster at ICLR 2026, represents a significant step toward unifying disparate AI architectures and could simplify the pipeline for researchers and engineers working with relational data.
- Framework combines reversible graph serialization with BPE tokenization to create sequential representations of graphs.
- Enables standard Transformers (e.g., BERT) to process graphs directly, achieving SOTA on 14 benchmark datasets.
- Frequently outperforms specialized models like Graph Neural Networks (GNNs) and dedicated graph transformers.
Why It Matters
Unlocks powerful pre-trained Transformer models for critical graph-based tasks in drug discovery, network analysis, and recommendation systems.