Research & Papers

BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks

New TGN architecture uses bidirectional GRU-Transformer to catch multi-scale cyber threats...

Deep Dive

Researchers Zahra Makki Nayeri and Mohsen Rezvani have introduced BiTA (Bidirectional Gated Recurrent Unit-Transformer Aggregator), a novel temporal aggregation function for Temporal Graph Neural Networks (TGNs) designed to predict cyber attack alerts in computer networks. Traditional TGN methods rely on unidirectional or single-mechanism temporal aggregation, which struggles to capture recursive, multi-scale temporal patterns common in real-world attacks. BiTA redesigns the temporal aggregation by jointly encoding bidirectional sequential dependencies (using GRU) and long-range contextual relations (using Transformer) over each node's temporal neighborhood. This dual approach enables complementary temporal reasoning at different scales without increasing model depth or capacity, preserving the original TGN memory and message-passing structure.

Evaluated on real-world alert datasets, BiTA shows significant improvements in key performance metrics including area under the curve (AUC), average precision, mean reciprocal rank, and per-category prediction accuracy compared to state-of-the-art temporal graph models. It outperforms baselines in both transductive (known nodes in new timestamps) and inductive (unseen nodes) settings, demonstrating robustness and generalization in dynamic network environments. The framework is designed to be scalable and interpretable, making it suitable for real-time cyber threat anticipation and adaptive intrusion detection systems. This work, submitted to arXiv on April 3, 2026, represents a step toward more intelligent, proactive network defense mechanisms.

Key Points
  • BiTA combines bidirectional GRU and Transformer to capture both sequential and long-range temporal patterns in network alerts
  • Outperforms state-of-the-art TGN models on AUC, average precision, and mean reciprocal rank across real-world datasets
  • Works under both transductive and inductive settings, showing strong generalization to unseen network nodes

Why It Matters

Enables proactive, real-time cyber threat detection with higher accuracy, reducing response times for network security teams.