Graph-based microservice performance detection shows 96.2% accuracy for intrusion detection
96.2% accuracy on 21,438 request graphs, but trial-level splits reveal gaps
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
A new preprint from Yunjian Ma investigates whether graph-based performance anomaly detection in microservices can double as intrusion detection. The study uses a Docker Compose synthetic e-commerce benchmark, running 50 controlled trials across five attack types under normal workloads. Telemetry data—metrics, logs, and distributed traces—was collected and each request trace was converted into a request-level invocation graph with multi-modal node features derived from timestamped logs and per-service performance metrics. A two-layer graph convolutional network (GCN) was trained for 6-way classification on 21,438 request graphs, achieving 96.2% test accuracy and a macro F1 of 0.955 under a graph-level random split.
However, stricter trial-level split evaluation revealed significant limitations: trace structure alone proved insufficient for reliable detection, while adding logs and metrics improved performance. Interestingly, strong flattened baselines (non-graph models) outperformed the shallow GCN on the engineered feature set. Modality ablation, runtime analysis, t-SNE, and confusion-matrix inspections further underscored that graph-based methods are not a silver bullet for microservice intrusion detection. The paper concludes that while graph representations capture useful dependency information, current shallow graph models lag behind simpler approaches when features are well-engineered. This challenges the assumption that graph neural networks automatically beat traditional ML for security monitoring.
- 50 controlled trials across 5 attack types using a Docker Compose e-commerce benchmark
- Two-layer GCN achieved 96.2% test accuracy and macro F1 0.955 on 21,438 request graphs under graph-level split
- Trial-level evaluation showed trace structure alone insufficient; flattened baselines outperform shallow GCN on engineered features
Why It Matters
Shows graph-based performance detection has potential but requires multimodal data and careful evaluation to replace traditional intrusion detection.