Research & Papers

Neuro-Symbolic Learning for Predictive Process Monitoring via Two-Stage Logic Tensor Networks with Rule Pruning

A new two-stage method injects domain rules into AI models, improving accuracy in compliance-heavy tasks like fraud detection.

Deep Dive

A team of researchers has published a novel neuro-symbolic AI framework designed to revolutionize predictive process monitoring—a critical task in domains like fraud detection and healthcare. The method, developed by Fabrizio De Santis, Gyunam Park, and Francesco Zanichelli, directly addresses a major flaw in purely data-driven models: their inability to incorporate hard, domain-specific logical rules. For instance, a medical treatment must follow a specific sequence, or a financial transaction must comply with regulatory steps. The new approach uses Logic Tensor Networks (LTNs) to formalize these rules—expressed in Linear Temporal Logic and first-order logic—and make them differentiable, allowing them to be learned alongside historical data patterns.

However, a core challenge with LTNs is their tendency to over-prioritize satisfying logical formulas at the expense of predictive accuracy, which can actually degrade model performance. The researchers' key contribution is a two-stage optimization strategy that solves this. First, a weighted axiom loss during pre-training prioritizes learning from the data itself. Then, a novel rule-pruning stage analyzes the dynamics of how well each axiom is satisfied, retaining only those rules that are both logically consistent and actually contribute to the model's predictions. This pruning step was shown to be essential; without it, injecting domain knowledge could severely hurt performance.

Evaluation on four real-world sequential event logs demonstrated that this framework significantly outperforms standard, purely data-driven baselines. The gains are especially pronounced in compliance-heavy scenarios where the number of training examples that follow all the rules is limited. By ensuring predictions adhere to known domain constraints, the model not only becomes more accurate but also more trustworthy and auditable for regulated industries.

Key Points
  • Integrates hard domain rules (e.g., compliance sequences) into AI models using differentiable Logic Tensor Networks (LTNs).
  • Uses a novel two-stage optimization with rule pruning to prevent logical constraints from degrading predictive accuracy.
  • Excels in compliance-constrained scenarios with limited data, outperforming data-only models on four real-world event logs.

Why It Matters

Enables more accurate, trustworthy, and regulation-compliant AI for critical applications in finance, healthcare, and logistics.