A Three-stage Neuro-symbolic Recommendation Pipeline for Cultural Heritage Knowledge Graphs
A new hybrid AI pipeline combines knowledge graphs and embeddings to surface hidden connections in 3.2 million historical records.
A research team from Jagiellonian University has published a novel methodology for building recommendation systems on complex cultural heritage knowledge graphs. The paper, 'A Three-stage Neuro-symbolic Recommendation Pipeline for Cultural Heritage Knowledge Graphs,' presents a complete hybrid pipeline designed to interpret semantic relationships within heterogeneous historical data.
The technical core of the work involves a three-stage process: first, generating knowledge-graph embeddings (evaluating four families including TransE, ComplEx, ConvE, and CompGCN); second, performing approximate nearest-neighbor search using the HNSW algorithm; and third, applying SPARQL-driven semantic filtering for precise, rule-based refinement. The pipeline was developed and evaluated on the Jagiellonian University Heritage Metadata Portal (JUHMP) knowledge graph, part of the CHExRISH project. This graph contains approximately 3.2 million RDF triples describing people, events, objects, and historical relations affiliated with the university.
The research addresses a significant challenge in digital humanities: providing meaningful recommendations from sparse, incomplete, and highly varied metadata. By combining neural embedding techniques (the 'neuro' part) with symbolic logic and querying (the 'symbolic' part), the system can surface non-obvious connections—like linking a historical figure to a related artifact via an intermediary event—while remaining explainable. The final recommender's output was validated through expert evaluation, confirming its practical utility despite the data's inherent noise and heterogeneity. This work provides a reproducible blueprint for institutions looking to make their vast digital archives more discoverable and interconnected.
- Hybrid pipeline combines neural knowledge-graph embeddings (e.g., ComplEx) with symbolic SPARQL filtering for precise, explainable recommendations.
- Tested on a real-world knowledge graph (JUHMP) containing ~3.2 million RDF triples of historical people, events, and objects.
- Expert evaluation confirmed the system produces useful recommendations even from sparse and heterogeneous cultural heritage metadata.
Why It Matters
Provides museums and archives with a blueprint to make vast, complex digital collections intelligible and discoverable to researchers and the public.