Implementation and Privacy Guarantees for Scalable Keyword Search on SOLID-based Decentralized Data with Granular Visibility Constraints
New privacy-preserving search indexes 2,689 KB of metadata across distributed personal data stores...
Researchers led by Mohamed Ragab from the University of Southampton have unveiled ESPRESSO, a decentralized framework enabling scalable keyword-based search across Solid-based personal online data stores (pods) while respecting user-defined visibility policies. Solid is a decentralized web architecture where users store data in pods hosted on compliant servers, retaining full sovereignty over their information. ESPRESSO addresses the fundamental challenge of searching across distributed pods by constructing WebID-scoped indexes within each pod and employing privacy-aware metadata to enable efficient source selection and ranking across servers. The framework is detailed in a paper submitted to arXiv on April 23, 2026, under Computer Science > Databases.
The paper further introduces a formal threat model for ESPRESSO, analyzing security and privacy risks associated with generating, aggregating, and using indexes and metadata. These risks include unintended metadata leakage and potential for adversaries to infer sensitive information about data within personal data stores. The analysis identifies key design principles that limit metadata exposure while mitigating unauthorized inference. This work provides a foundation for evaluating privacy-preserving decentralized search and informs the design of systems with stronger privacy guarantees, crucial for real-world adoption of decentralized data ecosystems like Solid.
- ESPRESSO enables scalable keyword search across distributed Solid pods using WebID-scoped indexes
- Formal threat model identifies risks of metadata leakage and unauthorized inference from index metadata
- Framework supports user-defined visibility policies, keeping data sovereignty with pod owners
Why It Matters
Enables practical, privacy-respecting search across decentralized personal data stores, crucial for Solid ecosystem adoption.