Proximity Measure of Information Object Features for Solving the Problem of Their Identification in Information Systems
A new paper proposes a unified proximity measure to match data from disparate sources, handling both quantitative and qualitative features.
A new research paper by Volodymyr Yuzefovych, submitted to arXiv under the identifier 2604.04939, tackles a core problem in AI and data science: determining if disparate pieces of information describe the same physical object. The work introduces a novel quantitative-qualitative proximity measure designed for scenarios where data flows into a central system from multiple independent sources, a common challenge in fields like intelligence analysis, customer data platforms, and IoT sensor fusion.
Unlike many existing methods, Yuzefovych's proposed framework does not require transforming feature values to a common scale, simplifying implementation. It employs a probabilistic measure to analyze the proximity of numerical (quantitative) features and a measure of possibility for categorical (qualitative) features, explicitly accounting for inherent determination errors in the data. The 14-page paper, complete with 12 figures, formally demonstrates that the measure satisfies required mathematical axioms, establishing its theoretical soundness for practical application in information systems.
- Proposes a unified proximity measure for matching data from multiple independent sources to a single real-world object.
- Uses separate probabilistic and possibility measures for quantitative and qualitative features without requiring data transformation.
- The 14-page paper formally validates the measure against mathematical axioms, providing a robust framework for entity resolution.
Why It Matters
Provides a formal, scalable method for critical data fusion tasks in intelligence, customer analytics, and sensor networks.