The CriticalSet problem: Identifying Critical Contributors in Bipartite Dependency Networks
A linear-time algorithm finds critical nodes in networks with 250M+ edges...
Researchers Sebastiano A. Piccolo and Andrea Tagarelli have formalized the CriticalSet problem, which involves identifying the set of k contributors in a bipartite dependency network whose removal isolates the largest number of items. They prove the problem is NP-hard and requires maximizing a supermodular set function, making standard greedy algorithms ineffective. To solve this, they model CriticalSet as a coalitional game and derive ShapleyCov, a closed-form centrality measure based on the Shapley value that represents the expected number of items isolated by a contributor's departure.
Leveraging these insights, the researchers propose MinCov, a linear-time iterative peeling algorithm that accounts for connection redundancy and prioritizes contributors who uniquely support many items. Extensive experiments on synthetic and large-scale real datasets, including a Wikipedia graph with over 250 million edges, show MinCov and ShapleyCov significantly outperform traditional baselines. Notably, MinCov achieves near-optimal performance within 0.02 AUC of a Stochastic Hill Climbing metaheuristic while remaining several orders of magnitude faster, offering practical tools for analyzing critical dependencies in complex networks.
- CriticalSet problem is proven NP-hard, requiring supermodular function maximization
- MinCov algorithm achieves near-optimal performance within 0.02 AUC of metaheuristic baselines
- Tested on Wikipedia graph with 250M+ edges, showing orders of magnitude speedup
Why It Matters
Enables efficient identification of critical nodes in massive bipartite networks, with applications in supply chains, knowledge graphs, and social networks.