Making the complete OpenAIRE citation graph easily accessible through compact data representation
A massive academic dataset just became accessible to anyone with a laptop.
Deep Dive
Researchers have dramatically compressed the massive OpenAIRE academic citation graph, making it accessible for standard computers. The original dataset contains over 200 million publications and 2 billion citations, typically requiring terabytes of storage. The new processed version shrinks it to just 32GB while preserving the full network structure. They also provide a simple data format and a Python pipeline for easy community use and future updates to the graph.
Why It Matters
This unlocks large-scale network analysis for researchers and developers without access to massive computing infrastructure.