Building realistic electric transmission grid dataset at scale: a pipeline from open dataset
21,697-bus Eastern Interconnection model now available for AC-OPF analysis without restricted data.
Microsoft Research has introduced a pipeline that constructs geographically grounded, electrically coherent transmission grid models solely from publicly available datasets, releasing an open dataset covering 48 U.S. states and interconnection-scale networks. The largest model, the Eastern Interconnection, includes 21,697 buses. This work tackles a critical bottleneck: realistic transmission-level grid data is classified as critical infrastructure, making it inaccessible to most researchers. By using OpenStreetMap and other open geographic and energy data, the pipeline produces models that support alternating current optimal power flow (AC-OPF) analysis, enabling physics-based investigations of congestion, capacity constraints, and infrastructure changes without needing restricted or commercial datasets.
The pipeline has been validated across the continental United States and is designed to generalize to other regions with comparable open data sources. Applications demonstrated include assessing transmission expansion potential, targeting line upgrades, and evaluating placement of large datacenter loads. The models preserve geographic structure of transmission corridors, substations, and generators, while transparently accounting for uncertainty in operational parameters. This open dataset is particularly valuable for data-driven and AI-based grid analysis, which requires large volumes of physically plausible grid data. Microsoft Research sees it as a way to democratize power systems research, enabling faster development of solutions for modern stresses like demand growth and renewable integration.
- Dataset spans 48 U.S. states and includes the full Eastern Interconnection grid with 21,697 buses
- Models support AC-OPF analysis, enabling physics-based study of congestion, capacity, and demand siting
- Built entirely from public data (OpenStreetMap, etc.) with transparent uncertainty reporting
Why It Matters
Democratizes power grid modeling, enabling AI-driven analysis and planning without proprietary data restrictions.