Residual-policy architecture?

precomputed plan provides fixed prior, learned head handles reactive corrections

70-85% success rate with 3-6% collisions on zero-shot benchmarks (DLP, DSC3D)?

70-85% success rate with 3-6% collisions on zero-shot benchmarks (DLP, DSC3D)

reverse-yielding, queuing, tight-corridor passing — all unscripted

Robotics

CoPark's self-play RL achieves 70-85% reactive parking success with 3-6% collision rate

arXiv cs.RO June 04, 2026

⚡Autonomous parking that yields, queues, and reverses — all learned via self-play

Deep Dive

CoPark tackles the conflicting goals of high-precision parking and safe interaction with other vehicles. The key innovation is a partner-threat-modulated, channel-asymmetric release of the precomputed prior. A continuous threat signal shifts longitudinal control to the residual head for yielding, while lateral control stays anchored to the offline plan, preserving sub-meter slot alignment. A closed-loop refinement layer corrects residual errors from action-grid discretization. The policy is trained on six parking lots and evaluated zero-shot on the new reactive-parking benchmark (Dragon Lake Parking and DeepScenario Open 3D).

Results show CoPark achieves roughly 70-85% success with only 3-6% collision rate, significantly beating classical planners, imitation learning, and large-scale RL baselines. Importantly, the system exhibits emergent interactive behaviors like reverse-yielding, mid-maneuver yielding, tight-corridor passing, and queuing — none explicitly programmed. This suggests self-play RL on top of robust plan priors can produce safe, socially-aware autonomous driving maneuvers in constrained environments.

Key Points

Residual-policy architecture: precomputed plan provides fixed prior, learned head handles reactive corrections
70-85% success rate with 3-6% collisions on zero-shot benchmarks (DLP, DSC3D)
Emergent behaviors: reverse-yielding, queuing, tight-corridor passing — all unscripted

Why It Matters

CoPark demonstrates that self-play RL can safely coordinate multiple autonomous vehicles in tight spaces, reducing collisions while achieving high parking precision.

Read Original Article

CoPark's self-play RL achieves 70-85% reactive parking success with 3-6% collision rate

Why It Matters

Related Articles

🚀 Stay Ahead in AI