// the find
OpenDriveLab/UniAD
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
UniAD is a CVPR 2023 Best Paper that treats autonomous driving as a single end-to-end pipeline: perception (tracking + map segmentation) feeds into motion prediction, which feeds into occupancy forecasting, which feeds into planning. The key bet is that planning performance improves when it can query upstream task representations directly rather than consuming disconnected outputs. This is a research codebase, not a production system.
The hierarchical task chaining is the actual contribution here — planning head attends to motion queries, which attend to tracked object queries, giving gradients a meaningful path back through the whole stack rather than hitting a detach boundary. The two-stage training setup (perception-first, then end-to-end) is pragmatic and reproducible, not just hand-wavy 'train it all together'. The v2.0 update to mmdet3d 1.x and torch 2.x is genuinely useful — the original shipped against ancient mmcv and was a dependency nightmare. NAVSIM benchmark integration in v2.0 gives a more realistic closed-loop signal than the open-loop nuScenes planning metric.
The open-loop planning metric (L2 + collision rate on nuScenes) was publicly disputed even by the authors — there's a pinned issue thread about it, and the metric is known to be gameable without actually driving well. Training requires 8×A100s minimum and takes days; there's no lighter path to experimenting with just the planning head. The codebase is tightly coupled to mmdet3d's config system, which means customization requires understanding three layers of framework before you can change anything meaningful. The nuPlan and NAVSIM tooling promised in v2.0 (Q2 2025 ETA) still shows as unchecked TODOs in the README as of the last push.