Models
Lagrange: An Open-Vocabulary, Energy-Based Sparse Framework for Generalized End-to-End Driving
The article introduces Lagrange, an open-vocabulary, energy-based sparse framework designed for end-to-end autonomous driving, which addresses the limitations of existing models in handling complex, open-world environments. It utilizes Masked Latent Fields (MLF) and Vision-Language Models (VLMs) to generate continuous semantic visual tokens, implementing an intent-driven masked cross-attention mechanism for effective entity filtering and decision-making through Lagrangian action minimization. The framework shows promising results in offline evaluations on nuScenes and CODA benchmarks, offering a robust and interpretable solution for real-world driving scenarios that require compliance with vehicle kinematics and collision avoidance.
autonomous drivingenergy-basedplanning