Safety
Is Your Trajectory Displacement Safe in Long-tail?
The article introduces FluidTest, a novel evaluation pipeline designed to address the shortcomings of existing autonomous driving evaluation methods in long-tail scenarios. It combines a pairwise WebUI for human annotation, a taxonomy of 32 semantic threats, and a three-agent verification system to enhance safety assessment. Experiments on the WOD-E2E dataset demonstrate that FluidTest can identify significant safety threats in planner trajectories, revealing that high performance metrics do not guarantee safety, which is crucial for practitioners focused on reliable autonomous systems.
autonomous drivingevaluationsafety