Safety
Task-Aligned Stability Analysis of Vision-Language Models for Autonomous Driving Hazard Detection
The study presents an analysis of vision-language models (VLMs) for hazard detection in autonomous driving, focusing on the correlation between corruption-induced embedding drift and task-aligned hazard scores derived from CLIP image-text similarities. The research, utilizing controlled corruptions on the BDD100K dataset, reveals that the relationship between representation drift and decision drift varies significantly with the type of corruption, highlighting the need for robustness benchmarks to incorporate task-specific stability measures alongside general embedding stability metrics. This is critical for practitioners as it informs the design of more reliable models for real-world applications in autonomous driving.
hazard detectionautonomous drivingrobustness