MultimodalarXiv cs.AI — 10 d ago

Seeing Roads Through Words: A Language-Guided Framework for RGB-T Driving Scene Segmentation

The article presents CLARITY, a novel framework for RGB-T driving scene segmentation that adapts its fusion strategy based on the detected scene conditions, leveraging vision-language model priors. It introduces mechanisms to preserve dark-object semantics and a hierarchical decoder for structural consistency, achieving state-of-the-art results on the MFNet dataset with 62.3% mean Intersection over Union (mIoU) and 77.5% mean Accuracy (mAcc). This approach addresses the challenges of adverse illumination and enhances segmentation accuracy, which is critical for improving the robustness of autonomous driving systems.

semantic segmentationautonomous drivingrgb-thermalrelevance 0.00 · engagement 0.00

Read at source ↗← all news