Multimodal
FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model
FADA, a unified vision-language model based on Qwen3.5-VL, has been released to enhance fetal ultrasound interpretation and annotation, addressing the shortage of trained sonographers in low- and middle-income countries. FADA integrates clinical interpretation, classification, detection, and segmentation in a single pipeline using selective distillation from four domain-specific models, achieving a mean Dice score of 0.8820 for segmentation and 0.7671 mAP@0.50 for detection. Its 0.8B parameter model is optimized for edge deployment, operable on consumer hardware like the Qualcomm Snapdragon 7 Gen 1, making AI-assisted fetal assessment accessible in resource-constrained environments.
ultrasoundvision-languageclinical interpretation