Research
SpeechDx: A Multi-Task Benchmark for Clinical Speech AI
SpeechDx is a newly introduced benchmark for clinical speech AI that encompasses 12 datasets and 27 tasks, organized by the stages of speech production: conceptualization, formulation, and articulation. The benchmark evaluates 12 state-of-the-art audio encoders under zero-shot cross-condition transfer, revealing that while large-scale speech models provide the best overall performance, domain-specific models are only beneficial for closely related tasks. This framework is significant for AI practitioners as it facilitates the comparison of clinical speech AI methods and aims to enhance generalization across diverse health conditions.
speechclinicalai