Inference
NaturalFlow: Reducing Disruptive Pauses for Natural Speech Flow in Simultaneous Speech-to-Speech Translation
The article introduces NaturalFlow, a fluency-aware optimization framework for simultaneous speech-to-speech translation that aims to reduce disruptive pauses in translated speech. By leveraging model-internal signals such as linguistic diversity and temporal variability, it achieves a balance between low latency and natural speech flow. Experimental results demonstrate that NaturalFlow maintains competitive latency and translation quality while enhancing the acoustic fluency of the output, which is critical for improving user experience in real-time communication applications.
speech translationnatural language processing