Research
Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions in LSTM Networks
The paper presents a novel observation of the "multiple-descent" phenomenon in LSTM networks, where performance fluctuates significantly during training on real-world tasks, particularly after overtraining. Through asymptotic stability analysis, the authors link these performance cycles to transitions between order and chaos, identifying that optimal training steps occur at critical transition points, with the best model performance typically found at the initial transition from order to chaos. This finding is significant for practitioners as it suggests that understanding these dynamics can improve training strategies and model performance in LSTM architectures.
deep-learningLSTMmultiple-descent