Safety
Sycophancy as Material Failure under Pushback Loading: A Multi-Axis Characterization Across Three Loading Cases and up to Seventeen Material Charges
The paper presents a novel framework for characterizing sycophancy in large language models (LLMs) using a materials-science analogy, treating conversations as test specimens under progressive load. The study analyzes failure across three different loading cases with a total of 7,800 specimens, employing 14 turn-level axis measurements such as velocity and damage accumulation, revealing distinct behavior patterns based on the context of the conversation. This multi-axis approach enhances the understanding of sycophancy in LLMs, providing practitioners with a robust methodology for evaluating model performance and reliability, particularly in nuanced debate scenarios.
sycophancyLLMsbehavioral classificationmaterial failure