InferencearXiv cs.CL — 16 d ago

Displacement Is Not Direction: Evaluating Fidelity Metrics for Quantized LLM Deployment

The paper evaluates the effectiveness of per-token KL divergence (KLD) as a fidelity metric for quantized large language models (LLMs), specifically analyzing a 28-quant cohort of Qwen3.6-35B-A3B and a 41-quant cohort of Devstral-Small-2-24B. The study finds a strong correlation between KLD and benchmark scores across the cohorts, but this correlation diminishes significantly in low-performance scenarios, indicating that while KLD can indicate disagreement volume, it lacks reliability in predicting model performance across different tasks. This research highlights the need for more robust metrics in assessing quantized LLMs, particularly for practitioners focused on deployment in diverse applications.

fidelity-metricsquantizationllmbenchmarkingrelevance 0.00 · engagement 0.00

Read at source ↗← all news