ai-digest.dev
last updated 2 h ago
ProductsarXiv cs.AI 15 d ago

Cost-Optimal LLM Routing with Limited User Feedback under User Satisfaction Guarantees

The article introduces SLARouter, an online routing algorithm designed to optimize inference costs for large language model (LLM) applications while adhering to Service Level Agreements (SLAs). SLARouter learns from sparse, one-sided user feedback, providing theoretical guarantees for cost optimality and SLA compliance, and demonstrates a reduction in operating costs by up to 2.2 times compared to existing methods across various LLM benchmarks. This development is significant for practitioners as it allows for more efficient resource allocation in LLM applications without extensive tuning or complete feedback signals.

llmroutingcost-optimizationslaonline-learningrelevance 0.00 · engagement 0.00
Read at source ↗← all news
Cost-Optimal LLM Routing with Limited User Feedback under User Satisfaction Guarantees — AI News Digest