ResearcharXiv cs.AI — 4 d ago

PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized LLM Inference

The article presents PoQ-Judge, a framework designed for reference-free quality evaluation in decentralized LLM inference, employing judge models to score query-output pairs. It evaluates three architectures—TextCNN, MiniLM, and DeBERTa—demonstrating that the best model achieves a Pearson correlation of 0.747 with ground-truth proxies, outperforming prior reference-based evaluators. This framework is significant for practitioners as it enables cost-effective quality assessment without the need for reference outputs, enhancing the efficiency of LLM applications while reducing evaluation costs by 72.7% with minimal quality compromise.

decentralized-llmevaluationqualityrelevance 0.00 · engagement 0.00

Read at source ↗← all news