CodingarXiv cs.AI — 2 d ago

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

An AI-powered framework for automated code documentation generation has been introduced, leveraging eight state-of-the-art Large Language Models (LLMs) including GPT, Gemini, Qwen, and LLaMA variants. Utilizing the PocketFlow orchestration framework, the system employs modular pipelines and advanced prompt engineering, while the MultiLLMasJudges evaluation framework ensures quality by having four independent LLMs assess documentation outputs based on nine criteria. This approach, validated on an open-source medical physics library, showed a significant 42% performance gap between the best and worst models, highlighting its potential to improve documentation quality and reduce manual effort in critical domains like healthcare.

llmdocumentationautomationrelevance 0.00 · engagement 0.00

Read at source ↗← all news