TrainingarXiv cs.CL — 7 d ago

Small LLMs for Biomedical Claim Verification: Cost-Effective Fine-Tuning, Structural Dataset Shortcuts, and Cross-Domain Generalization

Researchers fine-tuned three small LLMs—Phi-3-mini (3.8B), Qwen2.5-3B, and Mistral-7B—using QLoRA on the SciFact and HealthVer datasets for biomedical claim verification, achieving significant performance improvements. Notably, Mistral-7B outperformed both GPT-4o and GPT-5 by up to 12% in F1 scores while requiring only 1,008 training examples, demonstrating the efficacy of small models in this domain. This work highlights the importance of dataset structure for cross-domain generalization and plans to release all code and adapter checkpoints, which will aid practitioners in developing cost-effective LLM solutions.

fine-tuningbiomedicalcross-domainrelevance 0.00 · engagement 0.00

Read at source ↗← all news