Open SourcearXiv cs.AI — 12 d ago

Mordal: Automated Pretrained Model Selection for Vision Language Models

Mordal is an automated framework for selecting pretrained vision language models (VLMs) tailored to specific tasks, significantly enhancing the efficiency of model selection in multimodal applications. It reduces candidate evaluation time and GPU resource usage, achieving 8.9x to 11.6x fewer GPU hours compared to traditional grid search methods, while also outperforming existing model selection techniques by approximately 69% in weighted Kendall's τ across various tasks. This advancement is crucial for practitioners as it streamlines the process of deploying effective VLMs in real-world applications like healthcare and robotics.

vision language modelsautomated model selectionrelevance 0.00 · engagement 0.00

Read at source ↗← all news