Open Source
Mordal: Automated Pretrained Model Selection for Vision Language Models
Mordal is an automated framework for selecting pretrained vision language models (VLMs) tailored to specific tasks, significantly enhancing the efficiency of model selection in multimodal applications. It reduces candidate evaluation time and GPU resource usage, achieving 8.9x to 11.6x fewer GPU hours compared to traditional grid search methods, while also outperforming existing model selection techniques by approximately 69% in weighted Kendall's τ across various tasks. This advancement is crucial for practitioners as it streamlines the process of deploying effective VLMs in real-world applications like healthcare and robotics.
vision language modelsautomated model selection