MultimodalarXiv cs.AI — 21 h ago

Modeling Complex Behaviors: Multi-Personality Composition and Dynamic Switching in Vision-Language Models

The paper introduces a novel framework for explicit personality conditioning in Multimodal Large Language Models (MLLMs), focusing on single-personality induction, multi-personality induction, and personality switching. Experimental results indicate that while personality induction enhances image captioning performance, it can detrimentally affect visual question answering tasks. This research highlights the intricate interplay of personality traits in MLLMs and emphasizes the necessity for specialized methods for effective personality modeling and evaluation, with code to be released upon acceptance.

personalityvision-languagemodel behaviorrelevance 0.00 · engagement 0.00

Read at source ↗← all news