Training
Task-Aware Structured Memory for Dynamic Multi-modal In-Context Learning
The paper introduces TASM (Task-Aware Structured Memory), a framework designed to enhance multi-modal large language models (MLLMs) by improving in-context learning (ICL) efficiency. TASM employs task-vector guided compression and semantics-aware token merging via bipartite graph matching to construct a dynamic, hierarchical memory system that includes a compact Core Memory and a Latent Bank, allowing for adaptive retrieval while minimizing bias and preserving semantic structure. This approach addresses scalability issues associated with context windows and KV cache costs, making it significant for practitioners aiming to optimize memory utilization in MLLMs.
multi-modalmemoryin-context learning