ResearcharXiv cs.CL — 7 d ago

LLM-based Embeddings: Attention Values Encode Sentence Semantics Better Than Hidden States

The paper presents a novel method for deriving sentence representations from Large Language Models (LLMs) by utilizing attention value vectors instead of final-layer hidden states, which are typically optimized for next-token prediction. The proposed Value Aggregation (VA) technique pools token values across multiple layers, achieving superior performance in a training-free context compared to existing methods, including the ensemble-based MetaEOL. The enhanced approach, Aligned Weighted VA (AlignedWVA), aligns attention outputs with the LLM's residual stream, resulting in state-of-the-art performance for LLM-based embeddings, indicating significant implications for practitioners focused on efficient and effective sentence representation in NLP applications.

llmsentence-representationembeddingsrelevance 0.00 · engagement 0.00

Read at source ↗← all news