From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG
The article introduces EPIC (Efficient Preference-aligned Index Construction), a novel approach for on-device Retrieval-Augmented Generation (RAG) that prioritizes user preferences to optimize memory usage and retrieval accuracy. EPIC demonstrates a dramatic reduction in indexing memory by 2,404 times, an 18.79% improvement in preference-following accuracy, and achieves 32.17 times lower retrieval latency compared to existing baselines, while operating within a memory constraint of under 1 MB and supporting latency between 5.21 to 29.35 ms per query across multiple platforms. This advancement is significant for practitioners as it enhances the efficiency and responsiveness of personal AI agents while maintaining user privacy through local context management.