ResearcharXiv cs.AI — 10 d ago

FOUNDv2: Learning Unified User Quantized Tokenizers for User Representation

FOUNDv2 introduces a novel user representation learning framework that utilizes the Unified User Quantized Tokenizer (U2QT) to standardize heterogeneous user data into a compact discrete token space. The architecture employs a two-stage process that includes feature extraction and a multi-view RQ-VAE for token discretization, achieving significant reductions in storage and computational costs while enhancing predictive capabilities through multi-scale alignment objectives. Its successful deployment on Alipay demonstrates its scalability and efficiency, making it a valuable tool for practitioners focused on personalized services in large-scale applications.

user-representationtokenizationmachine-learningrelevance 0.00 · engagement 0.00

Read at source ↗← all news