Research
FOUNDv2: Learning Unified User Quantized Tokenizers for User Representation
FOUNDv2 introduces a novel user representation learning framework that utilizes the Unified User Quantized Tokenizer (U2QT) to standardize heterogeneous user data into a compact discrete token space. The architecture employs a two-stage process that includes feature extraction and a multi-view RQ-VAE for token discretization, achieving significant reductions in storage and computational costs while enhancing predictive capabilities through multi-scale alignment objectives. Its successful deployment on Alipay demonstrates its scalability and efficiency, making it a valuable tool for practitioners focused on personalized services in large-scale applications.
user-representationtokenizationmachine-learning