RAG
VCG: A Multimodal Retrieval Framework for E-Commerce Video Feeds under Extreme Cold-Start Conditions
The paper introduces the Video Candidate Generation (VCG) system, a multimodal retrieval framework designed for e-commerce video feeds facing extreme cold-start conditions. VCG utilizes a domain-adapted vision-language model based on CLIP to perform zero-shot retrieval by mapping users and videos into a shared semantic space, addressing challenges posed by lack of interaction history and engagement biases. Evaluation results indicate that VCG significantly improves video completion rates by 50% compared to traditional methods, demonstrating its effectiveness in real-world applications.
multimodal retrievale-commercevideo feeds