Inference
RLRC: Reinforcement Learning-based Recovery for Compressed Vision-Language-Action Models
The paper presents RLRC, a novel three-stage compression and recovery pipeline for Vision-Language-Action (VLA) models, which employs structured pruning, supervised fine-tuning (SFT), and reinforcement learning (RL) for performance recovery. RLRC achieves up to an 8x reduction in memory usage and a 2.3x speedup in inference without compromising task success rates, outperforming existing compression techniques across multiple VLA architectures. This method is significant for practitioners aiming to deploy VLA models on resource-constrained devices, enhancing their efficiency and practicality in real-world applications.
vision-language-actionmodel compression