Training
Qwen 27B for planning, Qwen 35B-A3B for execution?
The discussion revolves around the use of two models: Qwen 27B for planning long horizon tasks and Qwen 35B-A3B for execution. The Qwen 27B model operates at approximately 7-10 tokens per second, while the Qwen 35B-A3B achieves around 18 tokens per second, suggesting a trade-off between planning and execution efficiency. This insight is significant for practitioners considering optimal model deployment strategies in task-oriented AI applications, particularly in scenarios requiring sequential processing.
qwenplanning