TrainingReddit r/LocalLLaMA — 14 d ago

Qwen 27B for planning, Qwen 35B-A3B for execution?

The discussion revolves around the use of two models: Qwen 27B for planning long horizon tasks and Qwen 35B-A3B for execution. The Qwen 27B model operates at approximately 7-10 tokens per second, while the Qwen 35B-A3B achieves around 18 tokens per second, suggesting a trade-off between planning and execution efficiency. This insight is significant for practitioners considering optimal model deployment strategies in task-oriented AI applications, particularly in scenarios requiring sequential processing.

qwenplanningrelevance 0.00 · engagement 0.00

Read at source ↗← all news