Research
Think Fast: Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models
The study evaluates the no-chain-of-thought (no-CoT) reasoning capabilities of frontier AI models across over 30,000 questions and 43 benchmarks, revealing that the 50% task-completion time horizon (TH) for these models has been doubling annually, with GPT-5.5 achieving a TH of over 3 minutes and a reasoning token horizon exceeding 1,500 tokens. The findings suggest that by 2028, THs could exceed 7 minutes, indicating a significant increase in the internal reasoning capabilities of AI models, which poses challenges for monitoring and ensuring safety in AI applications. Practitioners should consider these trends in task completion and reasoning to enhance oversight mechanisms for future AI developments.
ai-modelsreasoningtask-completion