Research
CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges
CreativeBench is a newly introduced benchmark designed for evaluating machine creativity in code generation, consisting of two subsets: CreativeBench-Combo and CreativeBench-Explore. It employs an automated pipeline that utilizes reverse engineering and self-play to differentiate between creativity and hallucination through a unified metric of quality and novelty. The findings indicate that while larger models improve combinatorial creativity, they may experience diminishing returns in exploratory tasks, and the proposed EvoRePE strategy enhances creative performance by internalizing evolutionary search patterns, which is significant for practitioners focused on developing more effective generative models.
creativitybenchmarkllm