ResearcharXiv cs.AI — 10 d ago

LLM Jaggedness Unlocks Scientific Creativity

The paper introduces SciAidanBench, a benchmark for assessing the scientific creativity of large language models (LLMs) by measuring their ability to generate unique and coherent responses to open-ended scientific questions. Evaluations of 19 base models across 30 variants reveal a phenomenon termed "jaggedness," characterized by uneven performance in creativity across tasks, prompts, and domains. This jaggedness is proposed as a resource for enhancing model performance through techniques like inference-time compute and knowledge pooling, suggesting that understanding these variability patterns can lead to the development of more effective meta-model ensembles for scientific idea generation.

creativityllmbenchmarkrelevance 0.00 · engagement 0.00

Read at source ↗← all news