Research
DecompSR: A dataset for decomposed analyses of compositional multihop spatial reasoning
DecompSR is a newly introduced benchmark dataset comprising over 5 million datapoints aimed at analyzing compositional spatial reasoning in Large Language Models (LLMs). It allows for independent variation of aspects such as productivity, substitutivity, overgeneralization, and systematicity, and is verified for correctness through a symbolic solver. The dataset reveals that LLMs struggle with productive and systematic generalization in spatial reasoning tasks, highlighting critical areas for improvement in model training and evaluation.
datasetspatial-reasoningllm