JAMER: Project-Level Code Framework Dataset and Benchmark on Professional Game Engines
The article introduces JamSet and JamBench, the first project-level code framework dataset and benchmark for professional game engines, derived from over 240,000 open-source projects from Game Jam competitions. Utilizing the Godot engine, the dataset includes 8,133 verified projects, with 300 manually validated for JamBench, which evaluates theme-driven generation and code completion tasks through metrics like Structural Completeness Score (SCS) and Behavioral Alignment Score (BAS). The findings highlight a significant decline in runtime pass rates as project size increases, indicating that architectural design challenges are a key barrier for AI models in game development, making this dataset crucial for advancing research in AI-driven game coding.