CodingarXiv cs.AI — 9 d ago

From Tokens to Regions: CUDA-Sensitive Instruction Tuning for GPU Kernel Generation

The paper introduces CUDA-Sensitive Instruction Tuning (CuSeT), a novel method for generating high-performance CUDA kernels using Large Language Models (LLMs). CuSeT employs adaptive token-level masking and region-aware sample reweighting to enhance the model's understanding of CUDA sensitivity at both token and region levels, resulting in improved functional correctness across various model families. This approach offers a low-cost alternative to traditional supervised fine-tuning and outperforms existing methods while maintaining lower inference costs, making it significant for practitioners focused on efficient GPU kernel generation.

cudakernel generationllmrelevance 0.00 · engagement 0.00

Read at source ↗← all news