ResearcharXiv cs.AI — 15 d ago

How LLMs Fail and Generalize in RTL Coding for Hardware Design?

The article presents a new error taxonomy for large language models (LLMs) in the context of translating sequential programming to parallel temporal logic in hardware design, identifying categories of failures such as syntactic and functional errors. Evaluations on the VerilogEval benchmark show that leading models achieve a maximum pass rate of 90.8%, limited by unsolvable functional errors that are resistant to improvements through scaling or alignment techniques. This highlights the need for enhanced model reasoning capabilities rather than just alignment, to effectively address the challenges in register-transfer level (RTL) coding for hardware generation.

RTL codinghardware designerror taxonomyrelevance 0.00 · engagement 0.00

Read at source ↗← all news