Safety
Capability Minimization as a Safety Primitive: Risk-Aware Causal Gating for Least-Privilege LLM Agents
The article presents Risk-Aware Causal Gating (RACG), a framework designed to enhance decision-making in learned systems by integrating causal effect estimation with calibrated risk control. RACG employs distribution-free bounds to determine whether to act on a model's predictions based on estimated counterfactual risks, rather than raw confidence levels, effectively reducing costly errors while maintaining utility. This approach offers a structured method for improving safety and transparency in automated decision systems, particularly in high-stakes environments where reliable performance is critical.
risk controlcausal gatingllm agents