Safety
Beyond Runtime Enforcement: Shield Synthesis as Defensibility Analysis for Adversarial Networks
The paper presents a novel approach to shielded reinforcement learning, emphasizing shield synthesis as a tool for defensibility analysis in adversarial networks rather than merely a runtime safety mechanism. It introduces a two-player safety game framework where the defender's specification identifies unsafe regions and the attacker's specification limits actions, resulting in a defensibility verdict that characterizes the system's defensive capabilities. This approach provides insights into network architecture and operational behavior, highlighting that minor changes can significantly impact operational outcomes while maintaining formal safety margins, thus offering a new perspective on network defense strategies for AI practitioners.
reinforcement-learningadversarialnetworks