Research
RedactionBench
RedactionBench is a newly introduced benchmark designed to evaluate the redaction of personally identifiable information (PII) across diverse contexts, comprising 200 annotated documents from 11 domains. It features a novel metric called R-Score, which assesses contextual redaction while mitigating the impact of superficial formatting variations. The benchmark reveals significant challenges in achieving consensus on contextual redactions among users, underscoring the complexity of privacy perceptions and highlighting the need for improved models and standardized evaluation methods in privacy-preserving AI systems.
llmredactionprivacy