ai-digest.dev
last updated 5 h ago
ResearcharXiv cs.AI 21 h ago

CleanPatrick: A Benchmark for Image Data Cleaning

CleanPatrick is a newly introduced large-scale benchmark for image data cleaning, utilizing the Fitzpatrick17k dermatology dataset, which includes 496,377 binary annotations from 933 medical crowd workers. The benchmark identifies off-topic samples, near-duplicates, and label errors, formalizing issue detection as a ranking task and employing standard ranking metrics for evaluation. This resource allows practitioners to systematically compare various image-cleaning strategies, highlighting the performance of self-supervised representations and classical anomaly detection methods in real-world scenarios.

data cleaningbenchmarkimage datamachine learningrelevance 0.00 · engagement 0.00
Read at source ↗← all news
CleanPatrick: A Benchmark for Image Data Cleaning — AI News Digest