ResearcharXiv cs.AI — 4 d ago

LASA: A Weak Supervision Method for Open-Vocabulary Scene Sketch Semantic Segmentation

The article presents LASA, a novel weak supervision method for open-vocabulary scene sketch semantic segmentation that utilizes a structure-aware framework to enhance semantic label assignment to line drawings without pixel-level annotations. LASA employs Layer-wise Accumulated Structural Attention to aggregate multi-layer attention from Vision Transformer models, improving segmentation accuracy with mean Intersection over Union (mIoU) gains of +3.43, +8.01, and +15.74 on the FS-COCO, SFSD, and FrISS datasets, respectively. This approach is significant for practitioners as it addresses the challenges of semantic understanding in sketches and offers a robust mechanism for hierarchical semantic alignment, potentially improving model performance in applications where labeled data is scarce.

semantic segmentationscene understandingweak supervisionrelevance 0.00 · engagement 0.00

Read at source ↗← all news