Segment Anything without Supervision

Wang, XuDong; Yang, **gfeng; Darrell, Trevor

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.20081 (cs)

[Submitted on 28 Jun 2024]

Title:Segment Anything without Supervision

Authors:XuDong Wang, **gfeng Yang, Trevor Darrell

View PDF HTML (experimental)

Abstract:The Segmentation Anything Model (SAM) requires labor-intensive data labeling. We present Unsupervised SAM (UnSAM) for promptable and automatic whole-image segmentation that does not require human annotations. UnSAM utilizes a divide-and-conquer strategy to "discover" the hierarchical structure of visual scenes. We first leverage top-down clustering methods to partition an unlabeled image into instance/semantic level segments. For all pixels within a segment, a bottom-up clustering method is employed to iteratively merge them into larger groups, thereby forming a hierarchical structure. These unsupervised multi-granular masks are then utilized to supervise model training. Evaluated across seven popular datasets, UnSAM achieves competitive results with the supervised counterpart SAM, and surpasses the previous state-of-the-art in unsupervised segmentation by 11% in terms of AR. Moreover, we show that supervised SAM can also benefit from our self-supervised labels. By integrating our unsupervised pseudo masks into SA-1B's ground-truth masks and training UnSAM with only 1% of SA-1B, a lightly semi-supervised UnSAM can often segment entities overlooked by supervised SAM, exceeding SAM's AR by over 6.7% and AP by 3.9% on SA-1B.

Comments:	Code: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2406.20081 [cs.CV]
	(or arXiv:2406.20081v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.20081

Submission history

From: Xudong Wang [view email]
[v1] Fri, 28 Jun 2024 17:47:32 UTC (22,085 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Segment Anything without Supervision

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Segment Anything without Supervision

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators