Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.16338 (cs)

[Submitted on 26 Feb 2024 (v1), last revised 11 Mar 2024 (this version, v4)]

Title:BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM

Authors:Li Zhang, Youwei Liang, Ruiyi Zhang, Amirhosein Javadi, Pengtao Xie

Abstract:The Segment Anything Model (SAM), a foundation model pretrained on millions of images and segmentation masks, has significantly advanced semantic segmentation, a fundamental task in computer vision. Despite its strengths, SAM encounters two major challenges. Firstly, it struggles with segmenting specific objects autonomously, as it relies on users to manually input prompts like points or bounding boxes to identify targeted objects. Secondly, SAM faces challenges in excelling at specific downstream tasks, like medical imaging, due to a disparity between the distribution of its pretraining data, which predominantly consists of general-domain images, and the data used in downstream tasks. Current solutions to these problems, which involve finetuning SAM, often lead to overfitting, a notable issue in scenarios with very limited data, like in medical imaging. To overcome these limitations, we introduce BLO-SAM, which finetunes SAM based on bi-level optimization (BLO). Our approach allows for automatic image segmentation without the need for manual prompts, by optimizing a learnable prompt embedding. Furthermore, it significantly reduces the risk of overfitting by training the model's weight parameters and the prompt embedding on two separate subsets of the training dataset, each at a different level of optimization. We apply BLO-SAM to diverse semantic segmentation tasks in general and medical domains. The results demonstrate BLO-SAM's superior performance over various state-of-the-art image semantic segmentation methods.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2402.16338 [cs.CV]
	(or arXiv:2402.16338v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.16338

Submission history

From: Li Zhang [view email]
[v1] Mon, 26 Feb 2024 06:36:32 UTC (5,384 KB)
[v2] Thu, 29 Feb 2024 17:04:01 UTC (5,384 KB)
[v3] Sat, 2 Mar 2024 00:22:06 UTC (5,384 KB)
[v4] Mon, 11 Mar 2024 16:40:27 UTC (5,384 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators