When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario

Chengcheng Han, Liqing Cui, Renyu Zhu, Jianing Wang, Nuo Chen, Qiushi Sun, Xiang Li, Ming Gao

Abstract

Large pre-trained language models (PLMs) have garnered significant attention for their versatility and potential for solving a wide spectrum of natural language processing (NLP) tasks. However, the cost of running these PLMs may be prohibitive. Furthermore, PLMs may not be open-sourced due to commercial considerations and potential risks of misuse, such as GPT-3. The parameters and gradients of PLMs are unavailable in this scenario. To solve the issue, black-box tuning has been proposed, which utilizes derivative-free optimization (DFO), instead of gradient descent, for training task-specific continuous prompts. However, these gradient-free methods still exhibit a significant gap compared to gradient-based methods. In this paper, we introduce gradient descent into black-box tuning scenario through knowledge distillation. Furthermore, we propose a novel method GDFO, which integrates gradient descent and derivative-free optimization to optimize task-specific continuous prompts in a harmonized manner. Experimental results show that GDFO can achieve significant performance gains over previous state-of-the-art methods.

Anthology ID:: 2023.findings-acl.55
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 868–880
Language:
URL:: https://aclanthology.org/2023.findings-acl.55/
DOI:: 10.18653/v1/2023.findings-acl.55
Bibkey:
Cite (ACL):: Chengcheng Han, Liqing Cui, Renyu Zhu, Jianing Wang, Nuo Chen, Qiushi Sun, Xiang Li, and Ming Gao. 2023. When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario. In Findings of the Association for Computational Linguistics: ACL 2023, pages 868–880, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario (Han et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.55.pdf

PDF Cite Search Fix data