Computer Science > Computer Vision and Pattern Recognition
[Submitted on 12 Jun 2023]
Title:Sticker820K: Empowering Interactive Retrieval with Stickers
View PDFAbstract:Stickers have become a ubiquitous part of modern-day communication, conveying complex emotions through visual imagery. To facilitate the development of more powerful algorithms for analyzing stickers, we propose a large-scale Chinese sticker dataset, namely Sticker820K, which consists of 820k image-text pairs. Each sticker has rich and high-quality textual annotations, including descriptions, optical characters, emotional labels, and style classifications. Although vision-language tasks in the domain of natural images have been well studied, directly applying the those models, such as CLIP, to sticker data is not an optimal solution due to the discrepant nature between natural and emotive image data. Therefore, we propose StickerCLIP as a benchmark model on the Sticker820K dataset. For the text-to-image retrieval task, our StickerCLIP demonstrates strong superiority over the CLIP, which achieves an absolute gain of 66.0\% in mean recall on the Sticker820K test set. Additionally, we endeavor to extend the recently popularized LLM by means of prompt tuning, integrating its ability for sticker retrieval and allowing users to retrieve stickers through instructions. We validate the feasibility of this method, demonstrating the immense potential of prompt tuning in expanding LLM abilities while not affecting the quality of upstream tasks.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.