short-paper

Open access

Automated categorization of pre-trained models in software engineering: A case study with a Hugging Face dataset

Authors:

Claudio Di Sipio,

Riccardo Rubei,

Davide Di Ruscio,

Phuong T. NguyenAuthors Info & Claims

EASE '24: Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering

Pages 351 - 356

https://doi.org/10.1145/3661167.3661215

Published: 18 June 2024 Publication History

All formats PDF

Abstract

Software engineering (SE) activities have been revolutionized by the advent of pre-trained models (PTMs), defined as large machine learning (ML) models that can be fine-tuned to perform specific SE tasks. However, users with limited expertise may need help to select the appropriate model for their current task. To tackle the issue, the Hugging Face (HF) platform simplifies the use of PTMs by collecting, storing, and curating several models. Nevertheless, the platform currently lacks a comprehensive categorization of PTMs designed specifically for SE, i.e., the existing tags are more suited to generic ML categories.

This paper introduces an approach to bridge the gap by enabling the automatic classification of PTMs for SE tasks. First, we utilize a public dump of HF to extract PTMs information, including model documentation and associated tags. Then, we employ a semi-automated method to identify SE tasks and their corresponding PTMs from existing literature. The approach involves creating an initial mapping between HF tags and specific SE tasks, using a similarity-based strategy to identify PTMs with relevant tags. The evaluation shows that model cards are informative enough to classify PTMs considering the pipeline tag. Moreover, we provide a mapping between SE tasks and stored PTMs by relying ons model names.

References

[1]

Adem Ait, Javier Luis Cánovas Izquierdo, and Jordi Cabot. 2023. HFCommunity: A Tool to Analyze the Hugging Face Hub Community. In Procs. of SANER 2023. 728–732. https://doi.org/10.1109/SANER56733.2023.00080 ISSN: 2640-7574.

[2]

Michael Buckland and Fredric Gey. 1994. The relationship between recall and precision. Journal of the American society for information science 45, 1 (1994), 12–19. Publisher: Wiley Online Library.

Digital Library

[3]

Joel Castaño, Silverio Martínez-Fernández, Xavier Franch, and Justus Bogner. 2023. Analyzing the Evolution and Maintenance of ML Models on Hugging Face. https://doi.org/10.48550/arXiv.2311.13380 arXiv:2311.13380 [cs].

[4]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. on Intelligent Systems and Technology 2, 3 (April 2011), 1–27. https://doi.org/10.1145/1961189.1961199

Digital Library

[5]

Juri Di Rocco, Davide Di Ruscio, Claudio Di Sipio, Phuong T. Nguyen, and Riccardo Rubei. 2021. Development of recommendation systems for software engineering: the CROSSMINER experience. Empirical Software Engineering 26, 4 (July 2021), 69. https://doi.org/10.1007/s10664-021-09963-7

Digital Library

[6]

Juri Di Rocco, Davide Di Ruscio, Claudio Di Sipio, Phuong T. Nguyen, and Riccardo Rubei. 2023. HybridRec: A recommender system for tagging GitHub repositories. Applied Intelligence 53, 8 (April 2023), 9708–9730. https://doi.org/10.1007/s10489-022-03864-y

Digital Library

[7]

Claudio Di Sipio, Riccardo Rubei, Juri Di Rocco, Davide Di Ruscio, and Phuong T. Nguyen. 2024. Replication Package: Automated categorization of pre-trained models for software engineering: A case study with a Hugging Face dataset. https://github.com/MDEGroup/EASE2024-HF-ReplicationPackage

[8]

Claudio Di Sipio, Riccardo Rubei, Davide Di Ruscio, and Phuong T. Nguyen. 2020. A Multinomial Naïve Bayesian (MNB) Network to Automatically Recommend Topics for GitHub Repositories. In Procs. of the Evaluation and Assessment in Software Engineering. ACM, Trondheim Norway, 71–80. https://doi.org/10.1145/3383219.3383227

Digital Library

[9]

Malinda Dilhara, Ameya Ketkar, and Danny Dig. 2021. Understanding Software-2.0: A Study of Machine Learning Library Usage and Evolution. ACM Trans. Softw. Eng. Methodol. 30, 4, Article 55 (jul 2021), 42 pages. https://doi.org/10.1145/3453478

Digital Library

[10]

Zishuo Ding, Heng Li, Weiyi Shang, and Tse-Hsun Peter Chen. 2022. Can pre-trained code embeddings improve model performance? Revisiting the use of code embeddings in software engineering tasks. Empirical Software Engineering 27, 3 (March 2022), 63. https://doi.org/10.1007/s10664-022-10118-5

Digital Library

[11]

Yihong Dong, Xue Jiang, Zhi Jin, and Ge Li. 2023. Self-collaboration Code Generation via ChatGPT. arxiv:2304.07590 [cs.SE]

[12]

Lina Gong, Jingxuan Zhang, Mingqiang Wei, Haoxiang Zhang, and Zhiqiu Huang. 2023. What Is the Intended Usage Context of This Model? An Exploratory Study of Pre-Trained Models on Various Model Repositories. ACM Trans. on Software Engineering and Methodology 32, 3 (May 2023), 69:1–69:57. https://doi.org/10.1145/3569934

Digital Library

[13]

Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, 2021. Pre-trained models: Past, present and future. AI Open 2 (2021), 225–250. https://doi.org/10.1016/j.aiopen.2021.08.002

[14]

Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, 2023. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework. arxiv:2308.00352 [cs.AI]

[15]

Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, 2023. Large Language Models for Software Engineering: A Systematic Literature Review. https://doi.org/10.48550/arXiv.2308.10620 arXiv:2308.10620 [cs].

[16]

Maliheh Izadi, Mahtab Nejati, and Abbas Heydarnoori. 2023. Semantically-enhanced topic recommendation systems for software projects. Empirical Software Engineering 28, 2 (Feb. 2023), 50. https://doi.org/10.1007/s10664-022-10272-w

Digital Library

[17]

Wenxin Jiang, Nicholas Synovic, Matt Hyatt, Taylor R. Schorlemmer, Rohan Sethi, 2023. An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry. In Procs. of ICSE 2023. IEEE Press, Melbourne, Victoria, Australia, 2463–2475. https://doi.org/10.1109/ICSE48619.2023.00206

Digital Library

[18]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arxiv:1907.11692 [cs.CL]

[19]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, 2019. Model Cards for Model Reporting. In Procs. of the Conf. on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT⁎’19). ACM, 220–229. https://doi.org/10.1145/3287560.3287596

Digital Library

[20]

Diego Montes, Pongpatapee Peerapatanapokin, Jeff Schultz, Chengjun Guo, Wenxin Jiang, 2022. Discrepancies among pre-trained deep neural networks: a new threat to model zoo reliability. In Procs. of ESEC/FSE 2022. ACM, 1605–1609. https://doi.org/10.1145/3540250.3560881

Digital Library

[21]

Gonzalo Navarro. 2001. A guided tour to approximate string matching. Comput. Surveys 33, 1 (2001), 31–88. https://doi.org/10.1145/375360.375365

Digital Library

[22]

Payam Refaeilzadeh, Lei Tang, and Huan Liu. 2009. Cross-Validation. Springer US, Boston, MA, 532–538. https://doi.org/10.1007/978-0-387-39940-9_565

[23]

Jason D M Rennie, Lawrence Shih, Jaime Teevan, and David R Karger. 2003. Tackling the Poor Assumptions of Naive Bayes Text Classifiers. (2003).

[24]

Martin P. Robillard, Walid Maalej, Robert J. Walker, and Thomas Zimmermann (Eds.). 2014. Recommendation Systems in Software Engineering. Springer Berlin Heidelberg, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45135-5

[25]

Cezar Sas, Andrea Capiluppi, Claudio Di Sipio, Juri Di Rocco, and Davide Di Ruscio. 2023. GitRanking: A ranking of GitHub topics for software classification using active sampling. Software: Practice and Experience 53, 10 (Oct. 2023), 1982–2006. https://doi.org/10.1002/spe.3238

[26]

Rosalia Tufano, Simone Masiero, Antonio Mastropaolo, Luca Pascarella, Denys Poshyvanyk, 2022. Using pre-trained models to boost code review automation. In Procs. of the 44th Int. Conf. on Software Engineering(ICSE ’22). ACM, 2291–2302. https://doi.org/10.1145/3510003.3510621

Digital Library

[27]

Ratnadira Widyasari, Zhipeng Zhao, Thanh Le Cong, Hong Jin Kang, and David Lo. 2023. Topic Recommendation for GitHub Repositories: How Far Can Extreme Multi-Label Learning Go?. In 2023 IEEE Int. Conf. on Software Analysis, Evolution and Reengineering (SANER). IEEE, Taipa, Macao, 167–178. https://doi.org/10.1109/SANER56733.2023.00025

[28]

Jialu Zhang, Todd Mytkowicz, Mike Kaufman, Ruzica Piskac, and Shuvendu K. Lahiri. 2022. Using pre-trained language models to resolve textual and semantic merge conflicts (experience paper). In Procs. of ISSTA 2022. ACM, 77–88. https://doi.org/10.1145/3533767.3534396

Digital Library

[29]

Yuqi Zhou, Jiawei Wu, and Yanchun Sun. 2021. GHTRec: A Personalized Service to Recommend GitHub Trending Repositories for Developers. In IEEE Int. Conf. on Web Services. IEEE, 314–323. https://doi.org/10.1109/ICWS53863.2021.00049

Cited By

Nguyen PDi Rocco JDi Sipio CShakya MDi Ruscio DDi Penta M(2024)Automatic Categorization of GitHub Actions with Transformers and Few-shot LearningProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3690752(468-474)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.1145/3674805.3690752

Recommendations

Case study: Molecular dynamics study on the ligand recognition by tandem SH3 domains of p47phox, regulating NADPH oxidase activity

The phagocyte NADPH oxidase complex plays a crucial role in host defense against microbial infection through the production of superoxides. Chronic granulomatous disease (CGD) is an inherited immune deficiency caused by the absence of certain components ...
Ranking and tuning pre-trained models: a new paradigm for exploiting model hubs

Model hubs with many pre-trained models (PTMs) have become a cornerstone of deep learning. Although built at a high cost, they remain under-exploited--practitioners usually pick one PTM from the provided model hub by popularity and then fine-tune the PTM ...
The need for multivocal literature reviews in software engineering: complementing systematic literature reviews with grey literature
EASE '16: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering

Systematic Literature Reviews (SLR) may not provide insight into the "state of the practice" in SE, as they do not typically include the "grey" (non-published) literature. A Multivocal Literature Review (MLR) is a form of a SLR which includes grey ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

EASE '24: Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering

June 2024

728 pages

ISBN:9798400717017

DOI:10.1145/3661167

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2024

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Funding Sources

FRINGE PNRR Project
EMELIOT PRIN project
TRex-SE PRIN Project

Conference

EASE 2024

EASE 2024: 28th International Conference on Evaluation and Assessment in Software Engineering

June 18 - 21, 2024

Salerno, Italy

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
117
Total Downloads

Downloads (Last 12 months)117
Downloads (Last 6 weeks)38

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Nguyen PDi Rocco JDi Sipio CShakya MDi Ruscio DDi Penta M(2024)Automatic Categorization of GitHub Actions with Transformers and Few-shot LearningProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3690752(468-474)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.1145/3674805.3690752

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents