Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICSE48619.2023.00206acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry

Published: 26 July 2023 Publication History

Abstract

Deep Neural Networks (DNNs) are being adopted as components in software systems. Creating and specializing DNNs from scratch has grown increasingly difficult as state-of-the-art architectures grow more complex. Following the path of traditional software engineering, machine learning engineers have begun to reuse large-scale pre-trained models (PTMs) and fine-tune these models for downstream tasks. Prior works have studied reuse practices for traditional software packages to guide software engineers towards better package maintenance and dependency management. We lack a similar foundation of knowledge to guide behaviors in pre-trained model ecosystems.
In this work, we present the first empirical investigation of PTM reuse. We interviewed 12 practitioners from the most popular PTM ecosystem, Hugging Face, to learn the practices and challenges of PTM reuse. From this data, we model the decision-making process for PTM reuse. Based on the identified practices, we describe useful attributes for model reuse, including provenance, reproducibility, and portability. Three challenges for PTM reuse are missing attributes, discrepancies between claimed and actual performance, and model risks. We substantiate these identified challenges with systematic measurements in the Hugging Face ecosystem. Our work informs future directions on optimizing deep learning ecosystems by automated measuring useful attributes and potential attacks, and envision future research on infrastructure and standardization for model registries.

References

[1]
E. Raymond, "The cathedral and the bazaar," Knowledge, Technology & Policy, vol. 12, no. 3, pp. 23--49, 1999.
[2]
R. Abdalkareem, O. Nourry, S. Wehaibi, S. Mujahid, and E. Shihab, "Why do developers use trivial packages? an empirical case study on npm," in European Software Engineering Conference/Foundations of Software Engineering (ESEC/FSE), 2017.
[3]
N. K. Gopalakrishna, D. Anandayuvaraj, A. Detti, F. L. Bland, S. Rahaman, and J. C. Davis, ""If security is required": Engineering and Security Practices for Machine Learning-based IoT Devices," in International Workshop on Software Engineering Research & Practices for the Internet of Things (SERP4IoT), 2022.
[4]
J. Garcia, Y. Feng, J. Shen, S. Almanee, Y. Xia, and a. Q. A. Chen, "A comprehensive study of autonomous vehicle bugs," in International Conference on Software Engineering (ICSE), 2020.
[5]
D. Patterson, J. Gonzalez, Q. Le, C. Liang, L.-M. Munguia, D. Rothchild, D. So, M. Texier, and J. Dean, "Carbon Emissions and Large Neural Network Training," 2021. [Online]. Available: https://arxiv.org/abs/2104.10350
[6]
H. V. Pham, S. Qian, J. Wang, T. Lutellier, J. Rosenthal, L. Tan, Y. Yu, and N. Nagappan, "Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance," in International Conference on Automated Software Engineering (ASE), 2020.
[7]
X. Han, Z. Zhang, N. Ding, Y. Gu, X. Liu, Y. Huo, J. Qiu, Y. Yao, A. Zhang, L. Zhang, W. Han, M. Huang, Q. Jin, Y. Lan, Y. Liu, Z. Liu, Z. Lu, X. Qiu, R. Song, J. Tang, J.-R. Wen, J. Yuan, W. X. Zhao, and J. Zhu, "Pre-trained models: Past, present and future," AI Open, vol. 2, pp. 225--250, 2021.
[8]
J. Gordon, "Introducing TensorFlow Hub: A Library for Reusable Machine Learning Modules in TensorFlow," https://blog.tensorflow.org/2018/03/introducing-tensorflow-hub-library.html, 2018.
[9]
T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush, "Transformers: State-of-the-Art Natural Language Processing," in Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020.
[10]
M. Zimmermann, C.-A. Staicu, and M. Pradel, "Small World with High Risks: A Study of Security Threats in the npm Ecosystem," in USENIX Security Symposium, 2019.
[11]
N. Zahan, T. Zimmermann, P. Godefroid, B. Murphy, C. Maddila, and L. Williams, "What are Weak Links in the npm Supply Chain?" in International Conference on Software Engineering (ICSE), May 2022. [Online]. Available: https://www.microsoft.com/en-us/research/publication/what-are-weak-links-in-the-npm-supply-chain/
[12]
A. Zerouali, T. Mens, A. Decan, and C. De Roover, "On the impact of security vulnerabilities in the npm and RubyGems dependency networks," Empirical Software Engineering (EMSE), 2022.
[13]
A. S. Jadhav and R. M. Sonar, "Evaluating and selecting software packages: A review," Information and Software Technology, 2009.
[14]
A. Decan, T. Mens, and P. Grosjean, "An empirical comparison of dependency network evolution in seven software packaging ecosystems," Empirical Software Engineering (EMSE), 2019.
[15]
C. Okafor, T. R. Schorlemmer, S. Torres-Arias, and J. C. Davis, "Sok: Analysis of software supply chain security by establishing secure design properties," in ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (SCORED'22), 2022.
[16]
S. Schelter, F. Biessmann, T. Januschowski, D. Salinas, S. Seufert, and G. Szarvas, "On Challenges in Machine Learning Model Management," Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2018.
[17]
S. Amershi, A. Begel, C. Bird, R. DeLine, and H. Gall, "Software Engineering for Machine Learning: A Case Study," in International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 2019.
[18]
R. B. Johnson and A. J. Onwuegbuzie, "Mixed Methods Research: A Research Paradigm Whose Time Has Come," Educational Researcher, 2004.
[19]
W. Jiang, N. Synovic, R. Sethi, A. Indarapu, M. Hyatt, T. R. Schorlemmer, G. K. Thiruvathukal, and J. C. Davis, "An empirical study of artifacts and security risks in the pre-trained model supply chain," in ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, 2022, pp. 105--114.
[20]
A. e. a. Monteil, "Nine Best Practices for Research Software Registries and Repositories: A Concise Guide," Dec. 2020, arXiv:2012.13117 [cs]. [Online]. Available: http://arxiv.org/abs/2012.13117
[21]
S. Oladele, "Ml model registry: What it is, why it matters, how to implement it," 2022. [Online]. Available: https://neptune.ai/blog/ml-model-registry
[22]
M. J. Smith, C. Sala, J. M. Kanter, and K. Veeramachaneni, "The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development," in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, Jun. 2020.
[23]
A. S. Jadhav and R. M. Sonar, "Framework for evaluation and selection of the software packages: A hybrid knowledge based system approach," Journal of Systems and Software (JSS), 2011.
[24]
V. del Bianco, L. Lavazza, S. Morasca, and D. Taibi, "A Survey on Open Source Software Trustworthiness," IEEE Software, 2011.
[25]
A. Zerouali, T. Mens, G. Robles, and J. M. Gonzalez-Barahona, "On the Diversity of Software Package Popularity Metrics: An Empirical Study of npm," in International Conference on Software Analysis, Evolution and Reengineering (SANER), 2019.
[26]
D. Mitropoulos, V. Karakoidas, P. Louridas, G. Gousios, and D. Spinellis, "The bug catalog of the maven ecosystem," in Working Conference on Mining Software Repositories (MSR), 2014.
[27]
C. Soto-Valero, A. Benelallam, N. Harrand, O. Barais, and B. Baudry, "The Emergence of Software Diversity in Maven Central," in International Conference on Mining Software Repositories (MSR), 2019.
[28]
J. Pineau, P. Vincent-Lamarre, K. Sinha, V. Lariviere, and A. Beygelzimer, "Improving Reproducibility in Machine Learning Research," Journal of Machine Learning Research, 2020.
[29]
P. Goswami, S. Gupta, Z. Li, N. Meng, and D. Yao, "Investigating The Reproducibility of NPM Packages," in International Conference on Software Maintenance and Evolution (ICSME), 2020.
[30]
D.-L. Vu, F. Massacci, I. Pashchenko, H. Plate, and A. Sabetta, "Last-PyMile: identifying the discrepancy between sources and packages," in European Software Eng. Conf. and Symp. on the Foundations of Software Eng. (ESEC/FSE), 2021.
[31]
M. Hutson, "Artificial intelligence faces reproducibility crisis," American Association for the Advancement of Science, vol. 359, no. 6377, pp. 725--726, 2018.
[32]
W. Jiang, N. Synovic, and R. Sethi, "An Empirical Study of Artifacts and Security Risks in the Pre-trained Model Supply Chain," Los Angeles, p. 10, 2022.
[33]
S. J. Pan and Q. Yang, "A Survey on Transfer Learning," Transactions on Knowledge and Data Engineering, 2010.
[34]
T. Liang, J. Glossner, L. Wang, S. Shi, and X. Zhang, "Pruning and quantization for deep neural network acceleration: A survey," Neurocomputing, pp. 370--403, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231221010894
[35]
G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," Mar. 2015, arXiv:1503.02531 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1503.02531
[36]
P. Dube, B. Bhattacharjee, S. Huo, P. Watson, and B. Belgodere, "Automatic Labeling of Data for Transfer Learning," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 122--129.
[37]
F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, and Q. He, "A Comprehensive Survey on Transfer Learning," Jun. 2020. [Online]. Available: http://arxiv.org/abs/1911.02685
[38]
S. Rahman, E. River, F. Khomh, Y. G. Guhneuc, and B. Lehnert, "Machine learning software engineering in practice: An industrial case study," in International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 2019.
[39]
Databricks, "Mlflow model registry," 2022. [Online]. Available: https://databricks.com/product/mlflow-model-registry
[40]
NPM, "npm," 2022. [Online]. Available: httnps://www.npmjs.com/
[41]
PyPI, "Python package index," 2022. [Online]. Available: https://pypi.org
[42]
Hugging Face, "Hugging face - the ai community building the future." 2021. [Online]. Available: https://huggingface.co/
[43]
TensorFlow team, "Tensorflow hub," 2022. [Online]. Available: https://www.tensorflow.org/hub
[44]
Pytorch, "Pytorch hub," 2021. [Online]. Available: https://pytorch.org/hub/
[45]
ONNX, "Onnx model zoo," 2022. [Online]. Available: https://github.com/onnx/models
[46]
Y. K. Jing, "Model Zoo - Deep learning code and pretrained models," 2021. [Online]. Available: https://modelzoo.co/
[47]
NVIDIA, "NVIDIA NGC: AI Development Catalog," 2022. [Online]. Available: https://catalog.ngc.nvidia.com/
[48]
Computational Imaging and Bioinformatics Lab, "Modelhub," 2022. [Online]. Available: http://modelhub.ai/
[49]
M. Injadat, A. Moubayed, A. B. Nassif, and A. Shami, "Machine learning towards intelligent systems: applications, challenges, and opportunities," Artificial Intelligence Review, 2021.
[50]
D. Marijan and A. Gotlieb, "Software Testing for Machine Learning," AAAI Conference on Artificial Intelligence, 2020.
[51]
K. Rasheed, A. Qayyum, M. Ghaly, A. Al-Fuqaha, A. Razi, and J. Qadir, "Explainable, trustworthy, and ethical machine learning for healthcare: A survey," Computers in Biology and Medicine, 2022.
[52]
J. M. Wing, "Trustworthy AI," Communications of the ACM, 2021.
[53]
M. Mora-Cantallops, S. Sánchez-Alonso, E. García-Barriocanal, and M.-A. Sicilia, "Traceability for Trustworthy AI: A Review of Models and Tools," Big Data and Cognitive Computing, 2021.
[54]
L. Floridi, "Establishing the rules for building trustworthy AI," Nature Machine Intelligence, 2019.
[55]
S. Shams, R. Platania, K. Lee, and S.-J. Park, "Evaluation of deep learning frameworks over different hpc architectures," in International Conference on Distributed Computing Systems (ICDCS), 2017.
[56]
L. Liu, Y. Wu, W. Wei, W. Cao, S. Sahin, and Q. Zhang, "Benchmarking deep learning frameworks: Design considerations, metrics and beyond," in International Conference on Distributed Computing Systems, 2018.
[57]
N. Akhtar and A. Mian, "Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey," IEEE Access, vol. 6, pp. 14 410--14 430, 2018.
[58]
Y. Liu, S. Ma, Y. Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, "Trojaning Attack on Neural Networks," in Network and Distributed Systems Security (NDSS) Symposium, 2018.
[59]
T. Gu, B. Dolan-Gavitt, and S. Garg, "BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain," 2019. [Online]. Available: http://arxiv.org/abs/1708.06733
[60]
K. Kurita, P. Michel, and G. Neubig, "Weight Poisoning Attacks on Pre-trained Models," arXiv, Tech. Rep., 2020. [Online]. Available: http://arxiv.org/abs/2004.06660
[61]
M. Goldblum, D. Tsipras, C. Xie, X. Chen, A. Schwarzschild, D. Song, A. Madry, B. Li, and T. Goldstein, "Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
[62]
Z. Wang, C. Liu, and X. Cui, "EvilModel: Hiding Malware Inside of Neural Network Models," in IEEE Symposium on Computers and Communications (ISCC), 2021.
[63]
ö. Aslan and R. Samet, "A Comprehensive Review on Malware Detection Approaches," IEEE Access, 2020.
[64]
J. Ritchie and L. Spencer, "Qualitative data analysis for applied policy research," in Analyzing qualitative data. Routledge, 2002, pp. 187--208.
[65]
A. Srivastava and S. Thomson, "Framework analysis: A qualitative methodology for applied policy research," Journal of Administration and Governance (JOAAG), vol. 4, 2009.
[66]
L. G. Michael, J. Donohue, J. C. Davis, D. Lee, and F. Servant, "Regexes are Hard: Decision-Making, Difficulties, and Risks in Programming Regular Expressions," in International Conference on Automated Software Engineering (ASE), 2019.
[67]
A. Cruz and A. Duarte, "npms," 2022. [Online]. Available: https://npms.io/about
[68]
A. Abdellatif, Y. Zeng, M. Elshafei, E. Shihab, and W. Shang, "Simplifying the Search of npm Packages," Information and Software Technology, 2020.
[69]
Hugging Face, "Hugging face users," 2022. [Online]. Available: https://huggingface.co/users
[70]
J. Saldana, Fundamentals of qualitative research. Oxford University Press, 2011.
[71]
G. Guest, A. Bunce, and L. Johnson, "How Many Interviews Are Enough?: An Experiment with Data Saturation and Variability," Field Methods, 2006. [Online].
[72]
Microsoft, "The STRIDE Threat Model," 2021. [Online]. Available: https://learn.microsoft.com/en-us/previous-versions/commerce-server/ee823878(v=cs.20)
[73]
D. Boyd, "How to approach threat modeling," 2021. [Online]. Available: https://aws.amazon.com/blogs/security/how-to-approach-threat-modeling/
[74]
L. Conklin, "Threat Modeling Process," 2023. [Online]. Available: https://owasp.org/www-community/Threat_Modeling_Process
[75]
A. Shostack, "Experiences threat modeling at microsoft," the Workshop on Modeling Security, vol. 413, 2008.
[76]
R. S. Pressman, Software engineering: a practitioner's approach. Pal-grave macmillan, 2005.
[77]
J. Fingas, "AI trained on 4chan's most hateful board is just as toxic as you'd expect," 2022. [Online]. Available: https://www.engadget.com/ai-bot-4chan-hate-machine-162550734.html
[78]
N. P. Tschacher, "Typosquatting in programming language package managers," Ph.D. dissertation, Universität Hamburg, Fachbereich Informatik, 2016.
[79]
J. Nivre, M.-C. De Marneffe, F. Ginter, Y. Goldberg, J. Hajic, C. D. Manning, R. McDonald, S. Petrov, S. Pyysalo, N. Silveira et al., "Universal dependencies v1: A multilingual treebank collection," in International Conference on Language Resources and Evaluation (LREC), 2016, pp. 1659--1666.
[80]
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv, 2018. [Online]. Available: https://arxiv.org/abs/1810.04805
[81]
J. Saltzer and M. Schroeder, "The protection of information in computer systems," Proceedings of the IEEE, vol. 63, no. 9, pp. 1278--1308, 1975. [Online]. Available: http://ieeexplore.ieee.org/document/1451869/
[82]
W. Jiang, N. Synovic, P. Jajal, T. R. Schorlemmer, A. Tewari, B. Pareek, G. K. Thiruvathukal, and J. C. Davis, "PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages," 2023. [Online].
[83]
G. Gousios and D. Spinellis, "GHTorrent: Github's data from a firehose," in International Working Conference on Mining Software Repositories (MSR), 2012.
[84]
S. Baltes, "SOTorrent: Reconstructing and Analyzing the Evolution of Stack Overflow Posts," in International Conference on Mining Software Repositories (MSR), 2018.
[85]
R. Abdalkareem, O. Nourry, S. Wehaibi, S. Mujahid, and E. Shihab, "Why do developers use trivial packages? an empirical case study on npm," in European Software Engineering Conference/Foundations of Software Engineering (ESEC/FSE), 2017.
[86]
C. Bogart, C. Kästner, and J. Herbsleb, "When It Breaks, It Breaks: How Ecosystem Developers Reason about the Stability of Dependencies," in International Conference on Automated Software Engineering Workshop (ASEW), 2015.
[87]
M. Anasuodei, Ojekudo, and N. Akpofure, "Software Reusability: Approaches and Challenges," International Journal of Research and Innovation in Applied Science, vol. 06, no. 05, 2021.
[88]
Y. Wang, B. Chen, K. Huang, B. Shi, C. Xu, X. Peng, Y. Wu, and Y. Liu, "An Empirical Study of Usages, Updates and Risks of Third-Party Libraries in Java Projects," in International Conference on Software Maintenance and Evolution (ICSME), 2020.
[89]
S. Mujahid, R. Abdalkareem, and E. Shihab, "What are the Characteristics of Highly-Selected Packages? A Case Study on the NPM Ecosystem," SSRN Electronic Journal, 2022.
[90]
L. Tunstall, A. Thakur, T. Thrush, S. Luccioni, L. v. Werra, N. Rajani, O. Piktus, O. Sanseviero, and D. Kiela, "Announcing Evaluation on the Hub." [Online]. Available: https://huggingface.co/blog/eval-on-the-hub
[91]
R. Hamon, H. Junklewitz, and J. I. Sanchez Martin, "Robustness and explainability of artificial intelligence," Publications Office of the European Union, vol. 207, 2020.
[92]
GitHub, "Adding a workflow status badge," 2022. [Online]. Available: https://docs.github.com/en/actions/monitoring-and-troubleshooting-workflows/adding-a-workflow-status-badge
[93]
X. He, K. Zhao, and X. Chu, "AutoML: A survey of the state-of-the-art," Knowledge-Based Systems, 2021.
[94]
Y. Liu, C. Chen, R. Zhang, T. Qin, X. Ji, H. Lin, and M. Yang, "Enhancing the interoperability between deep learning frameworks by model conversion," in European Software Engineering Conference/Foundations of Software Engineering (ESEC/FSE), 2020.
[95]
"Portability between deep learning frameworks - with ONNX," Aug. 2019. [Online]. Available: https://blog.codecentric.de/en/2019/08/portability-deep-learning-frameworks-onnx/
[96]
Cisco, "ClamAV," 2022. [Online]. Available: https://www.clamav.net/

Cited By

View all
  • (2024)Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model HubsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695271(2087-2098)Online publication date: 27-Oct-2024
  • (2024)What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claimsProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686665(13-24)Online publication date: 24-Oct-2024
  • (2024)Automated categorization of pre-trained models in software engineering: A case study with a Hugging Face datasetProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661215(351-356)Online publication date: 18-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '23: Proceedings of the 45th International Conference on Software Engineering
May 2023
2713 pages
ISBN:9781665457019
  • General Chair:
  • John Grundy,
  • Program Co-chairs:
  • Lori Pollock,
  • Massimiliano Di Penta

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 July 2023

Check for updates

Badges

Author Tags

  1. software reuse
  2. empirical software engineering
  3. machine learning
  4. deep learning
  5. software supply chain
  6. engineering decision making
  7. cybersecurity
  8. trust

Qualifiers

  • Research-article

Conference

ICSE '23
Sponsor:
ICSE '23: 45th International Conference on Software Engineering
May 14 - 20, 2023
Victoria, Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model HubsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695271(2087-2098)Online publication date: 27-Oct-2024
  • (2024)What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claimsProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686665(13-24)Online publication date: 24-Oct-2024
  • (2024)Automated categorization of pre-trained models in software engineering: A case study with a Hugging Face datasetProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661215(351-356)Online publication date: 18-Jun-2024
  • (2024)Interoperability in Deep Learning: A User Survey and Failure Analysis of ONNX Model ConvertersProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680374(1466-1478)Online publication date: 11-Sep-2024
  • (2024)ModelGo: A Practical Tool for Machine Learning License AnalysisProceedings of the ACM Web Conference 202410.1145/3589334.3645520(1158-1169)Online publication date: 13-May-2024
  • (2024)Active Code Learning: Benchmarking Sample-Efficient Training of Code ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.337696450:5(1080-1095)Online publication date: 13-Mar-2024
  • (2024)The State of Documentation Practices of Third-Party Machine Learning Models and DatasetsIEEE Software10.1109/MS.2024.336611141:5(52-59)Online publication date: 1-Sep-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media