Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3411764.3445538acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open access

UMLAUT: Debugging Deep Learning Programs using Program Structure and Model Behavior

Published: 07 May 2021 Publication History

Abstract

Training deep neural networks can generate non-descriptive error messages or produce unusual output without any explicit errors at all. While experts rely on tacit knowledge to apply debugging strategies, non-experts lack the experience required to interpret model output and correct Deep Learning (DL) programs. In this work, we identify DL debugging heuristics and strategies used by experts, andIn this work, we categorize the types of errors novices run into when writing ML code, and map them onto opportunities where tools could help. We use them to guide the design of Umlaut. Umlaut checks DL program structure and model behavior against these heuristics; provides human-readable error messages to users; and annotates erroneous model output to facilitate error correction. Umlaut links code, model output, and tutorial-driven error messages in a single interface. We evaluated Umlaut in a study with 15 participants to determine its effectiveness in helping developers find and fix errors in their DL programs. Participants using Umlaut found and fixed significantly more bugs and were able to implement fixes for more bugs compared to a baseline condition.

References

[1]
[n.d.]. NVIDIA DLSS 2.0: A Big Leap In AI Rendering. https://www.nvidia.com/en-us/geforce/news/nvidia-dlss-2-0-a-big-leap-in-ai-rendering/
[2]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, and et al.2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) (OSDI’16). USENIX Association, USA, 265–283.
[3]
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice(Montreal, Quebec, Canada) (ICSE-SEIP ’19). IEEE Press, 291–300. https://doi.org/10.1109/ICSE-SEIP.2019.00042
[4]
Saleema Amershi, Max Chickering, Steven M. Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. ModelTracker: Redesigning Performance Analysis Tools for Machine Learning. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machinery, New York, NY, USA, 337–346. https://doi.org/10.1145/2702123.2702509
[5]
Kanav Anand, Ziqi Wang, Marco Loog, and Jan van Gemert. 2020. Black Magic in Deep Learning: How Human Skill Impacts Network Training. The British Machine Vision Conference(2020).
[6]
J. Bergstra, R. Bardenet, Yoshua Bengio, and B. Kégl. 2011. Algorithms for Hyper-Parameter Optimization. In NIPS.
[7]
Houssem Ben Braiek and Foutse Khomh. 2019. TFCheck : A TensorFlow Library for Detecting Training Issues in Neural Network Programs. arxiv:1909.02562 [cs.LG]
[8]
Joel Brandt, Mira Dontcheva, Marcos Weskamp, and Scott R. Klemmer. 2010. Example-Centric Programming: Integrating Web Search into the Development Environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 513–522. https://doi.org/10.1145/1753326.1753402
[9]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arxiv:2005.14165 [cs.CL]
[10]
C. J. Cai and P. J. Guo. 2019. Software Developers Learning Machine Learning: Motivations, Hurdles, and Desires. In 2019 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 25–34. https://doi.org/10.1109/VLHCC.2019.8818751
[11]
Shanqing Cai. 2017. Debug TensorFlow Models with tfdbg. https://developers.googleblog.com/2017/02/debug-tensorflow-models-with-tfdbg.html
[12]
François Chollet. 2015. keras. https://github.com/fchollet/keras.
[13]
Daniel Drew, Julie L. Newcomb, William McGrath, Filip Maksimovic, David Mellis, and Björn Hartmann. 2016. The Toastboard: Ubiquitous Instrumentation and Automated Checking of Breadboarded Circuits. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 677–686. https://doi.org/10.1145/2984511.2984566
[14]
Jerry Alan Fails and Dan R. Olsen. 2003. Interactive Machine Learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (Miami, Florida, USA) (IUI ’03). Association for Computing Machinery, New York, NY, USA, 39–45. https://doi.org/10.1145/604045.604056
[15]
Rebecca Fiebrink and Perry R Cook. 2010. The Wekinator: a system for real-time, interactive machine learning in music. In Proceedings of The Eleventh International Society for Music Information Retrieval Conference (ISMIR 2010)(Utrecht).
[16]
Adam Fourney and Meredith Ringel Morris. 2013. Enhancing Technical Q&A Forums with CiteHistory. In Proceedings of ICWSM 2013(proceedings of icwsm 2013 ed.). AAAI. https://www.microsoft.com/en-us/research/publication/enhancing-technical-qa-forums-with-citehistory/ You can download the CiteHistory plugin at http://research.microsoft.com/en-us/um/redmond/projects/citehistory/.
[17]
Rolando Garcia, Vikram Sreekanti, Daniel Crankshaw, Neeraja Yadwadkar, and Joseph Gonzalez. 2019. flor. https://github.com/ucbrise/flor
[18]
Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining Explanations: An Overview of Interpretability of Machine Learning. arxiv:1806.00069 [cs.AI]
[19]
M. Goldman and R. C. Miller. 2008. Codetrail: Connecting source code and web resources. In 2008 IEEE Symposium on Visual Languages and Human-Centric Computing. 65–72. https://doi.org/10.1109/VLHCC.2008.4639060
[20]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.
[21]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in neural information processing systems. 2672–2680.
[22]
Tovi Grossman, George Fitzmaurice, and Ramtin Attar. 2009. A survey of software learnability: metrics, methodologies and guidelines. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 649–658.
[23]
Björn Hartmann, Daniel MacDougall, Joel Brandt, and Scott R. Klemmer. 2010. What Would Other Programmers Do: Suggesting Solutions to Error Messages. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 1019–1028. https://doi.org/10.1145/1753326.1753478
[24]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arxiv:1502.01852 [cs.CV]
[25]
K. He, X. Zhang, S. Ren, and J. Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In 2015 IEEE International Conference on Computer Vision (ICCV). 1026–1034.
[26]
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778.
[27]
A. Head, C. Appachu, M. A. Hearst, and B. Hartmann. 2015. Tutorons: Generating context-relevant, on-demand explanations and demonstrations of online code. In 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 3–12. https://doi.org/10.1109/VLHCC.2015.7356972
[28]
Andrew Head, Fred Hohman, Titus Barik, Steven M. Drucker, and Robert DeLine. 2019. Managing Messes in Computational Notebooks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300500
[29]
Andrew Head, Jason Jiang, James Smith, Marti A. Hearst, and Björn Hartmann. 2020. Composing Flexibly-Organized Step-by-Step Tutorials from Linked Source Code, Snippets, and Outputs. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376798
[30]
C. Hill, R. Bellamy, T. Erickson, and M. Burnett. 2016. Trials and tribulations of developers of intelligent systems: A field study. In 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 162–170. https://doi.org/10.1109/VLHCC.2016.7739680
[31]
Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M. Drucker. 2019. Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, Article 579, 13 pages. https://doi.org/10.1145/3290605.3300809
[32]
F. Hohman, M. Kahng, R. Pienta, and D. H. Chau. 2019. Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers. IEEE Transactions on Visualization and Computer Graphics 25, 8(2019), 2674–2693. https://doi.org/10.1109/TVCG.2018.2843369
[33]
Jeremy Howard and Sylvain Gugger. 2020. Fastai: A Layered API for Deep Learning. Information 11, 2 (Feb 2020), 108. https://doi.org/10.3390/info11020108
[34]
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2019. Taxonomy of Real Faults in Deep Learning Systems. arxiv:1910.11015 [cs.SE]
[35]
Apple Inc. 2019. Apple Create ML. https://developer.apple.com/machine-learning/create-ml/
[36]
Databricks Inc. 2019. MLFlow. https://mlflow.org/
[37]
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Tallinn, Estonia) (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA, 510–520. https://doi.org/10.1145/3338906.3338955
[38]
Andrew Janowczyk and Anant Madabhushi. 2016. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. Journal of pathology informatics 7 (2016).
[39]
S. C. Johnson. 1978. Lint, a C Program Checker. In Technical Report. Bell Telephone Laboratories, 78–1273.
[40]
Andrej Kaparthy. 2016. Training Neural Networks, Part 1. Convolutional Neural Networks for Visual Recognition. Lecture Slides (20 January 2016). http://cs231n.stanford.edu/2016/syllabus.html
[41]
Andrej Kaparthy. 2019. A Recipe for Training Neural Networks. https://karpathy.github.io/2019/04/25/recipe/
[42]
Andrei Kapishnikov, Tolga Bolukbasi, Fernanda Viégas, and Michael Terry. 2019. XRAI: Better Attributions Through Regions. arxiv:1906.02825 [cs.CV]
[43]
Jun Kato, Sean McDirmid, and Xiang Cao. 2012. DejaVu: integrated support for developing interactive camera-based programs. In Proceedings of the 25th annual ACM symposium on User interface software and technology. 189–196.
[44]
Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory Sayres. 2017. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). arxiv:1711.11279 [stat.ML]
[45]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster). http://arxiv.org/abs/1412.6980
[46]
Amy J. Ko and Brad A. Myers. 2009. Finding Causes of Program Output with the Java Whyline. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA, 1569–1578. https://doi.org/10.1145/1518701.1518942
[47]
Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical Report.
[48]
Lezhi Li and Yang Wang. 2019. Manifold: A Model-Agnostic Visual Debugging Tool for Machine Learning at Uber. https://eng.uber.com/manifold/
[49]
Zachary Chase Lipton. 2016. The Myth of Model Interpretability. CoRR abs / 1606.03490(2016). arxiv:1606.03490http://arxiv.org/abs/1606.03490
[50]
Google LLC. 2020. Machine Learning Crash Course with TensorFlow APIs. Retrieved February 2, 2020 from https://developers.google.com/machine-learning/crash-course
[51]
Dan Maynes-Aminzade, Terry Winograd, and Takeo Igarashi. 2007. Eyepatch: prototyping camera-based interaction through examples. In Proceedings of the 20th annual ACM symposium on User interface software and technology. 33–42.
[52]
Will McGrath, Daniel Drew, Jeremy Warner, Majeed Kazemitabaar, Mitchell Karchemsky, David Mellis, and Björn Hartmann. 2017. Bifröst: Visualizing and Checking Behavior of Embedded Systems across Hardware and Software. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (Québec City, QC, Canada) (UIST ’17). Association for Computing Machinery, New York, NY, USA, 299–310. https://doi.org/10.1145/3126594.3126658
[53]
David A. Mellis, Ben Zhang, Audrey Leung, and Björn Hartmann. 2017. Machine Learning for Makers: Interactive Sensor Data Classification Based on Augmented Code Examples. In Proceedings of the 2017 Conference on Designing Interactive Systems (Edinburgh, United Kingdom) (DIS ’17). Association for Computing Machinery, New York, NY, USA, 1213–1225. https://doi.org/10.1145/3064663.3064735
[54]
Microsoft. 2020. Automate code completions tailored to your codebase with IntelliCode Team completions. Retrieved September 7, 2020 from https://github.com/microsoft/vs-intellicode
[55]
Riccardo Miotto, Fei Wang, Shuang Wang, Xiaoqian Jiang, and Joel T Dudley. 2017. Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics 19, 6 (05 2017), 1236–1246. https://doi.org/10.1093/bib/bbx044arXiv:https://academic.oup.com/bib/article-pdf/19/6/1236/27119191/bbx044.pdf
[56]
Sugeerth Murugesan, Sana Malik, Fan Du, Eunyee Koh, and Tuan Lai. 2019. DeepCompare: Visual and Interactive Comparison of Deep Learning Model Performance. IEEE Computer Graphics and Applications PP (05 2019), 1–1. https://doi.org/10.1109/MCG.2019.2919033
[57]
Glenford J. Myers, Corey Sandler, and Tom Badgett. 2011. The Art of Software Testing(3rd ed.). Wiley Publishing.
[58]
Soroush Nasiriany, Garrett Thomas, William Wang, Alex Yang, Jennifer Listgarten, and Anant Sahai. 2019. A Comprehensive Guide to Machine Learning.
[59]
Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 4901–4911. http://proceedings.mlr.press/v97/odena19a.html
[60]
Chris Olah, Alexander Mordvintsev, and Ludwig Schubert. 2017. Feature Visualization. Distill (2017). https://doi.org/10.23915/distill.00007https://distill.pub/2017/feature-visualization.
[61]
Savannah Ostrowski. 2020. Announcing Pylance: Fast, feature-rich language support for Python in Visual Studio Code. Retrieved September 7, 2020 from https://devblogs.microsoft.com/python/announcing-pylance-fast-feature-rich-language-support-for-python-in-visual-studio-code/
[62]
Google PAIR. 2017. FACETS. https://pair-code.github.io/facets/
[63]
Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Amy J. Ko, and James Landay. 2010. Gestalt: Integrated Support for Implementation and Analysis in Machine Learning. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology(New York, New York, USA) (UIST ’10). Association for Computing Machinery, New York, NY, USA, 37–46. https://doi.org/10.1145/1866029.1866038
[64]
Kayur Patel, James Fogarty, James A. Landay, and Beverly Harrison. 2008. Investigating Statistical Machine Learning as a Tool for Software Development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Florence, Italy) (CHI ’08). Association for Computing Machinery, New York, NY, USA, 667–676. https://doi.org/10.1145/1357054.1357160
[65]
Eldon Schoop, Forrest Huang, and Björn Hartmann. 2020. SCRAM: Simple Checks for Realtime Analysis of Model Training for Non-Expert ML Programmers. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3334480.3382879
[66]
Shital Shah, Roland Fernandez, and Steven M. Drucker. 2019. A system for real-time interactive analysis of deep learning training. In Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems, EICS 2019, Valencia, Spain, June 18-21, 2019. 16:1–16:6. https://doi.org/10.1145/3319499.3328231
[67]
Jonathan R. Shewchuk. 2020. Concise Machine Learning. https://people.eecs.berkeley.edu/~jrs/papers/machlearn.pdf
[68]
John T Stasko, Marc H Brown, and Blaine A Price. 1997. Software Visualization. MIT press.
[69]
Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and Policy Considerations for Deep Learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 3645–3650. https://doi.org/10.18653/v1/P19-1355
[70]
Josh Tobin. 2019. Troubleshooting Deep Neural Networks: A Field Guide to Fixing Your Model. Retrieved September 17, 2020 from http://josh-tobin.com/troubleshooting-deep-neural-networks.html
[71]
Manasi Vartak, Harihar Subramanyam, Wei-En Lee, Srinidhi Viswanathan, Saadiyah Husnoo, Samuel Madden, and Matei Zaharia. 2016. ModelDB: A System for Machine Learning Model Management. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics (San Francisco, California) (HILDA ’16). Association for Computing Machinery, New York, NY, USA, Article 14, 3 pages. https://doi.org/10.1145/2939502.2939516
[72]
Matthew Veres and Medhat Moussa. 2019. Deep learning for intelligent transportation systems: A survey of emerging trends. IEEE Transactions on Intelligent transportation systems (2019).
[73]
Robert Stuart Weiss. 1995. Learning from strangers: the art and method of qualitative interview studies. Free Press.
[74]
J. Wexler, M. Pushkarna, T. Bolukbasi, M. Wattenberg, F. Viégas, and J. Wilson. 2020. The What-If Tool: Interactive Probing of Machine Learning Models. IEEE Transactions on Visualization and Computer Graphics 26, 1(2020), 56–65. https://doi.org/10.1109/TVCG.2019.2934619
[75]
Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv:cs.LG/1708.07747 [cs.LG]
[76]
Geoffrey X. Yu, Tovi Grossman, and Gennady Pekhimenko. 2020. Skyline: Interactive In-Editor Computational Performance Profiling for Deep Neural Network Training. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 126–139. https://doi.org/10.1145/3379337.3415890
[77]
T. Zhang, C. Gao, L. Ma, M. Lyu, and M. Kim. 2019. An Empirical Study of Common Challenges in Developing Deep Learning Applications. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). 104–115.
[78]
Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An Empirical Study on TensorFlow Program Bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (Amsterdam, Netherlands) (ISSTA 2018). Association for Computing Machinery, New York, NY, USA, 129–140. https://doi.org/10.1145/3213846.3213866

Cited By

View all
  • (2024)BTTackler: A Diagnosis-based Framework for Efficient Deep Learning Hyperparameter OptimizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671933(2340-2351)Online publication date: 25-Aug-2024
  • (2024)Jigsaw: Supporting Designers to Prototype Multimodal Applications by Chaining AI Foundation ModelsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641920(1-15)Online publication date: 11-May-2024
  • (2024)DeepCNN: A Dual Approach to Fault Localization and Repair in Convolutional Neural NetworksIEEE Access10.1109/ACCESS.2024.338498112(50321-50334)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. UMLAUT: Debugging Deep Learning Programs using Program Structure and Model Behavior
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
        May 2021
        10862 pages
        ISBN:9781450380966
        DOI:10.1145/3411764
        This work is licensed under a Creative Commons Attribution International 4.0 License.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 May 2021

        Check for updates

        Author Tags

        1. End-User ML
        2. ML Debugging
        3. ML Development

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        CHI '21
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)528
        • Downloads (Last 6 weeks)65
        Reflects downloads up to 29 Sep 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)BTTackler: A Diagnosis-based Framework for Efficient Deep Learning Hyperparameter OptimizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671933(2340-2351)Online publication date: 25-Aug-2024
        • (2024)Jigsaw: Supporting Designers to Prototype Multimodal Applications by Chaining AI Foundation ModelsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641920(1-15)Online publication date: 11-May-2024
        • (2024)DeepCNN: A Dual Approach to Fault Localization and Repair in Convolutional Neural NetworksIEEE Access10.1109/ACCESS.2024.338498112(50321-50334)Online publication date: 2024
        • (2024)When debugging encounters artificial intelligence: state of the art and open challengesScience China Information Sciences10.1007/s11432-022-3803-967:4Online publication date: 21-Feb-2024
        • (2024)Common challenges of deep reinforcement learning applications development: an empirical studyEmpirical Software Engineering10.1007/s10664-024-10500-529:4Online publication date: 14-Jun-2024
        • (2024)An empirical study of fault localization in Python programsEmpirical Software Engineering10.1007/s10664-024-10475-329:4Online publication date: 13-Jun-2024
        • (2023)Semantic-Based Neural Network RepairProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598045(150-162)Online publication date: 12-Jul-2023
        • (2023)Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications through Visual ProgrammingProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581338(1-23)Online publication date: 19-Apr-2023
        • (2023)Multi-Objective White-Box Test Input Selection for Deep Neural Network Model Enhancement2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE59848.2023.00051(521-532)Online publication date: 9-Oct-2023
        • (2023)Repairing DNN Architecture: Are We There Yet?2023 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST57152.2023.00030(234-245)Online publication date: Apr-2023
        • Show More Cited By

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media